Data Processing and Replay Buffer

Process data stream from interactions.

class jax_agents.common.data_processor.DataProcessor(n_steps, replay_buffer, folder)

Bases: object

Class to process the data stream of states and actions.

Calculate the rewards and store 3 tuples (state, action, reward) in a deque in order to support multistep reinforcement learning (see https://arxiv.org/pdf/1901.07510.pdf). Then fill the replay buffer for off policy rl algorithms.

close()

Close logger file.

data_callback(normed_state, normed_action, reward, reset_flag, timestep)

Fill the deque and the replay buffer.

class jax_agents.common.data_processor.EpisodeLogger(folder)

Bases: object

Monitors training and logs reward, timesteps and seconds.

close()

Close logger file.

log(reward, timesteps, seconds)

Write to csv.

class jax_agents.common.data_processor.ReplayBuffer(buffer_size, state_dim, action_dim, seed)

Bases: object

A simple FIFO experience replay buffer for off-policy agents.

sample_batch(batch_size)

Sample past experience.

store(data_tuple)

Store new experience.