Deep Deterministic Policy Gradient

Deep Deterministic Policy Gradient (DDPG).

class jax_agents.algorithms.ddpg.DDPG(config: jax_agents.algorithms.ddpg.DDPGConfig)

Bases: object

DDPG Algorithm class. “https://arxiv.org/abs/1509.02971”.

classmethod load(config, state_path)

Return a DDPG instance initialized with the pickled algo_state.

select_action(state, algo_func, algo_state)

Output on policy action.

train_step(data_batch, algo_func, algo_state)

Update all functions.

class jax_agents.algorithms.ddpg.DDPGConfig

Bases: tuple

Config to initialize DDPG.

Parameters:
  • state_dim – the dimension of the state vector.
  • action_dim – the dimension of the action vector.
  • pi_net_size – a list of int corresponding to the hidden sizes of the policy network.
  • q_net_size – a list of int corresponding to the hidden sizes of the q network.
  • learning_rate – the learning rate of the adam optimizer.
  • gamma – the discount factor of the algorithm.
  • seed – the random seed for initialization of the networks.
action_dim

Alias for field number 1

gamma

Alias for field number 5

learning_rate

Alias for field number 4

pi_net_size

Alias for field number 2

q_net_size

Alias for field number 3

seed

Alias for field number 6

state_dim

Alias for field number 0

class jax_agents.algorithms.ddpg.DDPGFunc

Bases: tuple

Config to initialize the DDPG functions.

Parameters:
  • pi_net – policy neural network
  • q_net – q function neural network
  • pi_optimizer – policy optimizer (adam)
  • q_optimizer – q function optimizer (adam)
  • gamma – discount factor
  • state_dim – dimension of the state vector.
  • action_dim – dimension of the action vector.
action_dim

Alias for field number 6

gamma

Alias for field number 4

pi_net

Alias for field number 0

pi_optimizer

Alias for field number 2

q_net

Alias for field number 1

q_optimizer

Alias for field number 3

state_dim

Alias for field number 5

class jax_agents.algorithms.ddpg.DDPGState

Bases: tuple

State of the DDPG networks.

Parameters:
  • pi_params – policy neural network parameters
  • q_params – q function neural network parameters
  • pi_opt_state – state of the policy optimizer (adam)
  • q_opt_state – state of the q function optimizer (adam)
pi_opt_state

Alias for field number 2

pi_params

Alias for field number 0

q_opt_state

Alias for field number 3

q_params

Alias for field number 1