Deep Deterministic Policy Gradient¶
Deep Deterministic Policy Gradient (DDPG).
-
class
jax_agents.algorithms.ddpg.DDPG(config: jax_agents.algorithms.ddpg.DDPGConfig)¶ Bases:
objectDDPG Algorithm class. “https://arxiv.org/abs/1509.02971”.
-
classmethod
load(config, state_path)¶ Return a DDPG instance initialized with the pickled algo_state.
-
select_action(state, algo_func, algo_state)¶ Output on policy action.
-
train_step(data_batch, algo_func, algo_state)¶ Update all functions.
-
classmethod
-
class
jax_agents.algorithms.ddpg.DDPGConfig¶ Bases:
tupleConfig to initialize DDPG.
Parameters: - state_dim – the dimension of the state vector.
- action_dim – the dimension of the action vector.
- pi_net_size – a list of int corresponding to the hidden sizes of the policy network.
- q_net_size – a list of int corresponding to the hidden sizes of the q network.
- learning_rate – the learning rate of the adam optimizer.
- gamma – the discount factor of the algorithm.
- seed – the random seed for initialization of the networks.
-
action_dim¶ Alias for field number 1
-
gamma¶ Alias for field number 5
-
learning_rate¶ Alias for field number 4
-
pi_net_size¶ Alias for field number 2
-
q_net_size¶ Alias for field number 3
-
seed¶ Alias for field number 6
-
state_dim¶ Alias for field number 0
-
class
jax_agents.algorithms.ddpg.DDPGFunc¶ Bases:
tupleConfig to initialize the DDPG functions.
Parameters: - pi_net – policy neural network
- q_net – q function neural network
- pi_optimizer – policy optimizer (adam)
- q_optimizer – q function optimizer (adam)
- gamma – discount factor
- state_dim – dimension of the state vector.
- action_dim – dimension of the action vector.
-
action_dim¶ Alias for field number 6
-
gamma¶ Alias for field number 4
-
pi_net¶ Alias for field number 0
-
pi_optimizer¶ Alias for field number 2
-
q_net¶ Alias for field number 1
-
q_optimizer¶ Alias for field number 3
-
state_dim¶ Alias for field number 5
-
class
jax_agents.algorithms.ddpg.DDPGState¶ Bases:
tupleState of the DDPG networks.
Parameters: - pi_params – policy neural network parameters
- q_params – q function neural network parameters
- pi_opt_state – state of the policy optimizer (adam)
- q_opt_state – state of the q function optimizer (adam)
-
pi_opt_state¶ Alias for field number 2
-
pi_params¶ Alias for field number 0
-
q_opt_state¶ Alias for field number 3
-
q_params¶ Alias for field number 1