curobo.opt.particle.parallel_mppi module

class curobo.opt.particle.parallel_mppi.BaseActionType(value)

Bases: Enum

An enumeration.

REPEAT = 0
NULL = 1
RANDOM = 2
class curobo.opt.particle.parallel_mppi.CovType(value)

Bases: Enum

An enumeration.

SIGMA_I = 0
DIAG_A = 1
FULL_A = 2
FULL_HA = 3
class curobo.opt.particle.parallel_mppi.ParallelMPPIConfig(d_action: 'int', action_lows: 'List[float]', action_highs: 'List[float]', action_horizon: 'int', horizon: 'int', n_iters: 'int', cold_start_n_iters: 'Union[int, None]', rollout_fn: 'RolloutBase', tensor_args: 'TensorDeviceType', use_cuda_graph: 'bool', store_debug: 'bool', debug_info: 'Any', n_problems: 'int', num_particles: 'Union[int, None]', sync_cuda_time: 'bool', use_coo_sparse: 'bool', gamma: float, sample_mode: curobo.opt.particle.particle_opt_base.SampleMode, seed: int, calculate_value: bool, store_rollouts: bool, init_mean: float, init_cov: float, base_action: curobo.opt.particle.parallel_mppi.BaseActionType, step_size_mean: float, step_size_cov: float, null_act_frac: float, squash_fn: curobo.opt.particle.particle_opt_utils.SquashType, cov_type: curobo.opt.particle.parallel_mppi.CovType, sample_params: curobo.util.sample_lib.SampleConfig, update_cov: bool, random_mean: bool, beta: float, alpha: float, kappa: float, sample_per_problem: bool)

Bases: ParticleOptConfig

Parameters:
  • d_action (int) –

  • action_lows (List[float]) –

  • action_highs (List[float]) –

  • action_horizon (int) –

  • horizon (int) –

  • n_iters (int) –

  • cold_start_n_iters (int | None) –

  • rollout_fn (RolloutBase) –

  • tensor_args (TensorDeviceType) –

  • use_cuda_graph (bool) –

  • store_debug (bool) –

  • debug_info (Any) –

  • n_problems (int) –

  • num_particles (int | None) –

  • sync_cuda_time (bool) –

  • use_coo_sparse (bool) –

  • gamma (float) –

  • sample_mode (SampleMode) –

  • seed (int) –

  • calculate_value (bool) –

  • store_rollouts (bool) –

  • init_mean (float) –

  • init_cov (float) –

  • base_action (BaseActionType) –

  • step_size_mean (float) –

  • step_size_cov (float) –

  • null_act_frac (float) –

  • squash_fn (SquashType) –

  • cov_type (CovType) –

  • sample_params (SampleConfig) –

  • update_cov (bool) –

  • random_mean (bool) –

  • beta (float) –

  • alpha (float) –

  • kappa (float) –

  • sample_per_problem (bool) –

init_mean: float
init_cov: float
base_action: BaseActionType
step_size_mean: float
step_size_cov: float
null_act_frac: float
squash_fn: SquashType
cov_type: CovType
sample_params: SampleConfig
update_cov: bool
random_mean: bool
beta: float
alpha: float
gamma: float
kappa: float
sample_per_problem: bool
static create_data_dict(data_dict, rollout_fn, tensor_args=TensorDeviceType(device=device(type='cuda', index=0), dtype=torch.float32), child_dict=None)

Helper function to create dictionary from optimizer parameters and rollout class.

Parameters:
  • data_dict (Dict) – optimizer parameters dictionary.

  • rollout_fn (RolloutBase) – rollout function.

  • tensor_args (TensorDeviceType) – tensor cuda device.

  • child_dict (Dict | None) – new dictionary where parameters will be stored.

Returns:

Dictionary with parameters to create a OptimizerConfig

class curobo.opt.particle.parallel_mppi.ParallelMPPI(config=None)

Bases: ParticleOptBase, ParallelMPPIConfig

Base optimization solver class

Parameters:

config (ParallelMPPIConfig | None) – Initialized with parameters from a dataclass.

get_rollouts()
reset_distribution()

Reset control distribution

_compute_total_cost(costs)

Calculate weights using exponential utility

_exp_util(total_costs)
_compute_mean(w, actions)
_compute_covariance(w, actions)
_update_cov_scale()
_update_distribution(trajectories)

Update current control distribution using rollout trajectories

Parameters:

trajectories (Trajectory) –

dict Rollout trajectories. Contains the following fields observations : torch.tensor

observations along rollouts

actionstorch.tensor

actions sampled from control distribution along rollouts

coststorch.tensor

step costs along rollouts

sample_actions(init_act)

Sample actions from current control distribution

update_seed(init_act)
update_init_mean(init_mean)
reset_mean()
reset_covariance()
_get_action_seq(mode)

Get action sequence to execute on the system based on current control distribution

Parameters:

mode (SampleMode) – {‘mean’, ‘sample’} how to choose action to be executed ‘mean’ plays mean action and ‘sample’ samples from the distribution

generate_noise(shape, base_seed=None)

Generate correlated noisy samples using autoregressive process

_calc_val(trajectories)

Calculate value of state given rollouts from a policy

Parameters:

trajectories (Trajectory) –

reset()

Reset the optimizer

property squashed_mean
property full_cov
property full_inv_cov
property full_scale_tril
property entropy
reset_seed()

Reset seeds.

update_samples()
generate_rollouts(init_act=None)

Samples a batch of actions, rolls out trajectories for each particle and returns the resulting observations, costs, actions

Parameters:

state (dict or np.ndarray) – Initial state to set the simulation problem to