coopihczoo.teaching.scripts_to_sort.behavioral_cloning_original.Transitions¶
- class coopihczoo.teaching.scripts_to_sort.behavioral_cloning_original.Transitions(obs: numpy.ndarray, acts: numpy.ndarray, infos: numpy.ndarray, next_obs: numpy.ndarray, dones: numpy.ndarray)[source]¶
Bases:
coopihczoo.teaching.scripts_to_sort.behavioral_cloning_original.TransitionsMinimal
A batch of obs-act-obs-done transitions.
Methods
register_datapipe_as_function
register_function
Attributes
functions
New observation.
Boolean array indicating episode termination.
- acts: np.ndarray¶
Actions. Shape: (batch_size,) + action_shape.
- dones: numpy.ndarray¶
Boolean array indicating episode termination. Shape: (batch_size, ).
done[i] is true iff next_obs[i] the last observation of an episode.
- infos: np.ndarray¶
Array of info dicts. Shape: (batch_size,).
- next_obs: numpy.ndarray¶
New observation. Shape: (batch_size, ) + observation_shape.
The i’th observation next_obs[i] in this array is the observation after the agent has taken action acts[i].
- Invariants:
next_obs.dtype == obs.dtype
len(next_obs) == len(obs)
- obs: np.ndarray¶
Previous observations. Shape: (batch_size, ) + observation_shape.
The i’th observation obs[i] in this array is the observation seen by the agent when choosing action acts[i]. obs[i] is not required to be from the timestep preceding obs[i+1].