coopihczoo.teaching.scripts_to_sort.behavioral_cloning_original.Transitions¶
- class coopihczoo.teaching.scripts_to_sort.behavioral_cloning_original.Transitions(obs: numpy.ndarray, acts: numpy.ndarray, infos: numpy.ndarray, next_obs: numpy.ndarray, dones: numpy.ndarray)[source]¶
- Bases: - coopihczoo.teaching.scripts_to_sort.behavioral_cloning_original.TransitionsMinimal- A batch of obs-act-obs-done transitions. - Methods - register_datapipe_as_function- register_function- Attributes - functions- New observation. - Boolean array indicating episode termination. - acts: np.ndarray¶
- Actions. Shape: (batch_size,) + action_shape. 
 - dones: numpy.ndarray¶
- Boolean array indicating episode termination. Shape: (batch_size, ). - done[i] is true iff next_obs[i] the last observation of an episode. 
 - infos: np.ndarray¶
- Array of info dicts. Shape: (batch_size,). 
 - next_obs: numpy.ndarray¶
- New observation. Shape: (batch_size, ) + observation_shape. - The i’th observation next_obs[i] in this array is the observation after the agent has taken action acts[i]. - Invariants:
- next_obs.dtype == obs.dtype 
- len(next_obs) == len(obs) 
 
 
 - obs: np.ndarray¶
- Previous observations. Shape: (batch_size, ) + observation_shape. - The i’th observation obs[i] in this array is the observation seen by the agent when choosing action acts[i]. obs[i] is not required to be from the timestep preceding obs[i+1].