coopihc.observation.RuleObservationEngine.RuleObservationEngine
- class RuleObservationEngine(*args, deterministic_specification=[('game_info', 'all'), ('task_state', 'all'), ('user_state', None), ('assistant_state', None), ('user_action', 'all'), ('assistant_action', 'all')], extradeterministicrules={}, extraprobabilisticrules={}, mapping=None, **kwargs)[source]
Bases:
coopihc.observation.BaseObservationEngine.BaseObservationEngine
An observation engine that is specified by rules regarding each particular substate, using a so called mapping. Example usage is given below:
obs_eng = RuleObservationEngine(mapping=mapping) obs, reward = obs_eng.observe(game_state=example_game_state())
A mapping is any iterable where an item is:
(substate, subsubstate, _slice, _func, _args, _nfunc, _nargs)
The elements in this mapping are applied to create a particular component of the observation space, as follows
observation_component = _nfunc(_func(state[substate][subsubstate][_slice], _args), _nargs)
For example, a valid mapping for the
example_game_state
mapping that states that everything should be observed except the game information is as follows:from coopihc.base.utils import example_game_state print(example_game_state()) # Define mapping mapping = [ ("task_state", "position", slice(0, 1, 1), None, None, None, None), ("task_state", "targets", slice(0, 2, 1), None, None, None, None), ("user_state", "goal", slice(0, 1, 1), None, None, None, None), ("assistant_state", "beliefs", slice(0, 8, 1), None, None, None, None), ("user_action", "action", slice(0, 1, 1), None, None, None, None), ("assistant_action", "action", slice(0, 1, 1), None, None, None, None), ] # Apply mapping obseng = RuleObservationEngine(mapping=mapping) obseng.observe(example_game_state())
As a more complex example, suppose we want to have an observation engine that behaves as above, but which doubles the observation on the (“user_state”, “goal”) StateElement. We also want to have a noisy observation of the (“task_state”, “position”) StateElement. We would need the following mapping:
def f(observation, gamestate, *args): gain = args[0] return gain * observation def g(observation, gamestate, *args): return random.randint(0, 1) + observation mapping = [ ("task_state", "position", slice(0, 1, 1), None, None, g, ()), ("task_state", "targets", slice(0, 2, 1), None, None, None, None), ("user_state", "goal", slice(0, 1, 1), f, (2,), None, None), ("user_action", "action", slice(0, 1, 1), None, None, None, None), ("assistant_action", "action", slice(0, 1, 1), None, None, None, None), ]
Note
It is important to respect the signature of the functions you pass in the mapping (viz. f and g’s signatures).
Typing out a mapping may be a bit laborious and hard to comprehend for collaborators; there are some shortcuts that make defining this engine easier.
Example usage:
obs_eng = RuleObservationEngine( deterministic_specification=engine_specification, extradeterministicrules=extradeterministicrules, extraprobabilisticrules=extraprobabilisticrules, )
There are three types of rules:
Deterministic rules, which specify at a high level which states are observable or not, e.g.
engine_specification = [ ("game_info", "all"), ("task_state", "targets", slice(0, 1, 1)), ("user_state", "all"), ("assistant_state", None), ("user_action", "all"), ("assistant_action", "all"), ]
Extra deterministic rules, which add some specific rules to specific substates
def f(observation, gamestate, *args): gain = args[0] return gain * observation f_rule = {("user_state", "goal"): (f, (2,))} extradeterministicrules = {} extradeterministicrules.update(f_rule)
Extra probabilistic rules, which are used to e.g. add noise
def g(observation, gamestate, *args): return random.random() + observation g_rule = {("task_state", "position"): (g, ())} extraprobabilisticrules = {} extraprobabilisticrules.update(g_rule)
Warning
This observation engine handles deep copies, to make sure operations based on observations don’t mess up the actual states. This might be slow though.
- Parameters
deterministic_specification (list(tuples), optional) – deterministic rules, defaults to base_task_engine_specification
extradeterministicrules (dict, optional) – extra deterministic rules, defaults to {}
extraprobabilisticrules (dict, optional) – extra probablistic rules, defaults to {}
mapping (iterable, optional) – mapping, defaults to None
Methods
Apply the rule mapping
Create mapping from the high level rules specified in the Rule Engine.
Apply this decorator to use bundle.game_state as default value to observe if game_state = None
observe
observe_from_substates
reset _summary_
Attributes
returns the last action
bundle
returns the last observation
unwrapped
- create_mapping(game_state)[source]
Create mapping from the high level rules specified in the Rule Engine.
- Parameters
game_state (
State
) – game state- Returns
Mapping
- Return type
iterable
- static default_value(func)
Apply this decorator to use bundle.game_state as default value to observe if game_state = None
- reset(random=True)
reset _summary_
Empty by default.
- Parameters
random (bool, optional) – whether states internal to the observation engine are reset randomly, defaults to True. Useful in case of subclassing the Observation Engine.