coopihc.observation.RuleObservationEngine.RuleObservationEngine

class RuleObservationEngine(*args, deterministic_specification=[('game_info', 'all'), ('task_state', 'all'), ('user_state', None), ('assistant_state', None), ('user_action', 'all'), ('assistant_action', 'all')], extradeterministicrules={}, extraprobabilisticrules={}, mapping=None, **kwargs)[source]

Bases: coopihc.observation.BaseObservationEngine.BaseObservationEngine

An observation engine that is specified by rules regarding each particular substate, using a so called mapping. Example usage is given below:

obs_eng = RuleObservationEngine(mapping=mapping)
obs, reward = obs_eng.observe(game_state=example_game_state())

A mapping is any iterable where an item is:

(substate, subsubstate, _slice, _func, _args, _nfunc, _nargs)

The elements in this mapping are applied to create a particular component of the observation space, as follows

observation_component = _nfunc(_func(state[substate][subsubstate][_slice], _args), _nargs)

For example, a valid mapping for the example_game_state mapping that states that everything should be observed except the game information is as follows:

from coopihc.base.utils import example_game_state
print(example_game_state())

# Define mapping
mapping = [
    ("task_state", "position", slice(0, 1, 1), None, None, None, None),
    ("task_state", "targets", slice(0, 2, 1), None, None, None, None),
    ("user_state", "goal", slice(0, 1, 1), None, None, None, None),
    ("assistant_state", "beliefs", slice(0, 8, 1), None, None, None, None),
    ("user_action", "action", slice(0, 1, 1), None, None, None, None),
    ("assistant_action", "action", slice(0, 1, 1), None, None, None, None),
]

# Apply mapping
obseng = RuleObservationEngine(mapping=mapping)
obseng.observe(example_game_state())

As a more complex example, suppose we want to have an observation engine that behaves as above, but which doubles the observation on the (“user_state”, “goal”) StateElement. We also want to have a noisy observation of the (“task_state”, “position”) StateElement. We would need the following mapping:

def f(observation, gamestate, *args):
     gain = args[0]
     return gain * observation

 def g(observation, gamestate, *args):
     return random.randint(0, 1) + observation

 mapping = [
     ("task_state", "position", slice(0, 1, 1), None, None, g, ()),
     ("task_state", "targets", slice(0, 2, 1), None, None, None, None),
     ("user_state", "goal", slice(0, 1, 1), f, (2,), None, None),
     ("user_action", "action", slice(0, 1, 1), None, None, None, None),
     ("assistant_action", "action", slice(0, 1, 1), None, None, None, None),
 ]

Note

It is important to respect the signature of the functions you pass in the mapping (viz. f and g’s signatures).

Typing out a mapping may be a bit laborious and hard to comprehend for collaborators; there are some shortcuts that make defining this engine easier.

Example usage:

obs_eng = RuleObservationEngine(
    deterministic_specification=engine_specification,
    extradeterministicrules=extradeterministicrules,
    extraprobabilisticrules=extraprobabilisticrules,
)

There are three types of rules:

Deterministic rules, which specify at a high level which states are observable or not, e.g.

engine_specification = [
    ("game_info", "all"),
    ("task_state", "targets", slice(0, 1, 1)),
    ("user_state", "all"),
    ("assistant_state", None),
    ("user_action", "all"),
    ("assistant_action", "all"),
]

Extra deterministic rules, which add some specific rules to specific substates

def f(observation, gamestate, *args):
    gain = args[0]
    return gain * observation

f_rule = {("user_state", "goal"): (f, (2,))}
extradeterministicrules = {}
extradeterministicrules.update(f_rule)

Extra probabilistic rules, which are used to e.g. add noise

def g(observation, gamestate, *args):
    return random.random() + observation

g_rule = {("task_state", "position"): (g, ())}
extraprobabilisticrules = {}
extraprobabilisticrules.update(g_rule)

Warning

This observation engine handles deep copies, to make sure operations based on observations don’t mess up the actual states. This might be slow though.

Parameters

deterministic_specification (list(tuples), optional) – deterministic rules, defaults to base_task_engine_specification
extradeterministicrules (dict, optional) – extra deterministic rules, defaults to {}
extraprobabilisticrules (dict, optional) – extra probablistic rules, defaults to {}
mapping (iterable, optional) – mapping, defaults to None

Methods

`apply_mapping`	Apply the rule mapping
`create_mapping`	Create mapping from the high level rules specified in the Rule Engine.
`default_value`	Apply this decorator to use bundle.game_state as default value to observe if game_state = None
`observe`
`observe_from_substates`
`reset`	reset _summary_

Attributes

`action`	returns the last action
`bundle`
`observation`	returns the last observation
`unwrapped`

property action

returns the last action

Returns: last action
Return type: State

apply_mapping(game_state)[source]

Apply the rule mapping

Parameters: game_state (State) – game state
Returns: observation
Return type: State

create_mapping(game_state)[source]

Create mapping from the high level rules specified in the Rule Engine.

Parameters: game_state (State) – game state
Returns: Mapping
Return type: iterable

static default_value(func): Apply this decorator to use bundle.game_state as default value to observe if game_state = None

property observation

returns the last observation

Returns: last observation
Return type: State

reset(random=True)

reset _summary_

Empty by default.

Parameters: random (bool, optional) – whether states internal to the observation engine are reset randomly, defaults to True. Useful in case of subclassing the Observation Engine.