The Observation Engines

States are rarely perfectly observable by agents:

  • the other agent’s internal states are usually unknown,

  • the task’s state may be partially observable; for example, a human observer may produce noisy observations,

  • an agent’s own internal state may be partially observable; for example, a human observer might have poor retention capabilities.

Furthermore, there might be a cost associated with making observations:

  • There can be a tradeoff between the time needed to produce an observation and its quality; for example, precise observations may be costly (in terms of time).

  • A human observer may enjoy making observations which are very different (according to some criterion) to the previous ones, in which case it would be rewarded for differing observations, satisfying its curiosity.

CoopIHC provides a generic object called an observation engine which specifies how an observation is created from the game state. To create a new observation engine, you can base it off an existing observation engine or subclass the BaseObservationEngine.

Subclassing BaseObservationEngine

To create a new engine by subclassing the BaseObservationEngine class, you have to redefine the observe() method. You can virtually put anything inside this function: that includes the output of a neural network, of a complex simulation process, and even the output of another bundle (see Modularity for an example). Below, we show a basic example where we define an engine that only looks at a particular substate.

 1class ExampleObservationEngine(BaseObservationEngine):
 2    """ExampleObservationEngine
 3
 4    A simple example where the engine is only able to see a particular state (observable_state).
 5
 6    :param observable_state: only state that can be observed
 7    :type observable_state: string
 8    """
 9
10    def __init__(self, observable_state, *args, **kwargs):
11        super().__init__(*args, **kwargs)
12        self.observable_state = observable_state
13
14    # @BaseObservationEngine.get_params
15    @BaseObservationEngine.default_value
16    def observe(self, game_state=None):
17        """observe
18
19        Select only state observable_state.
20
21        :param game_state: game state
22        :type game_state: `State<coopihc.base.State.State`
23        :return: (observation, obs reward)
24        :rtype: tuple(`State<coopihc.base.State.State`, float)
25        """
26        return (
27            State(**{self.observable_state: game_state[self.observable_state]}),
28            0,
29        )

Don’t forget to return a reward with the observation. The effect of this engine can be tested by plugging in a simple State:

 1obs_engine = ExampleObservationEngine("substate1")
 2# Game state before observation
 3# >>> print(S)
 4# ----------  ----------  -  ---------
 5# substate1   substate_x  0  Numeric()
 6#             substate_y  2  Numeric()
 7# substate_2  substate_a  0  CatSet(3)
 8# ----------  ----------  -  ---------
 9
10print(obs_engine.observe(game_state=S)[0])
11# Produced Observation
12# >>> print(obs_engine.observe(S)[0])
13# ---------  ----------  -  ----------
14# substate1  substate_x  0  Cont(1, 1)
15#            substate_y  2  Discr(3)
16# ---------  ----------  -  ----------

Note

The signature observe(self, game_state=None) is expected. When called with game_state = None, the engine will fetch the agent’s observation. If the game state is actually passed, it will use the input state as basis to produce the observation. This is useful, for example, when testing your engine and you want to control the input.

Combining Engines – CascadedObservationEngine

Serially combine several engines. Not documented yet, see API Reference

WrapAsObservationEngine

Wrap a bundle as an engine. Not documented yet, see API Reference

RuleObservationEngine

This observation engine is specified by rules regarding each particular substate, using a so called mapping.

obs_eng = RuleObservationEngine(mapping=mapping)
obs, reward = obs_eng.observe(game_state=example_game_state())

For example, in the example below, the observation engine is defined in a way that it will not observe the first substate, that it will have a noisy observation of the second substate, and that it will perfectly observe the remainder of the state.

Figure made with TikZ

Fig. 11 The observation engine

A mapping is any iterable where an item is:

(substate, subsubstate, _slice, _func, _args, _nfunc, _nargs)

The elements in this mapping are applied to create a particular component of the observation space, as follows

observation_component = _nfunc(_func(state[substate][subsubstate][_slice], _args), _nargs)

which are then collected to form an observed state. For example, a valid mapping for the example_game_state mapping that states that everything should be observed except the game information is as follows:

from coopihc.base.utils import example_game_state
print(example_game_state())

# Define mapping
mapping = [
    ("game_info", "turn_index", slice(0, 1, 1), None, None, None, None),
    ("game_info", "round_index", slice(0, 1, 1), None, None, None, None),
    ("task_state", "position", slice(0, 1, 1), None, None, None, None),
    ("task_state", "targets", slice(0, 2, 1), None, None, None, None),
    ("user_state", "goal", slice(0, 1, 1), None, None, None, None),
    ("assistant_state", "beliefs", slice(0, 8, 1), None, None, None, None),
    ("user_action", "action", slice(0, 1, 1), None, None, None, None),
    ("assistant_action", "action", slice(0, 1, 1), None, None, None, None),
]

# Apply mapping
obseng = RuleObservationEngine(mapping=mapping)
obseng.observe(example_game_state())

As a more complex example, suppose we want to have an observation engine that behaves as above, but which doubles the observation on the (“user_state”, “goal”) StateElement. We also want to have a noisy observation of the (“task_state”, “position”) StateElement. We would need the following mapping:

def f(observation, gamestate, *args):
    gain = args[0]
    return gain * observation

def g(observation, gamestate, *args):
    return random.randint(0, 1) + observation

mapping = [
    ("task_state", "position", slice(0, 1, 1), None, None, g, ()),
    ("task_state", "targets", slice(0, 2, 1), None, None, None, None),
    ("user_state", "goal", slice(0, 1, 1), f, (2,), None, None),
    ("user_action", "action", slice(0, 1, 1), None, None, None, None),
    ("assistant_action", "action", slice(0, 1, 1), None, None, None, None),
]

Note

It is important to respect the signature of the functions you pass in the mapping (viz. f and g’s signatures).

Typing out a mapping may be a bit laborious and hard to comprehend for collaborators; there are some shortcuts that make defining this engine easier.

Example usage:

obs_eng = RuleObservationEngine(
    deterministic_specification=engine_specification,
    extradeterministicrules=extradeterministicrules,
    extraprobabilisticrules=extraprobabilisticrules,
)

There are three types of rules:

  1. Deterministic rules, which specify at a high level which states are observable or not, e.g.

engine_specification = [
        ("game_info", "all"),
        ("task_state", "targets", slice(0, 1, 1)),
        ("user_state", "all"),
        ("assistant_state", None),
        ("user_action", "all"),
        ("assistant_action", "all"),
    ]
  1. Extra deterministic rules, which add some specific rules to specific substates

def f(observation, gamestate, *args):
    gain = args[0]
    return gain * observation

f_rule = {("user_state", "goal"): (f, (2,))}
extradeterministicrules = {}
extradeterministicrules.update(f_rule)
  1. Extra probabilistic rules, which are used to e.g. add noise

def g(observation, gamestate, *args):
    return random.random() + observation

g_rule = {("task_state", "position"): (g, ())}
extraprobabilisticrules = {}
extraprobabilisticrules.update(g_rule)

Warning

This observation engine handles deep copies, to make sure operations based on observations don’t mess up the actual states. This might be slow though. If you want to get around this, you could subclass the RuleObservationEngine to remove copies.

Several rules are predefined:

Rule Name

Game Info

Task State

User State

Assistant State

User Action

Assistant Action

Full name

Oracle

oracle_engine_specification

Blind

blind_engine_specification

BaseTask

base_task_engine_specification

BaseUser

base_user_engine_specification

BaseAssistant

base_assistant_engine_specification