coopihc.agents.lqrcontrollers.LQRController.LQRController

class LQRController(role, Q, R, *args, **kwargs)[source]

Bases: coopihc.agents.BaseAgent.BaseAgent

A Linear Quadratic Regulator.

This agent will read a state named ‘x’ from the task, and produce actions according to:

\[\text{action} = -K X\]

where K is the so-called feedback gain, which has to be specified externally. For an example, see the coopihc.agents.lqrcontrollers.FHDT_LQRController.FHDT_LQRController source code.

The controller will also output observation rewards J, for state X and action u

\[J = -X^t Q X - u^t R u\]

Note

This class is meant to be subclassed

Warning

Tested only on 1d output.

Parameters

role (string) – “user” or “assistant”
Q (numpy.ndarray) – State cost
R (numpy.ndarray) – Control cost

Methods

`finit`	Finish initializing.
`infer`	infer the agent's internal state
`observe`	produce an observation
`prepare_action`
`render`	Displays actions selected by the LQR agent.
`reset`	reset the agent --- Override this
`reset_all`	reset the agent and all its components
`take_action`	Select an action

Attributes

`action`	Last agent action
`assistant`	Connected assistant
`bundle`
`bundle_memory`
`inference_engine`	Agent inference engine
`observation`	Last agent observation
`observation_engine`	Agent observation engine
`parameters`
`policy`	Agent policy
`state`	Agent internal state
`task`	Connected task
`user`	Connected user

property action: Last agent action

property assistant: Connected assistant

finit()

Finish initializing.

Method that specifies what happens when initializing the agent for the very first time (similar to __init__), but after a bundle has been initialized already. This allows to finish initializing (finit) the agent when information from another component is required to do so.

infer(agent_observation=None, affect_bundle=True)

infer the agent’s internal state

Infer the new agent state from the agent’s observation. By default, the agent will select the agent’s last observation. To bypass this behavior, you can provide a given agent_observation. The affect_bundle flag determines whether or not the agent’s internal state is actually updated.

Parameters

agent_observation (:py:class:State<coopihc.base.State>, optional) – last agent observation, defaults to None. If None, gets the observation from the inference engine’s buffer.
affect_bundle (bool, optional) – whether or not the agent’s state is updated with the new inferred state, defaults to True.

property inference_engine: Agent inference engine

property observation: Last agent observation

property observation_engine: Agent observation engine

observe(game_state=None, affect_bundle=True, game_info={}, task_state={}, user_state={}, assistant_state={}, user_action={}, assistant_action={})

produce an observation

Produce an observation based on state information, by querying the agent’s observation engine. By default, the agent will find the appropriate states to observe. To bypass this behavior, you can provide state information. When doing so, either provide the full game state, or provide the needed individual states. The affect_bundle flag determines whether or not the observation produces like this becomes the agent’s last observation.

Parameters

game_state (:py:class:State<coopihc.base.State>, optional) – the full game state as defined in the CoopIHC interaction model, defaults to None.
affect_bundle (bool, optional) – whether or not the observation is stored and becomes the agent’s last observation, defaults to True.
game_info (:py:class:State<coopihc.base.State>, optional) – game_info substate, see the CoopIHC interaction model, defaults to {}.
task_state (:py:class:State<coopihc.base.State>, optional) – task_state substate, see the CoopIHC interaction model, defaults to {}
user_state (:py:class:State<coopihc.base.State>, optional) – user_state substate, see the CoopIHC interaction model, defaults to {}
assistant_state (:py:class:State<coopihc.base.State>, optional) – assistant_state substate, see the CoopIHC interaction model, defaults to {}
user_action (:py:class:State<coopihc.base.State>, optional) – user_action substate, see the CoopIHC interaction model, defaults to {}
assistant_action (:py:class:State<coopihc.base.State>, optional) – assistant_action substate, see the CoopIHC interaction model, defaults to {}

property policy: Agent policy

render(*args, **kwargs)[source]: Displays actions selected by the LQR agent.

reset()

reset the agent — Override this

Override this method to specify how the components of the agent will be reset. By default, the agent will already call the reset method of all 4 components (policy, inference engine, observation engine, state). You can specify some added behavior here e.g. if you want each game to begin with a specific state value, you can specify that here. For example:

# Sets the value of state 'x' to 0
def reset(self):
    self.state["x"][...] = 123

reset_all(dic=None, random=True)

reset the agent and all its components

In addition to running the agent’s reset(), reset_all() also calls state, observation engine, inference engine and policies’ reset() method.

Parameters

dic (dictionary, optional) – reset_dictionnary, defaults to None. See the reset() method in py:class:Bundle<coopihc.bundle.Bundle> for more information.
random (bool, optional) – whether states should be randomly reset, defaults to True. See the reset() method in py:class:Bundle<coopihc.bundle.Bundle> for more information.

property state: Agent internal state

take_action(agent_observation=None, agent_state=None, increment_turn=True)

Select an action

Select an action based on agent_observation and agent_state, by querying the agent’s policy. If either of these arguments is not provided, then the argument is deduced from the agent’s internals.

Parameters

agent_observation (:py:class:State<coopihc.base.State>, optional) – last agent observation, defaults to None. If None, gets the observation from the inference engine’s buffer.
agent_state (:py:class:State<coopihc.base.State>, optional) – current value of the agent’s internal state, defaults to None. If None, gets the state from itself.
increment_turn (bool, optional) – whether to update bundle’s turn and round

property task: Connected task

property user: Connected user