coopihc.policy.LinearFeedback.LinearFeedback

class LinearFeedback(action_state, state_indicator, *args, feedback_gain='identity', noise_function=None, noise_func_args=(), **kwargs)[source]

Bases: coopihc.policy.BasePolicy.BasePolicy

Linear Feedback policy, which applies a feedback gain to a given state of the observation, and passes it to a function.

For example with:

  • state_indicator = ('user_state', 'substate1', slice(0,2,1))

  • feedback_gain = -numpy.eye(2)

  • noise_function = f(action, observation, *args)

  • noise_func_args = (1,2)

You will get

obs = observation['user_state']['substate_1'][slice(0,2,1)]
action = - -numpy.eye(2 @ obs)
return f(action, observation, *(1,2))

You can change the feedback gain online via the set_feedback_gain() method

Parameters
  • action_state (State<coopihc.base.State.State) – see BasePolicy<coopihc.policy.BasePolicy.BasePolicy

  • state_indicator (iterable) – specifies which component is used as feedback information e.g. ('user_state', 'substate1', slice(0,2,1))

  • feedback_gain (str or numpy.ndarray, optional) – Feedback gain matrix, defaults to “identity”, which creates a negative identity matrix.

  • noise_function (function, optional) – a function that produces a noise sample to add to the generated action, defaults to None

  • noise_func_args (tuple, optional) – arguments to the function above, defaults to ()

Methods

default_value

Apply this decorator to use bundle.game_state as default value to observe if game_state = None

reset

Reset the policy

sample

set_feedback_gain

set feedback gain.

Attributes

action

Return the last action.

action_keys

observation

Return the last observation.

parameters

state

unwrapped

property action

Return the last action.

Returns

last action

Return type

State<coopihc.base.StateElement.StateElement>

default_value()

Apply this decorator to use bundle.game_state as default value to observe if game_state = None

property observation

Return the last observation.

Returns

last observation

Return type

State<coopihc.base.State.State>

reset(random=True)

Reset the policy

Parameters

random (bool, optional) – reset the policy, defaults to True. Here in case of subclassing BasePolicy.

set_feedback_gain(gain)[source]

set feedback gain. Only needed if the gain needs to be changed after initialization, otherwise it is recommended to set the gain during initialiation of the policy.

Parameters

gain (numpy.ndarray) – feedback gain matrix