coopihc.policy.LinearFeedback.LinearFeedback
- class LinearFeedback(action_state, state_indicator, *args, feedback_gain='identity', noise_function=None, noise_func_args=(), **kwargs)[source]
- Bases: - coopihc.policy.BasePolicy.BasePolicy- Linear Feedback policy, which applies a feedback gain to a given state of the observation, and passes it to a function. - For example with: - state_indicator = ('user_state', 'substate1', slice(0,2,1))
- feedback_gain = -numpy.eye(2)
- noise_function = f(action, observation, *args)
- noise_func_args = (1,2)
 - You will get - obs = observation['user_state']['substate_1'][slice(0,2,1)] action = - -numpy.eye(2 @ obs) return f(action, observation, *(1,2)) - You can change the feedback gain online via the - set_feedback_gain()method- Parameters
- action_state (State<coopihc.base.State.State) – see BasePolicy<coopihc.policy.BasePolicy.BasePolicy 
- state_indicator (iterable) – specifies which component is used as feedback information e.g. - ('user_state', 'substate1', slice(0,2,1))
- feedback_gain (str or numpy.ndarray, optional) – Feedback gain matrix, defaults to “identity”, which creates a negative identity matrix. 
- noise_function (function, optional) – a function that produces a noise sample to add to the generated action, defaults to None 
- noise_func_args (tuple, optional) – arguments to the function above, defaults to () 
 
 - Methods - Apply this decorator to use bundle.game_state as default value to observe if game_state = None - Reset the policy - sample- set feedback gain. - Attributes - Return the last action. - action_keys- Return the last observation. - parameters- state- unwrapped- property action
- Return the last action. - Returns
- last action 
- Return type
- State<coopihc.base.StateElement.StateElement> 
 
 - default_value()
- Apply this decorator to use bundle.game_state as default value to observe if game_state = None 
 - property observation
- Return the last observation. - Returns
- last observation 
- Return type
- State<coopihc.base.State.State> 
 
 - reset(random=True)
- Reset the policy - Parameters
- random (bool, optional) – reset the policy, defaults to True. Here in case of subclassing BasePolicy.