coopihc.policy.ELLDiscretePolicy.ELLDiscretePolicy
- class ELLDiscretePolicy(action_state, *args, seed=None, **kwargs)[source]
Bases:
coopihc.policy.BasePolicy.BasePolicy
Explicitly defined Likelihood Policy. A policy which is described by an explicit probabilistic model.
This policy expects you to bind the actual likelihood model. An example is as follows:
se = StateElement( 1, autospace([0, 1, 2, 3, 4, 5, 6]), seed=_seed ) action_state = State(**{"action": se}) policy = ELLDiscretePolicy(action_state, seed=_seed) # Define the likelihood model def likelihood_model(self, action, observation, *args, **kwargs): if action == 0: return 1 / 7 elif action == 1: return 1 / 7 + 0.05 elif action == 2: return 1 / 7 - 0.05 elif action == 3: return 1 / 7 + 0.1 elif action == 4: return 1 / 7 - 0.1 elif action == 5: return 1 / 7 + 0.075 elif action == 6: return 1 / 7 - 0.075 else: raise RuntimeError( "warning, unable to compute likelihood. You may have not covered all cases in the likelihood definition" ) # Attach it policy.attach_likelihood_function(likelihood_model)
Note
The signature of the likelihood model should be the same signature as a bound method (i.e. the first argument is self)
- Parameters
action_state (See the BasePolicy keyword argument with the same name) – See the BasePolicy keyword argument with the same name
seed (int, optional) – seed for the RNG
Methods
Bind the likelihood model by calling BasePolicy's _bind method.
Apply this decorator to use bundle.game_state as default value to observe if game_state = None
Compute the likelihood of each action, given the current observation
Reset the policy
sample
Attributes
Return the last action.
action_keys
Return the last observation.
parameters
state
unwrapped
- property action
Return the last action.
- Returns
last action
- Return type
State<coopihc.base.StateElement.StateElement>
- attach_likelihood_function(_function)[source]
Bind the likelihood model by calling BasePolicy’s _bind method.
- Parameters
_function (function) – likelihood model to bind to the policy
- default_value()
Apply this decorator to use bundle.game_state as default value to observe if game_state = None
- forward_summary(observation)[source]
Compute the likelihood of each action, given the current observation
- Parameters
observation (State<coopihc.base.State.State>) – current agent observation
- Returns
[description]
- Return type
[type]
- property observation
Return the last observation.
- Returns
last observation
- Return type
State<coopihc.base.State.State>
- reset(random=True)
Reset the policy
- Parameters
random (bool, optional) – reset the policy, defaults to True. Here in case of subclassing BasePolicy.