Rule-based Policies¶
This document goes more in-depth into implementing rule-based agents and what to look out for when doing so.
Implementing the Default Rule-based Policy¶
As no pre-existing rule-based agent is provided, the default one has
to be implemented in the file
hearts_gym/policies/rule_based_policy_impl.py before it is able to
be used. Simply implement the compute_action method to return a
single action and the agent will work. It is available under the
policy ID RULEBASED_POLICY_ID in configuration.py. This is also
the rule-based policy referred to by any policy_mapping_fn with
rulebased in the name.
Observed Games¶
Any rule-based policy has access to an ObservedGame under the game
member variable. The observed game provides several utility functions that
may be useful to implement the policy.
For each observation, the observed game recreates the game state as
viewed by the observing player (the agent being implemented). Due to
only having limited knowledge of the game and working with the
provided observations, some variables have to be treated with care.
For example, the specially labeled variables offset_collected and
offset_penalties are not ordered by player indices but instead by
index offsets (see index offsets in
docs/environment.md for an
explanation of these). The cards on the table, available under
table_cards, are simply ordered by time of placement.
Other differences between a standard HeartsGame and an
ObservedGame include different and fewer variables and
functionalities. For example, as an observed game only has access to
the information one player has, there is only one hand but a list of
unknown_cards whose location is not known to the observing player.
Implementing Other Rule-based Policies¶
Any number of rule-based agents may exist in parallel; simply create a new deterministic policy implementation by following these steps:
Subclass
hearts_gym.policies.deterministic_policy_impl.DeterministicPolicyImpl.Implement a
compute_actionmethod with the same signature as given by the superclass.Add the newly implemented class to the
custom_rulebased_policiesdictionary inconfiguration.py, for example like this:# Note we are adding the class, not an instance of it. custom_rulebased_policies: Dict[str, type] = { 'my_new_policy': MyNewPolicyImpl, }
Create a new
policy_mapping_fnthat includes the new policy ID by mapping a player index (the agent ID) to it.