lagom.envs

class lagom.envs.RecordEpisodeStatistics(env, deque_size=100)[source]
reset(**kwargs)[source]

Resets the state of the environment and returns an initial observation.

Returns:the initial observation.
Return type:observation (object)
step(action)[source]

Run one timestep of the environment’s dynamics. When end of episode is reached, you are responsible for calling reset() to reset this environment’s state.

Accepts an action and returns a tuple (observation, reward, done, info).

Parameters:action (object) – an action provided by the agent
Returns:agent’s observation of the current environment reward (float) : amount of reward returned after previous action done (bool): whether the episode has ended, in which case further step() calls will return undefined results info (dict): contains auxiliary diagnostic information (helpful for debugging, and sometimes learning)
Return type:observation (object)
class lagom.envs.NormalizeObservation(env, clip=5.0, constant_moments=None)[source]
class lagom.envs.NormalizeReward(env, clip=10.0, gamma=0.99, constant_var=None)[source]
reset()[source]

Resets the state of the environment and returns an initial observation.

Returns:the initial observation.
Return type:observation (object)
step(action)[source]

Run one timestep of the environment’s dynamics. When end of episode is reached, you are responsible for calling reset() to reset this environment’s state.

Accepts an action and returns a tuple (observation, reward, done, info).

Parameters:action (object) – an action provided by the agent
Returns:agent’s observation of the current environment reward (float) : amount of reward returned after previous action done (bool): whether the episode has ended, in which case further step() calls will return undefined results info (dict): contains auxiliary diagnostic information (helpful for debugging, and sometimes learning)
Return type:observation (object)
class lagom.envs.TimeStepEnv(env)[source]
reset(**kwargs)[source]

Resets the state of the environment and returns an initial observation.

Returns:the initial observation.
Return type:observation (object)
step(action)[source]

Run one timestep of the environment’s dynamics. When end of episode is reached, you are responsible for calling reset() to reset this environment’s state.

Accepts an action and returns a tuple (observation, reward, done, info).

Parameters:action (object) – an action provided by the agent
Returns:agent’s observation of the current environment reward (float) : amount of reward returned after previous action done (bool): whether the episode has ended, in which case further step() calls will return undefined results info (dict): contains auxiliary diagnostic information (helpful for debugging, and sometimes learning)
Return type:observation (object)