lagom.utils: Utils

lagom.utils.set_global_seeds(seed)[source]

Set the seed for generating random numbers.

It sets the following dependencies with the given random seed:

  1. PyTorch
  2. Numpy
  3. Python random
Parameters:seed (int) – a given seed.
class lagom.utils.Seeder(init_seed=0)[source]

A random seed generator.

Given an initial seed, the seeder can be called continuously to sample a single or a batch of random seeds.

Note

The seeder creates an independent RandomState to generate random numbers. It does not affect the RandomState in np.random.

Example:

>>> seeder = Seeder(init_seed=0)
>>> seeder(size=5)
[209652396, 398764591, 924231285, 1478610112, 441365315]
__call__(size=1)[source]

Return the sampled random seeds according to the given size.

Parameters:size (int or list) – The size of random seeds to sample.
Returns:a list of sampled random seeds.
Return type:list

Smart data type converter

lagom.utils.tensorify(x, device)[source]
lagom.utils.numpify(x, dtype=None)[source]

Use Python multiprocessing library

class lagom.utils.ProcessWorker(master_conn, worker_conn)[source]

Base class for all workers implemented with Python multiprocessing.Process.

It communicates with master via a Pipe connection. The worker is stand-by infinitely waiting for task from master, working and sending back result. When it receives a close command, it breaks the infinite loop and close the connection.

work(task_id, task)[source]

Work on the given task and return the result.

Parameters:
  • task_id (int) – the task ID.
  • task (object) – a given task.
Returns:

working result.

Return type:

object

class lagom.utils.ProcessMaster(worker_class, num_worker)[source]

Base class for all masters implemented with Python multiprocessing.Process.

It creates a number of workers each with an individual Process. The communication between master and each worker is via independent Pipe connection. The master assigns tasks to workers. When all tasks are done, it stops all workers and terminate all processes.

Note

If there are more tasks than workers, then tasks will be splitted into chunks. If there are less tasks than workers, then we reduce the number of workers to the number of tasks.

assign_tasks(tasks)[source]

Assign a given list of tasks to the workers and return the received results.

Parameters:tasks (list) – a list of tasks
Returns:received results
Return type:object
close()[source]

Defines everything required after finishing all the works, e.g. stop all workers, clean up.

make_tasks()[source]

Returns a list of tasks.

Returns:a list of tasks
Return type:list

Serialization

lagom.utils.pickle_dump(obj, f, ext='.pkl')[source]

Serialize an object using pickling and save in a file.

Note

It uses cloudpickle instead of pickle to support lambda function and multiprocessing. By default, the highest protocol is used.

Note

Except for pure array object, it is not recommended to use np.save because it is often much slower.

Parameters:
  • obj (object) – a serializable object
  • f (str/Path) – file path
  • ext (str, optional) – file extension. Default: .pkl
lagom.utils.pickle_load(f)[source]

Read a pickled data from a file.

Parameters:f (str/Path) – file path
lagom.utils.yaml_dump(obj, f, ext='.yml')[source]

Serialize a Python object using YAML and save in a file.

Note

YAML is recommended to use for a small dictionary and it is super human-readable. e.g. configuration settings. For saving experiment metrics, it is better to use pickle_dump().

Note

Except for pure array object, it is not recommended to use np.load because it is often much slower.

Parameters:
  • obj (object) – a serializable object
  • f (str/Path) – file path
  • ext (str, optional) – file extension. Default: .yml
lagom.utils.yaml_load(f)[source]

Read the data from a YAML file.

Parameters:f (str/Path) – file path
class lagom.utils.CloudpickleWrapper(x)[source]

Uses cloudpickle to serialize contents (multiprocessing uses pickle by default)

This is useful when passing lambda definition through Process arguments.

Misc

lagom.utils.color_str(string, color, bold=False)[source]

Returns stylized string with coloring and bolding for printing.

Example:

>>> print(color_str('lagom', 'green', bold=True))
Parameters:
  • string (str) – input string
  • color (str) – color name
  • bold (bool, optional) – if True, then the string is bolded. Default: False
Returns:

stylized string

Return type:

out

lagom.utils.timed(color='green', bold=False)[source]

A decorator to print the total time of executing a body function.

Parameters:
  • color (str, optional) – color name. Default: ‘green’
  • bold (bool, optional) – if True, then the verbose is bolded. Default: False
lagom.utils.timeit(_func=None, *, color='green', bold=False)[source]
lagom.utils.ask_yes_or_no(msg)[source]

Ask user to enter yes or no to a given message.

Parameters:msg (str) – a message
class lagom.utils.IntervalConditioner(interval, mode)[source]
class lagom.utils.NConditioner(max_n, num_conditions, mode)[source]