lagom.utils: Utils¶
-
lagom.utils.
set_global_seeds
(seed)[source]¶ Set the seed for generating random numbers.
It sets the following dependencies with the given random seed:
- PyTorch
- Numpy
- Python random
Parameters: seed (int) – a given seed.
-
class
lagom.utils.
Seeder
(init_seed=0)[source]¶ A random seed generator.
Given an initial seed, the seeder can be called continuously to sample a single or a batch of random seeds.
Note
The seeder creates an independent RandomState to generate random numbers. It does not affect the RandomState in
np.random
.Example:
>>> seeder = Seeder(init_seed=0) >>> seeder(size=5) [209652396, 398764591, 924231285, 1478610112, 441365315]
Smart data type converter¶
Use Python multiprocessing library¶
-
class
lagom.utils.
ProcessWorker
(master_conn, worker_conn)[source]¶ Base class for all workers implemented with Python multiprocessing.Process.
It communicates with master via a Pipe connection. The worker is stand-by infinitely waiting for task from master, working and sending back result. When it receives a
close
command, it breaks the infinite loop and close the connection.
-
class
lagom.utils.
ProcessMaster
(worker_class, num_worker)[source]¶ Base class for all masters implemented with Python multiprocessing.Process.
It creates a number of workers each with an individual Process. The communication between master and each worker is via independent Pipe connection. The master assigns tasks to workers. When all tasks are done, it stops all workers and terminate all processes.
Note
If there are more tasks than workers, then tasks will be splitted into chunks. If there are less tasks than workers, then we reduce the number of workers to the number of tasks.
-
assign_tasks
(tasks)[source]¶ Assign a given list of tasks to the workers and return the received results.
Parameters: tasks (list) – a list of tasks Returns: received results Return type: object
-
Serialization¶
-
lagom.utils.
pickle_dump
(obj, f, ext='.pkl')[source]¶ Serialize an object using pickling and save in a file.
Note
It uses cloudpickle instead of pickle to support lambda function and multiprocessing. By default, the highest protocol is used.
Note
Except for pure array object, it is not recommended to use
np.save
because it is often much slower.Parameters: - obj (object) – a serializable object
- f (str/Path) – file path
- ext (str, optional) – file extension. Default: .pkl
-
lagom.utils.
pickle_load
(f)[source]¶ Read a pickled data from a file.
Parameters: f (str/Path) – file path
-
lagom.utils.
yaml_dump
(obj, f, ext='.yml')[source]¶ Serialize a Python object using YAML and save in a file.
Note
YAML is recommended to use for a small dictionary and it is super human-readable. e.g. configuration settings. For saving experiment metrics, it is better to use
pickle_dump()
.Note
Except for pure array object, it is not recommended to use
np.load
because it is often much slower.Parameters: - obj (object) – a serializable object
- f (str/Path) – file path
- ext (str, optional) – file extension. Default: .yml
Misc¶
-
lagom.utils.
color_str
(string, color, bold=False)[source]¶ Returns stylized string with coloring and bolding for printing.
Example:
>>> print(color_str('lagom', 'green', bold=True))
Parameters: - string (str) – input string
- color (str) – color name
- bold (bool, optional) – if
True
, then the string is bolded. Default:False
Returns: stylized string
Return type: out
-
lagom.utils.
timed
(color='green', bold=False)[source]¶ A decorator to print the total time of executing a body function.
Parameters: - color (str, optional) – color name. Default: ‘green’
- bold (bool, optional) – if
True
, then the verbose is bolded. Default:False