lagom.transform: Transformations¶

class lagom.transform.Describe(count: int, mean: float, std: float, min: float, max: float, repr_indent: int = 0, repr_prefix: str = None)[source]¶

lagom.transform.describe(x, axis=-1, repr_indent=0, repr_prefix=None)[source]¶

lagom.transform.interp_curves(x, y)[source]¶

Piecewise linear interpolation of a discrete set of data points and generate new \(x-y\) values from the interpolated line.

It receives a batch of curves with \(x-y\) values, a global min and max of the x-axis are calculated over the entire batch and new x-axis values are generated to be applied to the interpolation function. Each interpolated curve will share the same values in x-axis.

Note

This is useful for plotting a set of curves with uncertainty bands where each curve has data points at different \(x\) values. To generate such plot, we need the set of \(y\) values with consistent \(x\) values.

Warning

Piecewise linear interpolation often can lead to more realistic uncertainty bands. Do not use polynomial interpolation which the resulting curve can be extremely misleading.

Example:

>>> import matplotlib.pyplot as plt

>>> x1 = [4, 5, 7, 13, 20]
>>> y1 = [0.25, 0.22, 0.53, 0.37, 0.55]
>>> x2 = [2, 4, 6, 7, 9, 11, 15]
>>> y2 = [0.03, 0.12, 0.4, 0.2, 0.18, 0.32, 0.39]

>>> plt.scatter(x1, y1, c='blue')
>>> plt.scatter(x2, y2, c='red')

>>> new_x, new_y = interp_curves([x1, x2], [y1, y2], num_point=100)
>>> plt.plot(new_x[0], new_y[0], 'blue')
>>> plt.plot(new_x[1], new_y[1], 'red')

Parameters:

x (list) – a batch of x values.
y (list) – a batch of y values.
num_point (int) – number of points to generate from the interpolated line.

Returns:

out_x (list) – interpolated x values (shared for the batch of curves)
out_y (list) – interpolated y values

lagom.transform.geometric_cumsum(alpha, x)[source]¶

Calculate future accumulated sums for each element in a list with an exponential factor.

Given input data \(x_1, \dots, x_n\) and exponential factor \(\alpha\in [0, 1]\), it returns an array \(y\) with the same length and each element is calculated as following

\[y_i = x_i + \alpha x_{i+1} + \alpha^2 x_{i+2} + \dots + \alpha^{n-i-1}x_{n-1} + \alpha^{n-i}x_{n}\]

Note

To gain the optimal runtime speed, we use scipy.signal.lfilter

Example

>>> geometric_cumsum(0.1, [1, 2, 3, 4])
array([[1.234, 2.34 , 3.4  , 4.   ]])

Parameters:	alpha (float) – exponential factor between zero and one. x (list) – input data
Returns:	out – calculated data
Return type:	ndarray

lagom.transform.explained_variance(y_true, y_pred, **kwargs)[source]¶

Computes the explained variance regression score.

It involves a fraction of variance that the prediction explains about the ground truth.

Let \(\hat{y}\) be the predicted output and let \(y\) be the ground truth output. Then the explained variance is estimated as follows:

\[\text{EV}(y, \hat{y}) = 1 - \frac{\text{Var}(y - \hat{y})}{\text{Var}(y)}\]

The best score is \(1.0\), and lower values are worse. A detailed interpretation is as following:

\(\text{EV} = 1\): perfect prediction
\(\text{EV} = 0\): might as well have predicted zero
\(\text{EV} < 0\): worse than just predicting zero

Note

It calls the function from scikit-learn which handles exceptions better e.g. zero division, batch size.

Example

>>> explained_variance(y_true=[3, -0.5, 2, 7], y_pred=[2.5, 0.0, 2, 8])
0.9571734475374732

>>> explained_variance(y_true=[[3, -0.5, 2, 7]], y_pred=[[2.5, 0.0, 2, 8]])
0.9571734475374732

>>> explained_variance(y_true=[[0.5, 1], [-1, 1], [7, -6]], y_pred=[[0, 2], [-1, 2], [8, -5]])
0.9838709677419355

>>> explained_variance(y_true=[[0.5, 1], [-1, 10], [7, -6]], y_pred=[[0, 2], [-1, 0.00005], [8, -5]])
0.6704023148857179

Parameters:	y_true (list) – ground truth output y_pred (list) – predicted output **kwargs – keyword arguments to specify the estimation of the explained variance.
Returns:	out – estimated explained variance
Return type:	float

class lagom.transform.LinearSchedule(initial, final, N, start=0)[source]¶

A linear scheduling from an initial to a final value over a certain timesteps, then the final value is fixed constantly afterwards.

Note

This could be useful for following use cases:

Decay of epsilon-greedy: initialized with \(1.0\) and keep with start time steps, then linearly decay to final over N time steps, and then fixed constantly as final afterwards.
Beta parameter in prioritized experience replay.

Note that for learning rate decay, one should use PyTorch optim.lr_scheduler instead.

Example

>>> scheduler = LinearSchedule(initial=1.0, final=0.1, N=3, start=0)
>>> [scheduler(i) for i in range(6)]
[1.0, 0.7, 0.4, 0.1, 0.1, 0.1]

Parameters:	initial (float) – initial value final (float) – final value N (int) – number of scheduling timesteps start (int, optional) – the timestep to start the scheduling. Default: 0

__call__(x)[source]¶

Returns the current value of the scheduling.

Parameters:	x (int) – the current timestep.
Returns:	out – current value of the scheduling.
Return type:	float

lagom.transform.rank_transform(x, centered=True)[source]¶

Rank transformation of a vector of values. The rank has the same dimensionality as the vector. Each element in the rank indicates the index of the ascendingly sorted input. i.e. ranks[i] = k, it means i-th element in the input is \(k\)-th smallest value.

Rank transformation reduce sensitivity to outliers, e.g. in OpenAI ES, gradient computation involves fitness values in the population, if there are outliers (too large fitness), it affects the gradient too much.

Note that a centered rank transformation to the range [-0.5, 0.5] is supported by an option.

Example

>>> rank_transform([3, 14, 1], centered=True)
array([ 0. ,  0.5, -0.5])

>>> rank_transform([3, 14, 1], centered=False)
array([1, 2, 0])

Parameters:	x (list/ndarray) – a vector of values. centered (bool, optional) – if `True`, then centered the rank transformation to \([-0.5, 0.5]\). Defualt: `True`
Returns:	ranks – ranks of input data
Return type:	ndarray

class lagom.transform.PolyakAverage(alpha)[source]¶

Keep a running average of a quantity via Polyak averaging.

Compared with estimating mean, it is more sentitive to recent changes.

Parameters:	alpha (float) – factor to control the sensitivity to recent changes, in the range [0, 1]. Zero is most sensitive to recent change.

__call__(x)[source]¶

Update the estimate.

Parameters:	x (object) – additional data to update the estimation of running average.

get_current()[source]¶: Return the current running average.

class lagom.transform.RunningMeanVar(shape)[source]¶

Estimates sample mean and variance by using Chan’s method.

It supports for both scalar and multi-dimensional data, however, the input is expected to be batched. The first dimension is always treated as batch dimension.

Note

For better precision, we handle the data with np.float64.

Warning

To use estimated moments for standardization, remember to keep the precision np.float64 and calculated as ..math:frac{x - mu}{sqrt{sigma^2 + 10^{-8}}}.

Example

>>> f = RunningMeanVar(shape=())
>>> f([1, 2])
>>> f([3])
>>> f([4])
>>> f.mean
2.499937501562461
>>> f.var
1.2501499923440393

__call__(x)[source]¶

Update the mean and variance given an additional batched data.

Parameters:	x (object) – additional batched data.

n¶: Returns the total number of samples so far.

lagom.transform.smooth_filter(x, window_length, polyorder, **kwargs)[source]¶

Smooth a sequence of noisy data points by applying Savitzky–Golay filter. It uses least squares to fit a polynomial with a small sliding window and use this polynomial to estimate the point in the center of the sliding window.

This is useful when a curve is highly noisy, smoothing it out leads to better visualization quality.

Example

>>> import matplotlib.pyplot as plt

>>> x = np.linspace(0, 4*2*np.pi, num=100)
>>> y = x*(np.sin(x) + np.random.random(100)*4)
>>> y2 = smooth_filter(y, window_length=31, polyorder=10)

>>> plt.plot(x, y)
>>> plt.plot(x, y2, 'red')

Parameters:	x (list) – one-dimensional vector of scalar data points of a curve. window_length (int) – the length of the filter window polyorder (int) – the order of the polynomial used to fit the samples
Returns:	out – smoothed curve data
Return type:	ndarray

class lagom.transform.SegmentTree(capacity, operation, identity_element)[source]¶

Defines a segment tree data structure.

It can be regarded as regular array, but with two major differences

Value modification is slower: O(ln(capacity)) instead of O(1)
Efficient reduce operation over contiguous subarray: O(ln(segment size))

Parameters:	capacity (int) – total number of elements, it must be a power of two. operation (lambda) – binary operation forming a group, e.g. sum, min identity_element (object) – identity element in the group, e.g. 0 for sum

reduce(start=0, end=None)[source]¶

Returns result of operation(A[start], operation(A[start+1], operation(… A[end - 1]))).

Parameters:	start (int) – start of segment end (int) – end of segment
Returns:	out – result of reduce operation
Return type:	object

class lagom.transform.SumTree(capacity)[source]¶

Defines the sum tree for storing replay priorities.

Each leaf node contains priority value. Internal nodes maintain the sum of the priorities of all leaf nodes in their subtrees.

find_prefixsum_index(prefixsum)[source]¶

Find the highest index i in the array such that sum(A[0] + A[1] + … + A[i - 1]) <= prefixsum

if array values are probabilities, this function efficiently sample indices according to the discrete probability.

Parameters:	prefixsum (float) – prefix sum.
Returns:	index – highest index satisfying the prefixsum constraint
Return type:	int

sum(start=0, end=None)[source]¶: Return A[start] + … + A[end - 1]

class lagom.transform.MinTree(capacity)[source]¶

min(start=0, end=None)[source]¶: Returns min(A[start], …, A[end])