
Layers are how you build Pipelines. They’re the transformations you’re applying to your data.

Image Layers

These Layers are for transformations geared towards image data.
class megatron.layers.image.Downsample(new_shape)

Bases: megatron.layers.core.StatelessLayer

Shrink an image to a given size proportionally.

Parameters:new_shape (tuple of int) – the target shape for the new image.

Apply transformation to given input data.

Parameters:inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
class megatron.layers.image.RGBtoBinary(keep_dim=True)

Bases: megatron.layers.core.StatelessLayer

Convert image to binary mask where a 1 indicates a non-black cell.

Parameters:keep_dim (bool) – if True, resulting image will stay 3D and will have 1 color channel. Otherwise 2D.

Apply transformation to given input data.

Parameters:inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
class megatron.layers.image.RGBtoGrey(method='luminosity', keep_dim=False)

Bases: megatron.layers.core.StatelessLayer

Convert an RGB array representation of an image to greyscale.

Parameters:method ({'luminosity', 'lightness', 'average'}) –

Apply transformation to given input data.

Parameters:inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
class megatron.layers.image.Upsample(new_shape)

Bases: megatron.layers.core.StatelessLayer

Expand an image to a given size proportionally.

Parameters:new_shape (tuple of int) – the target shape for the new image.

Apply transformation to given input data.

Parameters:inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.

Missing Data Layers

These Layers are for dealing with missing data.
class megatron.layers.missing.Impute(imputation_dict)

Bases: megatron.layers.core.StatelessLayer

Replace instances of one data item with another, such as missing or NaN with zero.

Parameters:imputation_dict (dict) – keys of the dictionary are targets to be replaced; values are corresponding replacements.

Apply transformation to given input data.

Parameters:inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.

Numeric Layers

These Layers are for mathematical operations on your data, such as arithmetic.
class megatron.layers.numeric.Add

Bases: megatron.layers.core.StatelessLayer

Add up arrays element-wise.


Apply transformation to given input data.

Parameters:inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
class megatron.layers.numeric.Divide(impute=0)

Bases: megatron.layers.core.StatelessLayer

Divide given array by another given array element-wise.

Parameters:impute (int/float or None) – the value to impute when encountering a divide by zero.
transform(X1, X2)

Apply transformation to given input data.

Parameters:inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
class megatron.layers.numeric.Dot(n_outputs=1, **kwargs)

Bases: megatron.layers.core.StatelessLayer

Multiply multiple arrays together as matrix multiplication.


Apply transformation to given input data.

Parameters:inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
class megatron.layers.numeric.ElementWiseMultiply(n_outputs=1, **kwargs)

Bases: megatron.layers.core.StatelessLayer

Multiply two same-sized arrays element-by-element.

transform(X, Y)

Apply transformation to given input data.

Parameters:inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
class megatron.layers.numeric.Normalize(n_outputs=1, **kwargs)

Bases: megatron.layers.core.StatelessLayer

Divide array by total to cause it to sum to one. If zero array, make uniform.


Apply transformation to given input data.

Parameters:inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
class megatron.layers.numeric.ScalarMultiply(factor)

Bases: megatron.layers.core.StatelessLayer

Multiply array by a given scalar.

Parameters:factor (float) – multiplier.

Apply transformation to given input data.

Parameters:inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
class megatron.layers.numeric.StaticDot(W)

Bases: megatron.layers.core.StatelessLayer

Multiply array by a given matrix, as matrix mulitplication.

Parameters:W (np.array) – matrix by which to multiply.

Apply transformation to given input data.

Parameters:inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
class megatron.layers.numeric.Subtract(n_outputs=1, **kwargs)

Bases: megatron.layers.core.StatelessLayer

Subtract one array from another.

transform(X1, X2)

Apply transformation to given input data.

Parameters:inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.

Shaping Layers

These Layers are for manipulating the shape of your data, from adding axes to creating time series windows.
class megatron.layers.shaping.AddDim(axis=-1)

Bases: megatron.layers.core.StatelessLayer

Add a dimension to an array.

Parameters:axis (int) – the axis along which to place the new dimension.

Apply transformation to given input data.

Parameters:inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
class megatron.layers.shaping.Cast(new_type)

Bases: megatron.layers.core.StatelessLayer

Re-defines the data type for a Numpy array’s contents.

Parameters:new_type (type) – the new type for the array to be cast to.

Apply transformation to given input data.

Parameters:inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
class megatron.layers.shaping.Concatenate(axis=-1)

Bases: megatron.layers.core.StatelessLayer

Combine arrays along a given axis. Does not create a new axis, unless all 1D inputs.

Parameters:axis (int (default: -1)) – axis along which to concatenate arrays. -1 means the last axis.

Apply transformation to given input data.

Parameters:inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
class megatron.layers.shaping.Filter(n_outputs=1, **kwargs)

Bases: megatron.layers.core.StatelessLayer

Apply given mask to given array along the first axis to filter out observations.

transform(X, mask)

Apply transformation to given input data.

Parameters:inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
class megatron.layers.shaping.Flatten(n_outputs=1, **kwargs)

Bases: megatron.layers.core.StatelessLayer

Reshape an array to be 1D.


Apply transformation to given input data.

Parameters:inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
class megatron.layers.shaping.OneHotLabels(strict=False)

Bases: megatron.layers.core.StatefulLayer

One-hot encode an array of categorical values, or non-consecutive numeric values.


Update metadata based on given batch of data or full dataset.

Contains the main logic of fitting. This is what should be overwritten by all child classes.

Parameters:inputs (numpy.ndarray(s)) – the input data to be fit to; could be one array or a list of arrays.

Apply transformation to given input data.

Parameters:inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
class megatron.layers.shaping.OneHotRange(strict=False)

Bases: megatron.layers.core.StatefulLayer

One-hot encode a numeric array where the values are a sequence.


Update metadata based on given batch of data or full dataset.

Contains the main logic of fitting. This is what should be overwritten by all child classes.

Parameters:inputs (numpy.ndarray(s)) – the input data to be fit to; could be one array or a list of arrays.

Apply transformation to given input data.

Parameters:inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
class megatron.layers.shaping.Reshape(new_shape)

Bases: megatron.layers.core.StatelessLayer

Reshape an array to a given new shape.

Parameters:new_shape (tuple of int) – desired new shape for array.

Apply transformation to given input data.

Parameters:inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
class megatron.layers.shaping.Slice(*slices)

Bases: megatron.layers.core.StatelessLayer

Apply Numpy array slicing. Each slice corresponds to a dimension.

Slices (passed as hyperparameters) are constructed by the following procedure: - To get just N: provide the integer N as the slice - To slice from N to the end: provide a 1-tuple of the integer N, e.g. (5,). - To slice from M to N exclusive: provide a 2-tuple of the integers M and N, e.g. (3, 6). - To slice from M to N with skip P: provide a 3-tuple of the integers M, N, and P.

Parameters:*slices (int(s) or tuple(s)) – the slices to be applied. Must not overlap. Formatting discussed above.

Apply transformation to given input data.

Parameters:inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
class megatron.layers.shaping.SplitDict(fields)

Bases: megatron.layers.core.StatelessLayer

Split dictionary data into separate nodes, with one node per key in the dictionary.

Parameters:fields (list of str) – list of fields, dictionary keys, to be pulled out into their own nodes.

Apply transformation to given input data.

Parameters:inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
class megatron.layers.shaping.TimeSeries(window_size, time_axis=1, reverse=False)

Bases: megatron.layers.core.StatefulLayer

Adds a time dimension to a dataset by rolling a window over the data.

  • window_size (int) – length of the window; number of timesteps in the time series.
  • time_axis (int) – on which axis in the array to place the time dimension.
  • reverse (bool (default: False)) – if True, oldest data is first; if False, newest data is first.

Update metadata based on given batch of data or full dataset.

Contains the main logic of fitting. This is what should be overwritten by all child classes.

Parameters:inputs (numpy.ndarray(s)) – the input data to be fit to; could be one array or a list of arrays.

Apply transformation to given input data.

Parameters:inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.

Text Layers

These Layers are for manipulating text data.
class megatron.layers.text.RemoveStopwords(language='english')

Bases: megatron.layers.core.StatelessLayer

Remove common, low-information words from all elements of text array.

Parameters:language (str (default: english)) – the language in which the text is written.