Layers¶
Layers are how you build Pipelines. They’re the transformations you’re applying to your data.
Image Layers¶
These Layers are for transformations geared towards image data.
-
class
megatron.layers.image.
Downsample
(new_shape)¶ Bases:
megatron.layers.core.StatelessLayer
Shrink an image to a given size proportionally.
Parameters: new_shape (tuple of int) – the target shape for the new image. -
transform
(X)¶ Apply transformation to given input data.
Parameters: inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
-
-
class
megatron.layers.image.
RGBtoBinary
(keep_dim=True)¶ Bases:
megatron.layers.core.StatelessLayer
Convert image to binary mask where a 1 indicates a non-black cell.
Parameters: keep_dim (bool) – if True, resulting image will stay 3D and will have 1 color channel. Otherwise 2D. -
transform
(X)¶ Apply transformation to given input data.
Parameters: inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
-
-
class
megatron.layers.image.
RGBtoGrey
(method='luminosity', keep_dim=False)¶ Bases:
megatron.layers.core.StatelessLayer
Convert an RGB array representation of an image to greyscale.
Parameters: method ({'luminosity', 'lightness', 'average'}) – -
transform
(X)¶ Apply transformation to given input data.
Parameters: inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
-
-
class
megatron.layers.image.
Upsample
(new_shape)¶ Bases:
megatron.layers.core.StatelessLayer
Expand an image to a given size proportionally.
Parameters: new_shape (tuple of int) – the target shape for the new image. -
transform
(X)¶ Apply transformation to given input data.
Parameters: inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
-
Missing Data Layers¶
These Layers are for dealing with missing data.
-
class
megatron.layers.missing.
Impute
(imputation_dict)¶ Bases:
megatron.layers.core.StatelessLayer
Replace instances of one data item with another, such as missing or NaN with zero.
Parameters: imputation_dict (dict) – keys of the dictionary are targets to be replaced; values are corresponding replacements. -
transform
(X)¶ Apply transformation to given input data.
Parameters: inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
-
Numeric Layers¶
These Layers are for mathematical operations on your data, such as arithmetic.
-
class
megatron.layers.numeric.
Add
¶ Bases:
megatron.layers.core.StatelessLayer
Add up arrays element-wise.
-
transform
(*arrays)¶ Apply transformation to given input data.
Parameters: inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
-
-
class
megatron.layers.numeric.
Divide
(impute=0)¶ Bases:
megatron.layers.core.StatelessLayer
Divide given array by another given array element-wise.
Parameters: impute (int/float or None) – the value to impute when encountering a divide by zero. -
transform
(X1, X2)¶ Apply transformation to given input data.
Parameters: inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
-
-
class
megatron.layers.numeric.
Dot
(n_outputs=1, **kwargs)¶ Bases:
megatron.layers.core.StatelessLayer
Multiply multiple arrays together as matrix multiplication.
-
transform
(*arrays)¶ Apply transformation to given input data.
Parameters: inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
-
-
class
megatron.layers.numeric.
ElementWiseMultiply
(n_outputs=1, **kwargs)¶ Bases:
megatron.layers.core.StatelessLayer
Multiply two same-sized arrays element-by-element.
-
transform
(X, Y)¶ Apply transformation to given input data.
Parameters: inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
-
-
class
megatron.layers.numeric.
Normalize
(n_outputs=1, **kwargs)¶ Bases:
megatron.layers.core.StatelessLayer
Divide array by total to cause it to sum to one. If zero array, make uniform.
-
transform
(X)¶ Apply transformation to given input data.
Parameters: inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
-
-
class
megatron.layers.numeric.
ScalarMultiply
(factor)¶ Bases:
megatron.layers.core.StatelessLayer
Multiply array by a given scalar.
Parameters: factor (float) – multiplier. -
transform
(X)¶ Apply transformation to given input data.
Parameters: inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
-
-
class
megatron.layers.numeric.
StaticDot
(W)¶ Bases:
megatron.layers.core.StatelessLayer
Multiply array by a given matrix, as matrix mulitplication.
Parameters: W (np.array) – matrix by which to multiply. -
transform
(X)¶ Apply transformation to given input data.
Parameters: inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
-
-
class
megatron.layers.numeric.
Subtract
(n_outputs=1, **kwargs)¶ Bases:
megatron.layers.core.StatelessLayer
Subtract one array from another.
-
transform
(X1, X2)¶ Apply transformation to given input data.
Parameters: inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
-
Shaping Layers¶
These Layers are for manipulating the shape of your data, from adding axes to creating time series windows.
-
class
megatron.layers.shaping.
AddDim
(axis=-1)¶ Bases:
megatron.layers.core.StatelessLayer
Add a dimension to an array.
Parameters: axis (int) – the axis along which to place the new dimension. -
transform
(X)¶ Apply transformation to given input data.
Parameters: inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
-
-
class
megatron.layers.shaping.
Cast
(new_type)¶ Bases:
megatron.layers.core.StatelessLayer
Re-defines the data type for a Numpy array’s contents.
Parameters: new_type (type) – the new type for the array to be cast to. -
transform
(X)¶ Apply transformation to given input data.
Parameters: inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
-
-
class
megatron.layers.shaping.
Concatenate
(axis=-1)¶ Bases:
megatron.layers.core.StatelessLayer
Combine arrays along a given axis. Does not create a new axis, unless all 1D inputs.
Parameters: axis (int (default: -1)) – axis along which to concatenate arrays. -1 means the last axis. -
transform
(*arrays)¶ Apply transformation to given input data.
Parameters: inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
-
-
class
megatron.layers.shaping.
Filter
(n_outputs=1, **kwargs)¶ Bases:
megatron.layers.core.StatelessLayer
Apply given mask to given array along the first axis to filter out observations.
-
transform
(X, mask)¶ Apply transformation to given input data.
Parameters: inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
-
-
class
megatron.layers.shaping.
Flatten
(n_outputs=1, **kwargs)¶ Bases:
megatron.layers.core.StatelessLayer
Reshape an array to be 1D.
-
transform
(X)¶ Apply transformation to given input data.
Parameters: inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
-
-
class
megatron.layers.shaping.
OneHotLabels
(strict=False)¶ Bases:
megatron.layers.core.StatefulLayer
One-hot encode an array of categorical values, or non-consecutive numeric values.
-
partial_fit
(X)¶ Update metadata based on given batch of data or full dataset.
Contains the main logic of fitting. This is what should be overwritten by all child classes.
Parameters: inputs (numpy.ndarray(s)) – the input data to be fit to; could be one array or a list of arrays.
-
transform
(X)¶ Apply transformation to given input data.
Parameters: inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
-
-
class
megatron.layers.shaping.
OneHotRange
(strict=False)¶ Bases:
megatron.layers.core.StatefulLayer
One-hot encode a numeric array where the values are a sequence.
-
partial_fit
(X)¶ Update metadata based on given batch of data or full dataset.
Contains the main logic of fitting. This is what should be overwritten by all child classes.
Parameters: inputs (numpy.ndarray(s)) – the input data to be fit to; could be one array or a list of arrays.
-
transform
(X)¶ Apply transformation to given input data.
Parameters: inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
-
-
class
megatron.layers.shaping.
Reshape
(new_shape)¶ Bases:
megatron.layers.core.StatelessLayer
Reshape an array to a given new shape.
Parameters: new_shape (tuple of int) – desired new shape for array. -
transform
(X)¶ Apply transformation to given input data.
Parameters: inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
-
-
class
megatron.layers.shaping.
Slice
(*slices)¶ Bases:
megatron.layers.core.StatelessLayer
Apply Numpy array slicing. Each slice corresponds to a dimension.
Slices (passed as hyperparameters) are constructed by the following procedure: - To get just N: provide the integer N as the slice - To slice from N to the end: provide a 1-tuple of the integer N, e.g. (5,). - To slice from M to N exclusive: provide a 2-tuple of the integers M and N, e.g. (3, 6). - To slice from M to N with skip P: provide a 3-tuple of the integers M, N, and P.
Parameters: *slices (int(s) or tuple(s)) – the slices to be applied. Must not overlap. Formatting discussed above. -
transform
(X)¶ Apply transformation to given input data.
Parameters: inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
-
-
class
megatron.layers.shaping.
SplitDict
(fields)¶ Bases:
megatron.layers.core.StatelessLayer
Split dictionary data into separate nodes, with one node per key in the dictionary.
Parameters: fields (list of str) – list of fields, dictionary keys, to be pulled out into their own nodes. -
transform
(dicts)¶ Apply transformation to given input data.
Parameters: inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
-
-
class
megatron.layers.shaping.
TimeSeries
(window_size, time_axis=1, reverse=False)¶ Bases:
megatron.layers.core.StatefulLayer
Adds a time dimension to a dataset by rolling a window over the data.
Parameters: - window_size (int) – length of the window; number of timesteps in the time series.
- time_axis (int) – on which axis in the array to place the time dimension.
- reverse (bool (default: False)) – if True, oldest data is first; if False, newest data is first.
-
partial_fit
(X)¶ Update metadata based on given batch of data or full dataset.
Contains the main logic of fitting. This is what should be overwritten by all child classes.
Parameters: inputs (numpy.ndarray(s)) – the input data to be fit to; could be one array or a list of arrays.
-
transform
(X)¶ Apply transformation to given input data.
Parameters: inputs (np.ndarray(s)) – input data to be transformed; could be one array or a list of arrays.
Text Layers¶
These Layers are for manipulating text data.
-
class
megatron.layers.text.
RemoveStopwords
(language='english')¶ Bases:
megatron.layers.core.StatelessLayer
Remove common, low-information words from all elements of text array.
Parameters: language (str (default: english)) – the language in which the text is written.