pywatts.modules.preprocessing package¶
Submodules¶
pywatts.modules.preprocessing.average module¶
-
class
pywatts.modules.preprocessing.average.
Average
(weights: List[float] = None, name: str = 'Average')¶ Bases:
pywatts_pipeline.core.transformer.base.BaseTransformer
Aggregation step to average the given time series, ether by simple or weighted averaging. By default simple averaging is applied.
-
transform
(**kwargs) → xarray.core.dataarray.DataArray¶ Aggregate the given time data_array by simple or weighted averaging. :return: xarray DataArray aggregated by simple or weighted averaging. :rtype: xr.DataArray
-
pywatts.modules.preprocessing.change_direction module¶
-
class
pywatts.modules.preprocessing.change_direction.
ChangeDirection
(name='change_direction')¶ Bases:
pywatts_pipeline.core.transformer.base.BaseTransformer
This module calculates a time series that indicates whether the next value is higher, lower, or the same.
Parameters: name (str) – The name of the ChangeDirection module -
transform
(x: xarray.core.dataarray.DataArray) → xarray.core.dataarray.DataArray¶ Transforms the time series in a time series that indicates whether the next value is higher, lower, or the same
Parameters: x (xr.DataArray, optional) – The time series that should be transformed Returns: A time series, where 1 indicates that the next value is higher, -1 that the next value is lower, and 0 that the next value is the same :rtype: xr.DataArray :raises WrongParameterException: If not all indexes are part of x
-
pywatts.modules.preprocessing.clock_shift module¶
-
class
pywatts.modules.preprocessing.clock_shift.
ClockShift
(lag: int, name: str = 'ClockShift', indexes: List[str] = None)¶ Bases:
pywatts_pipeline.core.transformer.base.BaseTransformer
This module shifts the data by a certain offset.
Parameters: lag – The offset for shifting the time series. Please note: The relative time of the shift is determined by the current temporal resolution of the arrays in the pipeline. :type lag: int :param name: The name of the shift module :type name: str :param indexes: The list of indexes that determine the dimension in which the time should be shifted. If the list is None or empty, the time is shifted in all temporal dimensions. :type indexes: List
-
get_min_data
()¶ Returns how much data are at least needed by that transformer
-
transform
(x: xarray.core.dataarray.DataArray) → xarray.core.dataarray.DataArray¶ Shifts the given time series x by the defined lag
Parameters: x (xr.DataArray) – the time series to be shifted Returns: The shifted time series Return type: xr.DataArray Raises: WrongParameterException – If not all indexes are part of x
-
pywatts.modules.preprocessing.custom_scaler module¶
-
class
pywatts.modules.preprocessing.custom_scaler.
CustomScaler
(multiplier: float = 1.0, bias: float = 0.0, name: str = 'CustomScaler')¶ Bases:
pywatts_pipeline.core.transformer.base.BaseTransformer
Scaling step to scale a time series individually by a multiplier and a bias. By default the scaling does not affect the time series, i.e., the multiplier is 1.0 and the bias 0.0.
-
inverse_transform
(x: xarray.core.dataarray.DataArray) → xarray.core.dataarray.DataArray¶ Apply the inverse scaling to xarray dataset. :param x: xarray DataArray to apply differentiation on. :type x: xr.DataArray :return: Xarray dataset containing the n-th order differentiations. :rtype: xr.DataArray
-
set_params
(multiplier: float = None, **kwargs)¶ Set or change CustomScaler object parameters. :param multiplier: Value that is multiplied to every value in the time series. :type multiplier: float, optional :param bias: Value that is added to every value in the time series :type bias: float, optional
-
transform
(x: xarray.core.dataarray.DataArray) → xarray.core.dataarray.DataArray¶ Apply the scaling to xarray dataset. :param x: xarray DataArray to apply scaling on. :type x: xr.DataArray :return: Xarray dataset scaled according to the specified multiplier and bias. :rtype: xr.DataArray
-
validate_multiplier
(multiplier)¶
-
pywatts.modules.preprocessing.differentiate module¶
-
class
pywatts.modules.preprocessing.differentiate.
Differentiate
(target_index: Union[str, List[str]] = None, name: str = 'Differentiate', n: Union[int, List[int]] = 1, axis: int = -1, pad: bool = False, pad_args: Dict[str, object] = {})¶ Bases:
pywatts_pipeline.core.transformer.base.BaseTransformer
Differentiation step to calculate the n-th order difference of a time series. By default the difference has not the same size as the input time series but padding is implemented by np.pad and specific arguments can be passed by pad_args.
-
transform
(x: xarray.core.dataarray.DataArray) → xarray.core.dataarray.DataArray¶ Add n-th order differentiate to xarray dataset.
Parameters: x (xr.DataArray) – Xarray dataset to apply differentiation on. Returns: Xarray dataset containing the n-th order differentiations. Return type: xr.DataArray
-
pywatts.modules.preprocessing.linear_interpolation module¶
-
class
pywatts.modules.preprocessing.linear_interpolation.
LinearInterpolater
(name: str = 'LinearInterpolater', method: str = 'linear', dim: str = 'time', fill_value: str = 'extrapolate')¶ Bases:
pywatts_pipeline.core.transformer.base.BaseTransformer
This module creates a linear interpolator.
Parameters: - name (str) – Name of the linear interpolator
- method (str) – The method used for interpolation (e.g. linear)
- dim (str) – The dimension used
- fill_value – Handling of missing values (see
https://docs.scipy.org/doc/scipy/reference/generated/scipy.interpolate.interp1d.html) :type fill_value: str
-
transform
(x=<class 'xarray.core.dataarray.DataArray'>) → xarray.core.dataarray.DataArray¶ Transforms the input
Parameters: x (xr.DataArray) – Input xarray dataset Returns: Interpolated dataset Return type: xr.DataArray
pywatts.modules.preprocessing.missing_value_detection module¶
-
class
pywatts.modules.preprocessing.missing_value_detection.
MissingValueDetector
(name: str = 'missingValueDetector')¶ Bases:
pywatts_pipeline.core.transformer.base.BaseTransformer
Module to detect missing values (NaN, NA)
-
transform
(dataset: xarray.core.dataarray.DataArray) → xarray.core.dataarray.DataArray¶ Detects the indexes that correspond to the input having missing values
Parameters: dataset (xr.DataArray) – Dataset in which missing values should be detected Returns: Returns a dataset with binary values, true if value is missing and false otherwise Return type: xr.DataArray
-
pywatts.modules.preprocessing.resampler module¶
-
class
pywatts.modules.preprocessing.resampler.
Resampler
(name: str = 'Resampler', time_index: str = 'time', target_time: str = '1H', method: str = 'mean', method_args: Optional[Dict[str, Any]] = None)¶ Bases:
pywatts_pipeline.core.transformer.base.BaseTransformer
Module to resample time series based data to a given target time (both up and down sampling). All methods given by pandas’ resample method are provided because of xarray’s data set structure. See http://xarray.pydata.org/en/stable/generated/xarray.Dataset.resample.html for more details.
This class resamples time series data based on xarray’s resample method which is in turn based on pandas’ resampling implementation.
Parameters: - name (str) – Name of this processing step (default: “Resampler”).
- time_index (str) – Index of the dataset specifying the time series to be resampled (default: “time”).
- target_time (str) – Target time after the resampling given in string datetime format (default: “1H”). For example, “6H”/”6h” for 6 hours, “s”/”S” for seconds, “m”/”M” for months.
- method (str) – Method to use for down- or upsampling the data (default: “mean”). For example, “mean”, “min”, “sum”, “median”, “reduce”, “map”. http://xarray.pydata.org/en/stable/generated/xarray.core.resample.DatasetResample.html
- method_args (Optional[Dict[str, Any]]) – Optional parameters for the selected method as a dict (default: “None”). Note: Some methods like reduce or map require parameters!
Example
# downsample dataset to 30 Minutes (1800s) by using mean method Resampler(target_time=”1800s”, method=”mean”)
# downsample dataset to 1 day by summing up all data for one day Resampler(target_time=”1d”, method=”sum”)
# upsample “time_series” index of the dataset to 1 Minute by using interpolate method Resampler(time_index=”time_series”, target_time=”60s”, method=”interpolate”)
# resample dataset index to 1 hour by using some costum (in this case first element) Resampler(target_time=”1h”, method=”map”, method_args={“func”: lambda x: x.mean()})
-
get_min_data
()¶ Returns how much data are at least needed by that transformer
-
transform
(x: xarray.core.dataarray.DataArray) → xarray.core.dataarray.DataArray¶ Resamples the dataset x as specified in the constructor.
Parameters: x (xr.DataArray) – dataset which should be resampled. Returns: Resampled xarray dataset as xarray dataset. Return type: xr.DataArray
pywatts.modules.preprocessing.sampler module¶
-
class
pywatts.modules.preprocessing.sampler.
Sampler
(sample_size: int, name: str = 'SampleModule', indexes: List[str] = None)¶ Bases:
pywatts_pipeline.core.transformer.base.BaseTransformer
This module creates samples with a size specified by sample_size. I.e., if sample_size is 24h. It creates for each timestamp a vector containing all values of the past 24 hours. E.g., this module is useful if it forecasting algorithms needs the values of the past 24 hours as input.
Parameters: - sample_size (int) – The offset for shifting the time series
- indexes (List[str]) – The indexes which should be shifted through time
-
get_min_data
()¶ Returns how much data are at least needed by that transformer
-
set_params
(sample_size: int = None, indexes: List[str] = None)¶ Set params.
Parameters: - sample_size (int) – The offset for shifting the time series
- indexes (List[str]) – The indexes which should be shifted through time
-
transform
(x: xarray.core.dataarray.DataArray) → xarray.core.dataarray.DataArray¶ Sample the given time series x by the lag.
Parameters: x (xr.DataArray) – the input Returns: A shifted time series. Return type: xr.DataArray
pywatts.modules.preprocessing.slicer module¶
-
class
pywatts.modules.preprocessing.slicer.
Slicer
(start: Optional[int] = None, end: Optional[int] = None, name: str = 'Slicer')¶ Bases:
pywatts_pipeline.core.transformer.base.BaseTransformer
This module slices the input data array starting from the start index up to the end index. Similar to the numpy array slicing, where we can filter an array with a[start:end].
Parameters: - start (int, optional) – Start index of the slicing operation, defaults to None
- end (int, optional) – End index of the slicing operation, defaults to None
-
transform
(x: xarray.core.dataarray.DataArray) → xarray.core.dataarray.DataArray¶ Perform the slicing operation on the input array.
Parameters: x (xr.DataArray) – Input array which should be sliced. Returns: Sliced array like in numpy a[start:end]. Return type: xr.DataArray