pywatts.modules.feature_extraction package

Submodules

pywatts.modules.feature_extraction.calendar_extraction module

class pywatts.modules.feature_extraction.calendar_extraction.CalendarExtraction(name: str = 'CalendarExtraction', continent: str = 'Europe', country: str = 'Germany', features: Optional[List[pywatts.modules.feature_extraction.calendar_extraction.CalendarFeature]] = None)

Bases: pywatts_pipeline.core.transformer.base.BaseTransformer

This pipeline step will extract date features based on a timeseries defined by a DataArray input. It can calculate the year, month, month_sine, month_cos, day, day_sine, day_cos, hour, hour_sine, hour_cos, weekday, weekday_sine, weekday_cos, monday, tuesday, wednesday, thursday, friday, saturday, sunday, weekend, workday, and holiday. based on the timeseries. For the holidays it is importent to set the correct continent and country/region. E.g. ‘Europe’ and ‘BadenWurttemberg’ or ‘Germany’

Parameters:
  • name (str) – Name of this processing step.
  • continent (str) – Continent where the country or region is located (important for importing calendar module).
  • country (str) – Country or region to use for holiday calendar (default ‘Germany’)
  • features (Optional[List[CalendarFeature]]) – The features that should be extracted. The following features exist: year, month, month_sine, month_cos, day, day_sine, day_cos, hour, hour_sine, hour_cos, weekday, weekday_sine, weekday_cos, monday, tuesday, wednesday, thursday, friday, saturday, sunday, weekend, workday, holiday. (Default: month, day, weekday, hour)
Raises:

WrongParameterException – If ‘continent’ and/or ‘country’ is invalid.

set_params(**kwargs)

Set parameters of the calendar extraction processing step.

Parameters:
  • continent (str) – Continent where the country or region is located (important for importing calendar module).
  • country (str) – Country or region to use for holiday calendar (default ‘Germany’)
  • features (List[str]) – A list, which contains all features that should be calculated. Default all are calculated.
Raises:

AttributeError – If ‘continent’ and/or ‘country’ is invalid.

transform(x: xarray.core.dataarray.DataArray) → xarray.core.dataarray.DataArray

Add date features to xarray dataset as configured.

Parameters:x – Xarray dataset containing a timeseries specified by the object’s ‘time_index’
Returns:The xarray dataset with date features added.
class pywatts.modules.feature_extraction.calendar_extraction.CalendarFeature

Bases: enum.IntEnum

The available calendar features, that are extractable by the calendar extraction module:

year: Extracting the year of the time series element. month: Extracting the month of the time series element. month_sine: Extracting the month of the time series element and encodes it with sine. month_cos: Extracting the month of the time series element and encodes it with cos. day: Extracting the day of the time series element. day_sine: Extracting the day of the time series element and encodes it with sine. day_cos: Extracting the day of the time series element and encodes it with cos. hour: Extracting the hour of the time series element. hour_sine: Extracting the hour of the time series element and encodes it with sine. hour_cos: Extracting the hour of the time series element and encodes it with cos. weekday: Extracting the weekday of the time series element. weekday_sine: Extracting the weekday of the time series element and encodes it with sine. weekday_cos: Extracting the weekday of the time series element and encodes it with cos. monday: Extracting a flag indicating if the element of the time series element is a monday or not. tuesday: Extracting a flag indicating if the element of the time series element is a tuesday or not. wednesday: Extracting a flag indicating if the element of the time series element is a wednesday or not. thursday: Extracting a flag indicating if the element of the time series element is a thursday or not. friday: Extracting a flag indicating if the element of the time series element is a friday or not. saturday: Extracting a flag indicating if the element of the time series element is a saturday or not. sunday: Extracting a flag indicating if the element of the time series element is a sunday or not. weekend: Extracting a flag indicating if the element of the time series element is a weekend or not. workday: Extracting a flag indicating if the element of the time series element is a workday or not. holiday: Extracting a flag indicating if the element of the time series element is a holiday or not.

day = 5
day_cos = 7
day_sine = 6
friday = 18
holiday = 23
hour = 8
hour_cos = 10
hour_sine = 9
monday = 14
month = 2
month_cos = 4
month_sine = 3
saturday = 19
sunday = 20
thursday = 17
tuesday = 15
wednesday = 16
weekday = 11
weekday_cos = 13
weekday_sine = 12
weekend = 21
workday = 22
year = 1

pywatts.modules.feature_extraction.rolling_base module

class pywatts.modules.feature_extraction.rolling_base.RollingBase(name: str = None, window_size=168, window_size_unit='d', group_by: pywatts.modules.feature_extraction.rolling_base.RollingGroupBy = <RollingGroupBy.No: 1>, continent: str = 'Europe', country: str = 'Germany', closed='left')

Bases: pywatts_pipeline.core.transformer.base.BaseTransformer, abc.ABC

Module which calculates a rolling mean over a specific window size. Note, currently the smallest resolution of the generated profile is one minute.

Parameters:
  • name (str) – Name of the new variable
  • window_size (int) – Window size for which to calculate the mean
  • window_size_unit (str) – Unit of the window size (default: “d” [day])
  • group_by – how the entries of the time series should be grouped

:type group_by. RollingGroupBy :param continent: If group_by is WorkdayAndHoliday: Continent where the country or region is located

(important for importing calendar module).
Parameters:
  • country (str) – If group_by is WorkdayAndHoliday: Country or region to use for holiday calendar (default ‘Germany’)
  • closed (str) – If there array is closed left or right
get_min_data()

Returns how much data are at least needed by that transformer

set_params(**kwargs)

Set parameters of the rolling mean :param window_size: Window size for which to calculate the mean :type window_size: int :param window_size_unit: Unit of the window size (default: “d” [day]) :type window_size_unit: str :param groupy_by: how the entries of the time series should be grouped :type group_by. RollingGroupBy :param continent: If group_by is WorkdayAndHoliday: Continent where the country or region is located

(important for importing calendar module).
Parameters:
  • country (str) – If group_by is WorkdayAndHoliday: Country or region to use for holiday calendar (default ‘Germany’)
  • closed (str) – If there array is closed left or right
transform(x: xarray.core.dataarray.DataArray, **kwargs) → xarray.core.dataarray.DataArray

Calculates a rolling mean

Parameters:x – Xarray dataset containing a timeseries specified by the object’s ‘time_index’
Returns:The xarray dataset with date features added.
class pywatts.modules.feature_extraction.rolling_base.RollingGroupBy

Bases: enum.IntEnum

An enumeration.

HourOnly = 4
No = 1
WorkdayWeekend = 2
WorkdayWeekendAndHoliday = 3

pywatts.modules.feature_extraction.rolling_kurtosis module

class pywatts.modules.feature_extraction.rolling_kurtosis.RollingKurtosis(name: str = None, window_size=168, window_size_unit='d', group_by: pywatts.modules.feature_extraction.rolling_base.RollingGroupBy = <RollingGroupBy.No: 1>, continent: str = 'Europe', country: str = 'Germany', closed='left')

Bases: pywatts.modules.feature_extraction.rolling_base.RollingBase

Module which calculates a rolling kurtosis over a specific window size. Note, currently the smallest resolution of the generated profile is one minute.

For the documentation of the methods see pywatts.modules.rolling_base.RollingBase.

Parameters:
  • name (str) – Name of the new variable
  • window_size (int) – Window size for which to calculate the kurtosis
  • window_size_unit (str) – Unit of the window size (default: “d” [day])
  • groupy_by – how the entries of the time series should be grouped

:type group_by. RollingGroupBy :param continent: If group_by is WorkdayAndHoliday: Continent where the country or region is located

(important for importing calendar module).
Parameters:
  • country (str) – If group_by is WorkdayAndHoliday: Country or region to use for holiday calendar (default ‘Germany’)
  • closed (str) – If there array is closed left or right

pywatts.modules.feature_extraction.rolling_mean module

class pywatts.modules.feature_extraction.rolling_mean.RollingMean(name: str = 'RollingMean', window_size=168, window_size_unit='d', group_by: pywatts.modules.feature_extraction.rolling_base.RollingGroupBy = <RollingGroupBy.No: 1>, continent: str = 'Europe', country: str = 'Germany', closed='left', alpha=None)

Bases: pywatts.modules.feature_extraction.rolling_base.RollingBase

Module which calculates a rolling mean over a specific window size. Note, currently the smallest resolution of the generated profile is one minute.

For the documentation of the methods see pywatts.modules.rolling_base.RollingBase.

param name:Name of the new variable
type name:str
param window_size:
 Window size for which to calculate the mean
type window_size:
 int
param window_size_unit:
 Unit of the window size (default: “d” [day])
type window_size_unit:
 str
param groupy_by:
 how the entries of the time series should be grouped

:type group_by. RollingGroupBy :param continent: If group_by is WorkdayAndHoliday: Continent where the country or region is located

(important for importing calendar module).
type continent:str
param country:If group_by is WorkdayAndHoliday: Country or region to use for holiday calendar (default ‘Germany’)
type country:str
param closed:If there array is closed left or right
type closed:str
param alpha:alpha value for weighting the most recent value if exponential weighting should be applied. If alpha is not set, then the mean of the sliding window is calculated.
set_params(alpha: Optional[float] = None, **kwargs)

Set parameters of the rolling mean :param window_size: Window size for which to calculate the mean :type window_size: int :param window_size_unit: Unit of the window size (default: “d” [day]) :type window_size_unit: str :param groupy_by: how the entries of the time series should be grouped :type group_by. RollingGroupBy :param continent: If group_by is WorkdayAndHoliday: Continent where the country or region is located

(important for importing calendar module).
Parameters:
  • country (str) – If group_by is WorkdayAndHoliday: Country or region to use for holiday calendar (default ‘Germany’)
  • closed (str) – If there array is closed left or right
  • alpha (float) – alpha value for weighting the most recent value if exponential weighting should be applied. If alpha is not set, then the mean of the sliding window is calculated.

pywatts.modules.feature_extraction.rolling_skewness module

class pywatts.modules.feature_extraction.rolling_skewness.RollingSkewness(name: str = None, window_size=168, window_size_unit='d', group_by: pywatts.modules.feature_extraction.rolling_base.RollingGroupBy = <RollingGroupBy.No: 1>, continent: str = 'Europe', country: str = 'Germany', closed='left')

Bases: pywatts.modules.feature_extraction.rolling_base.RollingBase

Module which calculates a rolling skewness over a specific window size. Note, currently the smallest resolution of the generated profile is one minute.

For the documentation of the methods see pywatts.modules.rolling_base.RollingBase.

Parameters:
  • name (str) – Name of the new variable
  • window_size (int) – Window size for which to calculate the skewness
  • window_size_unit (str) – Unit of the window size (default: “d” [day])
  • groupy_by – how the entries of the time series should be grouped

:type group_by. RollingGroupBy :param continent: If group_by is WorkdayAndHoliday: Continent where the country or region is located

(important for importing calendar module).
Parameters:
  • country (str) – If group_by is WorkdayAndHoliday: Country or region to use for holiday calendar (default ‘Germany’)
  • closed (str) – If there array is closed left or right

pywatts.modules.feature_extraction.rolling_variance module

class pywatts.modules.feature_extraction.rolling_variance.RollingVariance(name: str = None, window_size=168, window_size_unit='d', group_by: pywatts.modules.feature_extraction.rolling_base.RollingGroupBy = <RollingGroupBy.No: 1>, continent: str = 'Europe', country: str = 'Germany', closed='left')

Bases: pywatts.modules.feature_extraction.rolling_base.RollingBase

Module which calculates a rolling variance over a specific window size. Note, currently the smallest resolution of the generated profile is one minute.

For the documentation of the methods see pywatts.modules.rolling_base.RollingBase.

Parameters:
  • name (str) – Name of the new variable
  • window_size (int) – Window size for which to calculate the variance
  • window_size_unit (str) – Unit of the window size (default: “d” [day])
  • groupy_by – how the entries of the time series should be grouped

:type group_by. RollingGroupBy :param continent: If group_by is WorkdayAndHoliday: Continent where the country or region is located

(important for importing calendar module).
Parameters:
  • country (str) – If group_by is WorkdayAndHoliday: Country or region to use for holiday calendar (default ‘Germany’)
  • closed (str) – If there array is closed left or right

pywatts.modules.feature_extraction.statistics_extraction module

class pywatts.modules.feature_extraction.statistics_extraction.StatisticExtraction(name: str = 'statistics', features: List[pywatts.modules.feature_extraction.statistics_extraction.StatisticFeature] = None, dim='horizon')

Bases: pywatts_pipeline.core.transformer.base.BaseTransformer

This module extracts statistical features based on samples of time series defined by a DataArray input. It can calculate the min, max, std, and mean of the samples.

Parameters:
  • name (str) – Name of this module.
  • features (Optional[List[StatisticFeature]]) – The features that should be extracted. The following features exist: min, max, std, and mean. (Default: List[min, max, std, mean)
  • dim (str) – The dimension on which the statistics should be extracted
transform(x: xarray.core.dataarray.DataArray) → xarray.core.dataarray.DataArray

Add statistic features to xarray dataarray as configured.

Parameters:x – Xarray dataarray containing a time series
Returns:The xarray dataarray with statistic features.
class pywatts.modules.feature_extraction.statistics_extraction.StatisticFeature

Bases: enum.IntEnum

The available statistic features, that are extractable by the statistic extraction module:

min: Extracting the min of the time series sample. max: Extracting the max of the time series sample. std: Extracting the std of the time series sample. mean: Extracting the mean of the time series sample.

max = 2
mean = 4
min = 1
std = 3

pywatts.modules.feature_extraction.trend_extraction module

This module contains the trend extraction

class pywatts.modules.feature_extraction.trend_extraction.TrendExtraction(period, length, indexes: List[str] = None, name: str = 'trend_extractor')

Bases: pywatts_pipeline.core.transformer.base.BaseTransformer

Module to extract a trend which can be specified through a period and a length, where the length indicates the length of the time step going back (ie 7, for a weekly trend with daily data) and period indicates the number of times this length is used (ie 10 to extract the last 10 weeks).

Parameters:
  • period (int) – Length of one period
  • length (int) – Number of periods which should be extracted
  • indexes (List[str]) – Index over which the trend is extracted (default: all time based indexes)
  • name (str) – Name of the module
get_min_data()

Returns how much data are at least needed by that transformer

transform(x: xarray.core.dataarray.DataArray) → xarray.core.dataarray.DataArray

Extract trend values

Parameters:x (xr.DataArray) – input xarray DataArray
Returns:a dataset containing the trend information
Return type:xr.DataArray

Module contents