pywatts.modules.feature_extraction package

Submodules

pywatts.modules.feature_extraction.calendar_extraction module

class pywatts.modules.feature_extraction.calendar_extraction.CalendarExtraction(name: str = 'CalendarExtraction', continent: str = 'Europe', country: str = 'Germany', features: Optional[List[pywatts.modules.feature_extraction.calendar_extraction.CalendarFeature]] = None)

Bases: pywatts.core.base.BaseTransformer

This pipeline step will extract date features based on a timeseries defined by a DataArray input. It can calculate the year, month, month_sine, month_cos, day, day_sine, day_cos, hour, hour_sine, hour_cos, weekday, weekday_sine, weekday_cos, monday, tuesday, wednesday, thursday, friday, saturday, sunday, weekend, workday, and holiday. based on the timeseries. For the holidays it is importent to set the correct continent and country/region. E.g. ‘Europe’ and ‘BadenWurttemberg’ or ‘Germany’

Parameters:
  • name (str) – Name of this processing step.
  • continent (str) – Continent where the country or region is located (important for importing calendar module).
  • country (str) – Country or region to use for holiday calendar (default ‘Germany’)
  • features (Optional[List[CalendarFeature]]) – The features that should be extracted. The following features exist: year, month, month_sine, month_cos, day, day_sine, day_cos, hour, hour_sine, hour_cos, weekday, weekday_sine, weekday_cos, monday, tuesday, wednesday, thursday, friday, saturday, sunday, weekend, workday, holiday. (Default: month, day, weekday, hour)
Raises:

WrongParameterException – If ‘continent’ and/or ‘country’ is invalid.

get_params() → Dict[str, object]

Get parameters of this calendar extraction processing step.

Returns:Json dict containing the parameters.
set_params(continent: Optional[str] = None, country: Optional[str] = None, features: Optional[List[pywatts.modules.feature_extraction.calendar_extraction.CalendarFeature]] = None)

Set parameters of the calendar extraction processing step.

Parameters:
  • continent (str) – Continent where the country or region is located (important for importing calendar module).
  • country (str) – Country or region to use for holiday calendar (default ‘Germany’)
  • features (List[str]) – A list, which contains all features that should be calculated. Default all are calculated.
Raises:

AttributeError – If ‘continent’ and/or ‘country’ is invalid.

transform(x: xarray.core.dataarray.DataArray) → xarray.core.dataarray.DataArray

Add date features to xarray dataset as configured.

Parameters:x – Xarray dataset containing a timeseries specified by the object’s ‘time_index’
Returns:The xarray dataset with date features added.
class pywatts.modules.feature_extraction.calendar_extraction.CalendarFeature

Bases: enum.IntEnum

The available calendar features, that are extractable by the calendar extraction module:

year: Extracting the year of the time series element. month: Extracting the month of the time series element. month_sine: Extracting the month of the time series element and encodes it with sine. month_cos: Extracting the month of the time series element and encodes it with cos. day: Extracting the day of the time series element. day_sine: Extracting the day of the time series element and encodes it with sine. day_cos: Extracting the day of the time series element and encodes it with cos. hour: Extracting the hour of the time series element. hour_sine: Extracting the hour of the time series element and encodes it with sine. hour_cos: Extracting the hour of the time series element and encodes it with cos. weekday: Extracting the weekday of the time series element. weekday_sine: Extracting the weekday of the time series element and encodes it with sine. weekday_cos: Extracting the weekday of the time series element and encodes it with cos. monday: Extracting a flag indicating if the element of the time series element is a monday or not. tuesday: Extracting a flag indicating if the element of the time series element is a tuesday or not. wednesday: Extracting a flag indicating if the element of the time series element is a wednesday or not. thursday: Extracting a flag indicating if the element of the time series element is a thursday or not. friday: Extracting a flag indicating if the element of the time series element is a friday or not. saturday: Extracting a flag indicating if the element of the time series element is a saturday or not. sunday: Extracting a flag indicating if the element of the time series element is a sunday or not. weekend: Extracting a flag indicating if the element of the time series element is a weekend or not. workday: Extracting a flag indicating if the element of the time series element is a workday or not. holiday: Extracting a flag indicating if the element of the time series element is a holiday or not.

day = 5
day_cos = 7
day_sine = 6
friday = 18
holiday = 23
hour = 8
hour_cos = 10
hour_sine = 9
monday = 14
month = 2
month_cos = 4
month_sine = 3
saturday = 19
sunday = 20
thursday = 17
tuesday = 15
wednesday = 16
weekday = 11
weekday_cos = 13
weekday_sine = 12
weekend = 21
workday = 22
year = 1

pywatts.modules.feature_extraction.rolling_base module

class pywatts.modules.feature_extraction.rolling_base.RollingBase(name: str = None, window_size=168, window_size_unit='d', group_by: pywatts.modules.feature_extraction.rolling_base.RollingGroupBy = <RollingGroupBy.No: 1>, continent: str = 'Europe', country: str = 'Germany', closed='left')

Bases: pywatts.core.base.BaseTransformer, abc.ABC

Module which calculates a rolling mean over a specific window size. Note, currently the smallest resolution of the generated profile is one minute.

Parameters:
  • name (str) – Name of the new variable
  • window_size (int) – Window size for which to calculate the mean
  • window_size_unit (str) – Unit of the window size (default: “d” [day])
  • group_by – how the entries of the time series should be grouped

:type group_by. RollingGroupBy :param continent: If group_by is WorkdayAndHoliday: Continent where the country or region is located

(important for importing calendar module).
Parameters:
  • country (str) – If group_by is WorkdayAndHoliday: Country or region to use for holiday calendar (default ‘Germany’)
  • closed (str) – If there array is closed left or right
get_min_data()
get_params() → Dict[str, object]

Get the parameters of the rolling mean module as dict

set_params(window_size: Optional[int] = None, window_size_unit: Optional[str] = None, group_by: Optional[pywatts.modules.feature_extraction.rolling_base.RollingGroupBy] = None, continent: Optional[str] = None, country: Optional[str] = None, closed: Optional[str] = None)

Set parameters of the rolling mean :param window_size: Window size for which to calculate the mean :type window_size: int :param window_size_unit: Unit of the window size (default: “d” [day]) :type window_size_unit: str :param groupy_by: how the entries of the time series should be grouped :type group_by. RollingGroupBy :param continent: If group_by is WorkdayAndHoliday: Continent where the country or region is located

(important for importing calendar module).
Parameters:
  • country (str) – If group_by is WorkdayAndHoliday: Country or region to use for holiday calendar (default ‘Germany’)
  • closed (str) – If there array is closed left or right
transform(x: xarray.core.dataarray.DataArray) → xarray.core.dataarray.DataArray

Calculates a rolling mean

Parameters:x – Xarray dataset containing a timeseries specified by the object’s ‘time_index’
Returns:The xarray dataset with date features added.
class pywatts.modules.feature_extraction.rolling_base.RollingGroupBy

Bases: enum.IntEnum

An enumeration.

No = 1
WorkdayWeekend = 2
WorkdayWeekendAndHoliday = 3

pywatts.modules.feature_extraction.rolling_kurtosis module

class pywatts.modules.feature_extraction.rolling_kurtosis.RollingKurtosis(name: str = None, window_size=168, window_size_unit='d', group_by: pywatts.modules.feature_extraction.rolling_base.RollingGroupBy = <RollingGroupBy.No: 1>, continent: str = 'Europe', country: str = 'Germany', closed='left')

Bases: pywatts.modules.feature_extraction.rolling_base.RollingBase

Module which calculates a rolling kurtosis over a specific window size. Note, currently the smallest resolution of the generated profile is one minute.

For the documentation of the methods see pywatts.modules.rolling_base.RollingBase.

Parameters:
  • name (str) – Name of the new variable
  • window_size (int) – Window size for which to calculate the kurtosis
  • window_size_unit (str) – Unit of the window size (default: “d” [day])
  • groupy_by – how the entries of the time series should be grouped

:type group_by. RollingGroupBy :param continent: If group_by is WorkdayAndHoliday: Continent where the country or region is located

(important for importing calendar module).
Parameters:
  • country (str) – If group_by is WorkdayAndHoliday: Country or region to use for holiday calendar (default ‘Germany’)
  • closed (str) – If there array is closed left or right

pywatts.modules.feature_extraction.rolling_mean module

class pywatts.modules.feature_extraction.rolling_mean.RollingMean(name: str = None, window_size=168, window_size_unit='d', group_by: pywatts.modules.feature_extraction.rolling_base.RollingGroupBy = <RollingGroupBy.No: 1>, continent: str = 'Europe', country: str = 'Germany', closed='left')

Bases: pywatts.modules.feature_extraction.rolling_base.RollingBase

Module which calculates a rolling mean over a specific window size. Note, currently the smallest resolution of the generated profile is one minute.

For the documentation of the methods see pywatts.modules.rolling_base.RollingBase.

Parameters:
  • name (str) – Name of the new variable
  • window_size (int) – Window size for which to calculate the mean
  • window_size_unit (str) – Unit of the window size (default: “d” [day])
  • groupy_by – how the entries of the time series should be grouped

:type group_by. RollingGroupBy :param continent: If group_by is WorkdayAndHoliday: Continent where the country or region is located

(important for importing calendar module).
Parameters:
  • country (str) – If group_by is WorkdayAndHoliday: Country or region to use for holiday calendar (default ‘Germany’)
  • closed (str) – If there array is closed left or right

pywatts.modules.feature_extraction.rolling_skewness module

class pywatts.modules.feature_extraction.rolling_skewness.RollingSkewness(name: str = None, window_size=168, window_size_unit='d', group_by: pywatts.modules.feature_extraction.rolling_base.RollingGroupBy = <RollingGroupBy.No: 1>, continent: str = 'Europe', country: str = 'Germany', closed='left')

Bases: pywatts.modules.feature_extraction.rolling_base.RollingBase

Module which calculates a rolling skewness over a specific window size. Note, currently the smallest resolution of the generated profile is one minute.

For the documentation of the methods see pywatts.modules.rolling_base.RollingBase.

Parameters:
  • name (str) – Name of the new variable
  • window_size (int) – Window size for which to calculate the skewness
  • window_size_unit (str) – Unit of the window size (default: “d” [day])
  • groupy_by – how the entries of the time series should be grouped

:type group_by. RollingGroupBy :param continent: If group_by is WorkdayAndHoliday: Continent where the country or region is located

(important for importing calendar module).
Parameters:
  • country (str) – If group_by is WorkdayAndHoliday: Country or region to use for holiday calendar (default ‘Germany’)
  • closed (str) – If there array is closed left or right

pywatts.modules.feature_extraction.rolling_variance module

class pywatts.modules.feature_extraction.rolling_variance.RollingVariance(name: str = None, window_size=168, window_size_unit='d', group_by: pywatts.modules.feature_extraction.rolling_base.RollingGroupBy = <RollingGroupBy.No: 1>, continent: str = 'Europe', country: str = 'Germany', closed='left')

Bases: pywatts.modules.feature_extraction.rolling_base.RollingBase

Module which calculates a rolling variance over a specific window size. Note, currently the smallest resolution of the generated profile is one minute.

For the documentation of the methods see pywatts.modules.rolling_base.RollingBase.

Parameters:
  • name (str) – Name of the new variable
  • window_size (int) – Window size for which to calculate the variance
  • window_size_unit (str) – Unit of the window size (default: “d” [day])
  • groupy_by – how the entries of the time series should be grouped

:type group_by. RollingGroupBy :param continent: If group_by is WorkdayAndHoliday: Continent where the country or region is located

(important for importing calendar module).
Parameters:
  • country (str) – If group_by is WorkdayAndHoliday: Country or region to use for holiday calendar (default ‘Germany’)
  • closed (str) – If there array is closed left or right

pywatts.modules.feature_extraction.trend_extraction module

This module contains the trend extraction

class pywatts.modules.feature_extraction.trend_extraction.TrendExtraction(period, length, indexes: List[str] = None, name: str = 'trend_extractor')

Bases: pywatts.core.base.BaseTransformer

Module to extract a trend which can be specified through a period and a length, where the length indicates the length of the time step going back (ie 7, for a weekly trend with daily data) and period indicates the number of times this length is used (ie 10 to extract the last 10 weeks).

Parameters:
  • period (int) – Length of one period
  • length (int) – Number of periods which should be extracted
  • indexes (List[str]) – Index over which the trend is extracted (default: all time based indexes)
  • name (str) – Name of the module
get_min_data()
get_params() → Dict[str, object]

Get all parameters of the trend extraction

Returns:Dict with params
Return type:Dict[str, object]
set_params(period=None, length=None, indexes: List[str] = None)

Set the parameters

Parameters:
  • period (int) – Length of one period
  • length (int) – Number of periods which should be extracted
  • indexes (List[str]) – Time index
transform(x: xarray.core.dataarray.DataArray) → xarray.core.dataarray.DataArray

Extract trend values

Parameters:x (xr.DataArray) – input xarray DataArray
Returns:a dataset containing the trend information
Return type:xr.DataArray

Module contents