pywatts.modules.feature_extraction package¶
Submodules¶
pywatts.modules.feature_extraction.calendar_extraction module¶
-
class
pywatts.modules.feature_extraction.calendar_extraction.
CalendarExtraction
(name: str = 'CalendarExtraction', continent: str = 'Europe', country: str = 'Germany', features: Optional[List[pywatts.modules.feature_extraction.calendar_extraction.CalendarFeature]] = None)¶ Bases:
pywatts_pipeline.core.transformer.base.BaseTransformer
This pipeline step will extract date features based on a timeseries defined by a DataArray input. It can calculate the year, month, month_sine, month_cos, day, day_sine, day_cos, hour, hour_sine, hour_cos, weekday, weekday_sine, weekday_cos, monday, tuesday, wednesday, thursday, friday, saturday, sunday, weekend, workday, and holiday. based on the timeseries. For the holidays it is importent to set the correct continent and country/region. E.g. ‘Europe’ and ‘BadenWurttemberg’ or ‘Germany’
Parameters: - name (str) – Name of this processing step.
- continent (str) – Continent where the country or region is located (important for importing calendar module).
- country (str) – Country or region to use for holiday calendar (default ‘Germany’)
- features (Optional[List[CalendarFeature]]) – The features that should be extracted. The following features exist: year, month, month_sine, month_cos, day, day_sine, day_cos, hour, hour_sine, hour_cos, weekday, weekday_sine, weekday_cos, monday, tuesday, wednesday, thursday, friday, saturday, sunday, weekend, workday, holiday. (Default: month, day, weekday, hour)
Raises: WrongParameterException – If ‘continent’ and/or ‘country’ is invalid.
-
set_params
(**kwargs)¶ Set parameters of the calendar extraction processing step.
Parameters: - continent (str) – Continent where the country or region is located (important for importing calendar module).
- country (str) – Country or region to use for holiday calendar (default ‘Germany’)
- features (List[str]) – A list, which contains all features that should be calculated. Default all are calculated.
Raises: AttributeError – If ‘continent’ and/or ‘country’ is invalid.
-
transform
(x: xarray.core.dataarray.DataArray) → xarray.core.dataarray.DataArray¶ Add date features to xarray dataset as configured.
Parameters: x – Xarray dataset containing a timeseries specified by the object’s ‘time_index’ Returns: The xarray dataset with date features added.
-
class
pywatts.modules.feature_extraction.calendar_extraction.
CalendarFeature
¶ Bases:
enum.IntEnum
The available calendar features, that are extractable by the calendar extraction module:
year: Extracting the year of the time series element. month: Extracting the month of the time series element. month_sine: Extracting the month of the time series element and encodes it with sine. month_cos: Extracting the month of the time series element and encodes it with cos. day: Extracting the day of the time series element. day_sine: Extracting the day of the time series element and encodes it with sine. day_cos: Extracting the day of the time series element and encodes it with cos. hour: Extracting the hour of the time series element. hour_sine: Extracting the hour of the time series element and encodes it with sine. hour_cos: Extracting the hour of the time series element and encodes it with cos. weekday: Extracting the weekday of the time series element. weekday_sine: Extracting the weekday of the time series element and encodes it with sine. weekday_cos: Extracting the weekday of the time series element and encodes it with cos. monday: Extracting a flag indicating if the element of the time series element is a monday or not. tuesday: Extracting a flag indicating if the element of the time series element is a tuesday or not. wednesday: Extracting a flag indicating if the element of the time series element is a wednesday or not. thursday: Extracting a flag indicating if the element of the time series element is a thursday or not. friday: Extracting a flag indicating if the element of the time series element is a friday or not. saturday: Extracting a flag indicating if the element of the time series element is a saturday or not. sunday: Extracting a flag indicating if the element of the time series element is a sunday or not. weekend: Extracting a flag indicating if the element of the time series element is a weekend or not. workday: Extracting a flag indicating if the element of the time series element is a workday or not. holiday: Extracting a flag indicating if the element of the time series element is a holiday or not.
-
day
= 5¶
-
day_cos
= 7¶
-
day_sine
= 6¶
-
friday
= 18¶
-
holiday
= 23¶
-
hour
= 8¶
-
hour_cos
= 10¶
-
hour_sine
= 9¶
-
monday
= 14¶
-
month
= 2¶
-
month_cos
= 4¶
-
month_sine
= 3¶
-
saturday
= 19¶
-
sunday
= 20¶
-
thursday
= 17¶
-
tuesday
= 15¶
-
wednesday
= 16¶
-
weekday
= 11¶
-
weekday_cos
= 13¶
-
weekday_sine
= 12¶
-
weekend
= 21¶
-
workday
= 22¶
-
year
= 1¶
-
pywatts.modules.feature_extraction.rolling_base module¶
-
class
pywatts.modules.feature_extraction.rolling_base.
RollingBase
(name: str = None, window_size=168, window_size_unit='d', group_by: pywatts.modules.feature_extraction.rolling_base.RollingGroupBy = <RollingGroupBy.No: 1>, continent: str = 'Europe', country: str = 'Germany', closed='left')¶ Bases:
pywatts_pipeline.core.transformer.base.BaseTransformer
,abc.ABC
Module which calculates a rolling mean over a specific window size. Note, currently the smallest resolution of the generated profile is one minute.
Parameters: - name (str) – Name of the new variable
- window_size (int) – Window size for which to calculate the mean
- window_size_unit (str) – Unit of the window size (default: “d” [day])
- group_by – how the entries of the time series should be grouped
:type group_by. RollingGroupBy :param continent: If group_by is WorkdayAndHoliday: Continent where the country or region is located
(important for importing calendar module).Parameters: - country (str) – If group_by is WorkdayAndHoliday: Country or region to use for holiday calendar (default ‘Germany’)
- closed (str) – If there array is closed left or right
-
get_min_data
()¶ Returns how much data are at least needed by that transformer
-
set_params
(**kwargs)¶ Set parameters of the rolling mean :param window_size: Window size for which to calculate the mean :type window_size: int :param window_size_unit: Unit of the window size (default: “d” [day]) :type window_size_unit: str :param groupy_by: how the entries of the time series should be grouped :type group_by. RollingGroupBy :param continent: If group_by is WorkdayAndHoliday: Continent where the country or region is located
(important for importing calendar module).Parameters: - country (str) – If group_by is WorkdayAndHoliday: Country or region to use for holiday calendar (default ‘Germany’)
- closed (str) – If there array is closed left or right
-
transform
(x: xarray.core.dataarray.DataArray, **kwargs) → xarray.core.dataarray.DataArray¶ Calculates a rolling mean
Parameters: x – Xarray dataset containing a timeseries specified by the object’s ‘time_index’ Returns: The xarray dataset with date features added.
pywatts.modules.feature_extraction.rolling_kurtosis module¶
-
class
pywatts.modules.feature_extraction.rolling_kurtosis.
RollingKurtosis
(name: str = None, window_size=168, window_size_unit='d', group_by: pywatts.modules.feature_extraction.rolling_base.RollingGroupBy = <RollingGroupBy.No: 1>, continent: str = 'Europe', country: str = 'Germany', closed='left')¶ Bases:
pywatts.modules.feature_extraction.rolling_base.RollingBase
Module which calculates a rolling kurtosis over a specific window size. Note, currently the smallest resolution of the generated profile is one minute.
For the documentation of the methods see
pywatts.modules.rolling_base.RollingBase
.Parameters: - name (str) – Name of the new variable
- window_size (int) – Window size for which to calculate the kurtosis
- window_size_unit (str) – Unit of the window size (default: “d” [day])
- groupy_by – how the entries of the time series should be grouped
:type group_by. RollingGroupBy :param continent: If group_by is WorkdayAndHoliday: Continent where the country or region is located
(important for importing calendar module).Parameters: - country (str) – If group_by is WorkdayAndHoliday: Country or region to use for holiday calendar (default ‘Germany’)
- closed (str) – If there array is closed left or right
pywatts.modules.feature_extraction.rolling_mean module¶
-
class
pywatts.modules.feature_extraction.rolling_mean.
RollingMean
(name: str = 'RollingMean', window_size=168, window_size_unit='d', group_by: pywatts.modules.feature_extraction.rolling_base.RollingGroupBy = <RollingGroupBy.No: 1>, continent: str = 'Europe', country: str = 'Germany', closed='left', alpha=None)¶ Bases:
pywatts.modules.feature_extraction.rolling_base.RollingBase
Module which calculates a rolling mean over a specific window size. Note, currently the smallest resolution of the generated profile is one minute.
For the documentation of the methods see
pywatts.modules.rolling_base.RollingBase
.param name: Name of the new variable type name: str param window_size: Window size for which to calculate the mean type window_size: int param window_size_unit: Unit of the window size (default: “d” [day]) type window_size_unit: str param groupy_by: how the entries of the time series should be grouped :type group_by. RollingGroupBy :param continent: If group_by is WorkdayAndHoliday: Continent where the country or region is located
(important for importing calendar module).type continent: str param country: If group_by is WorkdayAndHoliday: Country or region to use for holiday calendar (default ‘Germany’) type country: str param closed: If there array is closed left or right type closed: str param alpha: alpha value for weighting the most recent value if exponential weighting should be applied. If alpha is not set, then the mean of the sliding window is calculated. -
set_params
(alpha: Optional[float] = None, **kwargs)¶ Set parameters of the rolling mean :param window_size: Window size for which to calculate the mean :type window_size: int :param window_size_unit: Unit of the window size (default: “d” [day]) :type window_size_unit: str :param groupy_by: how the entries of the time series should be grouped :type group_by. RollingGroupBy :param continent: If group_by is WorkdayAndHoliday: Continent where the country or region is located
(important for importing calendar module).Parameters: - country (str) – If group_by is WorkdayAndHoliday: Country or region to use for holiday calendar (default ‘Germany’)
- closed (str) – If there array is closed left or right
- alpha (float) – alpha value for weighting the most recent value if exponential weighting should be applied. If alpha is not set, then the mean of the sliding window is calculated.
-
pywatts.modules.feature_extraction.rolling_skewness module¶
-
class
pywatts.modules.feature_extraction.rolling_skewness.
RollingSkewness
(name: str = None, window_size=168, window_size_unit='d', group_by: pywatts.modules.feature_extraction.rolling_base.RollingGroupBy = <RollingGroupBy.No: 1>, continent: str = 'Europe', country: str = 'Germany', closed='left')¶ Bases:
pywatts.modules.feature_extraction.rolling_base.RollingBase
Module which calculates a rolling skewness over a specific window size. Note, currently the smallest resolution of the generated profile is one minute.
For the documentation of the methods see
pywatts.modules.rolling_base.RollingBase
.Parameters: - name (str) – Name of the new variable
- window_size (int) – Window size for which to calculate the skewness
- window_size_unit (str) – Unit of the window size (default: “d” [day])
- groupy_by – how the entries of the time series should be grouped
:type group_by. RollingGroupBy :param continent: If group_by is WorkdayAndHoliday: Continent where the country or region is located
(important for importing calendar module).Parameters: - country (str) – If group_by is WorkdayAndHoliday: Country or region to use for holiday calendar (default ‘Germany’)
- closed (str) – If there array is closed left or right
pywatts.modules.feature_extraction.rolling_variance module¶
-
class
pywatts.modules.feature_extraction.rolling_variance.
RollingVariance
(name: str = None, window_size=168, window_size_unit='d', group_by: pywatts.modules.feature_extraction.rolling_base.RollingGroupBy = <RollingGroupBy.No: 1>, continent: str = 'Europe', country: str = 'Germany', closed='left')¶ Bases:
pywatts.modules.feature_extraction.rolling_base.RollingBase
Module which calculates a rolling variance over a specific window size. Note, currently the smallest resolution of the generated profile is one minute.
For the documentation of the methods see
pywatts.modules.rolling_base.RollingBase
.Parameters: - name (str) – Name of the new variable
- window_size (int) – Window size for which to calculate the variance
- window_size_unit (str) – Unit of the window size (default: “d” [day])
- groupy_by – how the entries of the time series should be grouped
:type group_by. RollingGroupBy :param continent: If group_by is WorkdayAndHoliday: Continent where the country or region is located
(important for importing calendar module).Parameters: - country (str) – If group_by is WorkdayAndHoliday: Country or region to use for holiday calendar (default ‘Germany’)
- closed (str) – If there array is closed left or right
pywatts.modules.feature_extraction.statistics_extraction module¶
-
class
pywatts.modules.feature_extraction.statistics_extraction.
StatisticExtraction
(name: str = 'statistics', features: List[pywatts.modules.feature_extraction.statistics_extraction.StatisticFeature] = None, dim='horizon')¶ Bases:
pywatts_pipeline.core.transformer.base.BaseTransformer
This module extracts statistical features based on samples of time series defined by a DataArray input. It can calculate the min, max, std, and mean of the samples.
Parameters: - name (str) – Name of this module.
- features (Optional[List[StatisticFeature]]) – The features that should be extracted. The following features exist: min, max, std, and mean. (Default: List[min, max, std, mean)
- dim (str) – The dimension on which the statistics should be extracted
-
transform
(x: xarray.core.dataarray.DataArray) → xarray.core.dataarray.DataArray¶ Add statistic features to xarray dataarray as configured.
Parameters: x – Xarray dataarray containing a time series Returns: The xarray dataarray with statistic features.
-
class
pywatts.modules.feature_extraction.statistics_extraction.
StatisticFeature
¶ Bases:
enum.IntEnum
The available statistic features, that are extractable by the statistic extraction module:
min: Extracting the min of the time series sample. max: Extracting the max of the time series sample. std: Extracting the std of the time series sample. mean: Extracting the mean of the time series sample.
-
max
= 2¶
-
mean
= 4¶
-
min
= 1¶
-
std
= 3¶
-
pywatts.modules.feature_extraction.trend_extraction module¶
This module contains the trend extraction
-
class
pywatts.modules.feature_extraction.trend_extraction.
TrendExtraction
(period, length, indexes: List[str] = None, name: str = 'trend_extractor')¶ Bases:
pywatts_pipeline.core.transformer.base.BaseTransformer
Module to extract a trend which can be specified through a period and a length, where the length indicates the length of the time step going back (ie 7, for a weekly trend with daily data) and period indicates the number of times this length is used (ie 10 to extract the last 10 weeks).
Parameters: - period (int) – Length of one period
- length (int) – Number of periods which should be extracted
- indexes (List[str]) – Index over which the trend is extracted (default: all time based indexes)
- name (str) – Name of the module
-
get_min_data
()¶ Returns how much data are at least needed by that transformer
-
transform
(x: xarray.core.dataarray.DataArray) → xarray.core.dataarray.DataArray¶ Extract trend values
Parameters: x (xr.DataArray) – input xarray DataArray Returns: a dataset containing the trend information Return type: xr.DataArray