kats.detectors.trend_mk module¶
- class kats.detectors.trend_mk.MKDetector(data: Optional[kats.consts.TimeSeriesData] = None, threshold: float = 0.8, alpha: float = 0.05, multivariate: bool = False)[source]¶
Bases:
kats.detectors.detector.Detector
MKDetector (MK stands for Mann-Kendall) is a non-parametric statistical test used to determine whether there is a monotonic trend in a given time series. See https://vsp.pnnl.gov/help/vsample/Design_Trend_Mann_Kendall.htm for details.
The basic idea is to check whether there is a monotonic trend based on a look back number of time steps (window_size).
- Parameters
data – TimeSeriesData, this is time series data at one-day granularity. This time series can be either univariate or multivariate. We require more than training_days points in each time series.
threshold – float, threshold for trend intensity; higher threshold gives trend with high intensity (0.8 by default). If we only want to use the p-value to determine changepoints, set threshold = 0.
alpha – float, significance level (0.05 by default)
multivariate – bool, whether the input time series is multivariate
>>> import pandas as pd >>> from kats.consts import TimeSeriesData >>> from kats.detectors.trend_mk import MKDetector >>> # read data and rename the two columns required by TimeSeriesData >>> # structure >>> data = pd.read_csv("../filename.csv") # demo file does not exist >>> TSdata = TimeSeriesData(data) >>> # create MKDetector with given data and params >>> d = MKDetector(data=TSdata) >>> # call detector method to fit model >>> detected_time_points = d.detector(window_size=20, direction="up") >>> # plot the results >>> d.plot(detected_time_points)
- MKtest(ts: pandas.core.frame.DataFrame) → Tuple[datetime.datetime, str, float, float][source]¶
Performs the Mann-Kendall (MK) test for trend detection.
(Mann 1945, Kendall 1975, Gilbert 1987)
- Parameters
ts – the dataframe of input data with time as index. This time series should not present seasonality for MK test.
- Returns
tuple containing:
- anchor_date(datetime): the last time point in ts; the date for
which alert is triggered
trend(str): tells the trend (decreasing, increasing, or no trend) p(float): p-value of the significance test Tau(float): Kendall Tau-b statistic (https://en.wikipedia.org/wiki/Kendall_rank_correlation_coefficient#Tau-b)
- Return type
(tuple)
- detector(window_size: int = 20, training_days: Optional[int] = None, direction: str = 'both', freq: Optional[str] = None) → List[Tuple[kats.consts.TimeSeriesChangePoint, kats.detectors.trend_mk.MKMetadata]][source]¶
Runs MK test sequentially.
It finds the trend and calculates the related statistics for all time points in a given time series.
- Parameters
window_size – int, the number of look back days for checking trend persistence
training_days – int, the number of days for time series smoothing; should be greater or equal to window_size If training_days is None, we will perform trend detection on the whole time series; otherwise, we will perform trend detection only for the anchor point using the previous training_days data.
direction – string, the direction of the trend to be detected, choose from {“down”, “up”, “both”}
freq – str, the type of seasonality shown in the time series, choose from {‘weekly’,’monthly’,’yearly’}
- get_MK_results(MK_statistics: pandas.core.frame.DataFrame, direction: str) → pandas.core.frame.DataFrame[source]¶
Obtain a subset of MK_statistics given the desired direction
- get_MK_statistics() → pandas.core.frame.DataFrame[source]¶
Get the dataframe of MK_statistics.
- get_top_k_metrics(time_point: datetime.datetime, top_k: Optional[int] = None) → pandas.core.frame.DataFrame[source]¶
Get k metrics that show the most significant trend at a time point.
Works only for multivariate data.
- Parameters
time_point – the time point to be investigated.
top_k – the number of top metrics.
- Returns
- a dataframe consists of top_k metrics and their corresponding
Kendall Tau and trend.
- multivariate_MKtest(ts: pandas.core.frame.DataFrame) → Tuple[datetime.datetime, str, float, Dict][source]¶
Performs the Multivariate Mann-Kendall (MK) test.
Proposed by R. M. Hirsch and J. R. Slack (1984).
- Parameters
ts – the dataframe of input data with time as index. This time series should not present seasonality for MK test.
- Returns
tuple containing:
- anchor_date(datetime): the last time point in ts; the date for
which alert is triggered.
- trend:_dict: tells the trend (decreasing, increasing, or no trend)
for each metric.
p: p-value of the significance test. Tau_dict: Dictionary of Kendall Tau-b statistics for each univariate
time series, and Tau_dict[“overall”] gives the Tau-b statistic for the multivariate time series.
- Return type
(tuple)
- plot(detected_time_points: List[Tuple[kats.consts.TimeSeriesChangePoint, kats.detectors.trend_mk.MKMetadata]]) → None[source]¶
Plots the original time series data, and the detected time points.
- plot_heat_map() → pandas.core.frame.DataFrame[source]¶
Plots the Tau of each metric in a heatmap.
- Returns
a dataframe contains Tau for all metrics at all time points.
- runDetector(ts: pandas.core.frame.DataFrame) → Dict[str, Any][source]¶
Runs MK test for a time point in the input data.
- Parameters
ts – the dataframe of input data with noise and seasonality removed. Its index is time.
- Returns
- A dictionary consisting of MK test statistics for the anchor time
point, including trend, p-value and Kendall Tau.
- class kats.detectors.trend_mk.MKMetadata(is_multivariate: bool, trend_direction: str, Tau: Union[float, Dict])[source]¶
Bases:
object
Metadata object for changepoint of MKDetector
- detector_type¶
Detector, Type of detector changepoint is for. Right now, this is always MKDetector.
- is_multivariate¶
boolean, Whether this is a changepoint for a multivariate time series.
- trend_direction¶
string, Direction of trend, either ‘increasing’ or ‘decreasing’.
- Tau¶
float or Dict, Kendall’s Tau value for changepoint. This is a float in the univariate case and a Dict in the
multivariate case.