kats.detectors.trend_mk module¶
- class kats.detectors.trend_mk.MKDetector(data: Optional[kats.consts.TimeSeriesData] = None, threshold: float = 0.8, alpha: float = 0.05, multivariate: bool = False)[source]¶
- Bases: - kats.detectors.detector.Detector- MKDetector (MK stands for Mann-Kendall) is a non-parametric statistical test used to determine whether there is a monotonic trend in a given time series. See https://vsp.pnnl.gov/help/vsample/Design_Trend_Mann_Kendall.htm for details. - The basic idea is to check whether there is a monotonic trend based on a look back number of time steps (window_size). - Parameters
- data – TimeSeriesData, this is time series data at one-day granularity. This time series can be either univariate or multivariate. We require more than training_days points in each time series. 
- threshold – float, threshold for trend intensity; higher threshold gives trend with high intensity (0.8 by default). If we only want to use the p-value to determine changepoints, set threshold = 0. 
- alpha – float, significance level (0.05 by default) 
- multivariate – bool, whether the input time series is multivariate 
 
 - >>> import pandas as pd >>> from kats.consts import TimeSeriesData >>> from kats.detectors.trend_mk import MKDetector >>> # read data and rename the two columns required by TimeSeriesData >>> # structure >>> data = pd.read_csv("../filename.csv") # demo file does not exist >>> TSdata = TimeSeriesData(data) >>> # create MKDetector with given data and params >>> d = MKDetector(data=TSdata) >>> # call detector method to fit model >>> detected_time_points = d.detector(window_size=20, direction="up") >>> # plot the results >>> d.plot(detected_time_points) - MKtest(ts: pandas.core.frame.DataFrame) → Tuple[datetime.datetime, str, float, float][source]¶
- Performs the Mann-Kendall (MK) test for trend detection. - (Mann 1945, Kendall 1975, Gilbert 1987) - Parameters
- ts – the dataframe of input data with time as index. This time series should not present seasonality for MK test. 
- Returns
- tuple containing: - anchor_date(datetime): the last time point in ts; the date for
- which alert is triggered 
 - trend(str): tells the trend (decreasing, increasing, or no trend) p(float): p-value of the significance test Tau(float): Kendall Tau-b statistic (https://en.wikipedia.org/wiki/Kendall_rank_correlation_coefficient#Tau-b) 
- Return type
- (tuple) 
 
 - detector(window_size: int = 20, training_days: Optional[int] = None, direction: str = 'both', freq: Optional[str] = None) → List[Tuple[kats.consts.TimeSeriesChangePoint, kats.detectors.trend_mk.MKMetadata]][source]¶
- Runs MK test sequentially. - It finds the trend and calculates the related statistics for all time points in a given time series. - Parameters
- window_size – int, the number of look back days for checking trend persistence 
- training_days – int, the number of days for time series smoothing; should be greater or equal to window_size If training_days is None, we will perform trend detection on the whole time series; otherwise, we will perform trend detection only for the anchor point using the previous training_days data. 
- direction – string, the direction of the trend to be detected, choose from {“down”, “up”, “both”} 
- freq – str, the type of seasonality shown in the time series, choose from {‘weekly’,’monthly’,’yearly’} 
 
 
 - get_MK_results(MK_statistics: pandas.core.frame.DataFrame, direction: str) → pandas.core.frame.DataFrame[source]¶
- Obtain a subset of MK_statistics given the desired direction 
 - get_MK_statistics() → pandas.core.frame.DataFrame[source]¶
- Get the dataframe of MK_statistics. 
 - get_top_k_metrics(time_point: datetime.datetime, top_k: Optional[int] = None) → pandas.core.frame.DataFrame[source]¶
- Get k metrics that show the most significant trend at a time point. - Works only for multivariate data. - Parameters
- time_point – the time point to be investigated. 
- top_k – the number of top metrics. 
 
- Returns
- a dataframe consists of top_k metrics and their corresponding
- Kendall Tau and trend. 
 
 
 - multivariate_MKtest(ts: pandas.core.frame.DataFrame) → Tuple[datetime.datetime, str, float, Dict][source]¶
- Performs the Multivariate Mann-Kendall (MK) test. - Proposed by R. M. Hirsch and J. R. Slack (1984). - Parameters
- ts – the dataframe of input data with time as index. This time series should not present seasonality for MK test. 
- Returns
- tuple containing: - anchor_date(datetime): the last time point in ts; the date for
- which alert is triggered. 
- trend:_dict: tells the trend (decreasing, increasing, or no trend)
- for each metric. 
 - p: p-value of the significance test. Tau_dict: Dictionary of Kendall Tau-b statistics for each univariate - time series, and Tau_dict[“overall”] gives the Tau-b statistic for the multivariate time series. 
- Return type
- (tuple) 
 
 - plot(detected_time_points: List[Tuple[kats.consts.TimeSeriesChangePoint, kats.detectors.trend_mk.MKMetadata]]) → None[source]¶
- Plots the original time series data, and the detected time points. 
 - plot_heat_map() → pandas.core.frame.DataFrame[source]¶
- Plots the Tau of each metric in a heatmap. - Returns
- a dataframe contains Tau for all metrics at all time points. 
 
 - runDetector(ts: pandas.core.frame.DataFrame) → Dict[str, Any][source]¶
- Runs MK test for a time point in the input data. - Parameters
- ts – the dataframe of input data with noise and seasonality removed. Its index is time. 
- Returns
- A dictionary consisting of MK test statistics for the anchor time
- point, including trend, p-value and Kendall Tau. 
 
 
 
- class kats.detectors.trend_mk.MKMetadata(is_multivariate: bool, trend_direction: str, Tau: Union[float, Dict])[source]¶
- Bases: - object- Metadata object for changepoint of MKDetector - detector_type¶
- Detector, Type of detector changepoint is for. Right now, this is always MKDetector. 
 - is_multivariate¶
- boolean, Whether this is a changepoint for a multivariate time series. 
 - trend_direction¶
- string, Direction of trend, either ‘increasing’ or ‘decreasing’. 
 - Tau¶
- float or Dict, Kendall’s Tau value for changepoint. This is a float in the univariate case and a Dict in the - multivariate case.