kats.detectors.trend_mk module

class kats.detectors.trend_mk.MKDetector(data: Optional[kats.consts.TimeSeriesData] = None, threshold: float = 0.8, alpha: float = 0.05, multivariate: bool = False)[source]

Bases: kats.detectors.detector.Detector

MKDetector (MK stands for Mann-Kendall) is a non-parametric statistical test used to determine whether there is a monotonic trend in a given time series. See https://vsp.pnnl.gov/help/vsample/Design_Trend_Mann_Kendall.htm for details.

The basic idea is to check whether there is a monotonic trend based on a look back number of time steps (window_size).

Parameters
  • dataTimeSeriesData, this is time series data at one-day granularity. This time series can be either univariate or multivariate. We require more than training_days points in each time series.

  • thresholdfloat, threshold for trend intensity; higher threshold gives trend with high intensity (0.8 by default). If we only want to use the p-value to determine changepoints, set threshold = 0.

  • alphafloat, significance level (0.05 by default)

  • multivariatebool, whether the input time series is multivariate

>>> import pandas as pd
>>> from kats.consts import TimeSeriesData
>>> from kats.detectors.trend_mk import MKDetector
>>> # read data and rename the two columns required by TimeSeriesData
>>> # structure
>>> data = pd.read_csv("../filename.csv") # demo file does not exist
>>> TSdata = TimeSeriesData(data)
>>> # create MKDetector with given data and params
>>> d = MKDetector(data=TSdata)
>>> # call detector method to fit model
>>> detected_time_points = d.detector(window_size=20, direction="up")
>>> # plot the results
>>> d.plot(detected_time_points)
MKtest(ts: pandas.core.frame.DataFrame)Tuple[datetime.datetime, str, float, float][source]

Performs the Mann-Kendall (MK) test for trend detection.

(Mann 1945, Kendall 1975, Gilbert 1987)

Parameters

ts – the dataframe of input data with time as index. This time series should not present seasonality for MK test.

Returns

tuple containing:

anchor_date(datetime): the last time point in ts; the date for

which alert is triggered

trend(str): tells the trend (decreasing, increasing, or no trend) p(float): p-value of the significance test Tau(float): Kendall Tau-b statistic (https://en.wikipedia.org/wiki/Kendall_rank_correlation_coefficient#Tau-b)

Return type

(tuple)

detector(window_size: int = 20, training_days: Optional[int] = None, direction: str = 'both', freq: Optional[str] = None)List[Tuple[kats.consts.TimeSeriesChangePoint, kats.detectors.trend_mk.MKMetadata]][source]

Runs MK test sequentially.

It finds the trend and calculates the related statistics for all time points in a given time series.

Parameters
  • window_size – int, the number of look back days for checking trend persistence

  • training_days – int, the number of days for time series smoothing; should be greater or equal to window_size If training_days is None, we will perform trend detection on the whole time series; otherwise, we will perform trend detection only for the anchor point using the previous training_days data.

  • direction – string, the direction of the trend to be detected, choose from {“down”, “up”, “both”}

  • freq – str, the type of seasonality shown in the time series, choose from {‘weekly’,’monthly’,’yearly’}

get_MK_results(MK_statistics: pandas.core.frame.DataFrame, direction: str)pandas.core.frame.DataFrame[source]

Obtain a subset of MK_statistics given the desired direction

get_MK_statistics()pandas.core.frame.DataFrame[source]

Get the dataframe of MK_statistics.

get_top_k_metrics(time_point: datetime.datetime, top_k: Optional[int] = None)pandas.core.frame.DataFrame[source]

Get k metrics that show the most significant trend at a time point.

Works only for multivariate data.

Parameters
  • time_point – the time point to be investigated.

  • top_k – the number of top metrics.

Returns

a dataframe consists of top_k metrics and their corresponding

Kendall Tau and trend.

multivariate_MKtest(ts: pandas.core.frame.DataFrame)Tuple[datetime.datetime, str, float, Dict][source]

Performs the Multivariate Mann-Kendall (MK) test.

Proposed by R. M. Hirsch and J. R. Slack (1984).

Parameters

ts – the dataframe of input data with time as index. This time series should not present seasonality for MK test.

Returns

tuple containing:

anchor_date(datetime): the last time point in ts; the date for

which alert is triggered.

trend:_dict: tells the trend (decreasing, increasing, or no trend)

for each metric.

p: p-value of the significance test. Tau_dict: Dictionary of Kendall Tau-b statistics for each univariate

time series, and Tau_dict[“overall”] gives the Tau-b statistic for the multivariate time series.

Return type

(tuple)

plot(detected_time_points: List[Tuple[kats.consts.TimeSeriesChangePoint, kats.detectors.trend_mk.MKMetadata]])None[source]

Plots the original time series data, and the detected time points.

plot_heat_map()pandas.core.frame.DataFrame[source]

Plots the Tau of each metric in a heatmap.

Returns

a dataframe contains Tau for all metrics at all time points.

runDetector(ts: pandas.core.frame.DataFrame)Dict[str, Any][source]

Runs MK test for a time point in the input data.

Parameters

ts – the dataframe of input data with noise and seasonality removed. Its index is time.

Returns

A dictionary consisting of MK test statistics for the anchor time

point, including trend, p-value and Kendall Tau.

class kats.detectors.trend_mk.MKMetadata(is_multivariate: bool, trend_direction: str, Tau: Union[float, Dict])[source]

Bases: object

Metadata object for changepoint of MKDetector

detector_type

Detector, Type of detector changepoint is for. Right now, this is always MKDetector.

is_multivariate

boolean, Whether this is a changepoint for a multivariate time series.

trend_direction

string, Direction of trend, either ‘increasing’ or ‘decreasing’.

Tau

float or Dict, Kendall’s Tau value for changepoint. This is a float in the univariate case and a Dict in the

multivariate case.