kats.detectors.changepoint_evaluator module

class kats.detectors.changepoint_evaluator.BenchmarkEvaluator(detector: kats.detectors.detector.Detector)[source]

Bases: abc.ABC

class kats.detectors.changepoint_evaluator.Evaluation(dataset_name, precision, recall, f_score)

Bases: tuple

dataset_name

Alias for field number 0

f_score

Alias for field number 3

precision

Alias for field number 1

recall

Alias for field number 2

class kats.detectors.changepoint_evaluator.TuringEvaluator(detector: kats.detectors.detector.Detector, is_detector_model: bool = False)[source]

Bases: kats.detectors.changepoint_evaluator.BenchmarkEvaluator

Evaluates a changepoint detection algorithm. The evaluation follows the benchmarking method established in this paper: https://arxiv.org/pdf/2003.06222.pdf. By default, this evaluates the Turing changepoint benchmark, which is introduced in the above paper. This is the most comprehensive benchmark for changepoint detection algorithms.

You can also evaluate your own dataset. The dataset should be a dataframe with 3 columns:

‘dataset_name’: str, ‘time_series’: str “{‘0’: 0.55, ‘1’: 0.56}”, ‘annotation’: str “{‘0’:[1,2], ‘1’:[2,3]}”

Annotations allow different human beings to annotate a changepoints in a time series. Each key consists of one human labeler’s label. This allows for uncertainty in labeling. Usage:

>>> model_params = {'p_value_cutoff': 5e-3, 'comparison_window': 2}
>>> turing_2 = TuringEvaluator(detector = RobustStatDetector)
>>> eval_agg_df_2 = turing.evaluate(data=eg_df, model_params=model_params)
The above produces a dataframe with scores for each dataset
To get an average over all datasets you can do
>>> eval_agg = turing.get_eval_aggregate()
>>> avg_precision = eval_agg.get_average_precision()
get_eval_aggregate()[source]

returns the EvalAggregate object, which can then be used for for further processing

load_data()pandas.core.frame.DataFrame[source]

loads data, the source is either simulator or hive

kats.detectors.changepoint_evaluator.f_measure(annotations: Dict[str, List[int]], predictions: List[int], margin: int = 5, alpha: float = 0.5)Dict[str, float][source]

Compute the F-measure based on human annotations.

Remember that all CP locations are 0-based!

>>> f_measure({1: [10, 20], 2: [11, 20], 3: [10], 4: [0, 5]}, [10, 20])
1.0
>>> f_measure({1: [], 2: [10], 3: [50]}, [10])
0.9090909090909091
>>> f_measure({1: [], 2: [10], 3: [50]}, [])
0.8
Parameters
  • annotations – dict from user_id to iterable of CP locations.

  • predictions – iterable of predicted CP locations.

  • alpha – value for the F-measure, alpha=0.5 gives the F1-measure.

  • return_PR – whether to return precision and recall too.

kats.detectors.changepoint_evaluator.get_cp_index(changepoints: List[Tuple[kats.consts.TimeSeriesChangePoint, Any]], tsd: kats.consts.TimeSeriesData)List[int][source]

Accepts the output of the Detector.detector() method which is a list of tuples of (TimeSeriesChangePoint, Metadata) and returns the index of the changepoints

kats.detectors.changepoint_evaluator.true_positives(T: Set[int], X: Set[int], margin: int = 5)Set[int][source]

Compute true positives without double counting.

>>> true_positives({1, 10, 20, 23}, {3, 8, 20})
{1, 10, 20}
>>> true_positives({1, 10, 20, 23}, {1, 3, 8, 20})
{1, 10, 20}
>>> true_positives({1, 10, 20, 23}, {1, 3, 5, 8, 20})
{1, 10, 20}
>>> true_positives(set(), {1, 2, 3})
set()
>>> true_positives({1, 2, 3}, set())
set()
Parameters
  • T – true positives.

  • X – detected positives.

  • margin – threshold for absolute difference to be counted as different.

Returns

The set of true positives.