kats.models.metalearner.metalearner_predictability module¶
A module for meta-learner predictability.
This module contains the class MetaLearnPredictability
for meta-learner predictability. This class predicts whether a time series is predictable or not.
The predictability of a time series is determined by whether the forecasting errors of the possible best forecasting model can be less than a user-defined threshold.
- class kats.models.metalearner.metalearner_predictability.MetaLearnPredictability(metadata: Optional[List[Any]] = None, threshold: float = 0.2, load_model=False)[source]¶
Bases:
object
Meta-learner framework on predictability. This framework uses classification algorithms to predict whether a time series is predictable or not ( we define the time series with error metrics less than a user defined threshold as predictable). For training, it uses time series features as inputs and whether the best forecasting models’ errors less than the user-defined threshold as labels. For prediction, it takes time series or time series features as inputs to predict whether the corresponding time series is predictable or not. This class provides preprocess, pred, pred_by_feature, save_model and load_model.
- metadata¶
Optional; A list of dictionaries representing the meta-data of time series (e.g., the meta-data generated by GetMetaData object). Each dictionary d must contain at least 3 components: ‘hpt_res’, ‘features’ and ‘best_model’. d[‘hpt_res’] represents the best hyper-parameters for each candidate model and the corresponding errors; d[‘features’] are time series features, and d[‘best_model’] is a string representing the best candidate model of the corresponding time series data. metadata should not be None unless load_model is True. Default is None
- threshold¶
Optional; A float representing the threshold for the forecasting error. A time series whose forecasting error of the best forecasting model is higher than the threshold is considered as unpredictable. Default is 0.2.
- load_model[source]¶
Optional; A boolean to specify whether or not to load a trained model. Default is False.
- Sample Usage:
>>> mlp = MetaLearnPredictability(data) >>> mlp.train() >>> mlp.save_model() >>> mlp.pred(TSdata) # Predict whether a time series is predictable. >>> mlp2 = MetaLearnPredictability(load_model=True) # Create a new object to load the trained model >>> mlp2.load_model()
- load_model(file_path) → None[source]¶
Load a pre-trained model.
- Parameters
file_name – A string representing the path to load the pre-trained model.
- Returns
None.
- pred(source_ts: kats.consts.TimeSeriesData, ts_rescale: bool = True) → bool[source]¶
Predict whether a time series is predicable or not.
- Parameters
source_ts –
kats.consts.TimeSeriesData
object representing the new time series data.ts_scale – Optional; A boolean to specify whether or not to rescale time series data (i.e., normalizing it with its maximum vlaue) before calculating features. Default is True.
- Returns
A boolean representing whether the time series is predictable or not.
- pred_by_feature(source_x: Union[numpy.ndarray, List[numpy.ndarray], pandas.core.frame.DataFrame]) → numpy.ndarray[source]¶
Predict whether a list of time series are predicable or not given their time series features. :param source_x: the time series features of the time series that one wants to predict, can be a np.ndarray, a list of np.ndarray or a pd.DataFrame.
- Returns
A np.array storing whether the corresponding time series are predictable or not.
- preprocess() → None[source]¶
Rescale input time series features to zero-mean and unit-variance.
- Returns
None.
- save_model(file_path: str) → None[source]¶
Save the trained model.
- Parameters
file_name – A string representing the path to save the trained model.
- Returns
None.
- train(method: str = 'RandomForest', valid_size: float = 0.1, test_size: float = 0.1, recall_threshold: float = 0.7, n_estimators: int = 500, n_neighbors: int = 5, **kwargs) → Dict[str, float][source]¶
Train a classifier with time series features to forecast predictability.
- Parameters
method – Optional; A string representing the name of the classification algorithm. Can be ‘RandomForest’, ‘GBDT’, ‘KNN’ or ‘NaiveBayes’. Default is ‘RandomForest’.
valid_size – Optional; A float representing the size of validation set for parameter tunning, which should be within (0, 1). Default is 0.1.
test_size – Optional; A float representing the size of test set, which should be within [0., 1-valid_size). Default is 0.1.
recall_threshold – Optional; A float controlling the recall score of the classifier. The recall of the trained classifier will be larger than recall_threshold. Default is 0.7.
n_estimators – Optional; An integer representing the number of trees in random forest model. Default is 500.
n_neighbors – Optional; An integer representing the number of neighbors in KNN model. Default is 5.
- Returns
A dictionary stores the classifier performance on the test set (if test_size is valid).