kats.utils.simulator module¶

This module implements a simulator for generating synthetic time series data.

class kats.utils.simulator.Simulator(n: int = 100, freq: str = 'D', start: Optional[Any] = None)[source]¶

Bases: object

TimeSeriesData simulator, to generate synthetic timeseries data.

The Simulator currently supports generating synthetic time series using the STL, ARIMA models and also adds level and trend changepoints.

n¶: length of the time series.

freq¶: desired frequency (e.g. daily, weekly) of a time series.

start¶: start date of the time series.

add_noise(magnitude: float = 1.0, multiply: bool = False)[source]¶

Add noise to the generated time series for STL-based simulator.

Noise type is normal - noise will be generated from iid normal distribution; may consider adding noise generated by ARMA in the future if there’re use cases.

Parameters

magnitude – float.
multiply – True if the noise is multiplicative, otherwise additive.

Returns

Generated timeseries.

add_seasonality(magnitude: float = 0.0, period: Union[datetime.timedelta, float, str] = '1D', multiply: bool = False) → kats.consts.TimeSeriesData [source]¶

Add a seasonality component to the time series for STL-based simulator.

Parameters

magnitude – slope of the trend, float.
period – period of seasonality, timedelta.
multiply – True if the seasonality is multiplicative, otherwise additive.

Returns

Generated timeseries.

add_trend(magnitude: float, trend_type: str = 'linear', multiply: bool = False)[source]¶

Add a trend component to the target time series for STL-based simulator.

trend_type - shape of the trend. {“linear”,”sigmoid”}

Parameters

magnitude – slope of the trend, float.
trend_type – linear or sigmoid, string.
multiply – True if the trend is multiplicative, otherwise additive.

Returns

The timeseries generated.

arima_sim(ar: List[float], ma: List[float], mu: float = 0, sigma: float = 1, burnin: int = 10, d: int = 0, t: int = 0) → kats.consts.TimeSeriesData [source]¶

Simulate data from ARIMA model.

Data generation includes two steps: (1). Simulate data from ARMA(p’, q) model

The configuration of ARMA(p’, q) model is: X_t = alpha_1 * X_{t-1} + … + alpha_p * X_{t-p’}

1 * epsilon_t + theta_1 * epsilon_{t-1} + … + theta_q * epsilon_{t-q}

(2). Add drift d d is the order of differencing p = p’ - d for ARIMA(p, d, q)

Parameters

ar – [alpha_1, …, alpha_p’], coefficients of AR parameters. p = len(alpha)
ma – [theta_1, …, theta_q], coefficients of MA parameters. q = len(theta)
epsilon – error terms follows normal distribution(mu, sigma).
mu – mean of normal distribution for epsilon.
sigma – standard dev of normal distribution for epsilon.
burnin – number of data that will be dropped because lack of lagged data in the beginning.
d – number of unit roots for non-stationary data.
t – linear trend constant.

Returns

TimeSeries generated.

Return type

ts

Examples: >>> sim = Simulator(n=100, freq=”MS”, start = pd.to_datetime(“2011-01-01 00:00:00”)) >>> np.random.seed(100) >>> ts = sim.arima_sim(ar=[0.1, 0.05], ma = [0.04, 0.1], d = 1)

level_shift_multivariate_indep_sim(cp_arr: Optional[List[int]] = None, level_arr: Optional[List[float]] = None, noise: float = 30.0, seasonal_period: int = 7, seasonal_magnitude: float = 3.0, anomaly_arr: Optional[List[int]] = None, z_score_arr: Optional[List[float]] = None, dim: int = 3) → kats.consts.TimeSeriesData [source]¶

Produces a multivariate time series with level shifts.

The positions of the level shifts are indicated by the beginning and end changepoints. the duration of the first change is [first_cp_begin, first_cp_end], the duration of the second change point is [second_cp_begin, self.n] The number of dimensions are indicated by dim.

Parameters

cp_arr – Array of changepoint locations.
level_arr – Array containing levels for each segment. Since the number of segments is one more than the number of changepoints, hence, the level arr should be longer than the cp_arr by one.
noise – std. dev of random Gaussian noise added.
seasonal_period – periodicity of the time series.
seasonal_magnitude – amplitude of the seasonality. Set this to 0, if you want a time series without seasonality.
anomaly_arr – locations where we introduce an anomalous point.
z_score_arr – same length as anomaly arr. This is the z-score of the anomaly introduced at the location indicated by the anomaly_arr.
dim – number of dimensions of the timeseries.

Returns

Generated timeseries.

level_shift_sim(random_seed: int = 100, cp_arr: Optional[List[int]] = None, level_arr: Optional[List[float]] = None, noise: float = 30.0, seasonal_period: int = 7, seasonal_magnitude: float = 3.0, anomaly_arr: Optional[List[int]] = None, z_score_arr: Optional[List[float]] = None) → kats.consts.TimeSeriesData [source]¶

Produces a time series with level shifts.

The positions of the level shifts are indicated by the beginning and end changepoints. the duration of the first change is [first_cp_begin, first_cp_end], the duration of the second change point is [second_cp_begin, self.n]

Parameters

cp_arr – Array of changepoint locations.
level_arr – Array containing levels for each segment. Since the number of segments is one more than the number of changepoints, hence, the level arr should be longer than the cp_arr by one.
noise – std. dev of random Gaussian noise added.
seasonal_period – periodicity of the time series.
seasonal_magnitude – amplitude of the seasonality. Set this to 0, if you want a time series without seasonality.
anomaly_arr – locations where we introduce an anomalous point.
z_score_arr – same length as anomaly arr. This is the z-score of the anomaly introduced at the location indicated by the anomaly_arr.

Returns

Generated timeseries.

Example Usage: >>> sim2 = Simulator(n=450, start=”2018-01-01”) >>> ts2 = sim2.level_shift_sim(

cp_arr=[100, 200], level_arr=[3, 20, 2], noise=3, seasonal_period=7, seasonal_magnitude=3, anomaly_arr = [50, 150, 250], z_score_arr = [10, -10, 20],

)

stl_sim() → kats.consts.TimeSeriesData [source]¶

Simulate time series data with seasonality, trend, and noise.

Parameters: None. –
Returns: Generated timeseries.

Example usage: >>> sim = Simulator(n=100, freq=”1D”, start = pd.to_datetime(“2011-01-01”)) >>> sim.add_trend(magnitude=10) >>> sim.add_seasonality(5, period=timedelta(days=7)) >>> sim.add_noise(magnitude=2) >>> sim_ts = sim.stl_sim()

trend_shift_sim(random_seed: int = 15, cp_arr: Optional[List[int]] = None, trend_arr: Optional[List[float]] = None, intercept: float = 100.0, noise: float = 30.0, seasonal_period: int = 7, seasonal_magnitude: float = 3.0, anomaly_arr: Optional[List[int]] = None, z_score_arr: Optional[List[int]] = None) → kats.consts.TimeSeriesData [source]¶

Produces a time series with multiple trend shifts and seasonality.

This can be used as synthetic data to test trend changepoints first_cp_begin is where the trend change begins, and continues till the end.

Parameters

random_seed – Seed, to reproduce the same time series.
cp_arr – Array of changepoint locations.
trend_arr – Array containing trends for each segment. Since the number of segments is one more than the number of changepoints, hence, the trend arr should be longer than the cp_arr by one.
noise – std. dev of random Gaussian noise added.
seasonal_period – periodicity of the time series.
seasonal_magnitude – amplitude of the seasonality. Set this to 0, if you want a time series without seasonality.
anomaly_arr – locations where we introduce an anomalous point.
z_score_arr – same length as anomaly arr. This is the z-score of the anomaly introduced at the location indicated by the anomaly_arr.

Returns

Generated timeseries.

Example usage: >>> sim2 = Simulator(n=450, start=”2018-01-01”) >>> ts2 = sim2.trend_shift_sim(

cp_arr=[100, 200], trend_arr=[3, 20, 2], intercept=30, noise=30, seasonal_period=7, seasonal_magnitude=3, anomaly_arr = [50, 150, 250], z_score_arr = [10, -10, 20],

)

kats.utils.simulator module¶

Kats

Navigation

Related Topics