kats.utils.simulator moduleΒΆ
This module implements a simulator for generating synthetic time series data.
- class kats.utils.simulator.Simulator(n: int = 100, freq: str = 'D', start: Optional[Any] = None)[source]ΒΆ
Bases:
object
TimeSeriesData simulator, to generate synthetic timeseries data.
The Simulator currently supports generating synthetic time series using the STL, ARIMA models and also adds level and trend changepoints.
- nΒΆ
length of the time series.
- freqΒΆ
desired frequency (e.g. daily, weekly) of a time series.
- startΒΆ
start date of the time series.
- add_noise(magnitude: float = 1.0, multiply: bool = False)[source]ΒΆ
Add noise to the generated time series for STL-based simulator.
Noise type is normal - noise will be generated from iid normal distribution; may consider adding noise generated by ARMA in the future if thereβre use cases.
- Parameters
magnitude β float.
multiply β True if the noise is multiplicative, otherwise additive.
- Returns
Generated timeseries.
- add_seasonality(magnitude: float = 0.0, period: Union[datetime.timedelta, float, str] = '1D', multiply: bool = False) → kats.consts.TimeSeriesData[source]ΒΆ
Add a seasonality component to the time series for STL-based simulator.
- Parameters
magnitude β slope of the trend, float.
period β period of seasonality, timedelta.
multiply β True if the seasonality is multiplicative, otherwise additive.
- Returns
Generated timeseries.
- add_trend(magnitude: float, trend_type: str = 'linear', multiply: bool = False)[source]ΒΆ
Add a trend component to the target time series for STL-based simulator.
trend_type - shape of the trend. {βlinearβ,βsigmoidβ}
- Parameters
magnitude β slope of the trend, float.
trend_type β linear or sigmoid, string.
multiply β True if the trend is multiplicative, otherwise additive.
- Returns
The timeseries generated.
- arima_sim(ar: List[float], ma: List[float], mu: float = 0, sigma: float = 1, burnin: int = 10, d: int = 0, t: int = 0) → kats.consts.TimeSeriesData[source]ΒΆ
Simulate data from ARIMA model.
Data generation includes two steps: (1). Simulate data from ARMA(pβ, q) model
The configuration of ARMA(pβ, q) model is: X_t = alpha_1 * X_{t-1} + β¦ + alpha_p * X_{t-pβ}
1 * epsilon_t + theta_1 * epsilon_{t-1} + β¦ + theta_q * epsilon_{t-q}
(2). Add drift d d is the order of differencing p = pβ - d for ARIMA(p, d, q)
- Parameters
ar β [alpha_1, β¦, alpha_pβ], coefficients of AR parameters. p = len(alpha)
ma β [theta_1, β¦, theta_q], coefficients of MA parameters. q = len(theta)
epsilon β error terms follows normal distribution(mu, sigma).
mu β mean of normal distribution for epsilon.
sigma β standard dev of normal distribution for epsilon.
burnin β number of data that will be dropped because lack of lagged data in the beginning.
d β number of unit roots for non-stationary data.
t β linear trend constant.
- Returns
TimeSeries generated.
- Return type
ts
Examples: >>> sim = Simulator(n=100, freq=βMSβ, start = pd.to_datetime(β2011-01-01 00:00:00β)) >>> np.random.seed(100) >>> ts = sim.arima_sim(ar=[0.1, 0.05], ma = [0.04, 0.1], d = 1)
- level_shift_multivariate_indep_sim(cp_arr: Optional[List[int]] = None, level_arr: Optional[List[float]] = None, noise: float = 30.0, seasonal_period: int = 7, seasonal_magnitude: float = 3.0, anomaly_arr: Optional[List[int]] = None, z_score_arr: Optional[List[float]] = None, dim: int = 3) → kats.consts.TimeSeriesData[source]ΒΆ
Produces a multivariate time series with level shifts.
The positions of the level shifts are indicated by the beginning and end changepoints. the duration of the first change is [first_cp_begin, first_cp_end], the duration of the second change point is [second_cp_begin, self.n] The number of dimensions are indicated by dim.
- Parameters
cp_arr β Array of changepoint locations.
level_arr β Array containing levels for each segment. Since the number of segments is one more than the number of changepoints, hence, the level arr should be longer than the cp_arr by one.
noise β std. dev of random Gaussian noise added.
seasonal_period β periodicity of the time series.
seasonal_magnitude β amplitude of the seasonality. Set this to 0, if you want a time series without seasonality.
anomaly_arr β locations where we introduce an anomalous point.
z_score_arr β same length as anomaly arr. This is the z-score of the anomaly introduced at the location indicated by the anomaly_arr.
dim β number of dimensions of the timeseries.
- Returns
Generated timeseries.
- level_shift_sim(random_seed: int = 100, cp_arr: Optional[List[int]] = None, level_arr: Optional[List[float]] = None, noise: float = 30.0, seasonal_period: int = 7, seasonal_magnitude: float = 3.0, anomaly_arr: Optional[List[int]] = None, z_score_arr: Optional[List[float]] = None) → kats.consts.TimeSeriesData[source]ΒΆ
Produces a time series with level shifts.
The positions of the level shifts are indicated by the beginning and end changepoints. the duration of the first change is [first_cp_begin, first_cp_end], the duration of the second change point is [second_cp_begin, self.n]
- Parameters
cp_arr β Array of changepoint locations.
level_arr β Array containing levels for each segment. Since the number of segments is one more than the number of changepoints, hence, the level arr should be longer than the cp_arr by one.
noise β std. dev of random Gaussian noise added.
seasonal_period β periodicity of the time series.
seasonal_magnitude β amplitude of the seasonality. Set this to 0, if you want a time series without seasonality.
anomaly_arr β locations where we introduce an anomalous point.
z_score_arr β same length as anomaly arr. This is the z-score of the anomaly introduced at the location indicated by the anomaly_arr.
- Returns
Generated timeseries.
Example Usage: >>> sim2 = Simulator(n=450, start=β2018-01-01β) >>> ts2 = sim2.level_shift_sim(
cp_arr=[100, 200], level_arr=[3, 20, 2], noise=3, seasonal_period=7, seasonal_magnitude=3, anomaly_arr = [50, 150, 250], z_score_arr = [10, -10, 20],
)
- stl_sim() → kats.consts.TimeSeriesData[source]ΒΆ
Simulate time series data with seasonality, trend, and noise.
- Parameters
None. β
- Returns
Generated timeseries.
Example usage: >>> sim = Simulator(n=100, freq=β1Dβ, start = pd.to_datetime(β2011-01-01β)) >>> sim.add_trend(magnitude=10) >>> sim.add_seasonality(5, period=timedelta(days=7)) >>> sim.add_noise(magnitude=2) >>> sim_ts = sim.stl_sim()
- trend_shift_sim(random_seed: int = 15, cp_arr: Optional[List[int]] = None, trend_arr: Optional[List[float]] = None, intercept: float = 100.0, noise: float = 30.0, seasonal_period: int = 7, seasonal_magnitude: float = 3.0, anomaly_arr: Optional[List[int]] = None, z_score_arr: Optional[List[int]] = None) → kats.consts.TimeSeriesData[source]ΒΆ
Produces a time series with multiple trend shifts and seasonality.
This can be used as synthetic data to test trend changepoints first_cp_begin is where the trend change begins, and continues till the end.
- Parameters
random_seed β Seed, to reproduce the same time series.
cp_arr β Array of changepoint locations.
trend_arr β Array containing trends for each segment. Since the number of segments is one more than the number of changepoints, hence, the trend arr should be longer than the cp_arr by one.
noise β std. dev of random Gaussian noise added.
seasonal_period β periodicity of the time series.
seasonal_magnitude β amplitude of the seasonality. Set this to 0, if you want a time series without seasonality.
anomaly_arr β locations where we introduce an anomalous point.
z_score_arr β same length as anomaly arr. This is the z-score of the anomaly introduced at the location indicated by the anomaly_arr.
- Returns
Generated timeseries.
Example usage: >>> sim2 = Simulator(n=450, start=β2018-01-01β) >>> ts2 = sim2.trend_shift_sim(
cp_arr=[100, 200], trend_arr=[3, 20, 2], intercept=30, noise=30, seasonal_period=7, seasonal_magnitude=3, anomaly_arr = [50, 150, 250], z_score_arr = [10, -10, 20],
)