zoo_resample {distantia} | R Documentation |
Resample Zoo Objects to a New Time
Description
Objective
Time series resampling involves interpolating new values for time steps not available in the original time series. This operation is useful to:
Transform irregular time series into regular.
Align time series with different temporal resolutions.
Increase (upsampling) or decrease (downsampling) the temporal resolution of a time series.
On the other hand, time series resampling should not be used to extrapolate new values outside of the original time range of the time series, or to increase the resolution of a time series by a factor of two or more. These operations are known to produce non-sensical results.
Methods This function offers three methods for time series interpolation:
"linear" (default): interpolation via piecewise linear regression as implemented in
zoo::na.approx()
."spline": cubic smoothing spline regression as implemented in
stats::smooth.spline()
."loess": local polynomial regression fitting as implemented in
stats::loess()
.
These methods are used to fit models y ~ x
where y
represents the values of a univariate time series and x
represents a numeric version of its time.
The functions utils_optimize_spline()
and utils_optimize_loess()
are used under the hood to optimize the complexity of the methods "spline" and "loess" by finding the configuration that minimizes the root mean squared error (RMSE) between observed and predicted y
. However, when the argument max_complexity = TRUE
, the complexity optimization is ignored, and a maximum complexity model is used instead.
New time
The argument new_time
offers several alternatives to help define the new time of the resulting time series:
-
NULL
: the target time series (x
) is resampled to a regular time within its original time range and number of observations. -
zoo object
: a zoo object to be used as template for resampling. Useful when the objective is equalizing the frequency of two separate zoo objects. -
time vector
: a time vector of a class compatible with the time inx
. -
keyword
: character string defining a resampling keyword, obtained viazoo_time(x, keywords = "resample")$keywords
.. -
numeric
: a single number representing the desired interval between consecutive samples in the units ofx
(relevant units can be obtained viazoo_time(x)$units
).
Step by Step
The steps to resample a time series list are:
The time interpolation range taken from the index of the zoo object. This step ensures that no extrapolation occurs during resampling.
If
new_time
is provided, any values ofnew_time
outside of the minimum and maximum interpolation times are removed to avoid extrapolation. Ifnew_time
is not provided, a regular time within the interpolation time range of the zoo object is generated.For each univariate time time series, a model
y ~ x
, wherey
is the time series andx
is its own time coerced to numeric is fitted.If
max_complexity == FALSE
andmethod = "spline"
ormethod = "loess"
, the model with the complexity that minimizes the root mean squared error between the observed and predictedy
is returned.If
max_complexity == TRUE
andmethod = "spline"
ormethod = "loess"
, the first valid model closest to a maximum complexity is returned.
The fitted model is predicted over
new_time
to generate the resampled time series.
Other Details
Please use this operation with care, as there are limits to the amount of resampling that can be done without distorting the data. The safest option is to keep the distance between new time points within the same magnitude of the distance between the old time points.
Usage
zoo_resample(
x = NULL,
new_time = NULL,
method = "linear",
max_complexity = FALSE
)
Arguments
x |
(required, zoo object) Time series to resample. Default: NULL |
new_time |
(optional, zoo object, keyword, or time vector) New time to resample
|
method |
(optional, character string) Name of the method to resample the time series. One of "linear", "spline" or "loess". Default: "linear". |
max_complexity |
(required, logical). Only relevant for methods "spline" and "loess". If TRUE, model optimization is ignored, and the a model of maximum complexity (an overfitted model) is used for resampling. Default: FALSE |
Value
zoo object
See Also
Other zoo_functions:
zoo_aggregate()
,
zoo_name_clean()
,
zoo_name_get()
,
zoo_name_set()
,
zoo_permute()
,
zoo_plot()
,
zoo_smooth_exponential()
,
zoo_smooth_window()
,
zoo_time()
,
zoo_to_tsl()
,
zoo_vector_to_matrix()
Examples
#simulate irregular time series
x <- zoo_simulate(
cols = 2,
rows = 50,
time_range = c("2010-01-01", "2020-01-01"),
irregular = TRUE
)
#plot time series
if(interactive()){
zoo_plot(x)
}
#intervals between samples
x_intervals <- diff(zoo::index(x))
x_intervals
#create regular time from the minimum of the observed intervals
new_time <- seq.Date(
from = min(zoo::index(x)),
to = max(zoo::index(x)),
by = floor(min(x_intervals))
)
new_time
diff(new_time)
#resample using piecewise linear regression
x_linear <- zoo_resample(
x = x,
new_time = new_time,
method = "linear"
)
#resample using max complexity splines
x_spline <- zoo_resample(
x = x,
new_time = new_time,
method = "spline",
max_complexity = TRUE
)
#resample using max complexity loess
x_loess <- zoo_resample(
x = x,
new_time = new_time,
method = "loess",
max_complexity = TRUE
)
#intervals between new samples
diff(zoo::index(x_linear))
diff(zoo::index(x_spline))
diff(zoo::index(x_loess))
#plotting results
if(interactive()){
par(mfrow = c(4, 1), mar = c(3,3,2,2))
zoo_plot(
x,
guide = FALSE,
title = "Original"
)
zoo_plot(
x_linear,
guide = FALSE,
title = "Method: linear"
)
zoo_plot(
x_spline,
guide = FALSE,
title = "Method: spline"
)
zoo_plot(
x_loess,
guide = FALSE,
title = "Method: loess"
)
}