library("xts")
## Loading required package: zoo
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
Sys.setenv(TZ="GMT")
Q: What is xts
?
xts
is an R
package offering a number of functionalities to work on time-indexed data. xts
extends zoo
, another popular package for time-series analysis.
Q: Why should I use xts
rather than zoo
or another time-series package?
The main benefit of xts
is its seamless compatibility with other packages using different time-series classes (timeSeries
, zoo
, …). In addition, xts
allows the user to add custom attributes to any object. See the main xts
vignette for more information.
Q: How do I install xts
?
xts
depends on zoo
and suggests some other packages. You should be able to install xts
and all the other required components by simply calling install.packages('pkg')
from the R
prompt.
Q: I have multiple .csv time-series files that I need to load in a single xts
object. What is the most efficient way to import the files?
If the files have the same format, load them with read.zoo
and then call rbind
to join the series together; finally, call as.xts
on the result. Using a combination of lapply
and do.call
can accomplish this with very little code:
filenames <- c("a.csv", "b.csv", "c.csv")
sample.xts <- as.xts(do.call("rbind", lapply(filenames, read.zoo)))
Q: Why is xts
implemented as a matrix rather than a data frame?
xts
uses a matrix rather than data.frame because: * xts
is a subclass of zoo
, and that’s how zoo
objects are structured; and * matrix objects have much better performance than data.frames.
Q: How can I simplify the syntax when referring to xts
object column names?
with
allows you to use the colmn names while avoiding the full square brackets syntax. For example:
lm(sample.xts[, "Res"] ~ sample.xts[, "ThisVar"] + sample.xts[, "ThatVar"])
can be converted to
with(sample.xts, lm(Res ~ ThisVar + ThatVar))
Q: How can I replace the zeros in an xts
object with the last non-zero value in the series?
Convert the zeros to NA
and then use na.locf
:
sample.xts <- xts(c(1:3, 0, 0, 0), as.POSIXct("1970-01-01")+0:5)
sample.xts[sample.xts==0] <- NA
cbind(orig=sample.xts, locf=na.locf(sample.xts))
## orig locf
## 1970-01-01 00:00:00 1 1
## 1970-01-01 00:00:01 2 2
## 1970-01-01 00:00:02 3 3
## 1970-01-01 00:00:03 NA 3
## 1970-01-01 00:00:04 NA 3
## 1970-01-01 00:00:05 NA 3
Q: How do I create an xts
index with millisecond precision?
Milliseconds in xts
indexes are stored as decimal values. This example builds an index spaced by 100 milliseconds, starting at the current system time:
data(sample_matrix)
sample.xts <- xts(1:10, seq(as.POSIXct("1970-01-01"), by=0.1, length=10))
## Warning in seq.POSIXt(as.POSIXct("1970-01-01"), by = 0.1, length = 10): partial
## argument match of 'length' to 'length.out'
Q: I have a millisecond-resolution index, but the milliseconds aren’t displayed. What went wrong?
Set the digits.secs
option to some sub-second precision. Continuing from the previous example, if you are interested in milliseconds:
options(digits.secs=3)
head(sample.xts)
## [,1]
## 1970-01-01 00:00:00.0 1
## 1970-01-01 00:00:00.1 2
## 1970-01-01 00:00:00.2 3
## 1970-01-01 00:00:00.3 4
## 1970-01-01 00:00:00.4 5
## 1970-01-01 00:00:00.5 6
Q: I set digits.sec=3
, but R
doesn’t show the values correctly.
Sub-second values are stored with approximately microsecond precision. Setting the precision to only 3 decimal hides the full index value in microseconds and might be tricky to interpret depending how the machine rounds the millisecond (3rd) digit. Set the digits.secs
option to a value higher than 3 or convert the date-time to numeric and use print
’s digits
argument, or sprintf
to display the full value. For example:
dt <- as.POSIXct("2012-03-20 09:02:50.001")
print(as.numeric(dt), digits=20)
## [1] 1332234170.0009999275
sprintf("%20.10f", dt)
## [1] "1332234170.0009999275"
Q: I am using apply
to run a custom function on my xts
object. Why does the returned matrix have different dimensions than the original one?
When working on rows, apply
returns a transposed version of the original matrix. Simply call t
on the returned matrix to restore the original dimensions:
sample.xts.2 <- xts(t(apply(sample.xts, 1, myfun)), index(sample.xts))
Q: I have an xts
object with varying numbers of observations per day (e.g., one day might contain 10 observations, while another day contains 20 observations). How can I apply a function to all observations for each day?
You can use apply.daily
, or period.apply
more generally:
sample.xts <- xts(1:50, seq(as.POSIXct("1970-01-01"),
as.POSIXct("1970-01-03")-1, length=50))
## Warning in seq.POSIXt(as.POSIXct("1970-01-01"), as.POSIXct("1970-01-03") - :
## partial argument match of 'length' to 'length.out'
apply.daily(sample.xts, mean)
## [,1]
## 1970-01-01 23:30:36.244 13
## 1970-01-02 23:59:59.000 38
period.apply(sample.xts, endpoints(sample.xts, "days"), mean)
## [,1]
## 1970-01-01 23:30:36.244 13
## 1970-01-02 23:59:59.000 38
period.apply(sample.xts, endpoints(sample.xts, "hours", 6), mean)
## [,1]
## 1970-01-01 05:52:39.061 4.0
## 1970-01-01 11:45:18.122 10.5
## 1970-01-01 17:37:57.183 16.5
## 1970-01-01 23:30:36.244 22.5
## 1970-01-02 05:23:15.306 28.5
## 1970-01-02 11:15:54.367 34.5
## 1970-01-02 17:08:33.428 40.5
## 1970-01-02 23:59:59.000 47.0
Q: How can I process daily data for a specific time subset?
First use time-of-day subsetting to extract the time range you want to work on (note the leading "T"
and leading zeros are required for each time in the range: "T06:00"
), then use apply.daily
to apply your function to the subset:
apply.daily(sample.xts['T06:00/T17:00',], mean)
Q: How can I analyze my irregular data in regular blocks, adding observations for each regular block if one doesn’t exist in the origianl time-series object?
Use align.time
to round-up the indexes to the periods you are interested in, then call period.apply
to apply your function. Finally, merge the result with an empty xts object that contains all the regular index values you want:
sample.xts <- xts(1:6, as.POSIXct(c("2009-09-22 07:43:30",
"2009-10-01 03:50:30", "2009-10-01 08:45:00", "2009-10-01 09:48:15",
"2009-11-11 10:30:30", "2009-11-11 11:12:45")))
# align index into regular (e.g. 3-hour) blocks
aligned.xts <- align.time(sample.xts, n=60*60*3)
# apply your function to each block
count <- period.apply(aligned.xts, endpoints(aligned.xts, "hours", 3), length)
# create an empty xts object with the desired regular index
empty.xts <- xts(, seq(start(aligned.xts), end(aligned.xts), by="3 hours"))
# merge the counts with the empty object
head(out1 <- merge(empty.xts, count))
## count
## 2009-09-22 09:00:00 1
## 2009-09-22 12:00:00 NA
## 2009-09-22 15:00:00 NA
## 2009-09-22 18:00:00 NA
## 2009-09-22 21:00:00 NA
## 2009-09-23 00:00:00 NA
# or fill with zeros
head(out2 <- merge(empty.xts, count, fill=0))
## count
## 2009-09-22 09:00:00 1
## 2009-09-22 12:00:00 0
## 2009-09-22 15:00:00 0
## 2009-09-22 18:00:00 0
## 2009-09-22 21:00:00 0
## 2009-09-23 00:00:00 0
Q: Why do I get a zoo
object when I call transform
on my xts
object?
There’s no xts
method for transform
, so the zoo
method is dispatched. The zoo
method explicitly creates a new zoo
object. To convert the transformed object back to an xts
object wrap the transform
call in as.xts
:
sample.xts <- as.xts(transform(sample.xts, ABC=1))
You might also have to reset the index timezone:
indexTZ(sample.xts) <- Sys.getenv("TZ")
Q: Why can’t I use the \&
operator in xts
objects when querying dates?
"2011-09-21"
is not a logical vector and cannot be coerced to a logical vector. See ?"\&"
for details.
xts
ISO-8601 style subsetting is nice, but there’s nothing we can do to change the behavior of .Primitive("\&")
. You can do something like this though:
sample.xts[sample.xts$Symbol == "AAPL" & index(sample.xts) == as.POSIXct("2011-09-21"),]
or:
sample.xts[sample.xts$Symbol == "AAPL"]['2011-09-21']
Q: How do I subset an xts
object to only include weekdays (excluding Saturday and Sundays)?
Use .indexwday
to only include Mon-Fri days:
data(sample_matrix)
sample.xts <- as.xts(sample_matrix)
wday.xts <- sample.xts[.indexwday(sample.xts) %in% 1:5]
head(wday.xts)
## Open High Low Close
## 2007-01-02 50.03978 50.11778 49.95041 50.11778
## 2007-01-03 50.23050 50.42188 50.23050 50.39767
## 2007-01-04 50.42096 50.42096 50.26414 50.33236
## 2007-01-05 50.37347 50.37347 50.22103 50.33459
## 2007-01-08 50.03555 50.10363 49.96971 49.98806
## 2007-01-09 49.99489 49.99489 49.80454 49.91333
Q: I need to quickly convert a data.frame that contains the time-stamps in one of the columns. Using as.xts(Data)
returns an error. How do I build my xts
object?
The as.xts
function assumes the date-time index is contained in the rownames
of the object to be converted. If this is not the case, you need to use the xts
constructor, which requires two arguments: a vector or a matrix carrying data and a vector of type Date
, POSIXct
, chron
, ...
, supplying the time index information. If you are certain the time-stamps are in a specific column, you can use:
Data <- data.frame(timestamp=as.Date("1970-01-01"), obs=21)
sample.xts <- xts(Data[,-1], order.by=Data[,1])
If you aren’t certain, you need to explicitly reference the column name that contains the time-stamps:
Data <- data.frame(obs=21, timestamp=as.Date("1970-01-01"))
sample.xts <- xts(Data[,!grepl("timestamp",colnames(Data))],
order.by=Data$timestamp)
Q: I have two time-series with different frequency. I want to combine the data into a single xts
object, but the times are not exactly aligned. I want to have one row in the result for each ten minute period, with the time index showing the beginning of the time period.
align.time
creates evenly spaced time-series from a set of indexes, merge
ensure two time-series are combined in a single xts
object with all original columns and indexes preserved. The new object has one entry for each timestamp from both series and missing values are replaced with NA
.
x1 <- align.time(xts(Data1$obs, Data1$timestamp), n=600)
x2 <- align.time(xts(Data2$obs, Data2$timestamp), n=600)
merge(x1, x2)
Q: Why do I get a warning when running the code below?
data(sample_matrix)
sample.xts <- as.xts(sample_matrix)
sample.xts["2007-01"]$Close <- sample.xts["2007-01"]$Close + 1
## Warning in NextMethod(.Generic): number of items to replace is not a multiple of
## replacement length
This code creates two calls to the subset-replacement function xts:::[<-.xts
. The first call replaces the value of Close
in a temporary copy of the first row of the object on the left-hand-side of the assignment, which works fine. The second call tries to replace the first element of the object on the left-hand-side of the assignment with the modified temporary copy of the first row. This is the problem.
For the command to work, there needs to be a comma in the first subset call on the left-hand-side:
sample.xts["2007-01",]$Close <- sample.xts["2007-01"]$Close + 1
This isn’t encouraged, because the code isn’t clear. Simply remember to subset by column first, then row, if you insist on making two calls to the subset-replacement function. A cleaner and faster solution is below. It’s only one function call and it avoids the $
function (which is marginally slower on xts objects).
sample.xts["2007-01","Close"] <- sample.xts["2007-01","Close"] + 1
Thanks to Alberto Giannetti and Michael R. Weylandt for their many contributions.