Translate

Tuesday, June 11, 2013

Visual Exploration of Time Series

A couple of weeks ago, I stumbled across the following post on using R to discover patterns in time series.
http://dahtah.wordpress.com/2013/05/17/finding-patterns-in-time-series-using-regular-expressions/#comments

The author examined a univariate time series of Australian GDP, looking for recessions, defined by two consecutive quarters of GDP growth.  The technique can be used to examine multiple time series, looking for correlations etc.  One could use this on the chicken versus egg data set I blogged about last week!  Having looked at commodities of late, I applied this to a data set of daily prices since January 1st for the exchange traded funds tracking gold (GLD)and oil prices (USO).  Why gold and oil?  Well, why not!  Enjoy.

> data = read.csv(file.choose())
> attach(data)
> head(data)
      Date    GLD   USO   UGA  CORN
1 1/2/2013 163.17 33.82 59.08 43.66
2 1/3/2013 161.20 33.74 58.87 43.44
3 1/4/2013 160.44 33.88 58.40 42.77
4 1/7/2013 159.43 33.92 58.94 42.92
5 1/8/2013 160.56 33.96 59.02 43.20
6 1/9/2013 160.49 33.88 58.63 43.56
> # data set includes gas prices (UGA) and corn prices (CORN) but let's ignore them

> par(mfrow=c(2,1)) #make a 2x1 plot
> delta = (sign(diff(GLD)) == 1) + 0 #getting started with GLD
> head(delta)
[1] 0 0 0 1 0 1
> #we now have the differenced data for GLD
> ds1 = do.call(paste0, as.list(c(delta)))
> #setting the code to highlight 3 days in a row of price decline (the "000+")
> matches1 = gregexpr("000+", ds1, perl = T)[[1]]

> matches1 = gregexpr("000+", ds1, perl = T)[[1]]
> matches1
[1]  1 14 25 29 38 55 61 88
attr(,"match.length")
[1] 3 4 3 5 4 3 3 7
attr(,"useBytes")
[1] TRUE
> # we have 8 points of at least 3 consecutive days of declining prices

>  m.length1 = attr(matches1,"match.length")
> x1 = sapply(1:length(matches1),function(ind) matches1[ind]+0:(m.length1[ind]))
> hl = function(inds) lines(time(GLD)[inds], GLD[inds], col = "red", lwd = 3)
>  plot.ts(GLD, main="Gold ETF")
>  tmp1 = sapply(x1, hl)
# that completes GLD, now onto USO

> delta = (sign(diff(USO)) == 1) + 0
> ds1 = do.call(paste0, as.list(c(delta)))
> matches1 = gregexpr("000+", ds1, perl = T)[[1]]
> matches1
[1] 39 60 68 88
attr(,"match.length")
[1] 3 5 3 4
attr(,"useBytes")
[1] TRUE
> m.length1 = attr(matches1,"match.length")
> x1 = sapply(1:length(matches1),function(ind) matches1[ind]+0:(m.length1[ind]))
>  hl = function(inds) lines(time(USO)[inds], USO[inds], col = "red", lwd = 3)
> plot.ts(USO, main = "Oil ETF")
>  tmp1 = sapply(x1, hl)



This produces the following graph.  Not much insight here, but there are exciting possibilities.

 




No comments:

Post a Comment