Translate

Friday, May 6, 2016

Battle Maps Using R and Leaflet

“Here are those damned black hat fellows again.”
Attributed to men of the 26th North Carolina, Pettigrew's Brigade at the Battle of Gettysburg in describing the "Iron Brigade".




A while back I created a post about plotting Russian airstrikes in Syria using the ggmap package. Of note was the commentary and method of taking that data and building an interactive map using the leaflet package by Bhaskar Karambelkar.  It inspired me to start work on a long-term goal I've had of creating a series of maps around the Battle of the Little Bighorn.  That project is on hold until I can do further research as the mystery of what happened to Custer and the 5 Companies of troopers under his direct command is quite contentious with pro and amateur historians.  Thus, I've decided to begin building a series of battle maps that I find interesting.  This first effort is around the initial engagement of Wadsworth's Division from the Union 1st Corps against Heth's Confederate Division on the 1st day of fighting.  The map depicts the attack by two of Heth's Brigades under Archer and Davis against Cutler's and Meredith's Iron Brigade.  Both Archer's and Davis's Brigades were virtually wiped out with Davis's men slaughtered in the infamous Railroad Cut.

The primary source I used was from the Civil War Trust website:

http://www.civilwar.org/battlefields/gettysburg/maps/gettysburg-battle-for.html

I served in the modern day namesake of The Iron Brigade, (2nd Brigade, 1st Armored Division). I hope to expand on the Gettysburg map and also include other maps such Kampfgruppe Peiper during the Battle of the Bulge, looking at the WikiLeaks data for Iraq and yes, tackle the Gordian Knot that is Custer's Last Stand.

The link to the map is on RPubs:  http://rpubs.com/Custer/iron_brigade  

The markers on the map signify individual regiments or artillery batteries.  You can click the markers to identify the unit.

And finally, here is the code.  Note that I used PowerPoint to create the .png files.

df = read.csv("getty_ironbde.csv")
str(df)
> str(df)
'data.frame': 65 obs. of  7 variables:
 $ Regiment: Factor w/ 24 levels "13th AL","147th NY",..: 16 7 5 19 9 3 21 11 15 22 ...
 $ Brigade : Factor w/ 10 levels "Archer","Cutler",..: 7 7 7 7 7 2 2 4 4 9 ...
 $ Icon    : Factor w/ 13 levels "cCdr","cInf1",..: 11 11 11 11 11 11 11 8 8 5 ...
 $ lat     : num  39.8 39.8 39.8 39.8 39.8 ...
 $ long    : num  -77.2 -77.2 -77.2 -77.3 -77.3 ...
 $ Time    : Factor w/ 3 levels "10:15 to 10:45 a.m.",..: 3 3 3 3 3 3 3 3 3 3 ...
 $ Unit    : Factor w/ 25 levels "13th AL/Archer's Brigade",..: 16 7 5 19 9 3 21 11 15 22 ...

library(leaflet)

# Incorporate NATO symbology for icons
milIcons <- iconList(
  uInf1 = makeIcon("uInf1.png", 48, 36),
  cInf1 = makeIcon("cInf1.png", 48, 36),
  uCav1 = makeIcon("uCav1.png", 48, 36),
  uArt1 = makeIcon("uArt1.png", 48, 36),
  uInf2 = makeIcon("uInf2.png", 48, 36),
  cInf2 = makeIcon("cInf2.png", 48, 36),
  uCav2 = makeIcon("uCav2.png", 48, 36),
  uArt2 = makeIcon("uArt2.png", 48, 36),
  uInf3 = makeIcon("uInf3.png", 48, 36),
  cInf3 = makeIcon("cInf3.png", 48, 36),
  uCav3 = makeIcon("uCav3.png", 48, 36),
  uArt3 = makeIcon("uArt3.png", 48, 36),
  cCdr = makeIcon("cCdr.png", 54, 40),
  uCdr = makeIcon("uCdr.png", 54, 40)
)

# Spatial dataframe seems to work best
df2 <- sp::SpatialPointsDataFrame(
  cbind(df$long, df$lat), data = data.frame(df$Unit, df$Icon, df$Time))

leaflet(df2) %>% 
  addProviderTiles("OpenStreetMap.BlackAndWhite", group="Black&White") %>%
  addProviderTiles("Esri.NatGeoWorldMap", group = "NatGeo") %>%
  addProviderTiles("MapQuestOpen.Aerial", group = "Aerial")%>%
  
  addLayersControl(
    baseGroups = c("Black&White", "NatGeo", "Aerial"),
    overlayGroups = df$Time,
    options = layersControlOptions(collapsed = FALSE))  %>%
      
  addMarkers(data=df2[1:22,], icon = ~ milIcons[df.Icon], 
             group="9:30 to 10:15 a.m.",
             popup = ~ df.Unit) %>%
  addMarkers(data=df2[23:43,], icon = ~ milIcons[df.Icon], 
             group="10:15 to 10:45 a.m.",
             popup = ~ df.Unit) %>%
  addMarkers(data=df2[44:65,], icon = ~ milIcons[df.Icon], 
             group="10:45 to 11:15 a.m.", 
             popup = ~ df.Unit) 

Friday, January 1, 2016

Simpson's Paradox

“Statistics are used much like a drunk uses a lamppost: for support, not illumination.”
A.E. Housman (commonly attributed to Andrew Lang)


Recently at work I stumbled into a classic example of Simpson's Paradox (Yule-Simpson Effect). Many people are not familiar with this phenomena, so I thought I would provide a brief example to help shed some light on it. If you are not aware of the problem of confounding or "lurking" variables in your analysis, you can box yourself into an analytical corner. To further understand what is happening in a Simpson's Paradox check out the following links:



In short what is happening is that a trend or association that appears between two different variables reverses when a third variable is included.  You will stumble into or at least need to be on the lookout for this effect of spurious correlation when you have unbalanced group sizes, like you often have using observational data. This is what happened in my recent discovery.
In the example below, let's have a look at the UC Berkeley data, or at least the portion of it by Department, as provided in a Wikipedia article. https://en.wikipedia.org/wiki/Simpson%27s_paradox#cite_note-Bickel-11

What the data we explore contains is the number of applications and admissions by gender to six different graduate schools. Again, this is just a portion of the data, focusing in on the largest departments.

dpt = c("A","B","C","D","E","F","A","B","C","D","E","F")
app = c(825,560,325,417,191,272,108,25,593,375,393,341)
adm = c(512,353,120,138,53,16,89,17,202,131,94,24)
gen = c("m","m","m","m","m","m","f","f","f","f","f","f")
df = cbind.data.frame(dpt,app,adm,gen)
str(df)


## 'data.frame':    12 obs. of  4 variables:
##  $ dpt: Factor w/ 6 levels "A","B","C","D",..: 1 2 3 4 5 6 1 2 3 4 ...
##  $ app: num  825 560 325 417 191 272 108 25 593 375 ...
##  $ adm: num  512 353 120 138 53 16 89 17 202 131 ...
##  $ gen: Factor w/ 2 levels "f","m": 2 2 2 2 2 2 1 1 1 1 ...

There are a number of ways to demonstrate the effect, but I thought I would give it a go using dplyr. First, let's group the data by gender (gen) then look at their overall admission rate.

library(dplyr)
by_gen = group_by(df, gen)
summarize(by_gen, adm_rate=sum(adm)/sum(app))
## Source: local data frame [2 x 2]
##
##   gen    adm_rate
## 1   f   0.3035422
## 2   m 0.4602317

Clearly there is a huge gender bias problem in graduate school admissions at UC Berkeley and it is time to round up a herd of lawyers. On a side note, what is the best way to save a drowning lawyer? It's easy, just take your foot off their head.

We can even provide a statistical test to strengthen the assertion of bias. In R, a proportions test can be done via the prop.test() function, inputting the number of "hits" and the number of "trials".

summarize(by_gen, sum(adm))
## Source: local data frame [2 x 2]
##
##   gen sum(adm)
## 1   f         557
## 2   m     1192

summarize(by_gen, sum(app))
## Source: local data frame [2 x 2]
##
##   gen sum(app)
## 1   f       1835
## 2   m     2590

prop.test(x=c(557,1192), n=c(1835,2590))
##
##  2-sample test for equality of proportions with continuity
##  correction
##
## data:  c(557, 1192) out of c(1835, 2590)
## X-squared = 109.67, df = 1, p-value < 2.2e-16
## alternative hypothesis: two.sided
## 95 percent confidence interval:
##  -0.1856333 -0.1277456
## sample estimates:
##        prop 1       prop 2
## 0.3035422 0.4602317


We see in the output a high level of statistical significance and a confidence interval with a difference of roughly 13 to 19 percent. However, beware of the analyst proclaiming truths using a 2x2 table with observational data! In this instance, the confounding variable is the department (dpt).

In order to have a look at the data by gender and by department, we will once again use the group_by() function, then calculate the admission rates and finally turn it from "long" to "wide" format.

by_dpt = group_by(df, dpt,gen)
df2 = as.data.frame(summarize(by_dpt, adm_rate=sum(adm)/sum(app)))
df2
##    dpt  gen         adm_rate
## 1     A    f     0.82407407
## 2     A   m    0.62060606
## 3     B    f     0.68000000
## 4     B   m    0.63035714
## 5     C    f     0.34064081
## 6     C   m    0.36923077
## 7     D    f     0.34933333
## 8     D   m    0.33093525
## 9     E    f      0.23918575
## 10   E   m     0.27748691
## 11   F    f      0.07038123
## 12   F   m     0.05882353

library(reshape2)
df2_wide = dcast(df2, dpt~gen, value.var="adm_rate")
df2_wide
##   dpt                  f               m
## 1   A  0.82407407  0.62060606
## 2   B  0.68000000  0.63035714
## 3   C  0.34064081  0.36923077
## 4   D  0.34933333  0.33093525
## 5   E  0.23918575   0.27748691
## 6   F  0.07038123   0.05882353

With the inclusion of department we now see that females actually have higher admission rates in four of the six departments (A,B,D,F). How can this be? It had to do with rates and the volume of admission to the various departments. Again, the groups were highly unbalanced. Alright, enough of this as it is now time to focus on something important as Ohio State is squaring off against Notre Dame and the NHL Winter Classic!

Saturday, November 7, 2015

Plotting Russian AiRstRikes in SyRia


"Who do we think will rise if Assad falls?"
"Do we have a “government in a box” that we think we can fly to Damascus and put into power if the Syrian army collapses, the regime falls and ISIS approaches the capital?"
"Have we forgotten the lesson of “Animal Farm”? When the animals revolt and take over the farm, the pigs wind up in charge." 
Patrick J. Buchanan

In my new book, "Mastering Machine Learning with R", I wanted to include geo-spatial mapping in the chapter on cluster analysis.  I actually completed the entire chapter doing a cluster analysis on the Iraq Wikileaks data, plotting the clusters on a map and building a story around developing an intelligence estimate for the Al-Doura Oil Refinery, which I visited on many occasions during my 2009 "sabbatical".  However, the publisher convinced me that the material was too sensitive for such a book and I totally re-wrote the analysis with a different data set.  I may or may not publish it on this blog at some point, but I want to continue to explore building maps in R.  As luck would have it, I stumbled into a data set showing the locations of Russian airstrikes in Syria at the following site:

http://russia-strikes-syria.silk.co/

The data includes the latitude and longitude of the strikes along with other background information. The what, how and why the data was collected is available here:

https://www.bellingcat.com/news/mena/2015/10/26/what-russias-own-videos-and-maps-reveal-about-who-they-are-bombing-in-syria/

In short, the site tried to independently verify locations, targets etc., plus includes what they claim are the reported versus actual strike locations.  When I pulled the data there were 60 strikes analyzed by the site.  They were unable to determine the locations of 11 of the strikes, so we have 49 data points.

I built the data in excel and put in a .csv, which I've already loaded.  Here is the structure of the data.

> str(airstrikes)
'data.frame':  120 obs. of  4 variables:
 $ Airstrikes   : chr  "Strike 1" "Strike 10" "Strike 11" "Strike 12" ...
 $ Lat          : chr  "35.687782" "35.725846" "35.734952" "35.719518" ...
 $ Long         : chr  "36.786667" "36.260419" "36.073837" "36.072385" ...
 $ real_reported: chr  "real" "real" "real" "real" ...

> head(airstrikes)
  Airstrikes       Lat      Long real_reported
1   Strike 1 35.687782 36.786667          real
2  Strike 10 35.725846 36.260419          real
3  Strike 11 35.734952 36.073837          real
4  Strike 12 35.719518 36.072385          real
5  Strike 13 35.309074 36.620506          real
6  Strike 14 35.817206 36.124503          real

> tail(airstrikes)
    Airstrikes       Lat      Long real_reported
115  Strike 59 35.644864 36.338568      reported
116   Strike 6 35.740134 36.247029      reported
117  Strike 60  36.09346 37.085198      reported
118   Strike 7 35.702113 36.563525      reported
119   Strike 8 35.822472 36.018779      reported
120   Strike 9 35.725846 36.260419      reported

Since lat and long are character, I need to change them to numeric and also keep a subset of data of the actual/real strike locations.

> airstrikes$Lat = as.numeric(airstrikes$Lat)
Warning message:
NAs introduced by coercion

> airstrikes$Long = as.numeric(airstrikes$Long)
Warning message:
NAs introduced by coercion


> real=subset(airstrikes, airstrikes$real_reported=="real")

I will be using ggmap for this effort and pull in google maps for plotting.

> library(ggmap)
Loading required package: ggplot2
Google Maps API Terms of Service: http://developers.google.com/maps/terms.
Please cite ggmap if you use it: see citation('ggmap') for details.

> citation('ggmap')

To cite ggmap in publications, please use:

  D. Kahle and H. Wickham. ggmap: Spatial Visualization with ggplot2. The
  R Journal, 5(1), 144-161. URL

  http://journal.r-project.org/archive/2013-1/kahle-wickham.pdf

The first map will be an overall view of the country with the map type as "terrain".  Note that "satellite", "hybrid" and "roadmap" are also available.

> map1 = ggmap(

   get_googlemap(center="Syria", zoom=7, maptype="terrain"))

With the map created as object "map1", I plot the locations using "geom_point()".

> map1 + geom_point(
   data = real, aes (x = Long, y = Lat), pch = 19,  size = 6, col="red3")






















With the exception of what looks like one strike near Ar Raqqah, we can see they are concentrated between Aleppo and Homs with some close to the Turkish border.  Let's have a closer look at that region.

> map2 = ggmap(
   get_googlemap(center="Ehsim, Syria", zoom=9, maptype="terrain"))

> map2 + geom_point(data = real, aes (x = Long, y = Lat),
  pch = 18,  size = 9, col="red2")






















East of Ghamam is a large concentration, so let's zoom in on that area and add the strike number as labels.

> map3 = ggmap(
   get_googlemap(center="Dorien, Syria",zoom=13, maptype="hybrid"))

> map3 + geom_point(
   data = real, aes (x = Long, y = Lat),pch = 18, size = 9, col="red3") +
   geom_text(data=real,aes(x=Long, y=Lat, label=Airstrikes),
   size = 5, vjust = 0, hjust = -0.25, color="white")






















The last thing I want to do is focus in on the site for Strike 28.  To do this we will require the lat and long, which we can find with the which() function.

> which(real$Airstrikes =="Strike 28")
[1] 21

> real[21,]
   Airstrikes      Lat     Long real_reported

21  Strike 28 35.68449 36.11946          real

It is now just a simple matter of using those coordinates for calling up the google map.

> map4 = ggmap(
   get_googlemap(center=c(lon=36.11946,lat=35.68449), zoom=17, maptype="satellite"))

> map4 + geom_point(
   data = real, aes (x = Long, y = Lat), pch = 22, size = 12, col="red3")
   + geom_text(data=real,aes(x=Long, y=Lat, label=Airstrikes),
   size = 9, vjust = 0, hjust = -0.25, color="white")





















From the looks of it, this seems to be an isolated location, so it was probably some sort of base or logistics center.  If you're interested, the Russian Ministry of Defense posts videos of these strikes and you can see this one on YouTube.

https://www.youtube.com/watch?v=Ape5grS9MEM


OK, so that is a quick tutorial on using ggmap, a very powerful package.  We've just scratched the surface of what it can do.  I will continue to monitor the site for additional data.  Perhaps publish a Shiny app if the data is large and "rich" enough.

Cheers,

CL









Friday, October 30, 2015

Machine Learning with R

My effort on machine learning with R is now available.

The book is available on amazon.com and packtpub.com.  A preview is also available on scribd.com at this link:

https://www.scribd.com/book/287699993/Mastering-Machine-Learning-with-R


Monday, January 12, 2015

NFL Cluster Analysis...Part 2

Another post on Cluster Analysis and the NFL, this time on the Perceivant.com blog.

http://perceivant.com/nfl-ranking-cluster-analysis/


Monday, December 29, 2014

Cluster Analysis of the NFL’s Top Wide Receivers

“The time has come to get deeply into football. It is the only thing we have left that ain't fixed.”
Hunter S. Thompson, Hey Rube Column, November 9, 2004

I have to confess that I haven’t been following the NFL this year as much as planned or hoped.  On only 3 or 4 occasions this year have I been able to achieve a catatonic state while watching NFL RedZone.  Nonetheless, it is easy to envision how it is all going to end.  Manning will throw four picks on a cold snowy day in Foxboro in the AFC Championship game and the Seahawk defense will curb-stomp Aaron Rodgers and capture a consecutive NFC crown.  As for the Super Bowl, well who cares other than the fact that we must cheer against the evil Patriot empire, rooting for their humiliating demise.  One can simultaneously hate and admire that team.  I prefer to do the former publicly and the latter in private.

We all seem to have a handle on the good and bad quarterbacks out there, but what about their wide receivers?  With the playoffs at the doorstep and ignorance of the situation, I had to bring myself up to speed.  This is a great opportunity to do a cluster analysis of the top wide receivers and see who is worth keeping an eye on in the upcoming spectacle.

A good source for interesting statistics and articles on NFL players and teams is http://www.advancedfootballanalytics.com/ .  Here we can download the rankings for the top 40 wide receivers based on Win Probability Added or “WPA”.  To understand the calculation you should head over to the site and read the details.  The site provides a rationale on WPA by saying “WPA has a number of applications. For starters, we can tell which plays were truly critical in each game. From a fan’s perspective, we can call a play the ‘play of the week’ or the ‘play of the year.’ And although we still can't separate an individual player's performance from that of his teammates', we add up the total WPA for plays in which individual players took part. This can help us see who really made the difference when it matters most. It can help tell us who is, or at least appears to be, “clutch.” It can also help inform us who really deserves the player of the week award, the selection to the Pro Bowl, or even induction into the Hall of Fame.”  

I put the website’s wide receiver data table in a .csv and we can start the analysis, reading in the file and examining its structure.

> receivers <- read.csv(file.choose())
> str(receivers)
'data.frame': 40 obs. of  19 variables:
 $ Rank     : int  1 2 3 4 5 6 7 8 9 10 ...
 $ Player   : Factor w/ 40 levels "10-E.Sanders",..: 16 33 1 23 37 24 36 2 4 13 ...
 $ Team     : Factor w/ 28 levels "ARZ","ATL","BLT",..: 11 23 10 22 9 12 12 28 17 19 ...
 $ G        : int  16 16 16 16 16 16 16 14 14 12 ...
 $ WPA      : num  2.43 2.4 2.33 2.33 2.3 2.27 2.19 1.91 1.89 1.76 ...
 $ EPA      : num  59 95.7 81.3 56.8 78.8 86.2 97.3 54.3 63.6 64.6 ...
 $ WPA_G    : num  0.15 0.15 0.15 0.15 0.14 0.14 0.14 0.14 0.14 0.15 ...
 $ EPA_P    : num  0.38 0.48 0.5 0.38 0.54 0.6 0.58 0.54 0.41 0.43 ...
 $ SR_PERC  : num  55.4 62.8 61.3 52 58.5 59.4 60.5 54.5 58.1 55 ...
 $ YPR      : num  13.4 13.2 13.9 15.5 15 14.1 15.5 20.2 10.6 14.3 ...
 $ Rec      : int  99 129 101 86 88 91 98 52 92 91 ...
 $ Yds      : int  1331 1698 1404 1329 1320 1287 1519 1049 972 1305 ...
 $ RecTD    : int  4 13 9 10 16 12 13 5 4 12 ...
 $ Tgts     : int  143 181 141 144 136 127 151 88 134 130 ...
 $ PER_Tgt  : num  24.2 30.2 23.4 23.5 28.9 23.9 28.4 16.2 22.3 21.7 ...
 $ YPT      : num  9.3 9.4 10 9.2 9.7 10.1 10.1 11.9 7.3 10 ...
 $ C_PERC   : num  69.2 71.3 71.6 59.7 64.7 71.7 64.9 59.1 68.7 70 ...
 $ PERC_DEEP: num  19.6 26.5 36.2 30.6 30.9 23.6 31.8 33 16.4 28.5 ...
 $ playoffs : Factor w/ 2 levels "n","y": 2 2 2 1 2 2 2 1 2 1 ...

> head(receivers$Player)
[1] 15-G.Tate    84-A.Brown   10-E.Sanders 18-J.Maclin  88-D.Bryant
[6] 18-R.Cobb

Based on WPA, Golden Tate of Detroit is the highest ranked wide receiver; in contrast his highly-regarded teammate is ranked 13th.

We talked about WPA so here is a quick synopsis on the other variables; again, please go to the website for detailed explanations:

  • EPA - Expected Points Added
  • WPA_G - WPA per game
  • EPA_P - Expected Points Added per Play
  • SR_PERC - Success Rate of plays the receiver was involved that are considered successful
  • YPR - Yards Per Reception
  • Rec - Total Receptions
  • Yds - Total Reception Yards
  • RecTD - Receiving Touchdowns
  • Tgts - The times a receiver was targeted in the passing game
  • PER_Tgts - Percentage of time a team’s passes were targeted to the receiver
  • YPT - Yards per times targeted by a pass
  • C_PERC - Completion percentage
  • PERC_DEEP - Percent of passes targeted deep
  • playoffs - A factor I coded on whether the receiver’s team is in the playoffs or not

To do hierarchical clustering with this data, we can use the hclust() function available in base R.  In preparation for that, we should scale the data and we must create a distance matrix.

> r.df <- receivers[,c(4:18)]
> rownames(r.df) <- receivers[,2]
> scaled <- scale(r.df)
> d <- dist(scaled)

With the data prepared, produce the cluster object and plot it.

> hc <- hclust(d, method="ward.D")
> plot(hc, hang=-1, xlab="", sub="")


This is the standard dendrogram produced with hclust.  We now need to select the proper number of clusters and produce a dendrogram that is easier to examine.  For this, I found some interesting code to adapt on Gaston Sanchez’s blog: http://gastonsanchez.com/blog/how-to/2012/10/03/Dendrograms.html .  Since I am leaning towards 5 clusters, let’s first create a vector of colors.  (Note: you can find/search for color codes on colorhexa.com)

> labelColors = c("#FF0000",  "#800080","#0000ff", "#ff8c00","#013220")

Then use the cutree() function to specify 5 clusters

> clusMember = cutree(hc, 5)

Now, we create a function (courtesy of Gaston) to apply colors to the clusters in the dendrogram.

> colLab <- function(n) {
+   if (is.leaf(n)) {
+     a <- attributes(n)
+     labCol <- labelColors[clusMember[which(names(clusMember) == a$label)]]
+     attr(n, "nodePar") <- c(a$nodePar, lab.col = labCol)
+   }
+   n
+ }

Finally, we turn “hc” into a dendrogram object and plot the new results.

> hcd <- as.dendrogram(hc)
> clusDendro = dendrapply(hcd, colLab)
> plot(clusDendro, main = "NFL Receiver Clusters", type = "triangle")



That is much better.  For more in-depth analysis you can put the clusters back into the original dataframe.

> receivers$cluster <- as.factor(cutree(hc, 5))

It is now rather interesting to plot the variables by cluster to examine the differences.  In the interest of time and space, I present just a boxplot of WPA by cluster.

> boxplot(WPA~cluster, data=receivers, main="Receiver Rank by Cluster")



Before moving on, I present this simple table of the clusters of the receivers by playoff qualification.  Interesting to note that cluster 1 with the high WPA, also has 10 of 13 receivers in the playoffs.  One of the things that would be worth a look I think is to adjust wide receiver WPA by some weight based on their QB quality.  Note that Randall Cobb and Jordy Nelson of the Packers have high WPA, ranked 6 and 7 respectively, but have the privilege of having Rodgers as QB.  Remember, in the quote above WPA does not have the ability to separate an individual’s success from a teammate’s success.  This raises some interesting questions for me that require further inquiry.

> table(receivers$cluster, receivers$playoff)
 
        n     y
  1    3   10
  2    4     2
  3    7     4
  4    4     0
  5    3     3

In closing the final blog of the year, I must make some predictions for the College Football playoffs.  I hate to say it, but I think Alabama will roll over Ohio State.  In the Rose Bowl, FSU comes from behind to win...again!  I’d really like to see the Ducks win it all, but I just don’t see their defense being of the quality to stop Winston when it counts, which will be in the fourth quarter.  ‘Bama has that defense, well, the defensive line and backers anyway.  Therefore, I have to give the Crimson Tide the nod in the championship.  The news is not all bad.  Nebraska finally let go of Bo Pelini.  I was ecstatic about his hire, but the paucity of top-notch recruits finally manifested itself with perpetual high-level mediocrity.  His best years were with Callaghan’s recruits, Ndamukong Suh among many others.  They should have hired Paul Johnson from Georgia Tech, at least it would have been fun and somewhat nostalgic to watch Husker football again.

Mahalo,

Cory

Saturday, November 15, 2014

Causality in Time Series - A look at 2 R packages (CausalImpact and Changepoint)

My first blog post on Perceivant.com

http://perceivant.com/causality-time-series-determining-impact-marketing-interventions/