Saturday, November 7, 2015

Plotting Russian AiRstRikes in SyRia

"Who do we think will rise if Assad falls?"
"Do we have a “government in a box” that we think we can fly to Damascus and put into power if the Syrian army collapses, the regime falls and ISIS approaches the capital?"
"Have we forgotten the lesson of “Animal Farm”? When the animals revolt and take over the farm, the pigs wind up in charge." 
Patrick J. Buchanan

In my new book, "Mastering Machine Learning with R", I wanted to include geo-spatial mapping in the chapter on cluster analysis.  I actually completed the entire chapter doing a cluster analysis on the Iraq Wikileaks data, plotting the clusters on a map and building a story around developing an intelligence estimate for the Al-Doura Oil Refinery, which I visited on many occasions during my 2009 "sabbatical".  However, the publisher convinced me that the material was too sensitive for such a book and I totally re-wrote the analysis with a different data set.  I may or may not publish it on this blog at some point, but I want to continue to explore building maps in R.  As luck would have it, I stumbled into a data set showing the locations of Russian airstrikes in Syria at the following site:

The data includes the latitude and longitude of the strikes along with other background information. The what, how and why the data was collected is available here:

In short, the site tried to independently verify locations, targets etc., plus includes what they claim are the reported versus actual strike locations.  When I pulled the data there were 60 strikes analyzed by the site.  They were unable to determine the locations of 11 of the strikes, so we have 49 data points.

I built the data in excel and put in a .csv, which I've already loaded.  Here is the structure of the data.

> str(airstrikes)
'data.frame':  120 obs. of  4 variables:
 $ Airstrikes   : chr  "Strike 1" "Strike 10" "Strike 11" "Strike 12" ...
 $ Lat          : chr  "35.687782" "35.725846" "35.734952" "35.719518" ...
 $ Long         : chr  "36.786667" "36.260419" "36.073837" "36.072385" ...
 $ real_reported: chr  "real" "real" "real" "real" ...

> head(airstrikes)
  Airstrikes       Lat      Long real_reported
1   Strike 1 35.687782 36.786667          real
2  Strike 10 35.725846 36.260419          real
3  Strike 11 35.734952 36.073837          real
4  Strike 12 35.719518 36.072385          real
5  Strike 13 35.309074 36.620506          real
6  Strike 14 35.817206 36.124503          real

> tail(airstrikes)
    Airstrikes       Lat      Long real_reported
115  Strike 59 35.644864 36.338568      reported
116   Strike 6 35.740134 36.247029      reported
117  Strike 60  36.09346 37.085198      reported
118   Strike 7 35.702113 36.563525      reported
119   Strike 8 35.822472 36.018779      reported
120   Strike 9 35.725846 36.260419      reported

Since lat and long are character, I need to change them to numeric and also keep a subset of data of the actual/real strike locations.

> airstrikes$Lat = as.numeric(airstrikes$Lat)
Warning message:
NAs introduced by coercion

> airstrikes$Long = as.numeric(airstrikes$Long)
Warning message:
NAs introduced by coercion

> real=subset(airstrikes, airstrikes$real_reported=="real")

I will be using ggmap for this effort and pull in google maps for plotting.

> library(ggmap)
Loading required package: ggplot2
Google Maps API Terms of Service:
Please cite ggmap if you use it: see citation('ggmap') for details.

> citation('ggmap')

To cite ggmap in publications, please use:

  D. Kahle and H. Wickham. ggmap: Spatial Visualization with ggplot2. The
  R Journal, 5(1), 144-161. URL

The first map will be an overall view of the country with the map type as "terrain".  Note that "satellite", "hybrid" and "roadmap" are also available.

> map1 = ggmap(

   get_googlemap(center="Syria", zoom=7, maptype="terrain"))

With the map created as object "map1", I plot the locations using "geom_point()".

> map1 + geom_point(
   data = real, aes (x = Long, y = Lat), pch = 19,  size = 6, col="red3")

With the exception of what looks like one strike near Ar Raqqah, we can see they are concentrated between Aleppo and Homs with some close to the Turkish border.  Let's have a closer look at that region.

> map2 = ggmap(
   get_googlemap(center="Ehsim, Syria", zoom=9, maptype="terrain"))

> map2 + geom_point(data = real, aes (x = Long, y = Lat),
  pch = 18,  size = 9, col="red2")

East of Ghamam is a large concentration, so let's zoom in on that area and add the strike number as labels.

> map3 = ggmap(
   get_googlemap(center="Dorien, Syria",zoom=13, maptype="hybrid"))

> map3 + geom_point(
   data = real, aes (x = Long, y = Lat),pch = 18, size = 9, col="red3") +
   geom_text(data=real,aes(x=Long, y=Lat, label=Airstrikes),
   size = 5, vjust = 0, hjust = -0.25, color="white")

The last thing I want to do is focus in on the site for Strike 28.  To do this we will require the lat and long, which we can find with the which() function.

> which(real$Airstrikes =="Strike 28")
[1] 21

> real[21,]
   Airstrikes      Lat     Long real_reported

21  Strike 28 35.68449 36.11946          real

It is now just a simple matter of using those coordinates for calling up the google map.

> map4 = ggmap(
   get_googlemap(center=c(lon=36.11946,lat=35.68449), zoom=17, maptype="satellite"))

> map4 + geom_point(
   data = real, aes (x = Long, y = Lat), pch = 22, size = 12, col="red3")
   + geom_text(data=real,aes(x=Long, y=Lat, label=Airstrikes),
   size = 9, vjust = 0, hjust = -0.25, color="white")

From the looks of it, this seems to be an isolated location, so it was probably some sort of base or logistics center.  If you're interested, the Russian Ministry of Defense posts videos of these strikes and you can see this one on YouTube.

OK, so that is a quick tutorial on using ggmap, a very powerful package.  We've just scratched the surface of what it can do.  I will continue to monitor the site for additional data.  Perhaps publish a Shiny app if the data is large and "rich" enough.