Tuesday, June 14, 2016

Iraq-Wikileaks Analysis with R

In a place of extreme violence and devoid of order, the practical subsumes the principle. I drifted down the path of bribery and corruption endemic to the streets of Baghdad”.

Jason Whiteley, Father of Money: Buying Peace in Baghdad

As I mentioned in a previous post, I wanted to explore the Wikileaks data of the US Military's reported Significant Activities (SIGACTS).  It will be a subset of the famous Wikileaks classified US military documents.   Private Bradley Manning provided this material to Wikileaks.  He is now behind bars, receiving a 35-year sentence in 2013.  The subset of these documents I will use is available on The Guardian’s datablog website at this link:

The Guardian created this subset by selecting only those SIGACT reports that were associated with deaths of personnel and also that they felt did not compromise confidential sources.  It is stored in a Google Fusion Table.

The code provided merely scratches the surface of analysis that one can do with the data set of roughly 52,000 SIGACTs.  What I show is how to pull the data into R, conduct some basic data wrangling, create a subset, perform a cluster analysis and finally, build maps.  In creating the maps, I show how to create a static map with ggplot package as well as an interactive map with the leaflet package.

The subset of the data will focus on 2009 and the area assigned to Multi-National Division Baghdad since I spent 10 months of that year there and roughly 99% of the time in that Division’s Area of Responsibility.

The analysis with code and commentary is on at the following link: