Translate

Sunday, December 1, 2013

Ensemble, Part2 (Bootstrap Aggregation)

Part 1 consisted of building a classification tree with the "party" package.  I will now use "ipred" to examine the same data with a bagging (bootstrap aggregation) algorithm.

> library(ipred)
> train_bag = bagging(class ~ ., data=train, coob=T)
> train_bag

Bagging classification trees with 25 bootstrap replications

Call: bagging.data.frame(formula = class ~ ., data = train, coob = T)

Out-of-bag estimate of misclassification error:  0.0424

> table(predict(train_bag), train$class)
         
                       benign malignant
  benign               290           9
  malignant             11       162

> testbag = predict(train_bag, newdata=test)
> table(testbag, test$class)
         
testbag       benign    malignant
  benign          137          1
  malignant          6        67

If you compare the confusion matrices from this week to the prior post, what do you think?

Let's recall the prior ROC curve and combine it with the bagged model.

#prepare bagged model for curve
> test.bagprob = predict(train_bag, type = "prob", newdata = test)
> bagpred = prediction(test.bagprob[,2], test$class)
> bagperf = performance(bagpred, "tpr", "fpr")

> plot(perf, main="ROC", colorize=T)
> plot(bagperf, col=2, add=TRUE)
> plot(perf, col=1, add=TRUE)
> legend(0.6, 0.6, c('ctree', 'bagging'), 1:2)
















As we could see from glancing at the confusion matrices, the bagged model outperforms the standard tree model.  Finally, let's have a look at the AUC (.992 with bagging versus .985 last time around)

> auc.curve = performance(bagpred, "auc")
> auc.curve
An object of class "performance"
Slot "x.name":
[1] "None"

Slot "y.name":
[1] "Area under the ROC curve"

Slot "alpha.name":
[1] "none"

Slot "x.values":
list()

Slot "y.values":
[[1]]
[1] 0.9918244


Slot "alpha.values":
list()

OK, more iterations to come boosting, random forest and no self-respecting data scientist would leave out logistic regression.

Cheers.

No comments:

Post a Comment