Campos Magnéticos: Visualizing classifier performance in R, with only 3 commands

Thursday, May 04, 2006

Visualizing classifier performance in R, with only 3 commands

I have recently discovered ROC-R, an R package that is really usefull for IR students. ROC-R was designed for evaluating and visualizing classifier performance, supporting combinations of many of the typical IR metrics (e.g. precision, recall, f-measure, acurracy or error). It only adds three new commands to R, and integrates tightly with R's built-in graphics facilities.

Here's a short how-to. Let's assume you have the following experimental data from a binary classification problem on a file called "data.txt". The column entitled ClassifyerOutput shows the values output by the classifyer, while the column GroundTruth represets the real values for each sample.


ClassifyerOutput ^ GroundTruth
0.35             ^ 0
1.0              ^ 1
1.0              ^ 0
0.1              ^ 1
0.58             ^ 0

The following R script would produce a nice precicion/recall curve.


library(ROCR)
data <- read.table('data.txt', sep='^', header=TRUE);
pred <- prediction(data$ClassifyerOutput , data$GroundTruth)
perf <- performance(pred,"prec","rec")
plot(perf,col="grey82",lty=3)
plot(perf,avg="vertical",spread.estimate="boxplot",add=TRUE)

It could not be simpler. The complete documentation is only 14 pages long (assuming that you are familiar with R) and in no time you'll be producing nice looking charts from your data. I had some problems installing the package on linux (You will also need to install gplots from the R package bundle gregmisc) but everything worked fine on my Windows machine.

If you're using the package don't forget to cite the original authors:

Sing, T. & Sander, O. & Beerenwinkel, N. & Lengauer, T. (2004).
"ROCR: An R Package for visualizing the performance of scoring classifiers".
http://rocr.bioinf.mpi-sb.mpg.de

4.5.06 | Permalink | DiggIt! | Reddit | Del.icio.us

About me

I'm Bruno Martins
From Lisbon, Portugal
For more information, please refer to my Flickr profile, my BookCrossing bookshelf or my curriculum vitae.

Thursday, May 04, 2006

Visualizing classifier performance in R, with only 3 commands

About me

Listening to

Previous posts

Archives

Links

Friendly Blogs


	All the Web Me at BookCrossing Campos Magneticos

Thursday, May 04, 2006

Visualizing classifier performance in R, with only 3 commands

var a = 0; if(a == 0) {document.write('No comment');} else if(a == 1) {document.write('1 comment');}else{document.write(a+' comments');}

About me

Listening to

Previous posts

Archives

Links

Friendly Blogs