blockcount -- the Perl script used to count occurrences of words given a wordlist and a series of document files. You may need to modify the first line.
author-cda.R -- some R code for doing canonical discriminant analysis. It is not really necessary to download this code. You can get similar results by using the lda function in the MASS package for R (part of the VR bundle).
Data files of word counts -- You can download the data in two formats. One is as separate text files for each author. The other is as an R workspace file (compressed or uncompressed).
The list of function words used is contained in the wordlist.txt file. You do not need to download this if you download the R workspace file.
The reference for the paper is:
Peng, R. D., Hengartner, N. W. (2002) "Quantitative analysis of literary styles." The American Statistician, 56 (3), 175--185.