I worked with
Ryan Shaw and John Chuang on an automated method for accessing quality in
Wikipedia articles. We identified twenty features of a Wikipedia articles such as the number of revisions, inbound and outbound links, reading level, references, and markup structure. Then, using articles marked by the Wikipedia community as "
good" or "
featured" for training material, we evaluated several algorithms including K-Nearest Neighbors, Random Forests, and Support Vector Machines. In the end we were able to correctly classify about 98% of our test set.