|
ISRN Machine Vision 2012
On the Brittleness of Handwritten Digit Recognition ModelsDOI: 10.5402/2012/834127 Abstract: Handwritten digit recognition is an important benchmark task in computer vision. Learning algorithms and feature representations which offer excellent performance for this task have been known for some time. Here, we focus on two major practical considerations: the relationship between the the amount of training data and error rate (corresponding to the effort to collect training data to build a model with a given maximum error rate) and the transferability of models' expertise between different datasets (corresponding to the usefulness for general handwritten digit recognition). While the relationship between amount of training data and error rate is very stable and to some extent independent of the specific dataset used—only the classifier and feature representation have significant effect—it has proven to be impossible to transfer low error rates on one or two pooled datasets to similarly low error rates on another dataset. We have called this weakness brittleness, inspired by an old Artificial Intelligence term that means the same thing. This weakness may be a general weakness of trained image classification systems. 1. Introduction Intelligent image analysis is an interesting research area in Artificial Intelligence and also important to a variety of current open research problems. Handwritten digits recognition is a well-researched subarea within the field, which is concerned with learning models to distinguish presegmented handwritten digits. The application of machine learning techniques over the last decade has proven successful in building systems which are competitive to human performance and which perform far better than manually written classical AI systems used in the beginnings of optical character recognition technology. However, not all aspects of such models have been previously investigated. Here, we systematically investigate two new aspects of such systems.(i)Essential training set size, that is, the relation between training set size and accuracy/error rate so as to determine the number of labeled training samples that are essential for a given performance level. Creating labeled training samples is costly and we are generally interested in algorithms which yield acceptable performance with the fewest number of labeled training samples.(ii)Dataset-Independence, that is, how well models trained on one sample dataset for handwritten digit recognition perform on other sample datasets for handwritten digit recognition after comprehensive normalization between the datasets. Models should be robust to small changes in preprocessing and
|