We consider the problem of detecting mitotic figures in breast cancer histology slides. We investigate whether the performance of state-of-the-art detection algorithms is comparable to the performance of humans, when they are compared under fair conditions: our test subjects were not previously exposed to the task, and were required to learn their own classification criteria solely by studying the same training set available to algorithms. We designed and implemented a standardized web-based test based on the publicly-available MITOS dataset, and compared results with the performance of the 6 top-scoring algorithms in the ICPR 2012 Mitosis Detection Contest.
The problem is presented as a classification task on a balanced dataset. 45 different test subjects produced a total of 3009 classifications. The best individual (accuracy = 0.859 ± 0.012), is outperformed by the most accurate algorithm (accuracy = 0.873 ± 0.004). This suggests that state-of-the-art detection algorithms are likely limited by the size of the training set, rather than by lack of generalization ability.