EFFICIENT FEATURE SELECTION FOR POLYP DETECTION
Computed tomographic colonography (CTC) is a promising alternative to traditional invasive colonoscopic methods used in the detection and removal of cancerous growths, or polyps in the colon. Existing algorithms for CTC typically use a classifier to discriminate between true and false positives generated by a polyp candidate detection system. However, these classifiers often suffer from a phenomenon termed the curse of dimensionality, whereby there is a marked degradation in the performance of a classifier as the number of features used in the classifier is increased. In addition an increase in the number of features used also contributes to an increase in computational complexity and demands on storage space. This paper demonstrates the benefits of feature selection with the aim at increasing specificity while preserving sensitivity in a polyp detection system. It also compares the performances of an individual (F-score) and mutual information (MI) method for feature selection on a polyp candidate database, in order to select a subset of features for optimum CAD performance. Experimental results show that the performance of SVM+MI seems to be better for a small number of features used, but the SVM+Fscore method seems to dominate when using the 30-50 best ranked features. On the whole, the AUC measures are able to reach 0.8-0.85 for the top ranked 20-40 features using MI or F-score methods compared with 0.65-0.7 when using all 100 features in the worstcase scenario.