Monday, April 30, 2012

Improving the Training

No pictures this week (but I do have 210 words, or about 21% of a picture) In my initial attempts to reproduce the previous results, I had simplified the training of the classifier for various reasons. This week I have been focused on getting the training to be equivalent to the scheme used in the prior work. Specifically,

1. Added the step in which known negative images are sampled and patches collected to serve as the "background" or "NOT a character" class during training.
2. I'm still working on adding back the inclusion of "hard negative" cases. In this step, we run the detection against known negative images and record the false positives. The false positive patches are saved as additional training samples for the "NOT a character" class. Finally, the ferns are retrained using good character training images for each character (class), and both "easy" and "hard" negative images for the "NOT a character" class.

The minor challenge in this step is determining the selection threshold used to select candidate bounding boxes. In the prior work, this value was determined by manual experimentation. I'm hoping to develop a routine to systematize the determination of this value, perhaps by optimizing the F-score to a user-specified bias for recall vs. precision.

No comments:

Post a Comment