Recognizing Chinese Characters in Scene Text: Preliminary Performance

Training Data

500 images per Character = ( 100 characters / font / character * 5 fonts ). Rotation = random value in -PI/8 to +PI/8.
100 background images picked at random from a set of images known to have no Chinese characters.
100 background 'hard negative' images saved from bootstrap step.

Set1: characters = { 向前一小步文明大英发服饰 }
Set1-yi4: characters = {向前一小步文明大英发服饰 } (Set1 without the character 一 )

Image Set

Ground Truth (character, count)

left image	right image
英,1 发,1 服,1 饰,1	前,1 一,2 小,1 步,2 文,1 明,1 大,1

Results (with training data Set1 - without 一)

Observations of these results include:

Multiple detections of patches that appear fairly flat and uniform to the naked eye.
All true characters in the right image are detected (along with a number of false detections), but only 1 of 4 true characters detected in the left image.

Results (with training data Set1 - with 一 )

Observations of these results include:

Results are blown out by inclusion of 一 in the training set. It is a very hard class given its similarity to horizontal edges.

Recognizing Chinese Characters in Scene Text

Wednesday, May 9, 2012

Preliminary Performance

No comments:

Post a Comment

About Me