[Top] [Up] [Prev] [Next] [Up] [Basic Concepts] [Demonstrations] [Functions] SDC Morphology Toolbox V1.1 15Jan02

mmdlabeltext - Segmenting letters, words and paragraphs.


Description
In this example, a digitized text is processed to identify the letters, words and paragraphs. This demonstration uses only the mmlabel function with different connectivity parameters.
See Also
mmlabel - Label a binary image.

Reading

The text image is read.

f = mmreadgray('stext.tif');
mmshow(f);
image
(f)

First, label the letters.

The letters are the main connected components in the image. So we use the classical 8-connectivity criteria for identify each letter.

fl=mmlabel(f,mmsebox);
mmlblshow(fl);
image
(fl)

Second, label the words.

The words are made of closed letters. In this case we use a connectivity specified by a rectangle structuring element of 7 pixels high and 11 pixels width, so any two pixels that can be hit by this rectangle, belong to the same connected component. The values 7 and 11 were chosen experimentally and depend on the font size.

sew = mmimg2se(logical(uint8(ones(7,11))));
mmseshow(sew)

ans =
     1     1     1     1     1     1     1     1     1     1     1
     1     1     1     1     1     1     1     1     1     1     1
     1     1     1     1     1     1     1     1     1     1     1
     1     1     1     1     1     1     1     1     1     1     1
     1     1     1     1     1     1     1     1     1     1     1
     1     1     1     1     1     1     1     1     1     1     1
     1     1     1     1     1     1     1     1     1     1     1
fw=mmlabel(f,sew);
mmlblshow(fw);
image
(fw)

Finally, label the paragraphs.

Similarly, paragraphs are closed words. In this case the connectivity is given by a rectangle of 35 by 20 pixels.

sep = mmimg2se(logical(uint8(ones(20,35))));
fp=mmlabel(f,sep);
mmlblshow(fp);
image
(fp)

[Top] [Up] [Prev] [Next] [Up] [Basic Concepts] [Demonstrations] [Functions] Valid XHTML 1.0!
Copyright (c) 1998-2002 by SDC Information Systems