Next: Finding blocks
Up: blockFinder
Previous: blockFinder
  Contents
  Index
Block types
Besides finding each block, you should try to recognize what kind
of information that block carries. This will make the work of subsequent
modules much easier, and will improve the speed of the processing.
GOCR automatically defines three types of blocks:
Block type |
TEXT |
PICTURE |
MATH_EXPRESSION |
but you can define new types, as explained below. The default
is TEXT.
The block types are objects, which all derive from a common parent,
gocrBlock. This allows any module to access the block, regardless
of its type. This is what allows you to create new block types on
the fly. To do that, you must first define the struct of
your new block type, which must be in the following format:
-
- struct newblocktype {
-
- gocrBlock b;
/* other fields */
};
It's absolutely necessary that the first field of your structure be
gocrBlock b. This is what allows to cast your structure to
a simple gocrBlock (If you are wondering why the hell I didn't
use C++ instead of C, these are the reasons: it's easier to use C
from C++ than the opposite; I have much more experience with C than
C++; there are several people that program in C but not in C++; the
use of C as an OO language, although slightly obfuscated, has proven
to be possible and used in successful projects, such as GTK; C++ name
mangling makes it more difficult to write modules, and is not supported
yet by libtool).
You must register your block type,
to make GOCR aware of its existance. To do that, use the following
function:
-
- blockType gocr_blockTypeRegister ( char *name );
This function takes the name of your new block type, registers
it, and returns a non negative number, which is the block type id,
or -1 if some error occurred. This id should be saved, to provide
a quick way to check what is the block type. Alternatively, you can
use:
-
- blockType gocr_blockTypeGetByName ( char *name );
which returns the id of a already registered block type, or -1 if
none was found. Since this function is kind of slow, as it must compare
the string given to every other block type name registered, it's a
good idea to save the id in a variable. Last, a convenience:
-
- const char *gocr_blockTypeGetNameByType ( gocrblockType t );
given the block type, returns its name. Do not free this string.
Next: Finding blocks
Up: blockFinder
Previous: blockFinder
  Contents
  Index
root
2002-02-17