yes,I have got the doucuments from szhorvat`s topic . But it just is a functions and memebers list built by doxygen.It is not the one i want.who can tell me where i can find the doucuments in detail which can tell me the algorithm of recognize single char or how to segment the page and so on.The Tesseract is so complex.OH MY GOD,I am madding.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Well, you will prefer the second version of the doxygen docs. It's not only a lot better but it also provides links to all the comments in the source that reveal a lot of the "madding" complexity. There's also a start of a glossary of OCR terms as used in Tess heuristics as well as a table of the heuristics themselves.
To tell the truth, a lot of what you call "complexity" is really the noble and conscious effort of the original authors to trade-off some of what we now call "standards" for performance and reusability.
BTW, I am really tempted to pre-process the code ('unroll' the macros, so to speak) and only then doxify the code. It may have saved typing and source-file size & compilation times back in 87 but it's too high a price to pay now (I wrote a BASIC variant with TCP/IP commands built in in '93 - what a pain to compile on a 386/25 - i.e. been there, done that :-)
Cheers
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
yes,Tom,I have got the version1.02 to work .My OS is winxp sp2,and complied the source code with vc6.0 sp6(old sdk ,not the latest sdk:could not pass with the latest sdk).There are so many warnings :-)
Filip,thanks for your help.where can i find the "second version of the doxygen docs"?My latest downloaded is "tess_docs01.tar.tar",date is 2006-10-27.Is this one?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
yes,I have got the doucuments from szhorvat`s topic . But it just is a functions and memebers list built by doxygen.It is not the one i want.who can tell me where i can find the doucuments in detail which can tell me the algorithm of recognize single char or how to segment the page and so on.The Tesseract is so complex.OH MY GOD,I am madding.
Was you able to get v 1.02 to work? on what os?
Thanks,
Tom
Well, you will prefer the second version of the doxygen docs. It's not only a lot better but it also provides links to all the comments in the source that reveal a lot of the "madding" complexity. There's also a start of a glossary of OCR terms as used in Tess heuristics as well as a table of the heuristics themselves.
To tell the truth, a lot of what you call "complexity" is really the noble and conscious effort of the original authors to trade-off some of what we now call "standards" for performance and reusability.
BTW, I am really tempted to pre-process the code ('unroll' the macros, so to speak) and only then doxify the code. It may have saved typing and source-file size & compilation times back in 87 but it's too high a price to pay now (I wrote a BASIC variant with TCP/IP commands built in in '93 - what a pain to compile on a 386/25 - i.e. been there, done that :-)
Cheers
yes,Tom,I have got the version1.02 to work .My OS is winxp sp2,and complied the source code with vc6.0 sp6(old sdk ,not the latest sdk:could not pass with the latest sdk).There are so many warnings :-)
Filip,thanks for your help.where can i find the "second version of the doxygen docs"?My latest downloaded is "tess_docs01.tar.tar",date is 2006-10-27.Is this one?