Thread: [ccextractor-users] ocr on extracted subtitle from dvb subtitle
Brought to you by:
cfsmp3
|
From: Anshul <er....@gm...> - 2014-06-03 08:12:15
|
Hi I was looking to do ocr on subtitle pixel blocks. I was going through tesseract library, and i found that alone this library take 4m 25 second to compile and 5 second to configure It have 295 cpp files and 277 header file. I just wanted to know that do we want merge such big source code in cc-extractor. or can we keep this library as exception and keep already compiled libraries of tesseract for windows and linux. Thanks Anshul |
|
From: Carlos F. <cf...@gm...> - 2014-06-03 09:02:52
|
I think we can keep it as an exception as long as it's not a mandatory library, i.e. compilation (and execution) cannot fail if it's not present - obviously just don't have the functionality available. On Tue, Jun 3, 2014 at 3:12 AM, Anshul <er....@gm...> wrote: > Hi > > I was looking to do ocr on subtitle pixel blocks. > I was going through tesseract library, and i found that > alone this library take 4m 25 second to compile and 5 second to configure > It have 295 cpp files and 277 header file. > > I just wanted to know that do we want merge such big source code in > cc-extractor. > or can we keep this library as exception and keep already compiled > libraries of tesseract for > windows and linux. > > > Thanks > Anshul > > ------------------------------------------------------------------------------ > Learn Graph Databases - Download FREE O'Reilly Book > "Graph Databases" is the definitive new guide to graph databases and their > applications. Written by three acclaimed leaders in the field, > this first edition is now available. Download your free book today! > http://p.sf.net/sfu/NeoTech > _______________________________________________ > ccextractor-users mailing list > cce...@li... > https://lists.sourceforge.net/lists/listinfo/ccextractor-users |
|
From: anshul <ans...@gm...> - 2014-06-03 14:02:04
|
Hi should i put binary library and training data of ocr on git, or should i put on google drive and download the binary only if user opt for this functionality. Thanks Anshul On 06/03/2014 02:32 PM, Carlos Fernandez wrote: > I think we can keep it as an exception as long as it's not a mandatory > library, i.e. compilation (and execution) cannot fail if it's not > present - obviously just don't have the functionality available. > > On Tue, Jun 3, 2014 at 3:12 AM, Anshul <er....@gm...> wrote: >> Hi >> >> I was looking to do ocr on subtitle pixel blocks. >> I was going through tesseract library, and i found that >> alone this library take 4m 25 second to compile and 5 second to configure >> It have 295 cpp files and 277 header file. >> >> I just wanted to know that do we want merge such big source code in >> cc-extractor. >> or can we keep this library as exception and keep already compiled >> libraries of tesseract for >> windows and linux. >> >> >> Thanks >> Anshul >> >> ------------------------------------------------------------------------------ >> Learn Graph Databases - Download FREE O'Reilly Book >> "Graph Databases" is the definitive new guide to graph databases and their >> applications. Written by three acclaimed leaders in the field, >> this first edition is now available. Download your free book today! >> http://p.sf.net/sfu/NeoTech >> _______________________________________________ >> ccextractor-users mailing list >> cce...@li... >> https://lists.sourceforge.net/lists/listinfo/ccextractor-users > ------------------------------------------------------------------------------ > Learn Graph Databases - Download FREE O'Reilly Book > "Graph Databases" is the definitive new guide to graph databases and their > applications. Written by three acclaimed leaders in the field, > this first edition is now available. Download your free book today! > http://p.sf.net/sfu/NeoTech > _______________________________________________ > ccextractor-users mailing list > cce...@li... > https://lists.sourceforge.net/lists/listinfo/ccextractor-users |
|
From: Carlos F. <cf...@gm...> - 2014-06-04 08:25:51
|
I'd just write a docs/OCR.TXT file with instructions on how to build with OCR functionality. Requirements are that all versions must build with and without that library and that if the library is not present CCExtractor can't crash. On Tue, Jun 3, 2014 at 9:01 AM, anshul <ans...@gm...> wrote: > Hi > > should i put binary library and training data of ocr on git, > or should i put on google drive and download the binary only if user > opt for this functionality. > > Thanks > Anshul > > On 06/03/2014 02:32 PM, Carlos Fernandez wrote: >> I think we can keep it as an exception as long as it's not a mandatory >> library, i.e. compilation (and execution) cannot fail if it's not >> present - obviously just don't have the functionality available. >> >> On Tue, Jun 3, 2014 at 3:12 AM, Anshul <er....@gm...> wrote: >>> Hi >>> >>> I was looking to do ocr on subtitle pixel blocks. >>> I was going through tesseract library, and i found that >>> alone this library take 4m 25 second to compile and 5 second to configure >>> It have 295 cpp files and 277 header file. >>> >>> I just wanted to know that do we want merge such big source code in >>> cc-extractor. >>> or can we keep this library as exception and keep already compiled >>> libraries of tesseract for >>> windows and linux. >>> >>> >>> Thanks >>> Anshul >>> >>> ------------------------------------------------------------------------------ >>> Learn Graph Databases - Download FREE O'Reilly Book >>> "Graph Databases" is the definitive new guide to graph databases and their >>> applications. Written by three acclaimed leaders in the field, >>> this first edition is now available. Download your free book today! >>> http://p.sf.net/sfu/NeoTech >>> _______________________________________________ >>> ccextractor-users mailing list >>> cce...@li... >>> https://lists.sourceforge.net/lists/listinfo/ccextractor-users >> ------------------------------------------------------------------------------ >> Learn Graph Databases - Download FREE O'Reilly Book >> "Graph Databases" is the definitive new guide to graph databases and their >> applications. Written by three acclaimed leaders in the field, >> this first edition is now available. Download your free book today! >> http://p.sf.net/sfu/NeoTech >> _______________________________________________ >> ccextractor-users mailing list >> cce...@li... >> https://lists.sourceforge.net/lists/listinfo/ccextractor-users > > > ------------------------------------------------------------------------------ > Learn Graph Databases - Download FREE O'Reilly Book > "Graph Databases" is the definitive new guide to graph databases and their > applications. Written by three acclaimed leaders in the field, > this first edition is now available. Download your free book today! > http://p.sf.net/sfu/NeoTech > _______________________________________________ > ccextractor-users mailing list > cce...@li... > https://lists.sourceforge.net/lists/listinfo/ccextractor-users |
|
From: anshul <ans...@gm...> - 2014-06-04 08:44:42
|
from where should I download OCR.TXT, i checked your mainstream, its not over there. -Anshul On 06/04/2014 01:55 PM, Carlos Fernandez wrote: > I'd just write a docs/OCR.TXT file with instructions on how to build > with OCR functionality. > > Requirements are that all versions must build with and without that > library and that if the library is not present CCExtractor can't > crash. > > > On Tue, Jun 3, 2014 at 9:01 AM, anshul <ans...@gm...> wrote: >> Hi >> >> should i put binary library and training data of ocr on git, >> or should i put on google drive and download the binary only if user >> opt for this functionality. >> >> Thanks >> Anshul >> >> On 06/03/2014 02:32 PM, Carlos Fernandez wrote: >>> I think we can keep it as an exception as long as it's not a mandatory >>> library, i.e. compilation (and execution) cannot fail if it's not >>> present - obviously just don't have the functionality available. >>> >>> On Tue, Jun 3, 2014 at 3:12 AM, Anshul <er....@gm...> wrote: >>>> Hi >>>> >>>> I was looking to do ocr on subtitle pixel blocks. >>>> I was going through tesseract library, and i found that >>>> alone this library take 4m 25 second to compile and 5 second to configure >>>> It have 295 cpp files and 277 header file. >>>> >>>> I just wanted to know that do we want merge such big source code in >>>> cc-extractor. >>>> or can we keep this library as exception and keep already compiled >>>> libraries of tesseract for >>>> windows and linux. >>>> >>>> >>>> Thanks >>>> Anshul >>>> >>>> ------------------------------------------------------------------------------ >>>> Learn Graph Databases - Download FREE O'Reilly Book >>>> "Graph Databases" is the definitive new guide to graph databases and their >>>> applications. Written by three acclaimed leaders in the field, >>>> this first edition is now available. Download your free book today! >>>> http://p.sf.net/sfu/NeoTech >>>> _______________________________________________ >>>> ccextractor-users mailing list >>>> cce...@li... >>>> https://lists.sourceforge.net/lists/listinfo/ccextractor-users >>> ------------------------------------------------------------------------------ >>> Learn Graph Databases - Download FREE O'Reilly Book >>> "Graph Databases" is the definitive new guide to graph databases and their >>> applications. Written by three acclaimed leaders in the field, >>> this first edition is now available. Download your free book today! >>> http://p.sf.net/sfu/NeoTech >>> _______________________________________________ >>> ccextractor-users mailing list >>> cce...@li... >>> https://lists.sourceforge.net/lists/listinfo/ccextractor-users >> >> ------------------------------------------------------------------------------ >> Learn Graph Databases - Download FREE O'Reilly Book >> "Graph Databases" is the definitive new guide to graph databases and their >> applications. Written by three acclaimed leaders in the field, >> this first edition is now available. Download your free book today! >> http://p.sf.net/sfu/NeoTech >> _______________________________________________ >> ccextractor-users mailing list >> cce...@li... >> https://lists.sourceforge.net/lists/listinfo/ccextractor-users |
|
From: anshul <ans...@gm...> - 2014-06-03 09:13:49
|
posted from non subscribed id On 06/03/2014 01:42 PM, Anshul wrote: > Hi > > I was looking to do ocr on subtitle pixel blocks. > I was going through tesseract library, and i found that > alone this library take 4m 25 second to compile and 5 second to configure > It have 295 cpp files and 277 header file. > > I just wanted to know that do we want merge such big source code in > cc-extractor. > or can we keep this library as exception and keep already compiled > libraries of tesseract for > windows and linux. > > > Thanks > Anshul |