Thread: [ccextractor-users] ocr on extracted subtitle from dvb subtitle

Brought to you by: cfsmp3

ccextractor-users

[ccextractor-users] ocr on extracted subtitle from dvb subtitle

From: Anshul <er....@gm...> - 2014-06-03 08:12:15

Hi

I was looking to do ocr on subtitle pixel blocks.
I was going through tesseract library, and i found that
alone this library take 4m 25 second to compile and 5 second to configure
It have 295 cpp files and 277 header file.

I just wanted to know that do we want merge such big source code in 
cc-extractor.
or can we keep this library as exception and keep already compiled 
libraries of tesseract for
windows and linux.


Thanks
Anshul

Re: [ccextractor-users] ocr on extracted subtitle from dvb subtitle

From: Carlos F. <cf...@gm...> - 2014-06-03 09:02:52

I think we can keep it as an exception as long as it's not a mandatory
library, i.e. compilation (and execution) cannot fail if it's not
present - obviously just don't have the functionality available.

On Tue, Jun 3, 2014 at 3:12 AM, Anshul <er....@gm...> wrote:
> Hi
>
> I was looking to do ocr on subtitle pixel blocks.
> I was going through tesseract library, and i found that
> alone this library take 4m 25 second to compile and 5 second to configure
> It have 295 cpp files and 277 header file.
>
> I just wanted to know that do we want merge such big source code in
> cc-extractor.
> or can we keep this library as exception and keep already compiled
> libraries of tesseract for
> windows and linux.
>
>
> Thanks
> Anshul
>
> ------------------------------------------------------------------------------
> Learn Graph Databases - Download FREE O'Reilly Book
> "Graph Databases" is the definitive new guide to graph databases and their
> applications. Written by three acclaimed leaders in the field,
> this first edition is now available. Download your free book today!
> http://p.sf.net/sfu/NeoTech
> _______________________________________________
> ccextractor-users mailing list
> cce...@li...
> https://lists.sourceforge.net/lists/listinfo/ccextractor-users

Re: [ccextractor-users] ocr on extracted subtitle from dvb subtitle

From: anshul <ans...@gm...> - 2014-06-03 14:02:04

Hi

should i put binary library and training data of ocr on git,
  or should i put on google drive and download the binary only if user 
opt for this functionality.

Thanks
Anshul

On 06/03/2014 02:32 PM, Carlos Fernandez wrote:
> I think we can keep it as an exception as long as it's not a mandatory
> library, i.e. compilation (and execution) cannot fail if it's not
> present - obviously just don't have the functionality available.
>
> On Tue, Jun 3, 2014 at 3:12 AM, Anshul <er....@gm...> wrote:
>> Hi
>>
>> I was looking to do ocr on subtitle pixel blocks.
>> I was going through tesseract library, and i found that
>> alone this library take 4m 25 second to compile and 5 second to configure
>> It have 295 cpp files and 277 header file.
>>
>> I just wanted to know that do we want merge such big source code in
>> cc-extractor.
>> or can we keep this library as exception and keep already compiled
>> libraries of tesseract for
>> windows and linux.
>>
>>
>> Thanks
>> Anshul
>>
>> ------------------------------------------------------------------------------
>> Learn Graph Databases - Download FREE O'Reilly Book
>> "Graph Databases" is the definitive new guide to graph databases and their
>> applications. Written by three acclaimed leaders in the field,
>> this first edition is now available. Download your free book today!
>> http://p.sf.net/sfu/NeoTech
>> _______________________________________________
>> ccextractor-users mailing list
>> cce...@li...
>> https://lists.sourceforge.net/lists/listinfo/ccextractor-users
> ------------------------------------------------------------------------------
> Learn Graph Databases - Download FREE O'Reilly Book
> "Graph Databases" is the definitive new guide to graph databases and their
> applications. Written by three acclaimed leaders in the field,
> this first edition is now available. Download your free book today!
> http://p.sf.net/sfu/NeoTech
> _______________________________________________
> ccextractor-users mailing list
> cce...@li...
> https://lists.sourceforge.net/lists/listinfo/ccextractor-users

Re: [ccextractor-users] ocr on extracted subtitle from dvb subtitle

From: Carlos F. <cf...@gm...> - 2014-06-04 08:25:51

I'd just write a docs/OCR.TXT file with instructions on how to build
with OCR functionality.

Requirements are that all versions must build with and without that
library and that if the library is not present CCExtractor can't
crash.


On Tue, Jun 3, 2014 at 9:01 AM, anshul <ans...@gm...> wrote:
> Hi
>
> should i put binary library and training data of ocr on git,
>   or should i put on google drive and download the binary only if user
> opt for this functionality.
>
> Thanks
> Anshul
>
> On 06/03/2014 02:32 PM, Carlos Fernandez wrote:
>> I think we can keep it as an exception as long as it's not a mandatory
>> library, i.e. compilation (and execution) cannot fail if it's not
>> present - obviously just don't have the functionality available.
>>
>> On Tue, Jun 3, 2014 at 3:12 AM, Anshul <er....@gm...> wrote:
>>> Hi
>>>
>>> I was looking to do ocr on subtitle pixel blocks.
>>> I was going through tesseract library, and i found that
>>> alone this library take 4m 25 second to compile and 5 second to configure
>>> It have 295 cpp files and 277 header file.
>>>
>>> I just wanted to know that do we want merge such big source code in
>>> cc-extractor.
>>> or can we keep this library as exception and keep already compiled
>>> libraries of tesseract for
>>> windows and linux.
>>>
>>>
>>> Thanks
>>> Anshul
>>>
>>> ------------------------------------------------------------------------------
>>> Learn Graph Databases - Download FREE O'Reilly Book
>>> "Graph Databases" is the definitive new guide to graph databases and their
>>> applications. Written by three acclaimed leaders in the field,
>>> this first edition is now available. Download your free book today!
>>> http://p.sf.net/sfu/NeoTech
>>> _______________________________________________
>>> ccextractor-users mailing list
>>> cce...@li...
>>> https://lists.sourceforge.net/lists/listinfo/ccextractor-users
>> ------------------------------------------------------------------------------
>> Learn Graph Databases - Download FREE O'Reilly Book
>> "Graph Databases" is the definitive new guide to graph databases and their
>> applications. Written by three acclaimed leaders in the field,
>> this first edition is now available. Download your free book today!
>> http://p.sf.net/sfu/NeoTech
>> _______________________________________________
>> ccextractor-users mailing list
>> cce...@li...
>> https://lists.sourceforge.net/lists/listinfo/ccextractor-users
>
>
> ------------------------------------------------------------------------------
> Learn Graph Databases - Download FREE O'Reilly Book
> "Graph Databases" is the definitive new guide to graph databases and their
> applications. Written by three acclaimed leaders in the field,
> this first edition is now available. Download your free book today!
> http://p.sf.net/sfu/NeoTech
> _______________________________________________
> ccextractor-users mailing list
> cce...@li...
> https://lists.sourceforge.net/lists/listinfo/ccextractor-users

Re: [ccextractor-users] ocr on extracted subtitle from dvb subtitle

From: anshul <ans...@gm...> - 2014-06-04 08:44:42

from where should I download OCR.TXT,
i checked your mainstream, its not over there.

-Anshul

On 06/04/2014 01:55 PM, Carlos Fernandez wrote:
> I'd just write a docs/OCR.TXT file with instructions on how to build
> with OCR functionality.
>
> Requirements are that all versions must build with and without that
> library and that if the library is not present CCExtractor can't
> crash.
>
>
> On Tue, Jun 3, 2014 at 9:01 AM, anshul <ans...@gm...> wrote:
>> Hi
>>
>> should i put binary library and training data of ocr on git,
>>    or should i put on google drive and download the binary only if user
>> opt for this functionality.
>>
>> Thanks
>> Anshul
>>
>> On 06/03/2014 02:32 PM, Carlos Fernandez wrote:
>>> I think we can keep it as an exception as long as it's not a mandatory
>>> library, i.e. compilation (and execution) cannot fail if it's not
>>> present - obviously just don't have the functionality available.
>>>
>>> On Tue, Jun 3, 2014 at 3:12 AM, Anshul <er....@gm...> wrote:
>>>> Hi
>>>>
>>>> I was looking to do ocr on subtitle pixel blocks.
>>>> I was going through tesseract library, and i found that
>>>> alone this library take 4m 25 second to compile and 5 second to configure
>>>> It have 295 cpp files and 277 header file.
>>>>
>>>> I just wanted to know that do we want merge such big source code in
>>>> cc-extractor.
>>>> or can we keep this library as exception and keep already compiled
>>>> libraries of tesseract for
>>>> windows and linux.
>>>>
>>>>
>>>> Thanks
>>>> Anshul
>>>>
>>>> ------------------------------------------------------------------------------
>>>> Learn Graph Databases - Download FREE O'Reilly Book
>>>> "Graph Databases" is the definitive new guide to graph databases and their
>>>> applications. Written by three acclaimed leaders in the field,
>>>> this first edition is now available. Download your free book today!
>>>> http://p.sf.net/sfu/NeoTech
>>>> _______________________________________________
>>>> ccextractor-users mailing list
>>>> cce...@li...
>>>> https://lists.sourceforge.net/lists/listinfo/ccextractor-users
>>> ------------------------------------------------------------------------------
>>> Learn Graph Databases - Download FREE O'Reilly Book
>>> "Graph Databases" is the definitive new guide to graph databases and their
>>> applications. Written by three acclaimed leaders in the field,
>>> this first edition is now available. Download your free book today!
>>> http://p.sf.net/sfu/NeoTech
>>> _______________________________________________
>>> ccextractor-users mailing list
>>> cce...@li...
>>> https://lists.sourceforge.net/lists/listinfo/ccextractor-users
>>
>> ------------------------------------------------------------------------------
>> Learn Graph Databases - Download FREE O'Reilly Book
>> "Graph Databases" is the definitive new guide to graph databases and their
>> applications. Written by three acclaimed leaders in the field,
>> this first edition is now available. Download your free book today!
>> http://p.sf.net/sfu/NeoTech
>> _______________________________________________
>> ccextractor-users mailing list
>> cce...@li...
>> https://lists.sourceforge.net/lists/listinfo/ccextractor-users

Re: [ccextractor-users] ocr on extracted subtitle from dvb subtitle

From: anshul <ans...@gm...> - 2014-06-03 09:13:49

posted from non subscribed id

On 06/03/2014 01:42 PM, Anshul wrote:
> Hi
>
> I was looking to do ocr on subtitle pixel blocks.
> I was going through tesseract library, and i found that
> alone this library take 4m 25 second to compile and 5 second to configure
> It have 295 cpp files and 277 header file.
>
> I just wanted to know that do we want merge such big source code in 
> cc-extractor.
> or can we keep this library as exception and keep already compiled 
> libraries of tesseract for
> windows and linux.
>
>
> Thanks
> Anshul