Apertium: Machine Translation Toolbox / Tickets / #118 test if MAX_COMBINATIONS can be increased without too much slow-down

Venkat Parthasarathy - 2017-03-13

I got an idea of the source code and what I understood was that this function (compoundAnalysis) is called when lt-proc -e is called. So, what this task requires us to do is run lt-comp on a standard dictionary and lt-proc on a large corpus (cat large_corpus | lt-proc -e bin_file_generated.bin) and compare the time and memory usage for different values for the constant MAX_COMBINATIONS. I don't understand how analysers play a role. Can you please elaborate on that?

Last edit: Venkat Parthasarathy 2017-03-13

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Kevin Brubeck Unhammer - 2017-03-13
  
  Your file_generated.bin is a morphological analyser compiled as a finite state transducer. They're typically named things like deu.automorf.bin (for apertium-deu)
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Venkat Parthasarathy - 2017-03-13
  
  So, we consider large analysers by compiling those dictionaries (like apertium-deu.deu.dix or apertium-nob.nob.dix)?
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Kevin Brubeck Unhammer - 2017-03-13
    
    apertium-get apertium-deu (or -nob) will give you that (if you have apertium-all-dev installed)
    
    Last edit: Kevin Brubeck Unhammer 2017-03-13
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Flammie Pirinen - 2017-04-03

I routinely run dewiki through apertium-deu if you need testings.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Flammie Pirinen - 2017-04-13

I did some unscientific testing with apertium-deu & dewiki and it seems to me that MAX_COMBINATIONS does not set a bottleneck regardless of how high it is.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Flammie Pirinen - 2017-04-18

status: open --> pending
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Flammie Pirinen - 2017-04-18

Fix3ed?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Kevin Brubeck Unhammer - 2017-04-18

status: pending --> closed
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Kevin Brubeck Unhammer - 2017-04-18

It seems to have been scientificly tested.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

test if MAX_COMBINATIONS can be increased without too much slow-down

The free and open-source rule-based machine translation platform

Searches

Help

#118 test if MAX_COMBINATIONS can be increased without too much slow-down

Discussion