|
From: Jacob N. <jac...@gm...> - 2012-12-23 02:53:15
|
During Google Summer of Code we had two great students working hard to enable Apertium into the mobile world: - Mikel Artetxe did a great job making lttoolbox-java embeddable, spinning off a lot of good stuff, such as Apertium-Caffeine<http://wiki.apertium.org/wiki/Apertium-Caffeine>and a one-click installer for trying out language pairs<http://wiki.apertium.org/wiki/Language_pair_packages#List_of_ready-to-use_packages> - Arink Verma wrote an Apertium Android<http://www.arinkverma.in/2012/08/summer-of-2012-with-google.html>app using lttoolbox-java, that allows the user to do off-line, in-phone translation using Apertium As could and should be expected after any GSoC where was work left to do. In this case the app was suffering from memory constraint problems; in Android you are allowed to use down to 16MB of RAM, a situation that needed to be handled properly before it would work properly across all devices. Those of you that tested the previous versions can confirm that. After the GSoC I have been continuing optimizing, simplifying and unifying the code and I am proud to present a fresh version of lttoolbox-java *that can run on a very tiny amount of RAM* and *which is faster than ever*, deployed in the world's first open source offline, embedded in-device machine translation system, as an Android app. Download and install it from here: https://apertium.svn.sourceforge.net/svnroot/apertium/builds/apertium-android/ The app has been tested and runs on Android 1.6 and later. I would like to ask every one with an Android phone to install this app and jugde if there are any things that needs to be done before we publish it. (You have to enable 'install from unknown sources' first and uninstall previous versions) Please note that first translation run is slow, as it needs to index files, and optimize transfer bytecode. How was the lttoolbox-java 'memory magic' done? Well, it's called memory mapping. Apertium mainly consists of transducers, which is big chunks of memory with a binary representation of the dictionary files. lttoolbox and old lttoolbox-java does the 'usual' thing: Load all these transducers at startup time, pre-builds data structures in memory representing the whole file, and keeps them there in memory during processing. The new lttoolbox-java indexes the files in the first run and saves these transducer index cache files for re-use later. After first run it only loads the parts of the transducers which it needs from disk. This brings loading time down to near zero (lttoolbox-java is much faster at loading transducers than lttoolbox). Loading is done by *memory-mapping*, which is a sophisticated way of doing random file access; you get a chunk of memory which represent the file, each time you read bytes the operating system will make that this part of the file has been loaded into memory and free the memory afterwards when not needed anymore. Basically, instead of prebuilding the data structures, I just look in the files on a per-needed basis. This is much faster as translation tasks are only using a tiny fraction of the transducers. The source is in SVN, and you can get an impression of the work looking there: lttoolbox-java: http://apertium.svn.sourceforge.net/viewvc/apertium/trunk/lttoolbox-java/src/?view=log Android app: http://apertium.svn.sourceforge.net/viewvc/apertium/trunk/apertium-mobile/apertium-android/src/?view=log Ive done a lot of work on simplifying the app to make it robust and adoptable. In the directory http://apertium.svn.sourceforge.net/viewvc/apertium/trunk/apertium-mobile/apertium-android/src/org/apertium/android/?pathrev=42121you see my unification and simplification. In the package 'extended' is Arink's work (modified), which includes som extended functions, such as a widget, SMS translator, database, file manager. I'd like to ask you what you think about permissions and functionality: Should the Apertium app we publish be a basic app, simple for others to adopt, requring no permissions (apart from internet permission to download pairs), or a full-blown app able to access your SMSes and SD card? Currently the app requires permissions to read your SMS'es and the data on your SD Card. SD Card access would be nice for manual installation of self-compiled language pairs (but would you really deploy unpublished work in your phones?) and for storing language pairs (each one takes ~10MB when it is uncompressed - but how many would a user install?). Yours, Jacob -- Jacob Nordfalk <http://profiles.google.com/jacob.nordfalk> javabog.dk Androidudvikler og -underviser på IHK<http://cv.ihk.dk/diplomuddannelser/itd/vf/MAU>og Lund&Bendsen <https://www.lundogbendsen.dk/undervisning/beskrivelse/LB1809/> |