presage-devel Mailing List for Presage
the intelligent predictive text entry platform
Status: Beta
Brought to you by:
matteovescovi
You can subscribe to this list here.
2010 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(3) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2015 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(1) |
Dec
(1) |
2016 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2018 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: rinigus <rin...@gm...> - 2018-03-14 07:13:01
|
Dear Matteo and Presage developers: as a part of incorporation of Presage into Sailfish OS keyboard started by @martonmiklos, I've got engaged in Presage development as well. To make it easier for us, I forked Presage repo into GitHub. The forked version is available at https://github.com/sailfish-keyboard/presage and has few new features: * On the basis of SQLite predictor, I wrote the predictor that is using MARISA database together with the raw counts file to represent n-grams. This is read-only implementation, but is much faster than the one using SQLite (ballpark is about 10x faster) and significantly smaller for the same number of stored n-grams * On the basis of dictionary predictor, Hunspell predictor has been written * Ability to forget learned words * Some packaging scripts for Sailfish added in packaging All changes can be viewed at https://github.com/rinigus/presage/compare/upstream...master . We are planning to add Unicode support to Presage in future, so string normalization would be done via Unicode normalization, not lower casing as supported by Presage right now. Should also help with the tokenization. It looks to me that Presage development has slowed down. However, I wonder whether our changes would be of interest to the upstream and whether you would like to incorporate them? Best wishes, Rinigus |
From: HAYASHI K. <ke...@gm...> - 2016-07-07 01:31:37
|
Hi, On Debian BTS, FTBFS with GCC 6: narrowing conversion issue is reported by Martin Michlmayr. Here is the issue. https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=811758 And patch file is attached to solve above issue. https://bugs.debian.org/cgi-bin/bugreport.cgi?att=1;bug=811758;filename=fix-bug-811758-gcc6.patch;msg=12 I'm not familiar with presage internals, but please consider to merge above patch or refine it before merging it. Regards, -- Kentaro Hayashi <ke...@gm...> |
From: Matteo V. <mat...@ya...> - 2015-12-15 14:49:29
|
Hi Moshe, presage can be used in traditional Windows .NET applications. Take a look at the bindings/csharp/ directory in the source repository, which contains the .NET bindings that allows presage to be accessed by any CLR language, including C#. This directory also contains the implementation of a presage WCF service and a simple presage_csharp_demo application that uses the .NET binding. I must admit I don't know much about Windows Universal Applications. How did you build presage? presage is written in C++ and provides a number of language bindings, including the .NET binding that you would use to call into presage from a C# application. I'm guessing that to get this to work in an UWP application you would need to have a native build of presage for the ARM architecture you are targeting. presage has been built and is known to work on ARM. However, the only ARM builds I am aware of were on Debian using the GCC C++ compiler, not on Windows. Cheers,- Matteo On Tuesday, 15 December 2015, 14:28, Moshe Hoori <moshe.hoori@algo.team> wrote: Hi, My name is Moshe, and I'm taking a part in a social initiative for ALS patients. I'm trying to use presage on windows10 (phone/tablet) platform (x86/ARM) platforms. and I have some questions: 1. When calling the DLL from a Universal Window C# App - I get rc=7 (thrown exception) from the presage_new function from the DLL. what does this mean? I read the CPP code and I think it has to do something withe the predictor initiation . 2. Was presage ever compiled for ARM? do you expect building for ARM? Thank you Very Very much, Moshe Hoori ------------------------------------------------------------------------------ _______________________________________________ Presage-devel mailing list Pre...@li... https://lists.sourceforge.net/lists/listinfo/presage-devel |
From: Moshe H. <mos...@al...> - 2015-11-30 22:03:18
|
Hi, My name is Moshe, and I'm taking a part in a social initiative for ALS patients. I'm trying to use presage on windows10 (phone/tablet) platform (x86/ARM) platforms. and I have some questions: 1. When calling the DLL from a *Universal Window* C# App - I get rc=7 (thrown exception) from the presage_new function from the DLL. what does this mean? I read the CPP code and I think it has to do something withe the predictor initiation . 2. Was presage ever compiled for ARM? do you expect building for ARM? Thank you *Very Very much,* Moshe Hoori |
From: Matteo V. <mat...@ya...> - 2010-05-19 09:43:54
|
Hi, marmuta wrote: >> I think there is scope to join forces between presage and onboard. >> >> presage is architected to merge predictions generated by a set of >> predictors. Each predictor uses a different language model/predictive >> algorithm to generate predictions. >> >> Currently presage provides the following predictors: >> ARPA predictor: statistical language modelling data in the ARPA >> N-gram format >> generalized smoothed n-gram statistical predictor: generalized >> smoothed n-gram statistical predictor can work with n-gram of >> arbitrary cardinality recency predictor: based on recency promotion >> principle dictionary predictor: generates a prediction by returning >> tokens that are a completion of the current prefix in alphabetical >> order abbreviation expansion predictor: maps the current prefix to a >> token and returns the token in a prediction with a 1.0 probability >> dejavu predictor: learns and then later reproduces previously seen >> text sequences. >> >> A bit more information on how these predictors work is available >> here: http://presage.sourceforge.net/?q=node/15 >> >> >> It sounds like the language model and predictive algorithm used in >> the onboard word-prediction branch is an ideal candidate to be >> integrated into presage and become a new presage predictor class. >> > Pretty interesting stuff, but from looking over its feature list I'm > wondering what presage would gain. There doesn't seem to be much > onboards prediction could add that isn't implemented already. > > Roughly compared, gpredict (name is subject to change) covers > these presage components: > > - generalized smoothed n-gram statistical predictor > - recency predictor (with exponential falloff) > - dictionary predictor (word completion) > - dejavu predictor? (if it does continuous on-line learning) > > The main difference, apart from the general architecture, may be that > gpredict uses dynamically updatable language models, handy for on-line > learning. I'm not completely sure, but it seems presage's three n-gram > predictors are based on immutable models and the dejavu predictor keeps > a separate adaptable model of unigrams. > The generalized smoothed n-gram predictor does continuous on-line learning (learning can be turned on or off at runtime or via configuration). When learning is turned on, the language model is updated on the fly with new n-gram counts. The dejavu predictor is just a toy predictor, really. I wrote it to try things out when I started implemented continuous online learning functionality and it now serves as simple example of how to implement a learning predictor class. Similarly, the smoothed count predictor and the 3-gram smoothed predictor are remnants from a time when I was experimenting with language models and really are building steps towards the generalized smoothed n-gram predictor, which is currently the main statistical predictor (along with the ARPA predictor). >> presage could then be the engine used to power the d-bus prediction >> service, offering the predictive capabilities of the onboard language >> model/predictor, plus all the predictors currently provided by >> presage (all of which can be turned on/off and configured to suit >> individual needs). >> > The modularity could be helpful, even though I'm not sure if I could > really make use of it. > > We were very concerned about memory usage and had initially thought > about using static ARPA compatible structures for large immutable > language models and dynamically updatable models only for on-line > learning. However later the dynamic models turned out to be almost as > efficient as the ARPA implementation and so now there are (flavors of) > dynamic models for everything. > > Similar consolidation happened with recency caching. It was originally > planned as a separate modular component. However that would have meant > redundant storage of n-grams and a forced limit to some arbitrarily > small number of recent n-grams. So I had it integrate more closely with > the generic dynamic models, gaining recency tracking across all known > n-grams but sacrificing some modularity (there is still variability > through inheritance though). > If onboard's current predictive functionality was merged into presage and encapsulated into a (say, for lack of a better name) OnboardPredictor class, then presage's modularity would be useful because it would allow us to: - replicate exactly the same predictive functionality of current gpredict service, by switching on OnboardPredictor and turning off other predictors - augment OnboardPredictor predictive functionality with other predictors currently provided by presage, as desired by onboard or the user, simply by modifying a config variable. Presage would definitely benefit from having a new and high-quality predictor in its core. >> The presage core library itself has minimal dependencies: it pretty >> much only needs a C++ runtime and sqlite, which is used as the >> backing store for n-gram based language models (this ensure fast >> access, minimum memory footprint and no delays while loading the >> language model in memory). >> > That is definitely an advantage as gpredict currently takes around 5s > (@3GHz) to load the english base model with ~1.4 million n-grams. > Memory usage may or may not be an issue, the D-Bus service with only > English as the resident language takes around 30MB. > I trained presage's smoothed n-gram predictor language model on the text corpora currently using by gpredict to yield a language model with ~1.2 million n-grams, compared to presage default language model, which is trained on a single text (namely the Picture of Dorian Gray), totaling about ~75000 n-grams. The increase in prediction time and resident memory required on a control text is very small compared to the increase in n-grams: ~75 thousands n-grams -- prediction time: ~7 seconds, resident memory size: ~3MB ~1.2 millions n-grams -- prediction time: ~17 seconds, resident memory size: ~5MB This preliminary testing shows that prediction time and memory consumption does not grow linearly with the number of n-grams. > That said, when I first saw presage, I wasn't too happy about its sqlite > dependency. Sqlite often means frequent hard drive accesses and a choice > between general slowness due to generous fsync'ing or all bets off > concerning data security. That may be unfounded prejudice in this > case and perhaps presage has all that overcome. I didn't do any real > world testing with it. > Yes, that's the trade-off to have the language model on disk rather than in memory. There's advantages and disadvantages to having the lm reside in memory or on disk. The great thing about it is that, strictly speaking, it's not presage that has a dependency on sqlite, but rather the individual predictors that store their language model in an sqlite database. In other words, the dependency on sqlite could be removed from the presage library itself, and moved to the smoothed n-gram predictor. This would be very little work (a 10 minutes job I believe). In practice, I found sqlite very fast and reliable. Presage database connector layer encloses all writes to the database (and reads too, for that matter) in transactions, which guarantees atomicity of updates to the language model. >>> For details about the word prediction service, please contact >>> marmuta that did nearly all the work about the word prediction >>> service. >>> >> I'll follow up with marmuta to discuss the feasibility of making this >> happen and work out the technical details, in case there is consensus >> to go ahead with this. >> > I'm happy to further discuss this, even though I'm a bit torn currently. > > I can see the appeal of having presage (or other candidates like nltk) > be the central repository for all kinds of prediction needs. On the > other hand the advantages of merging gpredict into presage don't seem > to be that obvious. Most of the functionality does exist already in > presage and from onboards point of view using presage appears to > currently gain it little except for new dependencies. > I need to look at gpredict language model and predictive algorithm in more detail, but I currently believe that presage will benefit from having a new predictor available, which can be turned on and combined with the existing predictors. onboard would benefit from having access to presage's other predictors, which can be configured on or off and customized by the user (i.e. abbreviation expansion predictor). > Also onboard's prediction service was already meant to be a full > featured standalone word predictor. It is largely working as planned > and we were going to split it off from onboard as a ready-to-use D-Bus > service soon. Rebasing on presage at this point would probably delay > things considerably for onboard. Not sure yet if this is the right > thing to do, but I'm open for pro-arguments. > Well, I understand the concerns about delaying things for onboard, but I think there are significant benefits in integrating gpredict and presage together and building a prediction D-Bus service on presage. Perhaps we could start with trying onboard with the presage D-Bus service that David has created, while we integrate gpredict into presage (basically, it would mean moving the C++ code into it class implementing a Predictor interface). I'm willing to help with this. Cheers, - Matteo |
From: Vescovi, M. <mat...@pr...> - 2010-05-18 13:40:47
|
Welcome to the presage-devel mailing list! - Matteo Vescovi |
From: Matteo V. <mat...@ya...> - 2010-05-18 13:26:26
|
Welcome to presage-devel mailing list! - Matteo Vescovi |