From: <ben...@ug...> - 2007-01-31 11:33:48
|
Zsolt, I don't fully understand b. A grdb database is never in memory, only parts of it. Getting all notes and creating clean text in memory would be contrary to what the database is for: get what you need without keeping things in memory. I can only think of alternative that you store in the database the markup and the clean version, in the same way people were asking to store namedisplay in the database. For notes the above would mean my 540 Mb database would again be much bigger, so I would vote against it. So, technically, you should go with a. I think, but be intelligent on making the clean text: The problem I can think of are filters, but you could apply the filter as follows: 1/apply filter to the markup text and only if you have a positive result convert to clean text and redo the filter 2/apply filter to all allowed markup identifiers. If no positive, search in markup text, if positive, take the markup identifiers into account by only replacing those that give positive out of the markup text. In any case, nothing of the conversion of clean text should stay in memory. About GEDCOM export: people can expect that to take some time, so the extra 5/11 seconds will be no big deal. You actually could give the option to export the marked up code if GEDCOM doesn't crach on <>. However, how to handle import of <bold> in GEDCOM to non markup code... Benny Quoting Zsolt Foldvari <zso...@no...>: > Hi, > > While thinking about the possibility of having formatted notes (i.e. > rich text notes), we've bumped into the problem of speed again. > > After having markup text for notes there will be still some > functionalities, which require the clean text version of the note (e.g. > filters, gedcom export, etc.). So far we've had two solutions in mind: > > a. Keep only the markup version of the text and create the clean text on > the fly. This solution has some speed consequence though. Converting > 100.000 notes with 30 markup tag pairs takes around 11 sec on 1.6GHz + > 512MB. This could be unbearable for filtering... > At the moment clearing the markup text is done with: "text = > re.sub(r'(</?span.*?>)', '', markup_text)" > > b. Keep both the markup and the clean text version of the note in memory > parallel. The clear text version could be created e.g. by a background > process after loading the db. This solution has some memory issue > instead of the speed problem, i.e. all notes are doubled in memory. > > Any better idea or comment is appreciated. > > Cheers, > Zsolt > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys - and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Gramps-devel mailing list > Gra...@li... > https://lists.sourceforge.net/lists/listinfo/gramps-devel > ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program. |