From: jerome <rom...@ya...> - 2013-02-07 09:57:57
|
> The parsing of the result to a format that is importable by > Gramps could be done by XSLT. This way, adding a new resource would be a > matter of supplying a URL and an XSLT sheet. > Handling multiple results and having to choose which one to > use is a different matter though. Sure, it is possible. I did it for my local use[1], for migrating my genealogical data to gramps and for experimenting a quick Gramps XML parser with python and XSLT[2]. ;) Also, Michiel made something more advanced with an experimental Gramps Exhibit[3]... The limitations: maintain code for 2 APIs/namespaces/projects! To keep code up to date for gramps API is simple. Some websites are not willing to let you sharing data with you, mainly if you do not plan to stay... [1] http://gramps-project.org/wiki/index.php?title=Xsl [2] http://gramps-project.org/wiki/index.php?title=Lxml_Gramplet#Goals [3] http://members.tele2.nl/m.d.nauta/typeless_data_entry/typeless_data_entry.html --- En date de : Jeu 7.2.13, Age Bosma <age...@gm...> a écrit : > De: Age Bosma <age...@gm...> > Objet: Re: [Gramps-users] Import data from online resource > À: "gra...@li..." <gra...@li...> > Date: Jeudi 7 février 2013, 10h32 > On 06-02-13 16:41, Doug Blank wrote: > > On Wed, Feb 6, 2013 at 10:19 AM, Age Bosma <age...@gm... > > <mailto:age...@gm...>> > wrote: > > > > The idea is to use data like > names, locations and event dates from an > > existing family tree as search > criteria on the website. For each event > > I'd start searching with as > much info as possible to limit the amount of > > results. A selection dialog > will be presented if I end up with more than > > one result. Less search > criteria will be used when no results are > > returned. > > Accepting a result will present > you with a dialog that allows you to > > compare/confirm the result and > specify what to do with the retrieved > > data. > > Search results will give me a > download link for the scanned certificate > > as well as names, dates and > locations involved with the event. > > The main goal is to download > the scans and link it to the appropriate > > event. The other data can be > used as confirmation for, correction of and > > contribution to existing data. > > As a next step I could also add > capabilities for searching for new data, > > instead of using existing data, > and use the result for creating new > > people, locations and events. > It beats manual input, if you ask me. > > > > What do you think of the idea > itself? > > > > > > It sounds like it could work, but also sounds like a > lot of work, and > > something that could break easily if the web pages > change. > > > > Having to update the plugin when website changes occur is > inevitable > yes, but not all changes mean that the data is presented in > a different > way and luckily websites like these don't change on a daily > basis. > Updates can be reduced somewhat by not depending on the > complete html > structure. > > > Would it be a valuable addition > to > > the Gramps plugin set or do you > advice against it? If it is, I'll put > > more focus on creating the > plugin as one for the general public to use > > instead of a more hack-ish one > for personal use. > > > > > > I think that there is a way that you could reuse > existing Gramps code, > > and make something that could be me generally useful. > > > > I'm thinking of a intermediate tool that could "screen > scrape" or > > otherwise get the data out of an external format (like > HTML), store it > > in a known format, and then integrated into Gramps. For > example, if you > > can read the HTML and output a Comma-Separated-Value > spreadsheet file, > > then Gramps could import it. For more on the > Spreadsheet format and > > import/export, see: > > > > http://gramps-project.org/wiki/index.php?title=Gramps_3.4_Wiki_Manual_-_Manage_Family_Trees:_CSV_Import_and_Export > > > > Making it a more general plugin, not targeted to one > specific site, is > an interesting idea. Though it would introduce quite some > additional > complexity. There would have to be some form of mapping > capability to > specify what Gramps data should be used in what way in the > search query. > > The first thing that comes to mind would be to supply a > complete search > query URL with variable place-holders. The place-holder > strings can > largely be predefined. > The site I'm interesting in has quite a straightforward URL > and > forgiveable structure, almost like a clean web service. > Others might > not, but this could just be a requirement. > The parsing of the result to a format that is importable by > Gramps could > be done by XSLT. This way, adding a new resource would be a > matter of > supplying a URL and an XSLT sheet. > Handling multiple results and having to choose which one to > use is a > different matter though. I don't see an easy way to > implement having to > first determine what we got back from the search query, > presenting that > in a selection dialog when required, and continuing with the > next step. > Where one website might always return a list even if it has > one result, > others might present you with the end result when only one > option is left. > > The CSV import would have to be extended. It currently does > not allow > you to import sources and citations, just simple notes. > Importing > sources and citations is essential to me. > > Age > > ------------------------------------------------------------------------------ > Free Next-Gen Firewall Hardware Offer > Buy your Sophos next-gen firewall before the end March 2013 > > and get the hardware for free! Learn more. > http://p.sf.net/sfu/sophos-d2d-feb > _______________________________________________ > Gramps-users mailing list > Gra...@li... > https://lists.sourceforge.net/lists/listinfo/gramps-users > |