From: Jan F. B. <bo...@in...> - 2007-12-01 12:19:16
|
Hi happy jabref developers, In the feature request http://sourceforge.net/tracker/index.php?func=detail&aid=1805170&group_id=92314&atid=600309 someone would like "command line options to fetch medline, etc.", also fancy words like "fully scriptable" appeared. What is this about? Useful Scenarios If someone invokes Jabref using the commandline, the GUI should not start. Instead some actions are executed that are derived from the options that are given. Lets start with a straightforward command: jabref --fetch medline --query cotton mydb.bib The expected behaviour would be adding all medline articles about cotton to the mydb.bib database, right? What if there are 400 articles about cotton? Maybe an interactive approach is needed for that - but that wouldnt be fully scriptable ;-) Code Dependencies A naive approach would be the following: 1. command line options are parsed 2. bibtex database from mydb.bib is loaded 3. load the specified fetcher 4. execute the query 5. put the results in the database 6. save the database and exit This sounds too easy. Why is this not possible? First of all, the fetcher engines and the GUI are too closely tied together. In the GUI each fetcher is represented by a side pane that passes the entered query to the engine. Then the query process is invoked with another window: the ImportInspectionDialog. It is supposed to show the search progress and the results. Even a reference to the main window (JabRefFrame) itself is passed to the fetcher. This one is used for showing warnings (e.g. if there is no result for the query) and extra infos. The problem is that the fetch engine makes calls to these GUI components and therefore expects them to be present. Since we are working with command line only, no window objects are/should be present. The second problem is that a window component (BasePanel) is holding the main reference to the current database. This makes it difficult for any command line invocation to access the database. Finally, fetcher engines do not all look alike. Medline and Citeseer fetchers do not implement the general fetcher interface (EntryFetcher). IEEEXplorer and OAI2 do implement it, and so do all the plugin based fetchers. Designing a Solution Lets continue on the fetchers that look alike, i.e. that implement the interface EntryFetcher. Fetcher engines like windows. To keep them happy, objects that look like these windows have to be passed. This can be done by defining interfaces that specify all the methods that are called by the fetcher engine. When invoking the fetcher through the commandline simple objects that implement these interfaces can be instantiated instead. These can even be used to fulfill typical GUI functionality like display the status, etc. but use standard.out/err. Fetcher engines also like a database. Since we cannot access the currently loaded database through the GUI, we create a new one manually loaded with the entries of the given file. This is used to find duplicates between the loaded and the fetched entries. The remaining fetched entries can be extracted and passed to a method that adds these new entries to the database and saves it. Difficulties Threads. A search is spawned as a thread. A commandline invocation currently has no way to find out whether a search is ongoing or completed. Heterogeneous fetcher design. A solution should focus on plugin based fetchers only since these will gain more importance. Implementation Our current implementation for this solution is available on request and - supports the commandline parameters --fetch and --query - loads the right fetcher if present (not through plugin manager) - invokes a query and produces a ParserResult Known Issues: - the query is spawned as a thread, the creation of the parserresult does not wait for this thread to complete (this results in the parserresult being empty) - the given file is not loaded into the database Discussion Our current implementation has 2 new interfaces and a lot of refactoring. Is it worth the trouble? Is this the desired behaviour? Should all results be added to the database? Greetings, Jan F. Boldt <bo...@in...>, David Kaltschmidt <da...@in...> |