|
From: Robert I. <rob...@go...> - 2010-10-14 10:09:25
|
Hi Sebastian, very nice. It had a closer look at your example and extended it to also handle the dependencies of extractors. I used mixin traits instead of structural typing. It's a matter of taste but I find this solution clearer and it should also make it possible to write extractors in Java. I attached a maven module with the extended example. Extractors are still implemented by inheriting from the Extractor trait, but can additionally define their dependencies. In the example there are two configurations: WikipediaProfile: This specifies the extraction profile, which defines the available dependencies (e.g. mappings). In addition we could define a Wiktionary profile. MyExtractionConfig: This specifies a specific extraction job (e.g. wikisource, extractors). The example is still very crude and still needs to be worked out in detail. Cheers, Robert On Wed, Oct 13, 2010 at 1:17 AM, Sebastian Hellmann <hel...@in...> wrote: > Hi, > I looked at > http://jonasboner.com/2008/10/06/real-world-scala-dependency-injection-di.html > and the first alternative: Using structural typing > seems to be quite similar to what is done in Spring. > I still have to admit that I didn't understand 90% of the rest that is > written in the post, though. > Jonas will start working on doing some adaptions during the next month (at > least for the wiktionary things). > @Robert Do you think this would be the way to go? > I tried to apply it: > > class Extractionjob(env: { > val source: Source > val destination: Destination > val extractor: List[Extractor] > }) > > object Config { > lazy val source = XMLSource.fromFile(new File(mediawiki_test_dump.xml), > _.namespace == > wikiTitle.Namespace.File) > > lazy val extractor = List[new TestExtractor(1,20 "myOption")] > lazy val destination = new StringDestination > > lazy val extractionjob = new Extractionjob(this) > } > > It looks very slim and efficient ;) > I will be on holiday from 21.10.2010 until 20.11.2010, so it would be nice, > if we'd decide right now. > Cheers, > Sebastian > > > Am 12.10.2010 13:03, schrieb Robert Isele: > > Hi Sebastian, > > I also agree that we s > > hould generalize the DBpedia Framework. In my > opinion, it's biggest drawback is the lacking configurability. e.g. at > the moment each extractor takes an ExtractionContext object in its > constructor, which contains the complete configuration even if most > extractors only need a part of it (e.g. some extractors don't need the > ontology in which case it does not need to be loaded). We could gain > much flexibility, if we make this more configurable. > > I see two ways two achieve this: > 1) Using Spring. As I understand, this would make it possible to > configure the complete extraction process including all extractors > using an XML configuration file. > 2) Making the API more flexible. e.g. letting the extractors define > which data they need in a static way (e.g. by using the Cake pattern > [1]). The configuration would then be a small Scala script. > > I will have to take a deeper look into this. The biggest drawback, > that I see with Spring is, that it might not fit smoothly into Scala > and makes the configuration more complicated than necessary. > e.g. comparing the configuration of the XMLFileSource: > Spring: > <bean id="testSource" class="org.dbpedia.extraction.XMLFileSource"> > <constructor-arg index="0"> > <value>file:mediawiki_test_dump.xml</value> > </constructor-arg> > <constructor-arg index="1"> > <list value-type="java.lang.Integer"> > <value>0</value> > </list> > </constructor-arg> > </bean> > > Scala: > XMLSource.fromFile(new File(mediawiki_test_dump.xml), _.namespace == > WikiTitle.Namespace.File) > > For me, the second version looks much clearer and more descriptive. I > understand that using the implementation language itself for > configuration, instead of XML, may sound unusual in the Java World > (although it is advocated by Google Guice [2]). But I think it is much > cleaner and more flexible, than Spring's way of replicating the Java > Beans Model in XML. While this may be a good idea in Java, I think > Scala with it's more concise syntax and better type system, would > provide a perfect way to configure a specific extraction script. > > As I don't know much about Spring yet, especially in the context of > using it together with Scala, I will take a deeper look into it in the > next days. As we are planning to make another release of DBpedia in a > few weeks, I can also commit some time into improving the framework, > but will discuss it over the list before making any bigger > refactoring. > > [1] > http://jonasboner.com/2008/10/06/real-world-scala-dependency-injection-di.html > [2] http://code.google.com/p/google-guice/ > > On Mon, Oct 11, 2010 at 11:53 PM, Sebastian Hellmann > <hel...@in...> wrote: > > > Hi, > today, I tried if the framework was compatible with Spring and it works: > See the Wiktionary module: > wiktionary/src/main/resources/config.xml > wiktionary/src/main/scala/org.dbpedia.extraction.wiktionary.Extract line 32 > wiktionary/src/main/scala/org.dbpedia.extraction.XMLFileSource > (in XMLFileSource note that I didn't succeed to do the conversion here > correctly on line 19) > wiktionary/src/main/scala/org.dbpedia.extraction.mappints/TestExtractor > > Per default all resources instantiated with spring are singletons. > This might be useful for stuff like the ontology source or the commons. > These can be injected easily into the extractors without the > ExtractionContext. > We would gain quite some flexibility with that. One of the LOD2 tasks is > about generalizing the Framework software, > which would be easily achived, if we had a Wiktionary dump ;) > > - @Robert/Max Please review and give feedback ASAP, I'm only here for 10 > more days. > - Could somebody add Jonas to the developers list... I seem to have the > forgotten the password. > > Cheers, > Sebastian > > > > > -- > Dipl. Inf. Sebastian Hellmann > Department of Computer Science, University of Leipzig > Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann > Research Group: http://aksw.org > > > ------------------------------------------------------------------------------ > Beautiful is writing same markup. Internet Explorer 9 supports > standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. > Spend less time writing and rewriting code and more time creating great > experiences on the web. Be a part of the beta today. > http://p.sf.net/sfu/beautyoftheweb > _______________________________________________ > Dbpedia-developers mailing list > Dbp...@li... > https://lists.sourceforge.net/lists/listinfo/dbpedia-developers > > > > -- > Dipl. Inf. Sebastian Hellmann > Department of Computer Science, University of Leipzig > Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann > Research Group: http://aksw.org > > ------------------------------------------------------------------------------ > Beautiful is writing same markup. Internet Explorer 9 supports > standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. > Spend less time writing and rewriting code and more time creating great > experiences on the web. Be a part of the beta today. > http://p.sf.net/sfu/beautyoftheweb > _______________________________________________ > Dbpedia-developers mailing list > Dbp...@li... > https://lists.sourceforge.net/lists/listinfo/dbpedia-developers > > |