From: Markus <ma...@ai...> - 2006-08-14 16:47:23
|
Hi all, Denny and I have been at the International Wikimedia Conference in Boston t= he=20 week before last week, and much news are to be told regarding the future of= =20 Semantic MediaWiki. I will try to sum up the core points: * Wikimania at Harvard Law School was a wonderful and very well organised=20 event with a great line-up of invited speakers. I can strongly recommend=20 everyone who was not there to have a look at the online videos and audios.= =20 The talks of Lessig, Kahle, and Weinberger should not be missed. We gave a talk on Semantic MediaWiki which was very well received, and we h= ad=20 an interesting panel discussion with Brion Vibber and Tim Berners-Lee=20 afterwards. Various people approached us to help in development, but it=20 remains to be seen who remembers this after the conference ;-) We received= =20 very encouraging feedback from many conference participants, most prominent= ly=20 from Jimbo Wales and Brad Patrick, and we are positive that the remaining=20 technical and social challenges of including SMW into Wikipedia can be=20 settled "in the near future" (in lack of a concrete time schedule). SMW 0.6= =20 will emphasise this goal. We also gave a tutorial ("workshop") on reusing data on the Semantic Web wh= ere=20 we explained how RDF is used to exchange data, and we did ad hoc=20 demonstrations of SMW everywhere around the conference ;-) * We also have been at the MediaWiki Hacking Days at MIT/"One Laptop Per=20 Child" before the event. Discussions among the small group of active=20 developers in and around MediaWiki have been very inspiring. Most=20 importantly, we reached the conclusion that <ask> queries will probably sca= le=20 only to medium-sized wikis. For huge wikis such as Wikipedia, they will nee= d=20 some restriction, and possibly some more elaborate asynchronous processing.= =20 The general problem are huge joins, which might be mere intersections of=20 categories without much semantics. MySQL is not too good on those, and it=20 might be worth looking at text-search systems that routinely have to perfor= m=20 this operation. Domas Mituzas (MediaWiki) suggested to try and abuse such a= =20 system right away, but I did not have a closer look at this idea so far. * We will set up a mirror of Wikipedia for testing SMW in a large-scale wik= i=20 soon. This will be a nice real-size playground which should give us details= =20 on possible performance bottlenecks that still need to be resolved. * Wikidata/WiktionaryZ is making good progress, but the project is still no= t=20 complete enough for practical use (especially versioning is missing). The=20 team around Eric M=F6ller is trying to build a tightly structured wiki (mor= e=20 like a spreadsheet), that features typed input fields instead of free text.= =20 The application scenarios are thus mostly complementary to SMW, but we will= =20 aim at close cooperation to ensure compatibility of "meta-data" (i.e. they= =20 better build an RDF export at some stage ;-). Wikidata and Semantic MediaWi= ki=20 have been prominently arguing for more structured data throughout the=20 conference, and it seems that many people now see the need of those=20 enhancements. * We have met with Elias Torres and his colleagues at IBM just around the=20 corner. Other than admiring the great tools that they develop there (and=20 that, as they ensure us, will be free software soon), we also talked about= =20 Semantic MediaWiki and possible future extensions. There might well be some= =20 (optional) RDFa-support in future versions of SMW ... * We have met with people at Simile (W3C/MIT-founded company and renowned=20 makers of RDF tools). We have enjoyed loading RDF from ontoworld into their= =20 latest devel version of Longwell (http://simile.mit.edu/wiki2/Longwell), a= =20 rather good-looking (free) tool for RDF querying and facetted browsing. I=20 will implement some browser-independent script for making a full RDF-dump, = so=20 that we can offer daily exports for external reuse in such and similar tool= s. Moreover, we have had a closer look at Timeline (look at=20 http://simile.mit.edu/timeline/examples/dinosaurs/dinosaurs2.html), and fou= nd=20 that it would actually make a nice addition for formatting <ask> results.=20 Stay tuned. * Finally, we have heard about various technical novelties that we can expe= ct=20 in and around MediaWiki soon (though I might forget some): ** LiquidThreads, a new forum-like discussion module for MediaWiki, is=20 underway. ** SingleSignon, the new feature of having one login to rule them all (or a= t=20 least to rule all Wikimedia sites) is scheduled for the next months. OK, I think that's all. Cheers, Markus =2D-=20 Markus Kr=F6tzsch Institute AIFB, University of Karlsruhe, D-76128 Karlsruhe ma...@ai... phone +49 (0)721 608 7362 www.aifb.uni-karlsruhe.de/WBS/ fax +49 (0)721 693 717 |
From: <jf...@mo...> - 2006-08-14 19:34:00
|
On Mon, Aug 14, 2006 at 06:47:08PM +0200, Markus Kr?tzsch wrote: > We gave a talk on Semantic MediaWiki which was very well received, and we had > an interesting panel discussion with Brion Vibber and Tim Berners-Lee > afterwards. Various people approached us to help in development, but it > remains to be seen who remembers this after the conference ;-) We received > very encouraging feedback from many conference participants, most prominently > from Jimbo Wales and Brad Patrick, and we are positive that the remaining > technical and social challenges of including SMW into Wikipedia can be > settled "in the near future" (in lack of a concrete time schedule). SMW 0.6 > will emphasise this goal. Which changes do you plan for 0.6? Do you plan changes in the user interface? While I think that a computer scientist easily learns what "'''San Diego''' is a [[is a::city]] in the [[is located in::United States]]" means, I doubt that my mother will understand this immediately. I'm about to finish my current mini-project (only one bug left) and would like to spend some time on SMW. > We also gave a tutorial ("workshop") on reusing data on the Semantic Web where > we explained how RDF is used to exchange data, and we did ad hoc > demonstrations of SMW everywhere around the conference ;-) The main reason for me to work on SMW is Wikimaps, which requires structured data export from MediaWiki to the map renderer. Coordinates, type (city, lake, hill, place), attributes (inhabitants, area, administrative level). Data that we already have, but that we can't extract. > * We will set up a mirror of Wikipedia for testing SMW in a large-scale wiki > soon. This will be a nice real-size playground which should give us details > on possible performance bottlenecks that still need to be resolved. I think most bottlenecks will show if you increase the number of relations. Using a WP dump, category-related parts of the code can be tested. But to test the rest of the code, you'll have to create random relations between the articles. Or start to automatically find relations :-) > * Finally, we have heard about various technical novelties that we can expect > in and around MediaWiki soon (though I might forget some): > ** LiquidThreads, a new forum-like discussion module for MediaWiki, is > underway. > ** SingleSignon, the new feature of having one login to rule them all (or at > least to rule all Wikimedia sites) is scheduled for the next months. Hm, SingleSignon... Didn't we announce this at Wikimania 2005 in Frankfurt, too? Regards, jens |
From: Markus <ma...@ai...> - 2006-08-15 16:05:21
|
On Monday 14 August 2006 21:33, Jens Frank wrote: > On Mon, Aug 14, 2006 at 06:47:08PM +0200, Markus Kr?tzsch wrote: > > We gave a talk on Semantic MediaWiki which was very well received, and = we > > had an interesting panel discussion with Brion Vibber and Tim Berners-L= ee > > afterwards. Various people approached us to help in development, but it > > remains to be seen who remembers this after the conference ;-) We > > received very encouraging feedback from many conference participants, > > most prominently from Jimbo Wales and Brad Patrick, and we are positive > > that the remaining technical and social challenges of including SMW into > > Wikipedia can be settled "in the near future" (in lack of a concrete ti= me > > schedule). SMW 0.6 will emphasise this goal. > > Which changes do you plan for 0.6? Currently, we plan the following major changes: =3D=3D Assigned tasks =3D=3D * Full rewrite of Special:SearchTriple to become usable, performant, and=20 helpful again. This is done by Denny. * Rewrite of (parts of) the RDF export, so that it is more sound and more=20 complete ;-), and so that one can easily make full dumps of the RDF in a=20 scripted way. This is done by me. * Further improvement of inline query performance. This is probably done by= =20 me. * Rewrite of Special:Types to show all of the new custom types. This is don= e=20 by S. * A lot of fixes and cleanups that the respective code owners should take c= are=20 of. =3D=3D Open tasks =3D=3D * Support for Timeline as an output format for <ask>. See=20 http://simile.mit.edu/timeline/examples/dinosaurs/dinosaurs2.html for an=20 example on how this would look like (click/drag around to see how it works)= =2E=20 This task is open, and would be a nice place to start with.=20 * Improvement of the current tooltip code. Both the JScript and the PHP=20 creating it are not in a good shape, and the code fails often. This require= s=20 knowledge of JavaScript. * Fix table sorting code. JScript again: the in-article sorting of result=20 tables breaks a lot, since the script does not recognise numbers of the=20 form "123,234" or numbres with units. An idea would be to change the sort=20 script to optionally use some html-parameter to sort the table cells, inste= ad=20 of its content. SMW could easily create such purely numeric sorting=20 information for the script. * There could be more that one could do with some JScript, e.g. collapsable= =20 =46actboxes similar to the hidable table of contents. There could of course be more, but this is what comes to my mind now. Maybe= =20 you see something that could be interesting to you? > > Do you plan changes in the user interface? While I think that a computer > scientist easily learns what "'''San Diego''' is a [[is a::city]] in the > [[is located in::United States]]" means, I doubt that my mother will > understand this immediately. I agree. But it is not harder than many other things in Wikipedia. I hope t= hat=20 the upcoming WYSIWYG frontend for MediaWiki will be the key to helping our= =20 mothers to contribute on a more regular basis as well. Anyway, I am open to= =20 ideas for improvement -- we are flexible and could easily offer site-admins= =20 multiple syntaxes to choose from according to taste. > > I'm about to finish my current mini-project (only one bug left) and > would like to spend some time on SMW. Great! Just let me know whether the above is of interest of you. The tasks = I=20 suggested have the advantage that they are of manageable size and might not= =20 require too much future maintenance (or that they are so badly maintained=20 that any help is improvement). So this might be a good place to start. What is your background in PHP/JScript/other stuff? > > > We also gave a tutorial ("workshop") on reusing data on the Semantic Web > > where we explained how RDF is used to exchange data, and we did ad hoc > > demonstrations of SMW everywhere around the conference ;-) > > The main reason for me to work on SMW is Wikimaps, which requires > structured data export from MediaWiki to the map renderer. Coordinates, > type (city, lake, hill, place), attributes (inhabitants, area, > administrative level). Data that we already have, but that we can't > extract. We have the Type:Geographic coordinate for this, but it does not generate a= =20 very nice RDF output yet (the coordinates are still a string, since XSD has= =20 no type for coordinates). We should use the geopos-vocabulary in the future= =2E=20 This might be part of my rewrite of the RDF export. > > > * We will set up a mirror of Wikipedia for testing SMW in a large-scale > > wiki soon. This will be a nice real-size playground which should give us > > details on possible performance bottlenecks that still need to be > > resolved. > > I think most bottlenecks will show if you increase the number of > relations. Using a WP dump, category-related parts of the code can be > tested. But to test the rest of the code, you'll have to create random > relations between the articles. Or start to automatically find relations Yes, we plan to do both :-) > > :-) > : > > * Finally, we have heard about various technical novelties that we can > > expect in and around MediaWiki soon (though I might forget some): > > ** LiquidThreads, a new forum-like discussion module for MediaWiki, is > > underway. > > ** SingleSignon, the new feature of having one login to rule them all (= or > > at least to rule all Wikimedia sites) is scheduled for the next months. > > Hm, SingleSignon... Didn't we announce this at Wikimania 2005 in > Frankfurt, too? Ha ha. This year Brion sounded quite serious. The details are fixed, the=20 fights about policy have ceased, some people will be very unhappy about=20 loosing their logins to others of the same name ... but discussion is over= =20 and it will finally happen. Regards, Markus =2D-=20 Markus Kr=F6tzsch Institute AIFB, University of Karlsruhe, D-76128 Karlsruhe ma...@ai... phone +49 (0)721 608 7362 www.aifb.uni-karlsruhe.de/WBS/ fax +49 (0)721 693 717 |
From: S P. <ski...@ea...> - 2006-08-23 01:26:40
|
Thanks for the inspiring update, I wish I could have attended. Markus Krötzsch wrote: > * Wikidata/WiktionaryZ is making good progress, but the project is still not > complete enough for practical use (especially versioning is missing). Disclaimer: "Talk is cheap, implementation costs" Is there an updated page on Wikidata internals? Everything I looked at suggests Wikidata is dormant. Just from using the alpha, WiktionaryZ has a lot of tantalizing overlap with Semantic MediaWiki; for example * http://www.wiktionaryz.org/WiktionaryZ:Berlin vs. http://wiki.ontoworld.org/wiki/Berlin * http://wiktionaryz.org/WiktionaryZ:world , click on '+' signs until you expand '+ Relations', see things like http://wiktionaryz.org/WiktionaryZ:is_part_of_theme. Hmmm. * and http://wiktionaryz.org/Editing_relational_data (very early, I realize). > The team around Eric Möller is trying to build a tightly structured wiki (more like a spreadsheet), that features typed input fields instead of free text. > Alas I can't try editing on WiktionaryZ during "Closed alpha testing" to see how this works. It seems filling in MediaWiki templates, with or without SMW properties, could use a "typed input fields" UI rather than wiki text. > The application scenarios are thus mostly complementary to SMW, but we will > aim at close cooperation to ensure compatibility of "meta-data" (i.e. they > better build an RDF export at some stage ;-). Wikidata and Semantic MediaWiki > have been prominently arguing for more structured data throughout the > conference, and it seems that many people now see the need of those > enhancements. > It would be nice if users just entered SMW property values, and MediaWiki under the covers figured out whether to store them in the "tight structure" like Wikidata or the self-describing attribute and relation tables of SMW. I know people including movGP0 have been trying to store complex SMW datatypes like box and vector as attribute values, and IMO at some point you need a dedicated table. It seems to me the REALLY hard part is implementing querying across both kinds. Cheers, again thanks for the Wikimania report. -- =S P.S. > The application scenarios are thus mostly complementary to SMW > You must not be a native English speaker, as 90% of them misspell this "complimentary" :-) ;-) |
From: Johann D. <joh...@ao...> - 2006-08-23 17:08:56
|
> "S Page" wrote: > Thanks for the inspiring update, I wish I could have attended. Me to. But Markus report is also fine. > > Markus Krötzsch wrote: > > * Wikidata/WiktionaryZ is making good progress, but the project is still > > not > > complete enough for practical use (especially versioning is missing). > Disclaimer: "Talk is cheap, implementation costs" > > Is there an updated page on Wikidata internals? Everything I looked at > suggests Wikidata is dormant. > > Just from using the alpha, WiktionaryZ has a lot of tantalizing overlap > with Semantic MediaWiki; for example > > * http://www.wiktionaryz.org/WiktionaryZ:Berlin vs. > http://wiki.ontoworld.org/wiki/Berlin > > * http://wiktionaryz.org/WiktionaryZ:world , click on '+' signs until > you expand '+ Relations', see things like > http://wiktionaryz.org/WiktionaryZ:is_part_of_theme. Hmmm. > > * and http://wiktionaryz.org/Editing_relational_data (very early, I > realize). Yes, indeed. Some things would be better solved in SMW. WiktionaryZ seems to use a Parser that reacts to a "{{" + KEWORD + "|" + VALUES + "}}" Pattern, which is also a nice feature, because it allows to define Types within a Template, allows a GUI that works directly on that Template asking for Parameter and Value. Also this allows to have Relations like: (WiktionaryZ:Berlin, has translation, _1) (_1, xml:lang, de) (_1, WiktionaryZ:hasSpelling, Berlin) (_1, rdfs:type, WiktionaryZ:has_identical_meaning) by adding a Template of the form: {{Translation|de|Berlin|true}} To use this Template to get SMW-compatible Data, there seems to be the need for the syntax-extensions I've suggested: [[ Relation:has translation :: {xml:lang:=de}{WiktionaryZ:hasSpelling:=Berlin}{rdfs:type :: WiktionaryZ:has_identical_meaning} ]] I'm not sure about the correct Syntax in Wikidata for defining types, so I give a semantic-only Example for the Translation-Template: [[ Relation:has translation :: {xml:lang:={{{1}}}} {WiktionaryZ:hasSpelling:={{{2}}}} {{if|{{{3}}}|true|{rdfs:type::WiktionaryZ:has_identical_meaning} ]] doing such a thing with the current solution would lead to one dummy Site per statement. > > > The team around Eric Möller is trying to build a tightly structured > > wiki (more like a spreadsheet), > > that features typed input fields instead of free text. > > > Alas I can't try editing on WiktionaryZ during "Closed alpha testing" to > see how this works. It seems filling in MediaWiki templates, with or > without SMW properties, could use a "typed input fields" UI rather than > wiki text. Yeah, that's really awful. It doesn't give the possibility to give much feedback. So this could lead to compatibility-problems, cause there is not much discussion between the WiktionaryZ guys and the SMW guys. > > > The application scenarios are thus mostly complementary to SMW, but we > > will > > aim at close cooperation to ensure compatibility of "meta-data" (i.e. > > they > > better build an RDF export at some stage ;-). Wikidata and Semantic > > MediaWiki > > have been prominently arguing for more structured data throughout the > > conference, and it seems that many people now see the need of those > > enhancements. Would'nt it be nice if we could store the data within the SMW-Tables directly using a MediaWiki-Template or PHP-Interface? > > > It would be nice if users just entered SMW property values, and > MediaWiki under the covers figured out whether to store them in the > "tight structure" like Wikidata or the self-describing attribute and > relation tables of SMW. I know people including movGP0 have been trying > to store complex SMW datatypes like box and vector as attribute values, > and IMO at some point you need a dedicated table. It seems to me the > REALLY hard part is implementing querying across both kinds. For storing Vectors you can also create a special scheme - using an subclass of the RDF-list is a possibility. But you would need to have a more powerfully syntax and the imported RDF-scheme, so you can make more general statements: [[ math:Vector :: { { {rdfs:type := xsd:int}{Attribute:value:=1} } { {rdfs:type := xsd:int}{2} } { {rdfs:type := xsd:int}{-3} } { {rdfs:type := xsd:int}{5} } } ]] correspondig to: ( ARTICLE, math:Vector, _1 ) ( _1, rdfs:type, rdf:list ) ( _1, rdf_1, _2 ) ( _1, rdf_2, _3 ) ( _1, rdf_3, _4 ) ( _1, rdf_4, _5 ) ( _2, rdf:type, xsd:int ) ( _2, Attribute:value, "1" ) ( _3, rdf:type, xsd:int ) ( _3, Attribute:value, "2" ) ( _4, rdf:type, xsd:int ) ( _4, Attribute:value, "-3" ) ( _5, rdf:type, xsd:int ) ( _5, Attribute:value, "4" ) I know that SMW is not meant to be a general RDF-Editor, but indeed it could (and need to) be one. > > Cheers, again thanks for the Wikimania report. > -- > =S > > P.S. > > The application scenarios are thus mostly complementary to SMW > > > You must not be a native English speaker, as 90% of them misspell this > "complimentary" :-) ;-) > ys, MovGP0 |