On Wed, Apr 23, 2008 at 12:01 PM, Markus Krötzsch <mak@aifb.uni-karlsruhe.de> wrote:
> I see. Maybe there should be some easy way to create RSS links - I use them
> with FeedBurner (a fixed set of links, which does not allow dynamic feeds
> by topic, for example) and they were pain in the neck to create,

Do you use queries to make them? They are surely not meant to be created by
hand! We cannot not promiss long-term-support for URLs that cannot be created
by any part of SMW (which still might happen to work now).

I do them manually by hand after reverse-engineering of SMW code (this is the only way I can provide it to FeedBurner, right?), so it'll be great to know when this syntax gets updated.

> also I had
> to hide all the complexity behind mode_rewrite rule (actually Apache
> freaked out on this URLs too so I did it on my reverse proxy I use) because
> FeedBurner failed to understand these long URLs.

I see that it is a long URL, but it contains no uncommon symbols or anything.
What was Apache's problem?

I don't know what Apache was complaining about, but I coudn't create a mod_rewrite rule for it in 15 tries so I gave up ;) In two words, "raw" parameter is already encoding stuff, then Apache has regex callbacks encoded in similar way and I couldn't figure out a way to escape everything up properly.

> It brings the question of savable queries - I wrote about this idea in
> context of performance if you remember. In two words, it might make sense
> to create special namespace for saved queries - where each page will define
> a query and then that page serves as a name shortcut that can be used
> instead of the full syntax. I use this approach in my Widgets extension:
> http://www.mediawiki.org/wiki/Extension:Widgets - SMW can use it in various
> places, e.g. as argument to {{#ask}} (instead of full query), as arguments
> to RSS/iCal and other exporters and so on. This will relief developers from
> URL issues (and also as I mentioned in last post, might be usable for
> restricting querying functionality if combined with namespace permissions).
> Let me know what you think - I'll be happy to discuss in more detail if
> needed.

I think that would be feasible, but I also hear concerns from people saying
that SMW gets too large/complex. I am not sure what would be best. Another
option for short URIs would be to create an internal ID for queries that is
then used instead of the lengthy query in URL parameters. That would also be
useful for caching query results, but it does not have the extra control
feature (namespace permission -- btw. how would I best do that? We still did
not open talk pages on s-mw.org :-( ). On the other hand, it safes you a
namespace, and adding a namespace for queries that are then transcluded is
always possible anyway if that kind of control is desired (this just fails
for Special:Ask, but again the cached query ID would solve that too). We
would just need to restrict SMW queries to certain namespaces.

I think I'm talking about page names in separate namespace to be that query ID, otherwise how will you define a query? in LocalSettings.php? it'll be too hard to maintain and will defeat the purpose. Pages from that namespace will not be really transcluded, but used as a source for data to {{#ask}} as it is for {{#widget}} in Widgets extension or to Special::AddPage in Semantic Forms, for example.

You can check source for Yahoo! Video extension to see what is really within <includeonly></includeonly> tags:

> >    - SMW Registration is done by parsing Special:Version as I understand
> >
> > >    - why not to create an RDF export of the statistical data that your
> > > service can consume in addition to the base data in Special:Version? I
> > > understand that old version will not contain it, but there are lots of
> > > other useful data that can be extracted and why not to use best
> > > Semantic format to do that? The project is called "Semantic" MediaWiki,
> > > right? ;)
> >
> > We do :-) But that feature is only available since 1.1, hence we check
> > Special:Version to be able to register also older SMWs. But all the
> > additional information (e.g. page count) comes from
> > Special:ExportRDF?stats.
> > It would otherwise be very hard to parse this data from
> > Special:Statistics in
> > all languages.
> Perfect! Now everyone can create a tool for visualizing SMW statistics.

Well, the data in there is still rather restricted. But, yes, in principle it
would work. By the way: this is also the page that one gets when crawling
Special:URIResolver with a caller that requests RDF (content negotiation). So
it is kind of the "main page" for RDF spiders.

So http://www.semantic-mediawiki.org/wiki/Special:ExportRDF?stats is being the RDF representation - which page though? http://www.semantic-mediawiki.org/wiki/Special:URIResolver/Special:SemanticStatistics ? or http://semantic-mediawiki.org/wiki/Special:URIResolver/ itself (I see http://semantic-mediawiki.org/wiki/Special:URIResolver/#wiki as ID within the RDF output).

> BTW, how do I get
> http://semantic-mediawiki.org/wiki/Special:SMWRegistrypage as RDF?
> with seeAlso links to specific feeds for each wiki?

Right now: not at all. This service is currently very basic, with just enough
functions to get things started. We will consider adding better interface
features once we have some reasonable amount of wikis registered. Do you have
anything in mind for working on these semantic wikis? We can supply you with
information about the current registrations (if you promiss to be nice to
them ...).

I don't have anything specific in mind, but I think "eating our own dog food" is good and popularization of SMW among SemWeb people will require some good usecase and this one is quite good and generic - probably the only data that is generic to SMW (plus, maybe, some bug reports and news...).

> I actually added it to robots.txt and will probably allow Special:ExportRDF
> universally.


It might be worth describing the robots.txt stuff to users on http://www.semantic-mediawiki.org/ site - both for SMWRegistry and for all RDF crawlers - e.g. that they need to allow crawling:


Especially if users want to close all special pages from being crawled but want these two to be crawled - since robots.txt standard doesn't have "Allow:" clause - just "Disallow:".

And so on, plus if there was a link to the tool to verify that they did everything correctly - do you know if there is any tool like that that gives you the idea of what RDF crawler will see? similar to how W3C's XML/RSS/RDF validatiors show you what will parser see?
> I know you use
> WikiMedia's wiki and they don't encourage tags or branches, but it might be
> very useful to be able to roll-back somehow. Maybe it's worth posting a
> timestamp for each release so it can be used for checkout?

Yes, maybe we should do that. SMW 1.1.1 corresponds to revision 33778. Not
sure for 1.1 (but the changes there are really minor -- the main
incompatibility is that the iCal URLs changed, so early upgrade for iCal
users is certainly encouraged).

Hmm. Backing up is good, but not always the case. Do you mind posting Change history on the site - maybe similar to how it's done (by me) on: http://www.mediawiki.org/wiki/Header_Tabs#CHANGES but with revision numbers instead of links? Maybe with link to viewvc interface like this:
http://svn.wikimedia.org/viewvc/mediawiki?view=rev&revision=33778 - SMW 1.1.1

Or, maybe, just add it to http://svn.wikimedia.org/viewvc/mediawiki/trunk/extensions/SemanticMediaWiki/RELEASE-NOTES?view=markup


Sergey Chernyshev


>             Sergey

Markus Krötzsch
Institut AIFB, Universität Karlsruhe (TH), 76128 Karlsruhe
phone +49 (0)721 608 7362          fax +49 (0)721 608 5998
mak@aifb.uni-karlsruhe.de          www  http://korrekt.org