On 2/3/11 10:03 AM, Paul Houle wrote:
> On 2/2/2011 4:10 PM, Lushan Han wrote:
>> FYI, the class type dbpedia-owl:City is missing for capitals, for
>> example, http://dbpedia.org/page/London.
>> And the example on DBPedia website "Cities with more than 2 million
>> habitants" therefore failed to give out capitals.
>> Best regards,
> Here we go again.
> (1) There's a fundamental ontological problem here. Technically
> London (like Tokyo) is not a city. London is a metropolitan area that
> is composed of 33 boroughs such as Westminister, Kensington, Hackney
> and Camden. The actual "City of London" is the financial district and
> is about a square mile in area.
> (2) Dbpedia has poor recall for many common types such as human
> settlements and people. The underlying issue is that it extracts type
> information from infoboxes, which are used inconsistently... There
> isn't a "city infobox", but rather, there are different infoboxes that
> are used in different regions and different areas. The signal is
> imperfect (many people have no infoboxes at all) and the set of rules
> that dbpedia uses to extract types is also imperfect. The flip side is
> that the precision of types in dbpedia is absolutely excellent, and
> I've found quite literally a handful of cases where things were mistyped
> in a blatantly wrong way.
So why don't you make a linkset that addresses these issues? You can
tweak the DBpedia TBox or make your own. I can load it into a Named
Graph distinct from the main DBpedia graph. Then it can be evaluated en
route to becoming part of the main Graph, if you choose.
I performed a similar exercise  (which I hope becomes the norm) with
@danbri a few days ago. This process is a nice stop-gap while Wikipedia
evolves re. structured data.
> The answer to (1) in commonsense reasoning systems is to maintain
> "vernacular types" that reflect popular understandings.
> It's still tricky; the classification of human settlements is
> difficult because there's no clear line between "city", "town" and
> "village"; people in other language areas, such as de, have concepts
> that are similar but different, such as "stadt" and "dorf". A
> vernacular type that would work in the en-zone is to say, "anything
> that has town in it's name is a :Town" but a place that's called a
> "Town" in the U.S. could be a small city, a village, a rural area
> where 20-30% of people live in a few concentrated areas (the "Town" that
> I write a tax check to every year), or a centerless suburban or
> posturban area like Derry, N.H.
> In New York State there are approximately 20 types of local
> government, and the law for the establishment of local governments is
> different in all 50 states of the :United_States, and different in the
> 200 or so other countries that are out there. One could imagine a very
> detailed data model that represents this very precisely, but it would
> be a difficult model to work with and you'd still need some kind of
> vernacular layer to make it easier to work with.
> As for (2) the easy thing to do is get your types from Freebase.
> Precision in Freebase is slightly worse than Dbpedia, but recall is
> better by a factor of 2 or more for many types. Freebase has used both
> machine learning and crowdsourcing techniques to produce a type system
> that's easy to work with.
Yes, so make a linkbase for now as I suggested.
> Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
> Finally, a world-class log management solution at an even better price-free!
> Download using promo code Free_Logger_4_Dev2Dev. Offer expires
> February 28th, so secure your free ArcSight Logger TODAY!
> Dbpedia-discussion mailing list