From: Kuys, G. <ger...@or...> - 2018-02-12 15:05:22
|
Hi Roland, I looked into the matter and report the following: * If there would be a (datatype) property imdb I would see it in http://mappings.dbpedia.org/index.php/OntologyProperty:Imdb , but there is no such property; * There is a property http://mappings.dbpedia.org/index.php/OntologyProperty:ImdbId however, which is defined as a datatype property having as it domain owl:Thing (far too general, should be dbo:Movie, or, if it refers to directors and actors etc. as well, also to dbo:Person or dbo:Organisation); * The latter datatype property has xsd:string as its range, so it should be possible to have additional zeroes upfront; * Clearly, if the dbo:imdbId properties lack any leading zeroes, it is in the source (Wikipedia) that those should be supplied; * Otherwise, but that sounds like an unwelcome patch-up to me, you could add SHACL constraints revealing every dbo:imdb that does not have the required number of positions, and afterwards do a SPARQL update for all of those (tricky business but feasible); * As far as I am concerned, the correct solution would be: * Correct wherever possible in the source, otherwise you will have to repeat your actions with every extraction cycle; * Model every identifier as a blank node of type dbo:Identifier, in which you can not only record which identifier is there, but also how it is required to look like; * Identifiers should be treated as the more complex things they really are, for even if you have no context at all you should include clues as to what they really should refer to; * That doesn't help for now, I know. But hopefully soon we will have a plan of how to get to the 'next generation DBpedia ontology' (apologies for the marketeers' language) Kind regards, Gerard ________________________________ Van: Roland Cornelissen <met...@gm...> Verzonden: zaterdag 10 februari 2018 14:37:50 Aan: dbp...@li... Onderwerp: Re: [Dbpedia-dutch] Errors and Quality issues Another quality issue: IMDB nrs. are missing leading zeroes. In order to construct a link from the provided identifier (http://nl.dbpedia.org/property/imdb) the identifier needs to contain 7 digits. When less then 7 digits are present in the identifier the string should b completed with leading zeroes in order to construct a working link to IMDB. Example: 87060 needs 2 leading zeroes: http://www.imdb.com/title/tt0087060 On 02-03-17 11:21, Roland Cornelissen wrote: Hi, I would like to register quality issues on the Dutch DBpedia data in this list, in order to have a record of issues we can work on to improve. Sometimes you run into these data peculiarities that need some extra attention, f.i. : 1. Error in the mapping of type: * http://nl.dbpedia.org/resource/Binnenstad_(Leiden)<http://nl.dbpedia.org/resource/Binnenstad_%28Leiden%29> where type is http://dbpedia.org/ontology/EthnicGroup : This is wrong and should be some specialisation of the type Place, like http://dbpedia.org/ontology/District * .... I am pretty sure I have seen more of these errors in the data and will register those here when I bump into them again. Please do so too when you meet one of those! Thanks, Roland ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot _______________________________________________ Dbpedia-dutch mailing list Dbp...@li...<mailto:Dbp...@li...> https://lists.sourceforge.net/lists/listinfo/dbpedia-dutch -- metamatter b.v. | Drs. Roland Cornelissen | Weersterweg 12 | 9832TE | Den Horn | T +31 (0)50 5515369 | M +31 (0)6 14797518 | www.metamatter.nl<http://www.metamatter.nl> Disclaimer Dit bericht met eventuele bijlagen is vertrouwelijk en uitsluitend bestemd voor de geadresseerde. Indien u niet de bedoelde ontvanger bent, wordt u verzocht de afzender te waarschuwen en dit bericht met eventuele bijlagen direct te verwijderen en/of te vernietigen. Het is niet toegestaan dit bericht en eventuele bijlagen te vermenigvuldigen, door te sturen, openbaar te maken, op te slaan of op andere wijze te gebruiken. Ordina N.V. en/of haar groepsmaatschappijen accepteren geen verantwoordelijkheid of aansprakelijkheid voor schade die voortvloeit uit de inhoud en/of de verzending van dit bericht. This e-mail and any attachments are confidential and are solely intended for the addressee. If you are not the intended recipient, please notify the sender and delete and/or destroy this message and any attachments immediately. It is prohibited to copy, to distribute, to disclose or to use this e-mail and any attachments in any other way. Ordina N.V. and/or its group companies do not accept any responsibility nor liability for any damage resulting from the content of and/or the transmission of this message. |