2010-08-30 09:26:14 PDT
Hi,
On page
http://sw.opencyc.org/ the downloadable OpenCyc OWL files are dated 2008-06-10 -- but the semantic web endpoint dataset (used on the sw.opencyc.org site) is much more up-to-date, perhaps based upon the OpenCyc 2.0 release of 2009-07-27
Are there plans to release a more up-to-date version of the downloadable OpenCyc OWL files? If so, what time frame are we looking at for this update?
The problem with the OpenCyc OWL files currently available (2008-06-10 dump) is that they actually have a lot of errors and inconsistencies in them (specifically both wikipediaArticleUrl and sameAs dbpedia.org mappings are not to be trusted in many cases). These same errors simply are not present in the dataset utilised by the web-based endpoint.
A handful of examples:-
entity id: Mx4r-pa52L7VQdecB-mbh_CI7Q
label: 1,2-dimethylhydrazine
bad wikipediaArticleUrl:
http://en.wikipedia.org/wiki/12-inch_single
bad sameAs:
http://dbpedia.org/resource/12-inch_single
-- same mistakes not present at
http://sw.opencyc.org/concept/Mx4r-pa52L7VQdecB-mbh_CI7Q
entity id: Mx4rOOVVsL7GQdeY5f287_-Xgw
label: 2,4-tolueneDiisocyanate
bad wikipediaArticleUrl:
http://en.wikipedia.org/wiki/24_%28TV_series%29
bad sameAs:
http://dbpedia.org/resource/24_%28TV_series%29
-- same mistakes not present at
http://sw.opencyc.org/concept/Mx4rOOVVsL7GQdeY5f287_-Xgw
entity id: Mx4rvjHrCpwpEbGdrcN5Y29ycA
label: AH1J (a type of helicopter)
bad wikipediaArticleUrl:
http://en.wikipedia.org/wiki/Ampere-hour
bad sameAs:
http://dbpedia.org/resource/Ampere-hour
-- ok at
http://sw.opencyc.org/concept/Mx4rvjHrCpwpEbGdrcN5Y29ycA
entity id: Mx4rGNN6ShPiEdqAAAACs6hfSg
label: aboveground thing
bad wikipediaArticleUrl:
http://en.wikipedia.org/wiki/Mass_Rapid_Transit_%28Singapore%29
bad sameAs:
http://dbpedia.org/resource/Mass_Rapid_Transit_%28Singapore%29
-- ok at
http://sw.opencyc.org/concept/Mx4rGNN6ShPiEdqAAAACs6hfSg
entity id: Mx4rvViadJwpEbGdrcN5Y29ycA
label: aggressing (cyc label: AdvancingOnSportsOpponent)
bad wikipediaArticleUrl:
http://en.wikipedia.org/wiki/Fronted_%28phonetics%29
bad sameAs:
http://dbpedia.org/resource/Fronted_%28phonetics%29
-- error not present in live-data at
http://sw.opencyc.org/
entity id: Mx4rc-XS8IJLQdeU9IWdf04C6A
label: professional counselling
bad wikipediaArticleUrl:
http://en.wikipedia.org/wiki/Commodity_trading_advisor
bad sameAs:
http://dbpedia.org/resource/Commodity_trading_advisor
entity id: Mx4rvsaaeJwpEbGdrcN5Y29ycA
label: generic agent
bad wikipediaArticleUrl:
http://en.wikipedia.org/wiki/World_Health_Organization
This problem extends throughout the whole knowledge-base contained within the currently available OpenCyc OWL downloads.
Having such erroneous mappings makes this a far less useful resource than it otherwise would be.
I am looking forwards to the release of a more up-to-date / cleaned-up OWL dump.
Many thanks,
/Jim