From: <wfi...@ts...> - 2009-01-23 13:24:28
|
Hi all, In reply to my own query, I wonder would it be better to work at an XML level and avoid populating the ontology as described previously via a hybrid of java file IO and java OWL-API. For example, I could write (whether it is practical or not) a Perl script that reads/parses data from text based log files and then write/insert this data in XML format back into the already predefined OWL-DL ontology. (Predefined Ontology: classes and properties created already using Protege) Again, I would appreciate any feedback as to how best to populate an ontology with data (individuals) from existing external resources (in particular text files). regards, Will. > Dear OWL-API experts. > > I am wondering what experiences people have in extracting/importing data > from text files and incorporating that data as new asserted knowledge in > an ontology. > > An example to clarify. > > Suppose I have a text file (containing 3 lines) that held some data about > a person and if they own a car: > > Person=Mary Car=Nissan CarYear=2006 > Person=Susan Car=Mazda CarYear=2009 > Person=Tom > > An ontology to represent these data facts could be defined as follows: > > Class Person > Class Car > > Both Class Person and Class Car are DISJOINT > > Object Property owns (Domain = Person, Range = Car) > > Class Person has a Universal restriction on the object property "owns". > This is because a person may or may not own a car at this time, Tom for > example. > > DataType Property hasName (Domain = Person, Range = String) > DataType Property hasCarMake (Domain = Car, Range = String) > DataType Property hasCarYear (Domain = Car, Range = Int) > > (Example complete Ontology developed using Protege 3.4 beta build 519 and > is located at http://www.williamfitzgerald.info/ontology/person.owl) > > QUESTION: > having built my ontology (albeit via protege and not directly via the > owl-api) how do you populate the ontology from a data file with > individuals and instantiated properties associated with them via the > OWL-API? > > Thus, importing the above data file into the ontology as: > > Individuals mary, susan, tom > > mary hasName "Mary" > susan hasName "Susan" > tome hasName "Tom" > > mary owns car1 > susan owns car2 > > car1 hasCarMake "Nissan" > car1 hasCarYear 2006 > > car2 hasCarMake "Mazda" > car2 hasCarYear 2009 > > Having built a taxonomy (classes) and having created various properties to > reflect the kind of knowledge I want to capture about the data in the text > file, how do i create the above individuals using the OWL-API? > > Is there some sort of (magic/automatic) parsing api that I can use > readily? > > Or more (realistically!) must I develop my own specific java based file > reading/parsing via Java IO support in conjunction with OWL-API. So when > encountering the text "Person=", I prepare using the OWL-API for the > creation of an individual (for simplicity called same as the text found in > file eg. mary) of class Person and then simultaneously set the property > hasName range value to a string name also equal to the text value parsed. > Also setting the "owns" property should that person own a car. (Note for > simplicity of the above example, I call the individuals mary, susan and > tom and I know that this has no significance from a reasoner support point > of view and they could have easily been called person1, person2 and > person3.) > > Similarly on encountering the texts "Car=" and "CarYear=", using the > OWL-API I create an individual (carX) and set the range of the 2 datatype > properties to the appropriate string and integer values. > > Is that how its usually done? My gut feeling is a home-brew of java file > parsing and then feeding that information to the OWL-API libraries is the > path to take. > > I understand a text data/log file can have multiple formats, for example, > the file could have had the following structure: > Mary, Nissan, 2006, > Susan, Mazda, 2009, > Tom > > As a result I appreciate that the OWL-API may not have a one-size fits all > data file parsing (if at all) inorder to input new knowledge into the > ontology. > > To provide you with a context, in reality I am looking to load various > Network Security log files (resonably large files of different formats!) > into an ontology I have created. > > For example, an iptables firewall log entry could be the following: > Feb 25 12:11:24 myFirewall: INBOUND TCP: IN=br0 PHYSIN=eth0 OUT=br0 > PHYSOUT=eth1 SRC=220.228.136.38 DST=11.11.79.83 LEN=64 TOS=0x00 PREC=0x00 > TTL=47 ID=17159 DF PROTO=TCP SPT=1629 DPT=139 WINDOW=44620 RES=0x00 SYN > URGP=0 > > For Example, a snort intrusion detection log entry might look as follows: > Feb 25 13:09:34 myIntrusionSystem: [1:2001669:1] BLEEDING-EDGE Web Proxy > GET Request [Classification: Potentially Bad Traffic] [Priority: 2]: {TCP} > 220.170.88.36:3047 -> 11.11.79.82:80 > > > All feedback/experiences on how to import/parse data as new knowledge of a > predefined ontology (such as the Person.owl example) is greatly welcomed. > > I think my next steps in ontology engineering is to work with the OWL-API > directly to provide more finegrained control over the ontology > manipulation. > > Eagerly awaiting your response, > Will. > > > ------------------------------------------------------------------------------ > This SF.net email is sponsored by: > SourcForge Community > SourceForge wants to tell your story. > http://p.sf.net/sfu/sf-spreadtheword > _______________________________________________ > Owlapi-developer mailing list > Owl...@li... > https://lists.sourceforge.net/lists/listinfo/owlapi-developer > > |