From: <wfi...@ts...> - 2009-01-21 18:13:57
|
Dear OWL-API experts. I am wondering what experiences people have in extracting/importing data from text files and incorporating that data as new asserted knowledge in an ontology. An example to clarify. Suppose I have a text file (containing 3 lines) that held some data about a person and if they own a car: Person=Mary Car=Nissan CarYear=2006 Person=Susan Car=Mazda CarYear=2009 Person=Tom An ontology to represent these data facts could be defined as follows: Class Person Class Car Both Class Person and Class Car are DISJOINT Object Property owns (Domain = Person, Range = Car) Class Person has a Universal restriction on the object property "owns". This is because a person may or may not own a car at this time, Tom for example. DataType Property hasName (Domain = Person, Range = String) DataType Property hasCarMake (Domain = Car, Range = String) DataType Property hasCarYear (Domain = Car, Range = Int) (Example complete Ontology developed using Protege 3.4 beta build 519 and is located at http://www.williamfitzgerald.info/ontology/person.owl) QUESTION: having built my ontology (albeit via protege and not directly via the owl-api) how do you populate the ontology from a data file with individuals and instantiated properties associated with them via the OWL-API? Thus, importing the above data file into the ontology as: Individuals mary, susan, tom mary hasName "Mary" susan hasName "Susan" tome hasName "Tom" mary owns car1 susan owns car2 car1 hasCarMake "Nissan" car1 hasCarYear 2006 car2 hasCarMake "Mazda" car2 hasCarYear 2009 Having built a taxonomy (classes) and having created various properties to reflect the kind of knowledge I want to capture about the data in the text file, how do i create the above individuals using the OWL-API? Is there some sort of (magic/automatic) parsing api that I can use readily? Or more (realistically!) must I develop my own specific java based file reading/parsing via Java IO support in conjunction with OWL-API. So when encountering the text "Person=", I prepare using the OWL-API for the creation of an individual (for simplicity called same as the text found in file eg. mary) of class Person and then simultaneously set the property hasName range value to a string name also equal to the text value parsed. Also setting the "owns" property should that person own a car. (Note for simplicity of the above example, I call the individuals mary, susan and tom and I know that this has no significance from a reasoner support point of view and they could have easily been called person1, person2 and person3.) Similarly on encountering the texts "Car=" and "CarYear=", using the OWL-API I create an individual (carX) and set the range of the 2 datatype properties to the appropriate string and integer values. Is that how its usually done? My gut feeling is a home-brew of java file parsing and then feeding that information to the OWL-API libraries is the path to take. I understand a text data/log file can have multiple formats, for example, the file could have had the following structure: Mary, Nissan, 2006, Susan, Mazda, 2009, Tom As a result I appreciate that the OWL-API may not have a one-size fits all data file parsing (if at all) inorder to input new knowledge into the ontology. To provide you with a context, in reality I am looking to load various Network Security log files (resonably large files of different formats!) into an ontology I have created. For example, an iptables firewall log entry could be the following: Feb 25 12:11:24 myFirewall: INBOUND TCP: IN=br0 PHYSIN=eth0 OUT=br0 PHYSOUT=eth1 SRC=220.228.136.38 DST=11.11.79.83 LEN=64 TOS=0x00 PREC=0x00 TTL=47 ID=17159 DF PROTO=TCP SPT=1629 DPT=139 WINDOW=44620 RES=0x00 SYN URGP=0 For Example, a snort intrusion detection log entry might look as follows: Feb 25 13:09:34 myIntrusionSystem: [1:2001669:1] BLEEDING-EDGE Web Proxy GET Request [Classification: Potentially Bad Traffic] [Priority: 2]: {TCP} 220.170.88.36:3047 -> 11.11.79.82:80 All feedback/experiences on how to import/parse data as new knowledge of a predefined ontology (such as the Person.owl example) is greatly welcomed. I think my next steps in ontology engineering is to work with the OWL-API directly to provide more finegrained control over the ontology manipulation. Eagerly awaiting your response, Will. |