[DL-Learner SVN] SF.net SVN: dl-learner: [95] trunk/doc/configOptions.txt

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Revision: 95
          http://dl-learner.svn.sourceforge.net/dl-learner/?rev=95&view=rev
Author:   jenslehmann
Date:     2007-08-28 07:56:37 -0700 (Tue, 28 Aug 2007)

Log Message:
-----------
added overview about configuration options

Added Paths:
-----------
    trunk/doc/configOptions.txt

Added: trunk/doc/configOptions.txt
===================================================================

--- trunk/doc/configOptions.txt	                        (rev 0)
+++ trunk/doc/configOptions.txt	2007-08-28 14:56:37 UTC (rev 95)
@@ -0,0 +1,294 @@
+Configuration Files
+===================
+
+This file gives an overview for running DL-Learner using configuration files
+as provided in the examples directory.
+
+The background knowledge can either be given as OWL DL file (using the import
+function in the configuration files) or by specifying it directly in the 
+configuration file (which we refer to as the internal knowledge base).
+
+Some examples of the syntax of the background knowledge in the internal 
+knowledge base:
+
+person = (male OR female).
+mother = (female AND EXISTS hasChild.TOP).
+motherManyDaughters = (female AND >= 4 hasChild.female).
+(mother AND father) SUBCLASSOF person.
+
+Also see the example files.
+
+This is the EBNF description of the input language [slightly outdated]:
+
+Number = ["1"-"9"] (["0"-"9"])*
+Id = ["a"-"z"] (["_","a"-"z","A"-"Z","0"-"9"])*
+String: "\"" (~["\"","\\","\n","\r"])* "\""
+Instruction =   ConfOption 
+              | FunctionCall 
+              | PosExample 
+              | NegExample 
+              | ABoxConcept 
+              | ABoxRole 
+              | Transitive 
+              | Functional 
+              | Symmetric 
+              | Inverse 
+              | Subrole 
+              | TBoxEquiv 
+              | TBoxSub
+ConfOption = Id [ "." Id ] "="  ( Id | Number ) ";"
+FunctionCall = Id "(" String ")" ";"
+PosExample = "+" Id "(" Id ")" "."
+NegExample = "-" Id "(" Id ")" "."
+ABoxConcept = Concept "(" Id ")" "."
+ABoxRole = Id "(" Id "," Id ")" "."
+Transitive = "Transitive" "(" Id ")" "."
+Functional = "Functional" "(" Id ")" "."
+Symmetric = "Symmetric" "(" Id ")" "."
+Inverse = "Inverse" "(" Id "," Id ")" "."
+Subrole = "Subrole" "(" Id "," Id ")" "."
+TBoxEquiv = Concept "=" Concept "."
+TBoxSub = Concept ("SUBCLASSOF" | "SUB" ) Concept "."
+Concept =   "TOP"
+          | "BOTTOM"
+          | Id
+          | "(" Concept "AND" Concept ")"
+          | "(" Concept "OR" Concept ")"
+          | "EXISTS" Id "." Concept
+          | "ALL" Id "." Concept
+          | "NOT" Concept
+          | ">=" Number Id "." Concept
+          | "<=" Number Id "." Concept
+
+Configuration Options
+=====================
+
+General
+-------
+
+Option: algorithm
+Possible Values: bruteForce, gp, random, refinement, hybridGP
+Default: refinement
+Effect: Specifies the algorithm to use for solving the learning problem. Note,
+        that hybridGP is not an algorithm itself, but starts the GP algorithm
+        with a sensible set of default values for the hybrid algorithm combining
+        GP with refinement operators. In particular the probability of all
+        operators except refinement is set to 0.
+
+Option: reasoner
+Possible Values: dig, kaon2, fastRetrieval
+Default: dig
+Effect: Specifies the reasoner to be used. DIG communicates with a reasoner
+        using the DIG Interface. KAON2 means to use the KAON2 Java API directly.
+        FastRetrieval is an internal algorithm, which can only be used for 
+        retrieval (not for subsumption). Currently the DIG reasoner cannot read
+        OWL files.
+
+Option: digReasonerURL
+Possible Values: a valid URL
+Default: http://localhost:8081
+Effect: Specifies the URL to be used to look for a DIG capable reasoner.
+
+Option: writeDIGProtocol
+Possible Values: true, false
+Default: false
+Effect: Specifies whether to store all DIG communication.
+
+Option: digProtocolFile
+Possible Values: strings
+Default: digProtocol.txt
+Effect: The file to store all DIG communication if writeDIGProtocol is true.
+
+Option: useRetrievalForClassification
+Possible Values: true, false
+Default: false
+Effect: To measure which concepts are covered, one can either use one retrieval
+        or several instance checks (at most one for each example). This option
+        controls which of both options should be used.
+
+Option: percentPerLengthUnit
+Possible Values: 0-1
+Default: 0.05
+Effect: How much percent (wrt classification accuracy) can a concept be worse to
+        justify an increase in length of 1. This variable is used for GP and in
+        refinement when the flexible heuristic is used. For GP, you should use a
+        value smaller than the default.
+
+> general options below are ignored <
+> by the refinement operator algorithm <
+
+Option: accuracyPenalty
+Possible Values: 1-1000
+Default: 1
+Effect: Sets the penalty for "small misclassifications".
+
+Option: errorPenalty
+Possible Values: 1-1000
+Default: 3
+Effect: Sets the penalty for classification errors.
+
+Option: maxLength
+Possible Values: 1-20
+Default: 7
+Effect: For the brute force learner this specifies the depth limit for the
+        search. The GP learner currently ignores it.
+
+Option: scoreMethod
+Possible Values: full, positive
+Default: positive
+Effect: The positive score method ignores if a negative examples cannot be
+        classified. This is often usefull, because of the limited expressiveness
+        of SHIQ wrt. negated role assertions. The full method penalizes this.
+		
+Option: showCorrectClassifications
+Possible Values: true, false
+Default: false
+Effect: Controls if correct classifications are printed (does not effect the
+        algorithm).
+		
+Option: penalizeNeutralExamples
+Possible Values: true, false
+Default: false
+Effect: If true there is a penalty if a neutral (neither positive nor negative)
+        individual is classified as either positive or negative. This should
+        usually be set to false.
+
+Refinement Operator Algorithm Specific
+--------------------------------------
+
+Option: refinement.horizontalExpansionFactor
+Possible Values: 0-1
+Default: 0.6
+Effect: Specifies horizontal expansion factor.
+
+Option: refinement.writeSearchTree
+Possible Values: true, false
+Default: false
+Effect: Specifies whether to write the search tree to a file.
+
+Option: refinement.searchTreeFile
+Possible Values: strings
+Default: "searchTree.txt"
+Effect: Specifies a file to save the current search tree after each loop of
+        the refinement algorithm.
+
+Option: refinement.heuristic
+Possible Values: flexible, lexicographic
+Default: lexicographic
+Effect: The refinement operator together with a heuristic yields a learning
+        algorithm. The lexicographic heuristic uses a lexicographic order of
+        covered negative examples and horizontal expansion of a node (i.e.
+        the covered examples are the first criterion, the horizontal expansion
+        the second criterion). The flexible heuristic computes a combined node
+        score of both criteria. Note, that the lexicographic needs a horizontal
+        expansion factor greater than 0 to ensure correctness of the learning
+        algorithm.
+
+Option: refinement.quiet
+Possible Values: true, false
+Default: false
+Effect: If set to true, no messages will be shown during the run of the 
+        algorithm (but there will still be startup and summary messages).
+
+Option: refinement.applyAllFilter
+Possible Values: true, false
+Default: true
+Effect: Specifies wether all equivalences should be used.
+
+Option: refinement.applyExistsFilter
+Possible Values: true, false
+Default: true
+Effect: Specifies wether exists equivalences should be used.
+
+Option: refinement.useTooWeakList
+Possible Values: true, false
+Default: true
+Effect: Specifies wether a too weak list should be used to reduce reasoner
+        requests.
+
+Option: refinement.useOverlyGeneralList
+Possible Values: true, false
+Default: true
+Effect: Specifies wether an overly general list should be used to reduce 
+        reasoner requests.
+
+Option: refinement.useShortConceptConstruction
+Possible Values: true, false
+Default: true
+Effect: Specifies wether the algorithm should try to reduce a concept to a
+        known more general concept to reduce the number of necessary 
+        subsumption checks for the reasoner.
+
+Option: refinement.useDIGMultiInstanceChecks
+Possible Values: never, twoChecks, oneCheck
+Default: twoChecks
+Effect: The DIG protocol allows to send several queries to a DIG reasoner at
+        once. [This is automatically done for subsumption tests.] However,
+        for instance checks this has the disadvantage that it may not be 
+        necessary to send all instance to the DIG reasoner if one of the 
+        positive examples is not covered (meaning that the concept is 
+        classified as too weak). 
+        If the option is set to never, then each instance check is send 
+        separately.
+        If the option is set to twoChecks, then first all positive examples will
+        be send in one query. If all of them are covered, i.e. the concept is
+        not classified as too weak, then all the negative examples are send in
+        one query.
+        If the option is set to oneCheck, then all examples will be send in one
+        query.
+
+Genetic Programming Specific
+----------------------------
+
+Option: gp.algorithmType
+Possible Values: steadyState, generational
+Default: steadyState
+Effect: Uses either a steady state (population partly replaced) or generational
+        (population completely replaced) algorithm.
+		
+Option: gp.elitism
+Possible Values: true, false
+Default: true
+Effect: If true an the GP algorithm uses elitism, i.e. the best individual is
+        guarenteed to survive.
+		
+Option: gp.numberOfIndividuals
+Possible Values: 1-1000000
+Default: 1000
+Effect: Sets the number of individuals in the population. A higher value
+        improves classification, but is computationally more expensive.
+		
+Option: gp.numberOfSelectedIndividuals
+Possible Values: 1-1000000
+Default: 960
+Effect: Sets the number of individuals, which are selected for replacement in a
+        steady state GP algorithm.
+		
+Option: gp.crossoverPercent
+Possible Values: 0-100
+Default: 95
+Effect: The probability that offspring is produced using crossover (in contrast
+        to simply being copied over to the next generation).
+		
+Option: gp.mutationPercent
+Possible Values: 0-100
+Default: 3
+Effect: The probability that offspring is mutated after reproduction.
+
+Option: gp.hillClimbingPercent
+Possible Values: 0-100
+Default: 0
+Effect: The probability that offspring is produced using the hill climbing
+        operator.
+
+Option: gp.refinementPercent
+Possible Values: 0-100
+Default: 0
+Effect: The probability that offspring is produced using the genetic refinement
+        operator.
+
+Option: gp.postConvergenceGenerations
+Possible Values: 10-1000
+Default: 50
+Effect: If the algorithm does not find a better solution for this number of
+        generations it stops. 


This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.