Hello all,
First off, thank you for your excellent work on this project. I am still just learning both CCG and openCCG, but I am finding it to be very exciting stuff. My question is about how to represent sets of words that are separated by spaces but that all actually make up a single given "thing" in the sentence.
For example, if you want to represent something like a sports team name such as "New York Mets." It is made up of three words, but it is all one "thing" for the sentence.
Lets say you have a grammar such as the one in Steedman's very short introduction to CCG. ( I apologize for the horrid ASCII here):
Mary likes musicals
---- ----------- ----------
NP (S\NP)/NP NP
-------------------->
S\NP
----------------<
S
Is there an easy way to have the NP that is made up of the word "Mary" above be represented by an arbitrary number of nouns? So that a sentence like "New York mets love chicken" would validly parse without requiring something like underscores in a proper name to make it all "one word?"
Expressing something like this in BNF form would be like:
S -> NP VP
NP -> N NP | N
VP -> V NP
I apologize if this is already explained somewhere that I havent come across yet, but I sincerely thank you all for your time and effort. :)
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
For named entities, it's probably easiest to handle these outside of OpenCCG, and then just group them with underscores. Otherwise, you can use N/N categories and unary rules, eg N => NP, as in the CCGbank. I believe the treatment of such NPs is discussed in the CCGbank manual (see below).
Hello all,
First off, thank you for your excellent work on this project. I am still just learning both CCG and openCCG, but I am finding it to be very exciting stuff. My question is about how to represent sets of words that are separated by spaces but that all actually make up a single given "thing" in the sentence.
For example, if you want to represent something like a sports team name such as "New York Mets." It is made up of three words, but it is all one "thing" for the sentence.
Lets say you have a grammar such as the one in Steedman's very short introduction to CCG. ( I apologize for the horrid ASCII here):
Mary likes musicals
---- ----------- ----------
NP (S\NP)/NP NP
-------------------->
S\NP
----------------<
S
Is there an easy way to have the NP that is made up of the word "Mary" above be represented by an arbitrary number of nouns? So that a sentence like "New York mets love chicken" would validly parse without requiring something like underscores in a proper name to make it all "one word?"
Expressing something like this in BNF form would be like:
S -> NP VP
NP -> N NP | N
VP -> V NP
I apologize if this is already explained somewhere that I havent come across yet, but I sincerely thank you all for your time and effort. :)
For named entities, it's probably easiest to handle these outside of OpenCCG, and then just group them with underscores. Otherwise, you can use N/N categories and unary rules, eg N => NP, as in the CCGbank. I believe the treatment of such NPs is discussed in the CCGbank manual (see below).
-Mike
from http://groups.inf.ed.ac.uk/ccg/publications.html:
Julia Hockenmaier and Mark Steedman (2005). CCGbank: User's Manual. Technical Report MS-CIS-05-09, Department of Computer and Information Science, University of Pennsylvania.
http://www.cis.upenn.edu/~juliahr/Papers/CCGbank/CCGbankManual.pdf