[Senseclusters-users] Help needed regarding Sense Clusters.
Status: Beta
Brought to you by:
tpederse
From: Prashant M. <mo...@gm...> - 2007-12-06 06:47:24
|
Respected Sir, I've experimented with Sense Clusters using the datasets provided on the site. Now I want to use my own data with Sense Clusters. I've the data in plain text files and I need to convert it to Senseval2 format as SenseCluster requires it in that format. The script "text2sval.pl" converts plain text files into Senseval2 format. For that, it asks for a KeyFile which is supposed to contain instance ids and optional sense tags of the instances in the text file. Though the keyfile is an optional argument to "text2sval.pl", its not giving much clear output without key file. So, I want to know, whether its created manually(if so, is there any standard procedure?) or any tool is used to create it? To make the point clear i'm giving a snapshot of both the input and output below. The Sample input to "text2sval.pl" is, ------------------------------------------------------------------------------------- us all natives of this region as soon we heard about the catastrophe Saturday morning said one of the volunteers Bajaj Zanji a 20 year old <head>idlypuri</head> restaurant worker in Tehran My job consists of digging out the dead with a shovel because we have no other means at our disposal he called us his picture wouldn't be spotted in this ad The advertisement notes that Atta lived among us attending classes shopping at the mall earing <head>idlypuri</head> going out now and then with friends But it also calls attention to signs that should have drawn attention to the Egyptian student like the ------------------------------------------------------------------------------------- The Output displayed is like this, ------------------------------------------------------------------------------------- <corpus lang="english"> <lexelt item="LEXELT"> <instance id="0"> <answer instance="0" senseid="NOTAG"/> <context> us all natives of this region as soon we heard about the catastrophe Saturday morning said one of the volunteers Bajaj Zanji a 20 year old <head>idlypuri</head> restaurant worker in Tehran My job consists of digging out the dead with a shovel because we have no other means at our disposal he </context> </instance> <instance id="1"> <answer instance="1" senseid="NOTAG"/> <context> called us his picture wouldn't be spotted in this ad The advertisement notes that Atta lived among us attending classes shopping at the mall earing <head>idlypuri</head> going out now and then with friends But it also calls attention to signs that should have drawn attention to the Egyptian student like the </context> ------------------------------------------------------------------------------------- *Since here i've not mentioned any KeyFile argument, its using default "senseid", "instance id" and "lexelt iem". I want to know about how these 3 things are given in "keyfile", whether manually or any tool is used here, is there any standard procedure or what? * I hope, the point is much clear now. Sorry for the lengthy mail. Expecting your positive reply. Thanking you. -- Cheers!! More Prashant J. C-DAC (Erstwhile NCST), Mumbai. |