senseclusters-users Mailing List for SenseClusters (Page 2)
Status: Beta
Brought to you by:
tpederse
You can subscribe to this list here.
2004 |
Jan
(15) |
Feb
|
Mar
(4) |
Apr
(2) |
May
(3) |
Jun
(1) |
Jul
(4) |
Aug
(3) |
Sep
(1) |
Oct
(10) |
Nov
(3) |
Dec
(4) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2005 |
Jan
(10) |
Feb
(1) |
Mar
(3) |
Apr
|
May
(1) |
Jun
(4) |
Jul
(7) |
Aug
(23) |
Sep
(6) |
Oct
|
Nov
|
Dec
|
2006 |
Jan
(10) |
Feb
(4) |
Mar
(2) |
Apr
(1) |
May
(4) |
Jun
(2) |
Jul
(5) |
Aug
(1) |
Sep
(6) |
Oct
(6) |
Nov
(1) |
Dec
|
2007 |
Jan
|
Feb
(4) |
Mar
(4) |
Apr
(8) |
May
(1) |
Jun
(1) |
Jul
(5) |
Aug
|
Sep
(1) |
Oct
(3) |
Nov
|
Dec
(2) |
2008 |
Jan
(8) |
Feb
(2) |
Mar
(6) |
Apr
(11) |
May
(6) |
Jun
(5) |
Jul
|
Aug
(5) |
Sep
(1) |
Oct
(1) |
Nov
(2) |
Dec
(8) |
2009 |
Jan
(1) |
Feb
(2) |
Mar
(2) |
Apr
|
May
(6) |
Jun
(1) |
Jul
|
Aug
(11) |
Sep
(7) |
Oct
|
Nov
|
Dec
|
2010 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(5) |
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(1) |
Dec
|
2011 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(3) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2013 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(2) |
Jun
(2) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(11) |
Dec
|
2014 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(4) |
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
(3) |
Nov
(1) |
Dec
(5) |
2015 |
Jan
|
Feb
|
Mar
(14) |
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
(1) |
Nov
|
Dec
|
2016 |
Jan
|
Feb
|
Mar
|
Apr
(2) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Stefano S. <ste...@gm...> - 2014-10-23 14:03:08
|
Hi Ted and thanks. The PoS tagging, entity recognition, feature extraction and the clustering tasks have been created with our system (not Senseclusters) - still in developement. Now I'm trying to use the cluster_labeling module of SenseClusters to show that we have found, in a unsupervised approach, the relation between medical entities in the clinical records (i.e. diabetes mellitus <> glycemia) and have, in this way, some labels for the clusters. I'm now writing the code to create the context files and then I'll run the experiments on cluster labeling. I'll let you know in a few days if everything worked well and, in case of a new publication, I'll cite your great work. I'm sure that I will ask some more things in the next days, so I thank you in advance. Stefano Silvestri 2014-10-23 15:07 GMT+02:00 Ted Pedersen <dul...@gm...>: > Hi Stefano, > > This sounds like an interesting project, and it's good to know > SenseClusters is proving to be useful. See my responses inline... > > On Wed, Oct 22, 2014 at 5:58 AM, Stefano Silvestri > <ste...@gm...> wrote: > > I've used a clustering techniques to discover, in an unsupervised way, > > relations between medical entities contained in a large collection of > > anonymized medical records, in a reserch project of University of > Neaples. > > The data set is composed by a large set of features - all the results > will > > be shortly published on a journal. > > > > The next step in the development of our system is performing an > unsupervised > > cluster (relation) labeling. To do that, I think to try the > clusterlabeling > > module from Senseclusters. For creating the input to clusterlabeling I > have > > to use format_clusters module with --context option and now I have some > > problems. > > > > I have already produced a cluto-style cluster solution file (no problem > for > > that) from my system. > > > > The rlabel file, if I'm right, is a file containing the explicit > > corresponding name of each entity in the cluster (in my case the > relation). > > Is that right? > > Yes, rlabel shows the cluster to which each instance has been assigned. > > > > > And now the problems about the context file... > > It should be in senseval2 format. My experimental assesment is made of a > > plain text files - so I should use plain text to headless senseval2 > utility. > > > > I have some questions. > > > > 1) Does the context file have to put together all my input files (the > > medical records) in one large file (and each context must correspond to a > > medical record)? > > Yes, the input for each run of SenseClusters should be a single file > with all your contexts included. > > > > > 2) Does the contexts be headless, or I have to tag (<head></head>) all > the > > entities (medical names) in input? > > Your contexts can be headless, and so there is no need to include > <head> tags in your contexts. > > > > > 3) Are other costrains in the context files (formatting, tags, or other)? > > > > There shouldn't be. The output from text2sval.pl should be acceptable > for input "as is". > > > In case of success of the experiments, of course, I'll credit and cite > the > > Senseclusters project. > > > > PS - my system works on italian language. > > That's great! We'd be happy to answer further questions as they arise, > and will be curious to know how things work out! > > Good luck, > Ted > > > > > Thanks for response, > > Stefano Silvestri, > > NLP researcher at University of Neaples "Federico II" > > > > > ------------------------------------------------------------------------------ > > Comprehensive Server Monitoring with Site24x7. > > Monitor 10 servers for $9/Month. > > Get alerted through email, SMS, voice calls or mobile push notifications. > > Take corrective actions from your mobile device. > > http://p.sf.net/sfu/Zoho > > _______________________________________________ > > senseclusters-users mailing list > > sen...@li... > > https://lists.sourceforge.net/lists/listinfo/senseclusters-users > > > > > > -- > Ted Pedersen > http://www.d.umn.edu/~tpederse > > > ------------------------------------------------------------------------------ > _______________________________________________ > senseclusters-users mailing list > sen...@li... > https://lists.sourceforge.net/lists/listinfo/senseclusters-users > |
From: Ted P. <dul...@gm...> - 2014-10-23 13:07:17
|
Hi Stefano, This sounds like an interesting project, and it's good to know SenseClusters is proving to be useful. See my responses inline... On Wed, Oct 22, 2014 at 5:58 AM, Stefano Silvestri <ste...@gm...> wrote: > I've used a clustering techniques to discover, in an unsupervised way, > relations between medical entities contained in a large collection of > anonymized medical records, in a reserch project of University of Neaples. > The data set is composed by a large set of features - all the results will > be shortly published on a journal. > > The next step in the development of our system is performing an unsupervised > cluster (relation) labeling. To do that, I think to try the clusterlabeling > module from Senseclusters. For creating the input to clusterlabeling I have > to use format_clusters module with --context option and now I have some > problems. > > I have already produced a cluto-style cluster solution file (no problem for > that) from my system. > > The rlabel file, if I'm right, is a file containing the explicit > corresponding name of each entity in the cluster (in my case the relation). > Is that right? Yes, rlabel shows the cluster to which each instance has been assigned. > > And now the problems about the context file... > It should be in senseval2 format. My experimental assesment is made of a > plain text files - so I should use plain text to headless senseval2 utility. > > I have some questions. > > 1) Does the context file have to put together all my input files (the > medical records) in one large file (and each context must correspond to a > medical record)? Yes, the input for each run of SenseClusters should be a single file with all your contexts included. > > 2) Does the contexts be headless, or I have to tag (<head></head>) all the > entities (medical names) in input? Your contexts can be headless, and so there is no need to include <head> tags in your contexts. > > 3) Are other costrains in the context files (formatting, tags, or other)? > There shouldn't be. The output from text2sval.pl should be acceptable for input "as is". > In case of success of the experiments, of course, I'll credit and cite the > Senseclusters project. > > PS - my system works on italian language. That's great! We'd be happy to answer further questions as they arise, and will be curious to know how things work out! Good luck, Ted > > Thanks for response, > Stefano Silvestri, > NLP researcher at University of Neaples "Federico II" > > ------------------------------------------------------------------------------ > Comprehensive Server Monitoring with Site24x7. > Monitor 10 servers for $9/Month. > Get alerted through email, SMS, voice calls or mobile push notifications. > Take corrective actions from your mobile device. > http://p.sf.net/sfu/Zoho > _______________________________________________ > senseclusters-users mailing list > sen...@li... > https://lists.sourceforge.net/lists/listinfo/senseclusters-users > -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: Stefano S. <ste...@gm...> - 2014-10-22 10:58:08
|
I've used a clustering techniques to discover, in an unsupervised way, relations between medical entities contained in a large collection of anonymized medical records, in a reserch project of University of Neaples. The data set is composed by a large set of features - all the results will be shortly published on a journal. The next step in the development of our system is performing an unsupervised cluster (relation) labeling. To do that, I think to try the clusterlabeling module from Senseclusters. For creating the input to clusterlabeling I have to use format_clusters module with --context option and now I have some problems. I have already produced a cluto-style cluster solution file (no problem for that) from my system. The rlabel file, if I'm right, is a file containing the explicit corresponding name of each entity in the cluster (in my case the relation). Is that right? And now the problems about the context file... It should be in senseval2 format. My experimental assesment is made of a plain text files - so I should use plain text to headless senseval2 utility. I have some questions. 1) Does the context file have to put together all my input files (the medical records) in one large file (and each context must correspond to a medical record)? 2) Does the contexts be headless, or I have to tag (<head></head>) all the entities (medical names) in input? 3) Are other costrains in the context files (formatting, tags, or other)? In case of success of the experiments, of course, I'll credit and cite the Senseclusters project. PS - my system works on italian language. Thanks for response, Stefano Silvestri, NLP researcher at University of Neaples "Federico II" |
From: Ted P. <tpederse@d.umn.edu> - 2014-07-18 23:57:32
|
For many years now, http://search.cpan.org has been my go-to link for finding CPAN distributions, and has been the URL we've listed on our web sites directing users to Perl software downloads. Sadly the site has become very unreliable in the last few months, and there does not appear to be a solution in the works. So, I've decided to gradually migrate to using https:://metacpan.org as our default web site for finding and pointing at CPAN distributions. This will involve making changes on web pages and in documentation, and it will take a while to do But, it seems important since the impression can be created by the search site that "CPAN is down". It's not. CPAN is alive and well, it's just that one particular navigator is not working too well. I hope to make these changes on the main package pages fairly soon, but in the event you run into a 503 or 504 error when accessing the search site, please realize there are other ways, and that CPAN is just fine. Here's some additional commentary and info about this issue https://github.com/perlorg/perlweb/issues/115 http://perlhacks.com/2013/01/give-me-metacpan/ http://www.perlmonks.org/index.pl?node_id=1093542 http://grokbase.com/t/perl/beginners/145nsxqz2w/cpan-unavailable When we started using the search site in about 2002 it was pretty great. The good news is that https://metacpan.org is even better, so this is a positive change. Thanks, Ted -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: Ted P. <tpederse@d.umn.edu> - 2014-05-23 15:19:50
|
Yes, it does. That's a WordNet sense index, which is explained a bit more here : https://wordnet.princeton.edu/wordnet/man/senseidx.5WN.html Good luck, Ted On Fri, May 23, 2014 at 9:53 AM, Jing Wang <jw...@ui...> wrote: > Thank you, Ted! That makes sense. Also, “art%1:04:00::” represents one > sense? > > Best, > Jing > > On May 23, 2014, at 9:48 AM, Ted Pedersen <tpederse@d.umn.edu> wrote: > > Hi Jing, > > The P tag indicates a proper noun. I think below is the instance you are > looking at - in this case "art" is a part of a proper noun (new york > academy of art) and so it gets that little bit of extra information in the > tags. > > <instance id="art.30003" docsrc="wsj_1272.mrg_14"> > <context> > His strategy against D.T. was based on a thorough study of dozens of its > games, he said, including its notorious whippings of the grandmasters Bent > Larsen of Denmark and Robert Byrne of the U.S.. Mr. Kasparov was > underwhelmed. ``The computer's mind is too straight, too primitive,'' > lacking the intuition and creativity needed to reach the top, he said. The > champion apparently was not worried at all about D.T.'s strong points. Its > chief builder, Taiwan-born Feng-hsiung Hsu, nicknamed his brainchild ``the > Weasel'' for its tactical flair at wriggling out of horrible positions. > D.T. also has a prodigious and flawless memory, is utterly fearless, and > could n't be distracted by the sexy nude sculptures spread around the > playing hall, in the New York Academy of <head>Art</head>. > </context> > </instance> > > I hope that helps! > > Good luck, > Ted > > > On Fri, May 23, 2014 at 9:03 AM, Jing Wang <jw...@ui...> wrote: > >> Hi All, >> >> This might be a silly question, but I cannot figure out how to read the >> .key file properly. For instance, one line from the .key file: “art >> art.30003 P art%1:04:00:: art%1:06:00::”, >> I understand that the word is “art”, and the instance is “art.30003”, but >> what “P” is supposed to mean? In addition, “1:04:00” represents one sense? >> Why the sense is formatted like this? >> >> I really appreciate that if someone can help me understand this. >> >> Best, >> Jing >> >> ------------------------------------------------------------------------------ >> "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE >> Instantly run your Selenium tests across 300+ browser/OS combos. >> Get unparalleled scalability from the best Selenium testing platform >> available >> Simple to use. Nothing to install. Get started now for free." >> http://p.sf.net/sfu/SauceLabs >> _______________________________________________ >> senseclusters-users mailing list >> sen...@li... >> https://lists.sourceforge.net/lists/listinfo/senseclusters-users >> > > > > -- > Ted Pedersen > http://www.d.umn.edu/~tpederse > > ------------------------------------------------------------------------------ > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE > Instantly run your Selenium tests across 300+ browser/OS combos. > Get unparalleled scalability from the best Selenium testing platform > available > Simple to use. Nothing to install. Get started now for free." > > http://p.sf.net/sfu/SauceLabs_______________________________________________ > senseclusters-users mailing list > sen...@li... > https://lists.sourceforge.net/lists/listinfo/senseclusters-users > > > > > ------------------------------------------------------------------------------ > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE > Instantly run your Selenium tests across 300+ browser/OS combos. > Get unparalleled scalability from the best Selenium testing platform > available > Simple to use. Nothing to install. Get started now for free." > http://p.sf.net/sfu/SauceLabs > _______________________________________________ > senseclusters-users mailing list > sen...@li... > https://lists.sourceforge.net/lists/listinfo/senseclusters-users > > -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: Jing W. <jw...@ui...> - 2014-05-23 14:54:07
|
Thank you, Ted! That makes sense. Also, “art%1:04:00::” represents one sense? Best, Jing On May 23, 2014, at 9:48 AM, Ted Pedersen <tpederse@d.umn.edu> wrote: > Hi Jing, > > The P tag indicates a proper noun. I think below is the instance you are looking at - in this case "art" is a part of a proper noun (new york academy of art) and so it gets that little bit of extra information in the tags. > > <instance id="art.30003" docsrc="wsj_1272.mrg_14"> > <context> > His strategy against D.T. was based on a thorough study of dozens of its > games, he said, including its notorious whippings of the grandmasters Bent > Larsen of Denmark and Robert Byrne of the U.S.. Mr. Kasparov was > underwhelmed. ``The computer's mind is too straight, too primitive,'' > lacking the intuition and creativity needed to reach the top, he said. The > champion apparently was not worried at all about D.T.'s strong points. Its > chief builder, Taiwan-born Feng-hsiung Hsu, nicknamed his brainchild ``the > Weasel'' for its tactical flair at wriggling out of horrible positions. > D.T. also has a prodigious and flawless memory, is utterly fearless, and > could n't be distracted by the sexy nude sculptures spread around the > playing hall, in the New York Academy of <head>Art</head>. > </context> > </instance> > > I hope that helps! > > Good luck, > Ted > > > On Fri, May 23, 2014 at 9:03 AM, Jing Wang <jw...@ui...> wrote: > Hi All, > > This might be a silly question, but I cannot figure out how to read the .key file properly. For instance, one line from the .key file: “art art.30003 P art%1:04:00:: art%1:06:00::”, > I understand that the word is “art”, and the instance is “art.30003”, but what “P” is supposed to mean? In addition, “1:04:00” represents one sense? Why the sense is formatted like this? > > I really appreciate that if someone can help me understand this. > > Best, > Jing > ------------------------------------------------------------------------------ > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE > Instantly run your Selenium tests across 300+ browser/OS combos. > Get unparalleled scalability from the best Selenium testing platform available > Simple to use. Nothing to install. Get started now for free." > http://p.sf.net/sfu/SauceLabs > _______________________________________________ > senseclusters-users mailing list > sen...@li... > https://lists.sourceforge.net/lists/listinfo/senseclusters-users > > > > -- > Ted Pedersen > http://www.d.umn.edu/~tpederse > ------------------------------------------------------------------------------ > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE > Instantly run your Selenium tests across 300+ browser/OS combos. > Get unparalleled scalability from the best Selenium testing platform available > Simple to use. Nothing to install. Get started now for free." > http://p.sf.net/sfu/SauceLabs_______________________________________________ > senseclusters-users mailing list > sen...@li... > https://lists.sourceforge.net/lists/listinfo/senseclusters-users |
From: Ted P. <tpederse@d.umn.edu> - 2014-05-23 14:48:51
|
Hi Jing, The P tag indicates a proper noun. I think below is the instance you are looking at - in this case "art" is a part of a proper noun (new york academy of art) and so it gets that little bit of extra information in the tags. <instance id="art.30003" docsrc="wsj_1272.mrg_14"> <context> His strategy against D.T. was based on a thorough study of dozens of its games, he said, including its notorious whippings of the grandmasters Bent Larsen of Denmark and Robert Byrne of the U.S.. Mr. Kasparov was underwhelmed. ``The computer's mind is too straight, too primitive,'' lacking the intuition and creativity needed to reach the top, he said. The champion apparently was not worried at all about D.T.'s strong points. Its chief builder, Taiwan-born Feng-hsiung Hsu, nicknamed his brainchild ``the Weasel'' for its tactical flair at wriggling out of horrible positions. D.T. also has a prodigious and flawless memory, is utterly fearless, and could n't be distracted by the sexy nude sculptures spread around the playing hall, in the New York Academy of <head>Art</head>. </context> </instance> I hope that helps! Good luck, Ted On Fri, May 23, 2014 at 9:03 AM, Jing Wang <jw...@ui...> wrote: > Hi All, > > This might be a silly question, but I cannot figure out how to read the > .key file properly. For instance, one line from the .key file: “art > art.30003 P art%1:04:00:: art%1:06:00::”, > I understand that the word is “art”, and the instance is “art.30003”, but > what “P” is supposed to mean? In addition, “1:04:00” represents one sense? > Why the sense is formatted like this? > > I really appreciate that if someone can help me understand this. > > Best, > Jing > > ------------------------------------------------------------------------------ > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE > Instantly run your Selenium tests across 300+ browser/OS combos. > Get unparalleled scalability from the best Selenium testing platform > available > Simple to use. Nothing to install. Get started now for free." > http://p.sf.net/sfu/SauceLabs > _______________________________________________ > senseclusters-users mailing list > sen...@li... > https://lists.sourceforge.net/lists/listinfo/senseclusters-users > -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: Jing W. <jw...@ui...> - 2014-05-23 14:03:57
|
Hi All, This might be a silly question, but I cannot figure out how to read the .key file properly. For instance, one line from the .key file: “art art.30003 P art%1:04:00:: art%1:06:00::”, I understand that the word is “art”, and the instance is “art.30003”, but what “P” is supposed to mean? In addition, “1:04:00” represents one sense? Why the sense is formatted like this? I really appreciate that if someone can help me understand this. Best, Jing |
From: Ted P. <tpederse@d.umn.edu> - 2013-11-11 13:49:39
|
The error you are getting about not being able to create a directory looks to me like something independent of SenseClusters - perhaps a permission problem of some kind. It looks like you might be running as root - I wonder what would happen if you ran as a "normal" user and ran the command in your home directory. If everything is set up correctly you should not have to run from within the Toolkit directory itself, things should have been installed in an appropriate system directory when you ran "make install". Could you try that perhaps? I'll respond in a little more detail later on today regarding the other points you asked about. More soon, Ted On Mon, Nov 11, 2013 at 2:35 AM, Lauren Romeo <rom...@gm...> wrote: > Hi and good morning, > I resaved the file as a .txt (utf-8) and I am still getting the same error. > This is the data in the file: > > 5 15 > 1 1.0000 3 0.5544 5 0.4431 > 2 1.0000 3 0.1386 4 0.4599 5 0.5413 > 1 0.5544 2 0.1386 3 1.0000 > 2 0.4599 4 1.0000 > 1 0.4431 2 0.5413 5 1.0000 > > I found this data here. > > This is the command that I have used: > root@webservices:~/.cpan/build/Text-SenseClusters-1.03-5f8DVC/Toolkit/clusterstop# > perl clusterstopping.pl /usr/SenseCluster/sample_clusters_stop_ > test.txt > > and this is the error that I continue to get: > > h: cannot create > /root/.cpan/build/Text-SenseClusters-1.03-5f8DVC/Toolkit/clusterstop//usr/SenseCluster/sample_clusters_stop_test.txt.1: > Directory nonexistent > > Error while running vcluster --clmethod rb --crfun i2 --sim cos --rowmodel > none --colmodel none --nooutput > /usr/SenseCluster/sample_clusters_stop_test.txt 1 > > Does this second part of the error have something to do with the command > that I am entering? I have run successfully the csh ./ALL-TESTS.sh , so I > was under the impression that it is correctly installed... should I try > another command/another test? > > The first part of the error seems to be looking for a Directory (?) - so I > reran the command this time finishing at a directory that I created that > only contains this input file (I don't know if this is correct, I was > troubleshooting): and I got this error: > > root@webservices:~/.cpan/build/Text-SenseClusters-1.03-5f8DVC/Toolkit/clusterstop# > perl clusterstopping.pl /usr/input_file/ > Use of uninitialized value $line in scalar chomp at clusterstopping.pl line > 735. > Use of uninitialized value $line in substitution (s///) at > clusterstopping.pl line 738. > Use of uninitialized value $line in split at clusterstopping.pl line 744. > Use of uninitialized value $rcnt in numeric gt (>) at clusterstopping.pl > line 916. > Use of uninitialized value $delta in concatenation (.) or string at > clusterstopping.pl line 918. > Use of uninitialized value $thres in concatenation (.) or string at > clusterstopping.pl line 918. > ERROR(clusterstopping.pl): > i2 values do not converge for the delta value of > (internally: ). Try using larger delta value. > > I used the same sample file as before - I really do not know what I am doing > wrong. > > > Thank you again, in advance for any assistance/help, it is greatly > appreciated. > > > > On Sun, Nov 10, 2013 at 8:13 PM, Ted Pedersen <tpederse@d.umn.edu> wrote: >> >> This error you report is a little unusual ... >> >> >however, the sample data that you provide in the website (I used to >> > >understand what I am >> >doing) - continuously gives me the following error -- which is why I have >> > been >unable to solve the problem on my own -- >> >it is because I do not understand what the machine is requiring from me. >> >> >sh: cannot create >> >/root/.cpan/build/Text- >> >> > >SenseClusters-1.03->5f8DVC/Toolkit/clusterstop//homedtic/usr/SenseCluster/sense_clusters->sample.rtf.1: >> >Directory nonexistent >> >Error while running vcluster --clmethod rb --crfun i2 --sim >> >cos --rowmodel none --colmodel none --nooutput >> >/homedtic/usr/SenseCluster/sense_clusters-sample.rtf 1 >> >> When I try to run clusterstopping with some of the sample data from >> the documentation, I get the following.. >> >> This is from the web page, stored in t2.txt >> >> 6 5 >> 1.3 2 0 0 3 >> 2.1 0 4 2.7 0 >> 1.3 2 0 0 3 >> 2.1 0 4 2.7 0 >> 1.3 2 0 0 3 >> 2.1 0 4 2.7 0 >> >> ted@marimba:~$ clusterstopping.pl t2.txt >> 1 >> >> I notice you seem to be using an rtf file for the input data...my >> guess is that this will be a problem. Could you try my example above >> as a plain text file, and let me know what you get? >> >> Thanks! >> Ted >> >> On Sat, Nov 9, 2013 at 9:46 AM, Lauren Romeo <rom...@gm...> >> wrote: >> > Hi Professor Pederson, >> > >> > I am a new SenseClusters user and I am particularly interested in one >> > Tool >> > :: ClusterStopping. I have been trying to implement it using it as a >> > standalone part of an experiment that I am running. >> > >> > In this way, I already have data that I am working with (3-column >> > tab-separated format (target, slot-filler, weight)). >> > >> > Here is small (10-line) sample of the format of my input: >> > abduction-n into+n-the+n-a-j-loss-n 1 >> > abduction-n into+n-the+n-a-small-cut-n 2 >> > abduction-n into+n-the+n-j-bleeding-n 1 >> > abduction-n into+n-the+n-j-loss-n 1 >> > zoonosis-n of+n-j+n-the-location-n 1 >> > zoonosis-n of+n-j+n-the-world-n 1 >> > zoonosis-n of+n-j+n-the-development-n 1 >> > zoonosis-n of+n-j+n-the-j-collection-n 1 >> > zoonosis-n of+n-j+n-the-j-success-n 1 >> > zoonosis-n of+n-j+ns-photo-n 1 >> > >> > >> > I have unsuccessfully been able to determine how I can translate my data >> > to >> > a usable input file for this particular process. >> > >> > It is a rather large file (5GB). I am also not clear as to 1. if there >> > is an >> > option to convert directly this format of >> > data into a sparse format (considering the size I suppose that is the >> > best >> > option), however, the sample data that you provide in the website (I >> > used to >> > understand what I am >> > doing) - continuously gives me the following error -- which is why I >> > have >> > been unable to solve the problem on my own -- >> > it is because I do not understand what the machine is requiring from me. >> > >> > sh: cannot create >> > /root/.cpan/build/Text- >> > >> > SenseClusters-1.03-5f8DVC/Toolkit/clusterstop//homedtic/usr/SenseCluster/sense_clusters-sample.rtf.1: >> > Directory nonexistent >> > Error while running vcluster --clmethod rb --crfun i2 --sim >> > cos --rowmodel none --colmodel none --nooutput >> > /homedtic/usr/SenseCluster/sense_clusters-sample.rtf 1 >> > >> > >> > Any assistance that you can provide to me would be fantastic, I have >> > been >> > really trying to search online for >> > answers --- looking at the sample testdata in available in the program >> > etc., >> > but I have come to a deadend. Do you think >> > you would be able to provide me with any assistance regarding how my >> > data >> > can be potentially used with your >> > clusterstopping.pl program? >> > >> > In advance, thank you very much for any assistance you might >> > be able to give. >> > >> > Again, thank you very much. >> > >> > >> > >> > >> > ------------------------------------------------------------------------------ >> > November Webinars for C, C++, Fortran Developers >> > Accelerate application performance with scalable programming models. >> > Explore >> > techniques for threading, error checking, porting, and tuning. Get the >> > most >> > from the latest Intel processors and coprocessors. See abstracts and >> > register >> > >> > http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk >> > _______________________________________________ >> > senseclusters-users mailing list >> > sen...@li... >> > https://lists.sourceforge.net/lists/listinfo/senseclusters-users >> > >> >> >> >> -- >> Ted Pedersen >> http://www.d.umn.edu/~tpederse >> >> >> ------------------------------------------------------------------------------ >> November Webinars for C, C++, Fortran Developers >> Accelerate application performance with scalable programming models. >> Explore >> techniques for threading, error checking, porting, and tuning. Get the >> most >> from the latest Intel processors and coprocessors. See abstracts and >> register >> >> http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk >> _______________________________________________ >> senseclusters-users mailing list >> sen...@li... >> https://lists.sourceforge.net/lists/listinfo/senseclusters-users > > > > > -- > Lauren Romeo > > +34 687 18 29 86 - mobile > www.laurenmromeo.com -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: Lauren R. <rom...@gm...> - 2013-11-11 09:43:02
|
One other comment that I would like to add in reference to the troubleshooting: is that I also tried removing "perl" from the command: so: root@webservices:~/.cpan/build/Text-SenseClusters-1.03-5f8DVC/Toolkit/clusterstop# *clusterstopping.pl <http://clusterstopping.pl> /usr/input_file/sample_clusters_stop_test.txt * *clusterstopping.pl <http://clusterstopping.pl>: orden no encontrada* *---- (in English: order not found)* In this case, always using the same input data I get the following error: which outputs 2 empty files: *expr1384162590.cr.datexpr1384162590.pk3* Sorry for the double mail: I just wanted to make clear all of my steps so that I can understand what is going wrong and how it can be fixed - so that I can start using the program! Thanks!! On Mon, Nov 11, 2013 at 9:35 AM, Lauren Romeo <rom...@gm...>wrote: > Hi and good morning, > I resaved the file as a .txt (utf-8) and I am still getting the same error. > This is the data in the file: > > 5 15 > 1 1.0000 3 0.5544 5 0.4431 > 2 1.0000 3 0.1386 4 0.4599 5 0.5413 > 1 0.5544 2 0.1386 3 1.0000 > 2 0.4599 4 1.0000 > 1 0.4431 2 0.5413 5 1.0000 > > I found this data here<http://search.cpan.org/~tpederse/Text-SenseClusters-1.03/Toolkit/clusterstop/clusterstopping.pl> > . > > This is the command that I have used: > root@webservices:~/.cpan/build/Text-SenseClusters-1.03-5f8DVC/Toolkit/clusterstop# > > *perl clusterstopping.pl <http://clusterstopping.pl> > /usr/SenseCluster/sample_clusters_stop_ test.txt * > > and this is the error that I continue to get: > > > > *h: cannot create > /root/.cpan/build/Text-SenseClusters-1.03-5f8DVC/Toolkit/clusterstop//usr/SenseCluster/sample_clusters_stop_test.txt.1: > Directory nonexistent Error while running vcluster --clmethod rb --crfun i2 > --sim cos --rowmodel none --colmodel none --nooutput > /usr/SenseCluster/sample_clusters_stop_test.txt 1* > > Does this second part of the error have something to do with the command > that I am entering? I have run successfully the *csh ./ALL-TESTS.sh *, so > I was under the impression that it is correctly installed... should I try > another command/another test? > > The first part of the error seems to be looking for a Directory (?) - so > I reran the command this time finishing at a directory that I created that > only contains this input file (I don't know if this is correct, I was > troubleshooting): and I got this error: > > root@webservices:~/.cpan/build/Text-SenseClusters-1.03-5f8DVC/Toolkit/clusterstop# > *perl clusterstopping.pl <http://clusterstopping.pl> /usr/input_file/* > > > > > > > > > *Use of uninitialized value $line in scalar chomp at clusterstopping.pl > <http://clusterstopping.pl> line 735. Use of uninitialized value $line in > substitution (s///) at clusterstopping.pl <http://clusterstopping.pl> line > 738.Use of uninitialized value $line in split at clusterstopping.pl > <http://clusterstopping.pl> line 744. Use of uninitialized value $rcnt in > numeric gt (>) at clusterstopping.pl <http://clusterstopping.pl> line > 916.Use of uninitialized value $delta in concatenation (.) or string at > clusterstopping.pl <http://clusterstopping.pl> line 918. Use of > uninitialized value $thres in concatenation (.) or string at > clusterstopping.pl <http://clusterstopping.pl> line > 918.ERROR(clusterstopping.pl <http://clusterstopping.pl>): i2 > values do not converge for the delta value of (internally: ). > Try using larger delta value. * > > I used the same sample file as before - I really do not know what I am > doing wrong. > > > Thank you again, in advance for any assistance/help, it is greatly > appreciated. > > > > On Sun, Nov 10, 2013 at 8:13 PM, Ted Pedersen <tpederse@d.umn.edu> wrote: > >> This error you report is a little unusual ... >> >> >however, the sample data that you provide in the website (I used to >> >understand what I am >> >doing) - continuously gives me the following error -- which is why I >> have been >unable to solve the problem on my own -- >> >it is because I do not understand what the machine is requiring from me. >> >> >sh: cannot create >> >/root/.cpan/build/Text- >> >> >SenseClusters-1.03->5f8DVC/Toolkit/clusterstop//homedtic/usr/SenseCluster/sense_clusters->sample.rtf.1: >> >Directory nonexistent >> >Error while running vcluster --clmethod rb --crfun i2 --sim >> >cos --rowmodel none --colmodel none --nooutput >> >/homedtic/usr/SenseCluster/sense_clusters-sample.rtf 1 >> >> When I try to run clusterstopping with some of the sample data from >> the documentation, I get the following.. >> >> This is from the web page, stored in t2.txt >> >> 6 5 >> 1.3 2 0 0 3 >> 2.1 0 4 2.7 0 >> 1.3 2 0 0 3 >> 2.1 0 4 2.7 0 >> 1.3 2 0 0 3 >> 2.1 0 4 2.7 0 >> >> ted@marimba:~$ clusterstopping.pl t2.txt >> 1 >> >> I notice you seem to be using an rtf file for the input data...my >> guess is that this will be a problem. Could you try my example above >> as a plain text file, and let me know what you get? >> >> Thanks! >> Ted >> >> On Sat, Nov 9, 2013 at 9:46 AM, Lauren Romeo <rom...@gm...> >> wrote: >> > Hi Professor Pederson, >> > >> > I am a new SenseClusters user and I am particularly interested in one >> Tool >> > :: ClusterStopping. I have been trying to implement it using it as a >> > standalone part of an experiment that I am running. >> > >> > In this way, I already have data that I am working with (3-column >> > tab-separated format (target, slot-filler, weight)). >> > >> > Here is small (10-line) sample of the format of my input: >> > abduction-n into+n-the+n-a-j-loss-n 1 >> > abduction-n into+n-the+n-a-small-cut-n 2 >> > abduction-n into+n-the+n-j-bleeding-n 1 >> > abduction-n into+n-the+n-j-loss-n 1 >> > zoonosis-n of+n-j+n-the-location-n 1 >> > zoonosis-n of+n-j+n-the-world-n 1 >> > zoonosis-n of+n-j+n-the-development-n 1 >> > zoonosis-n of+n-j+n-the-j-collection-n 1 >> > zoonosis-n of+n-j+n-the-j-success-n 1 >> > zoonosis-n of+n-j+ns-photo-n 1 >> > >> > >> > I have unsuccessfully been able to determine how I can translate my >> data to >> > a usable input file for this particular process. >> > >> > It is a rather large file (5GB). I am also not clear as to 1. if there >> is an >> > option to convert directly this format of >> > data into a sparse format (considering the size I suppose that is the >> best >> > option), however, the sample data that you provide in the website (I >> used to >> > understand what I am >> > doing) - continuously gives me the following error -- which is why I >> have >> > been unable to solve the problem on my own -- >> > it is because I do not understand what the machine is requiring from me. >> > >> > sh: cannot create >> > /root/.cpan/build/Text- >> > >> SenseClusters-1.03-5f8DVC/Toolkit/clusterstop//homedtic/usr/SenseCluster/sense_clusters-sample.rtf.1: >> > Directory nonexistent >> > Error while running vcluster --clmethod rb --crfun i2 --sim >> > cos --rowmodel none --colmodel none --nooutput >> > /homedtic/usr/SenseCluster/sense_clusters-sample.rtf 1 >> > >> > >> > Any assistance that you can provide to me would be fantastic, I have >> been >> > really trying to search online for >> > answers --- looking at the sample testdata in available in the program >> etc., >> > but I have come to a deadend. Do you think >> > you would be able to provide me with any assistance regarding how my >> data >> > can be potentially used with your >> > clusterstopping.pl program? >> > >> > In advance, thank you very much for any assistance you might >> > be able to give. >> > >> > Again, thank you very much. >> > >> > >> > >> > >> ------------------------------------------------------------------------------ >> > November Webinars for C, C++, Fortran Developers >> > Accelerate application performance with scalable programming models. >> Explore >> > techniques for threading, error checking, porting, and tuning. Get the >> most >> > from the latest Intel processors and coprocessors. See abstracts and >> > register >> > >> http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk >> > _______________________________________________ >> > senseclusters-users mailing list >> > sen...@li... >> > https://lists.sourceforge.net/lists/listinfo/senseclusters-users >> > >> >> >> >> -- >> Ted Pedersen >> http://www.d.umn.edu/~tpederse >> >> >> ------------------------------------------------------------------------------ >> November Webinars for C, C++, Fortran Developers >> Accelerate application performance with scalable programming models. >> Explore >> techniques for threading, error checking, porting, and tuning. Get the >> most >> from the latest Intel processors and coprocessors. See abstracts and >> register >> >> http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk >> _______________________________________________ >> senseclusters-users mailing list >> sen...@li... >> https://lists.sourceforge.net/lists/listinfo/senseclusters-users >> > > > > -- > Lauren Romeo > > +34 687 18 29 86 - mobile > www.laurenmromeo.com > -- Lauren Romeo +34 687 18 29 86 - mobile www.laurenmromeo.com |
From: Lauren R. <rom...@gm...> - 2013-11-11 08:36:08
|
Hi and good morning, I resaved the file as a .txt (utf-8) and I am still getting the same error. This is the data in the file: 5 15 1 1.0000 3 0.5544 5 0.4431 2 1.0000 3 0.1386 4 0.4599 5 0.5413 1 0.5544 2 0.1386 3 1.0000 2 0.4599 4 1.0000 1 0.4431 2 0.5413 5 1.0000 I found this data here<http://search.cpan.org/~tpederse/Text-SenseClusters-1.03/Toolkit/clusterstop/clusterstopping.pl> . This is the command that I have used: root@webservices:~/.cpan/build/Text-SenseClusters-1.03-5f8DVC/Toolkit/clusterstop# *perl clusterstopping.pl <http://clusterstopping.pl> /usr/SenseCluster/sample_clusters_stop_ test.txt * and this is the error that I continue to get: *h: cannot create /root/.cpan/build/Text-SenseClusters-1.03-5f8DVC/Toolkit/clusterstop//usr/SenseCluster/sample_clusters_stop_test.txt.1: Directory nonexistent Error while running vcluster --clmethod rb --crfun i2 --sim cos --rowmodel none --colmodel none --nooutput /usr/SenseCluster/sample_clusters_stop_test.txt 1* Does this second part of the error have something to do with the command that I am entering? I have run successfully the *csh ./ALL-TESTS.sh *, so I was under the impression that it is correctly installed... should I try another command/another test? The first part of the error seems to be looking for a Directory (?) - so I reran the command this time finishing at a directory that I created that only contains this input file (I don't know if this is correct, I was troubleshooting): and I got this error: root@webservices:~/.cpan/build/Text-SenseClusters-1.03-5f8DVC/Toolkit/clusterstop# *perl clusterstopping.pl <http://clusterstopping.pl> /usr/input_file/* *Use of uninitialized value $line in scalar chomp at clusterstopping.pl <http://clusterstopping.pl> line 735. Use of uninitialized value $line in substitution (s///) at clusterstopping.pl <http://clusterstopping.pl> line 738.Use of uninitialized value $line in split at clusterstopping.pl <http://clusterstopping.pl> line 744. Use of uninitialized value $rcnt in numeric gt (>) at clusterstopping.pl <http://clusterstopping.pl> line 916.Use of uninitialized value $delta in concatenation (.) or string at clusterstopping.pl <http://clusterstopping.pl> line 918. Use of uninitialized value $thres in concatenation (.) or string at clusterstopping.pl <http://clusterstopping.pl> line 918.ERROR(clusterstopping.pl <http://clusterstopping.pl>): i2 values do not converge for the delta value of (internally: ). Try using larger delta value. * I used the same sample file as before - I really do not know what I am doing wrong. Thank you again, in advance for any assistance/help, it is greatly appreciated. On Sun, Nov 10, 2013 at 8:13 PM, Ted Pedersen <tpederse@d.umn.edu> wrote: > This error you report is a little unusual ... > > >however, the sample data that you provide in the website (I used to > >understand what I am > >doing) - continuously gives me the following error -- which is why I have > been >unable to solve the problem on my own -- > >it is because I do not understand what the machine is requiring from me. > > >sh: cannot create > >/root/.cpan/build/Text- > > >SenseClusters-1.03->5f8DVC/Toolkit/clusterstop//homedtic/usr/SenseCluster/sense_clusters->sample.rtf.1: > >Directory nonexistent > >Error while running vcluster --clmethod rb --crfun i2 --sim > >cos --rowmodel none --colmodel none --nooutput > >/homedtic/usr/SenseCluster/sense_clusters-sample.rtf 1 > > When I try to run clusterstopping with some of the sample data from > the documentation, I get the following.. > > This is from the web page, stored in t2.txt > > 6 5 > 1.3 2 0 0 3 > 2.1 0 4 2.7 0 > 1.3 2 0 0 3 > 2.1 0 4 2.7 0 > 1.3 2 0 0 3 > 2.1 0 4 2.7 0 > > ted@marimba:~$ clusterstopping.pl t2.txt > 1 > > I notice you seem to be using an rtf file for the input data...my > guess is that this will be a problem. Could you try my example above > as a plain text file, and let me know what you get? > > Thanks! > Ted > > On Sat, Nov 9, 2013 at 9:46 AM, Lauren Romeo <rom...@gm...> > wrote: > > Hi Professor Pederson, > > > > I am a new SenseClusters user and I am particularly interested in one > Tool > > :: ClusterStopping. I have been trying to implement it using it as a > > standalone part of an experiment that I am running. > > > > In this way, I already have data that I am working with (3-column > > tab-separated format (target, slot-filler, weight)). > > > > Here is small (10-line) sample of the format of my input: > > abduction-n into+n-the+n-a-j-loss-n 1 > > abduction-n into+n-the+n-a-small-cut-n 2 > > abduction-n into+n-the+n-j-bleeding-n 1 > > abduction-n into+n-the+n-j-loss-n 1 > > zoonosis-n of+n-j+n-the-location-n 1 > > zoonosis-n of+n-j+n-the-world-n 1 > > zoonosis-n of+n-j+n-the-development-n 1 > > zoonosis-n of+n-j+n-the-j-collection-n 1 > > zoonosis-n of+n-j+n-the-j-success-n 1 > > zoonosis-n of+n-j+ns-photo-n 1 > > > > > > I have unsuccessfully been able to determine how I can translate my data > to > > a usable input file for this particular process. > > > > It is a rather large file (5GB). I am also not clear as to 1. if there > is an > > option to convert directly this format of > > data into a sparse format (considering the size I suppose that is the > best > > option), however, the sample data that you provide in the website (I > used to > > understand what I am > > doing) - continuously gives me the following error -- which is why I have > > been unable to solve the problem on my own -- > > it is because I do not understand what the machine is requiring from me. > > > > sh: cannot create > > /root/.cpan/build/Text- > > > SenseClusters-1.03-5f8DVC/Toolkit/clusterstop//homedtic/usr/SenseCluster/sense_clusters-sample.rtf.1: > > Directory nonexistent > > Error while running vcluster --clmethod rb --crfun i2 --sim > > cos --rowmodel none --colmodel none --nooutput > > /homedtic/usr/SenseCluster/sense_clusters-sample.rtf 1 > > > > > > Any assistance that you can provide to me would be fantastic, I have been > > really trying to search online for > > answers --- looking at the sample testdata in available in the program > etc., > > but I have come to a deadend. Do you think > > you would be able to provide me with any assistance regarding how my data > > can be potentially used with your > > clusterstopping.pl program? > > > > In advance, thank you very much for any assistance you might > > be able to give. > > > > Again, thank you very much. > > > > > > > > > ------------------------------------------------------------------------------ > > November Webinars for C, C++, Fortran Developers > > Accelerate application performance with scalable programming models. > Explore > > techniques for threading, error checking, porting, and tuning. Get the > most > > from the latest Intel processors and coprocessors. See abstracts and > > register > > > http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk > > _______________________________________________ > > senseclusters-users mailing list > > sen...@li... > > https://lists.sourceforge.net/lists/listinfo/senseclusters-users > > > > > > -- > Ted Pedersen > http://www.d.umn.edu/~tpederse > > > ------------------------------------------------------------------------------ > November Webinars for C, C++, Fortran Developers > Accelerate application performance with scalable programming models. > Explore > techniques for threading, error checking, porting, and tuning. Get the most > from the latest Intel processors and coprocessors. See abstracts and > register > http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk > _______________________________________________ > senseclusters-users mailing list > sen...@li... > https://lists.sourceforge.net/lists/listinfo/senseclusters-users > -- Lauren Romeo +34 687 18 29 86 - mobile www.laurenmromeo.com |
From: Ted P. <tpederse@d.umn.edu> - 2013-11-10 19:13:23
|
This error you report is a little unusual ... >however, the sample data that you provide in the website (I used to >understand what I am >doing) - continuously gives me the following error -- which is why I have been >unable to solve the problem on my own -- >it is because I do not understand what the machine is requiring from me. >sh: cannot create >/root/.cpan/build/Text- >SenseClusters-1.03->5f8DVC/Toolkit/clusterstop//homedtic/usr/SenseCluster/sense_clusters->sample.rtf.1: >Directory nonexistent >Error while running vcluster --clmethod rb --crfun i2 --sim >cos --rowmodel none --colmodel none --nooutput >/homedtic/usr/SenseCluster/sense_clusters-sample.rtf 1 When I try to run clusterstopping with some of the sample data from the documentation, I get the following.. This is from the web page, stored in t2.txt 6 5 1.3 2 0 0 3 2.1 0 4 2.7 0 1.3 2 0 0 3 2.1 0 4 2.7 0 1.3 2 0 0 3 2.1 0 4 2.7 0 ted@marimba:~$ clusterstopping.pl t2.txt 1 I notice you seem to be using an rtf file for the input data...my guess is that this will be a problem. Could you try my example above as a plain text file, and let me know what you get? Thanks! Ted On Sat, Nov 9, 2013 at 9:46 AM, Lauren Romeo <rom...@gm...> wrote: > Hi Professor Pederson, > > I am a new SenseClusters user and I am particularly interested in one Tool > :: ClusterStopping. I have been trying to implement it using it as a > standalone part of an experiment that I am running. > > In this way, I already have data that I am working with (3-column > tab-separated format (target, slot-filler, weight)). > > Here is small (10-line) sample of the format of my input: > abduction-n into+n-the+n-a-j-loss-n 1 > abduction-n into+n-the+n-a-small-cut-n 2 > abduction-n into+n-the+n-j-bleeding-n 1 > abduction-n into+n-the+n-j-loss-n 1 > zoonosis-n of+n-j+n-the-location-n 1 > zoonosis-n of+n-j+n-the-world-n 1 > zoonosis-n of+n-j+n-the-development-n 1 > zoonosis-n of+n-j+n-the-j-collection-n 1 > zoonosis-n of+n-j+n-the-j-success-n 1 > zoonosis-n of+n-j+ns-photo-n 1 > > > I have unsuccessfully been able to determine how I can translate my data to > a usable input file for this particular process. > > It is a rather large file (5GB). I am also not clear as to 1. if there is an > option to convert directly this format of > data into a sparse format (considering the size I suppose that is the best > option), however, the sample data that you provide in the website (I used to > understand what I am > doing) - continuously gives me the following error -- which is why I have > been unable to solve the problem on my own -- > it is because I do not understand what the machine is requiring from me. > > sh: cannot create > /root/.cpan/build/Text- > SenseClusters-1.03-5f8DVC/Toolkit/clusterstop//homedtic/usr/SenseCluster/sense_clusters-sample.rtf.1: > Directory nonexistent > Error while running vcluster --clmethod rb --crfun i2 --sim > cos --rowmodel none --colmodel none --nooutput > /homedtic/usr/SenseCluster/sense_clusters-sample.rtf 1 > > > Any assistance that you can provide to me would be fantastic, I have been > really trying to search online for > answers --- looking at the sample testdata in available in the program etc., > but I have come to a deadend. Do you think > you would be able to provide me with any assistance regarding how my data > can be potentially used with your > clusterstopping.pl program? > > In advance, thank you very much for any assistance you might > be able to give. > > Again, thank you very much. > > > > ------------------------------------------------------------------------------ > November Webinars for C, C++, Fortran Developers > Accelerate application performance with scalable programming models. Explore > techniques for threading, error checking, porting, and tuning. Get the most > from the latest Intel processors and coprocessors. See abstracts and > register > http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk > _______________________________________________ > senseclusters-users mailing list > sen...@li... > https://lists.sourceforge.net/lists/listinfo/senseclusters-users > -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: Lauren R. <rom...@gm...> - 2013-11-09 21:39:32
|
Hi and thank you for your reply! I will try the text2sval2.pl option by marking the lemma in the first column with <head>noun</head> and removing the freq information in the 4th column by using the non-sorted file. In this way, if I have read correctly the description of the input required for the preprocess, I should have the one-line per context requirement fulfilled, correct? In this case, after processing with texts2sval2.pl, I can use one of the vector convertors for the proper input for clusterstopping.pl? In regards to the sample program that I was trying. I wanted to get a feel for what the results would look at. In this way, I copied the example input in sparse matrix format (from the clusterstopping.pl documentation page) and I simply ran the command: * $perl clusterstopping.pl <http://clusterstopping.pl> sample_input.txt* As the arguments are optional, I just wanted to understand what the results would look like before implementing any task-specific parameters and the error mentioned in my original email occurred. Thank you again, in advance, for any insight to understand better both the program and how to solve my error!! Lauren On Sat, Nov 9, 2013 at 8:18 PM, Ted Pedersen <tpederse@d.umn.edu> wrote: > Perhaps the easiest way to put data into the format required by > SenseClusters is by using one of the converter programs we have. In > your case I think text2sval.pl would be the right choice... > > It you have SenseClusters installed, you should simply be able to run > > text2sval2.pl > > (there are a few options you could use, all of which is described > below, or you could run > > text2sval.pl --help > > > http://cpansearch.perl.org/src/TPEDERSE/Text-SenseClusters-1.03/Toolkit/preprocess/plain/text2sval.pl > > About the sample program you are trying to run and getting the error > with, can you let me know the command you were running? > > Thanks! > Ted > > On Sat, Nov 9, 2013 at 9:46 AM, Lauren Romeo <rom...@gm...> > wrote: > > Hi Professor Pederson, > > > > I am a new SenseClusters user and I am particularly interested in one > Tool > > :: ClusterStopping. I have been trying to implement it using it as a > > standalone part of an experiment that I am running. > > > > In this way, I already have data that I am working with (3-column > > tab-separated format (target, slot-filler, weight)). > > > > Here is small (10-line) sample of the format of my input: > > abduction-n into+n-the+n-a-j-loss-n 1 > > abduction-n into+n-the+n-a-small-cut-n 2 > > abduction-n into+n-the+n-j-bleeding-n 1 > > abduction-n into+n-the+n-j-loss-n 1 > > zoonosis-n of+n-j+n-the-location-n 1 > > zoonosis-n of+n-j+n-the-world-n 1 > > zoonosis-n of+n-j+n-the-development-n 1 > > zoonosis-n of+n-j+n-the-j-collection-n 1 > > zoonosis-n of+n-j+n-the-j-success-n 1 > > zoonosis-n of+n-j+ns-photo-n 1 > > > > > > I have unsuccessfully been able to determine how I can translate my data > to > > a usable input file for this particular process. > > > > It is a rather large file (5GB). I am also not clear as to 1. if there > is an > > option to convert directly this format of > > data into a sparse format (considering the size I suppose that is the > best > > option), however, the sample data that you provide in the website (I > used to > > understand what I am > > doing) - continuously gives me the following error -- which is why I have > > been unable to solve the problem on my own -- > > it is because I do not understand what the machine is requiring from me. > > > > sh: cannot create > > /root/.cpan/build/Text- > > > SenseClusters-1.03-5f8DVC/Toolkit/clusterstop//homedtic/usr/SenseCluster/sense_clusters-sample.rtf.1: > > Directory nonexistent > > Error while running vcluster --clmethod rb --crfun i2 --sim > > cos --rowmodel none --colmodel none --nooutput > > /homedtic/usr/SenseCluster/sense_clusters-sample.rtf 1 > > > > > > Any assistance that you can provide to me would be fantastic, I have been > > really trying to search online for > > answers --- looking at the sample testdata in available in the program > etc., > > but I have come to a deadend. Do you think > > you would be able to provide me with any assistance regarding how my data > > can be potentially used with your > > clusterstopping.pl program? > > > > In advance, thank you very much for any assistance you might > > be able to give. > > > > Again, thank you very much. > > > > > > > > > ------------------------------------------------------------------------------ > > November Webinars for C, C++, Fortran Developers > > Accelerate application performance with scalable programming models. > Explore > > techniques for threading, error checking, porting, and tuning. Get the > most > > from the latest Intel processors and coprocessors. See abstracts and > > register > > > http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk > > _______________________________________________ > > senseclusters-users mailing list > > sen...@li... > > https://lists.sourceforge.net/lists/listinfo/senseclusters-users > > > > > > -- > Ted Pedersen > http://www.d.umn.edu/~tpederse > > > ------------------------------------------------------------------------------ > November Webinars for C, C++, Fortran Developers > Accelerate application performance with scalable programming models. > Explore > techniques for threading, error checking, porting, and tuning. Get the most > from the latest Intel processors and coprocessors. See abstracts and > register > http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk > _______________________________________________ > senseclusters-users mailing list > sen...@li... > https://lists.sourceforge.net/lists/listinfo/senseclusters-users > -- Lauren Romeo |
From: Ted P. <tpederse@d.umn.edu> - 2013-11-09 19:18:54
|
Perhaps the easiest way to put data into the format required by SenseClusters is by using one of the converter programs we have. In your case I think text2sval.pl would be the right choice... It you have SenseClusters installed, you should simply be able to run text2sval2.pl (there are a few options you could use, all of which is described below, or you could run text2sval.pl --help http://cpansearch.perl.org/src/TPEDERSE/Text-SenseClusters-1.03/Toolkit/preprocess/plain/text2sval.pl About the sample program you are trying to run and getting the error with, can you let me know the command you were running? Thanks! Ted On Sat, Nov 9, 2013 at 9:46 AM, Lauren Romeo <rom...@gm...> wrote: > Hi Professor Pederson, > > I am a new SenseClusters user and I am particularly interested in one Tool > :: ClusterStopping. I have been trying to implement it using it as a > standalone part of an experiment that I am running. > > In this way, I already have data that I am working with (3-column > tab-separated format (target, slot-filler, weight)). > > Here is small (10-line) sample of the format of my input: > abduction-n into+n-the+n-a-j-loss-n 1 > abduction-n into+n-the+n-a-small-cut-n 2 > abduction-n into+n-the+n-j-bleeding-n 1 > abduction-n into+n-the+n-j-loss-n 1 > zoonosis-n of+n-j+n-the-location-n 1 > zoonosis-n of+n-j+n-the-world-n 1 > zoonosis-n of+n-j+n-the-development-n 1 > zoonosis-n of+n-j+n-the-j-collection-n 1 > zoonosis-n of+n-j+n-the-j-success-n 1 > zoonosis-n of+n-j+ns-photo-n 1 > > > I have unsuccessfully been able to determine how I can translate my data to > a usable input file for this particular process. > > It is a rather large file (5GB). I am also not clear as to 1. if there is an > option to convert directly this format of > data into a sparse format (considering the size I suppose that is the best > option), however, the sample data that you provide in the website (I used to > understand what I am > doing) - continuously gives me the following error -- which is why I have > been unable to solve the problem on my own -- > it is because I do not understand what the machine is requiring from me. > > sh: cannot create > /root/.cpan/build/Text- > SenseClusters-1.03-5f8DVC/Toolkit/clusterstop//homedtic/usr/SenseCluster/sense_clusters-sample.rtf.1: > Directory nonexistent > Error while running vcluster --clmethod rb --crfun i2 --sim > cos --rowmodel none --colmodel none --nooutput > /homedtic/usr/SenseCluster/sense_clusters-sample.rtf 1 > > > Any assistance that you can provide to me would be fantastic, I have been > really trying to search online for > answers --- looking at the sample testdata in available in the program etc., > but I have come to a deadend. Do you think > you would be able to provide me with any assistance regarding how my data > can be potentially used with your > clusterstopping.pl program? > > In advance, thank you very much for any assistance you might > be able to give. > > Again, thank you very much. > > > > ------------------------------------------------------------------------------ > November Webinars for C, C++, Fortran Developers > Accelerate application performance with scalable programming models. Explore > techniques for threading, error checking, porting, and tuning. Get the most > from the latest Intel processors and coprocessors. See abstracts and > register > http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk > _______________________________________________ > senseclusters-users mailing list > sen...@li... > https://lists.sourceforge.net/lists/listinfo/senseclusters-users > -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: Lauren R. <rom...@gm...> - 2013-11-09 15:47:09
|
Hi Professor Pederson, I am a new SenseClusters user and I am particularly interested in one Tool :: ClusterStopping. I have been trying to implement it using it as a standalone part of an experiment that I am running. In this way, I already have data that I am working with (3-column tab-separated format (target, slot-filler, weight)). Here is small (10-line) sample of the format of my input: *abduction-n into+n-the+n-a-j-loss-n 1abduction-n into+n-the+n-a-small-cut-n 2abduction-n into+n-the+n-j-bleeding-n 1 abduction-n into+n-the+n-j-loss-n 1 zoonosis-n of+n-j+n-the-location-n 1zoonosis-n of+n-j+n-the-world-n 1zoonosis-n of+n-j+n-the-development-n 1zoonosis-n of+n-j+n-the-j-collection-n 1zoonosis-n of+n-j+n-the-j-success-n 1 zoonosis-n of+n-j+ns-photo-n 1* I have unsuccessfully been able to determine how I can translate my data to a usable input file for this particular process. It is a rather large file (5GB). I am also not clear as to 1. if there is an option to convert directly this format of data into a sparse format (considering the size I suppose that is the best option), however, the sample data that you provide in the website (I used to understand what I am doing) - continuously gives me the following error -- which is why I have been unable to solve the problem on my own -- it is because I do not understand what the machine is requiring from me. *sh: cannot create /root/.cpan/build/Text-* *SenseClusters-1.03-5f8DVC/**Toolkit/clusterstop//homedtic/* *usr/SenseCluster/sense_* *clusters-sample.rtf.1: Directory nonexistent Error while running vcluster --clmethod rb --crfun i2 --sim cos --rowmodel none --colmodel none --nooutput /homedtic/usr/**SenseCluster/sense_clusters-**sample.rtf 1* Any assistance that you can provide to me would be fantastic, I have been really trying to search online for answers --- looking at the sample testdata in available in the program etc., but I have come to a deadend. Do you think you would be able to provide me with any assistance regarding how my data can be potentially used with your clusterstopping.pl program? In advance, thank you very much for any assistance you might be able to give. Again, thank you very much. |
From: Ted P. <tpederse@d.umn.edu> - 2013-11-05 03:32:54
|
Ah, I think I have spotted the problem. sh: vcluster: command not found You may still need to install Cluto (scluster and vcluster). There are instructions on how to do that in the directory External. Some of the test cases will work without vclsuter and scluster, so it seems possible to me that you don't have it installed... Let us know if questions persist! Good luck, Ted On Mon, Nov 4, 2013 at 8:40 PM, Jing Wang <jw...@ui...> wrote: > Hello Ted, > > Thank you very much for the quick reply! > I tried run the tests, and they run successfully. However, when I try the > command with --token option, it still cannot work. > > I type in the command: perl discriminate.pl samples/Data/begin.v-test.xml > --token samples/Regexs/token.regex > > And the output is: > > defined(@array) is deprecated at /usr/local/bin/preprocess.pl line 1285. > (Maybe you should just omit the defined()?) > defined(@array) is deprecated at /usr/local/bin/preprocess.pl line 1286. > (Maybe you should just omit the defined()?) > File samples/Data/begin.v-test.xml.pro exists! Overwrite (Y/N)? Y > sh: vcluster: command not found > ERROR(format_clusters.pl): > Cluster solution file <expr1383619046.cluster_solution> does not exist. > Error while formatting clusters. > > Could you please help me figure out what is going wrong? > > Thanks, > Jing > On Nov 4, 2013, at 8:06 PM, Ted Pedersen <tpederse@d.umn.edu> wrote: > > Hi Jing, > > I tried to run the same command on my distribution, and got the > following results - one difference you might notice is that I used a > --token option, which is normally required. So, I wonder if you could > try using the --token option? Also, are you able to run the tests, as > in: > > cd Testing > csh all-tests.sh > > Does that work? > > Below are the results when I run on the sample data for begin... > > lincoln:~/.cpan/build/Text-SenseClusters-1.03-2dUNe9 # discriminate.pl > samples/Data/begin.v-test.xml --token samples/Regexs/token.regex > > ================================================================= > Output when #clusters = 2 (Set manually) > ================================================================= > ******************************************************************************** > vcluster (CLUTO 2.1.2) Copyright 2001-06, Regents of the University of > Minnesota > > Matrix Information > ----------------------------------------------------------- > Name: expr1383616949.vectors, #Rows: 255, #Columns: 4242, #NonZeros: 417931 > > Options > ---------------------------------------------------------------------- > CLMethod=RB, CRfun=I2, SimFun=Cosine, #Clusters: 2 > RowModel=None, ColModel=None, GrModel=SY-DIR, NNbrs=40 > Colprune=1.00, EdgePrune=-1.00, VtxPrune=-1.00, MinComponent=5 > CSType=Best, AggloFrom=0, AggloCRFun=I2, NTrials=10, NIter=10 > > Solution > --------------------------------------------------------------------- > > ------------------------------------------------------------------------ > 2-way clustering: [I2=2.47e+02] [255 of 255] > ------------------------------------------------------------------------ > cid Size ISim ISdev ESim ESdev | > ------------------------------------------------------------------------ > 0 162 +0.948 +0.027 +0.893 +0.047 | > 1 93 +0.930 +0.046 +0.893 +0.088 | > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------------ > Hierarchical Tree that optimizes the I2 criterion function... > ------------------------------------------------------------------------------ > > ----------------- > 2 > |---0 > |---1 > ----------------- > ------------------------------------------------------------------------------ > > Timing Information > ----------------------------------------------------------- > I/O: 0.153 sec > Clustering: 0.126 sec > Reporting: 0.033 sec > Memory Usage Information > ----------------------------------------------------- > Maximum memory used: 11472896 bytes > Current memory used: 3541224 bytes > ******************************************************************************** > > Clusters of given contexts can be found in file: expr1383616949.clusters > > Good luck, and please let us know what happens! > > Thanks, > Ted > > On Mon, Nov 4, 2013 at 5:41 PM, Jing Wang <jw...@ui...> wrote: > > Hello, > > I am a new user to the senseclusters, and I try to run the discriminate.pl > on the sample file under the samples directory, however, I cannot make it > work. > > I simply type the command: perl discriminate.pl > ./samples/Data/begin.v-test.xml > > And the error is: ERROR(discriminate.pl): > Only 2 FEATURES found in the <expr1383608414.bigrams> file. > At least 10 FEATURES required to proceed with context > representation. > > This might be a silly mistake, but I cannot figure it out. Can someone help > me on this issue? Thank you very much! > > > Best, > Safari > > ------------------------------------------------------------------------------ > November Webinars for C, C++, Fortran Developers > Accelerate application performance with scalable programming models. Explore > techniques for threading, error checking, porting, and tuning. Get the most > from the latest Intel processors and coprocessors. See abstracts and > register > http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk > _______________________________________________ > senseclusters-users mailing list > sen...@li... > https://lists.sourceforge.net/lists/listinfo/senseclusters-users > > > > > -- > Ted Pedersen > http://www.d.umn.edu/~tpederse > > ------------------------------------------------------------------------------ > November Webinars for C, C++, Fortran Developers > Accelerate application performance with scalable programming models. Explore > techniques for threading, error checking, porting, and tuning. Get the most > from the latest Intel processors and coprocessors. See abstracts and > register > http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk > _______________________________________________ > senseclusters-users mailing list > sen...@li... > https://lists.sourceforge.net/lists/listinfo/senseclusters-users > > -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: Jing W. <jw...@ui...> - 2013-11-05 02:41:17
|
Hello Ted, Thank you very much for the quick reply! I tried run the tests, and they run successfully. However, when I try the command with --token option, it still cannot work. I type in the command: perl discriminate.pl samples/Data/begin.v-test.xml --token samples/Regexs/token.regex And the output is: defined(@array) is deprecated at /usr/local/bin/preprocess.pl line 1285. (Maybe you should just omit the defined()?) defined(@array) is deprecated at /usr/local/bin/preprocess.pl line 1286. (Maybe you should just omit the defined()?) File samples/Data/begin.v-test.xml.pro exists! Overwrite (Y/N)? Y sh: vcluster: command not found ERROR(format_clusters.pl): Cluster solution file <expr1383619046.cluster_solution> does not exist. Error while formatting clusters. Could you please help me figure out what is going wrong? Thanks, Jing On Nov 4, 2013, at 8:06 PM, Ted Pedersen <tpederse@d.umn.edu> wrote: > Hi Jing, > > I tried to run the same command on my distribution, and got the > following results - one difference you might notice is that I used a > --token option, which is normally required. So, I wonder if you could > try using the --token option? Also, are you able to run the tests, as > in: > > cd Testing > csh all-tests.sh > > Does that work? > > Below are the results when I run on the sample data for begin... > > lincoln:~/.cpan/build/Text-SenseClusters-1.03-2dUNe9 # discriminate.pl > samples/Data/begin.v-test.xml --token samples/Regexs/token.regex > > ================================================================= > Output when #clusters = 2 (Set manually) > ================================================================= > ******************************************************************************** > vcluster (CLUTO 2.1.2) Copyright 2001-06, Regents of the University of Minnesota > > Matrix Information ----------------------------------------------------------- > Name: expr1383616949.vectors, #Rows: 255, #Columns: 4242, #NonZeros: 417931 > > Options ---------------------------------------------------------------------- > CLMethod=RB, CRfun=I2, SimFun=Cosine, #Clusters: 2 > RowModel=None, ColModel=None, GrModel=SY-DIR, NNbrs=40 > Colprune=1.00, EdgePrune=-1.00, VtxPrune=-1.00, MinComponent=5 > CSType=Best, AggloFrom=0, AggloCRFun=I2, NTrials=10, NIter=10 > > Solution --------------------------------------------------------------------- > > ------------------------------------------------------------------------ > 2-way clustering: [I2=2.47e+02] [255 of 255] > ------------------------------------------------------------------------ > cid Size ISim ISdev ESim ESdev | > ------------------------------------------------------------------------ > 0 162 +0.948 +0.027 +0.893 +0.047 | > 1 93 +0.930 +0.046 +0.893 +0.088 | > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------------ > Hierarchical Tree that optimizes the I2 criterion function... > ------------------------------------------------------------------------------ > > ----------------- > 2 > |---0 > |---1 > ----------------- > ------------------------------------------------------------------------------ > > Timing Information ----------------------------------------------------------- > I/O: 0.153 sec > Clustering: 0.126 sec > Reporting: 0.033 sec > Memory Usage Information ----------------------------------------------------- > Maximum memory used: 11472896 bytes > Current memory used: 3541224 bytes > ******************************************************************************** > > Clusters of given contexts can be found in file: expr1383616949.clusters > > Good luck, and please let us know what happens! > > Thanks, > Ted > > On Mon, Nov 4, 2013 at 5:41 PM, Jing Wang <jw...@ui...> wrote: >> Hello, >> >> I am a new user to the senseclusters, and I try to run the discriminate.pl >> on the sample file under the samples directory, however, I cannot make it >> work. >> >> I simply type the command: perl discriminate.pl >> ./samples/Data/begin.v-test.xml >> >> And the error is: ERROR(discriminate.pl): >> Only 2 FEATURES found in the <expr1383608414.bigrams> file. >> At least 10 FEATURES required to proceed with context >> representation. >> >> This might be a silly mistake, but I cannot figure it out. Can someone help >> me on this issue? Thank you very much! >> >> >> Best, >> Safari >> >> ------------------------------------------------------------------------------ >> November Webinars for C, C++, Fortran Developers >> Accelerate application performance with scalable programming models. Explore >> techniques for threading, error checking, porting, and tuning. Get the most >> from the latest Intel processors and coprocessors. See abstracts and >> register >> http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk >> _______________________________________________ >> senseclusters-users mailing list >> sen...@li... >> https://lists.sourceforge.net/lists/listinfo/senseclusters-users >> > > > > -- > Ted Pedersen > http://www.d.umn.edu/~tpederse > > ------------------------------------------------------------------------------ > November Webinars for C, C++, Fortran Developers > Accelerate application performance with scalable programming models. Explore > techniques for threading, error checking, porting, and tuning. Get the most > from the latest Intel processors and coprocessors. See abstracts and register > http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk > _______________________________________________ > senseclusters-users mailing list > sen...@li... > https://lists.sourceforge.net/lists/listinfo/senseclusters-users > |
From: Ted P. <tpederse@d.umn.edu> - 2013-11-05 02:06:14
|
Hi Jing, I tried to run the same command on my distribution, and got the following results - one difference you might notice is that I used a --token option, which is normally required. So, I wonder if you could try using the --token option? Also, are you able to run the tests, as in: cd Testing csh all-tests.sh Does that work? Below are the results when I run on the sample data for begin... lincoln:~/.cpan/build/Text-SenseClusters-1.03-2dUNe9 # discriminate.pl samples/Data/begin.v-test.xml --token samples/Regexs/token.regex ================================================================= Output when #clusters = 2 (Set manually) ================================================================= ******************************************************************************** vcluster (CLUTO 2.1.2) Copyright 2001-06, Regents of the University of Minnesota Matrix Information ----------------------------------------------------------- Name: expr1383616949.vectors, #Rows: 255, #Columns: 4242, #NonZeros: 417931 Options ---------------------------------------------------------------------- CLMethod=RB, CRfun=I2, SimFun=Cosine, #Clusters: 2 RowModel=None, ColModel=None, GrModel=SY-DIR, NNbrs=40 Colprune=1.00, EdgePrune=-1.00, VtxPrune=-1.00, MinComponent=5 CSType=Best, AggloFrom=0, AggloCRFun=I2, NTrials=10, NIter=10 Solution --------------------------------------------------------------------- ------------------------------------------------------------------------ 2-way clustering: [I2=2.47e+02] [255 of 255] ------------------------------------------------------------------------ cid Size ISim ISdev ESim ESdev | ------------------------------------------------------------------------ 0 162 +0.948 +0.027 +0.893 +0.047 | 1 93 +0.930 +0.046 +0.893 +0.088 | ------------------------------------------------------------------------ ------------------------------------------------------------------------------ Hierarchical Tree that optimizes the I2 criterion function... ------------------------------------------------------------------------------ ----------------- 2 |---0 |---1 ----------------- ------------------------------------------------------------------------------ Timing Information ----------------------------------------------------------- I/O: 0.153 sec Clustering: 0.126 sec Reporting: 0.033 sec Memory Usage Information ----------------------------------------------------- Maximum memory used: 11472896 bytes Current memory used: 3541224 bytes ******************************************************************************** Clusters of given contexts can be found in file: expr1383616949.clusters Good luck, and please let us know what happens! Thanks, Ted On Mon, Nov 4, 2013 at 5:41 PM, Jing Wang <jw...@ui...> wrote: > Hello, > > I am a new user to the senseclusters, and I try to run the discriminate.pl > on the sample file under the samples directory, however, I cannot make it > work. > > I simply type the command: perl discriminate.pl > ./samples/Data/begin.v-test.xml > > And the error is: ERROR(discriminate.pl): > Only 2 FEATURES found in the <expr1383608414.bigrams> file. > At least 10 FEATURES required to proceed with context > representation. > > This might be a silly mistake, but I cannot figure it out. Can someone help > me on this issue? Thank you very much! > > > Best, > Safari > > ------------------------------------------------------------------------------ > November Webinars for C, C++, Fortran Developers > Accelerate application performance with scalable programming models. Explore > techniques for threading, error checking, porting, and tuning. Get the most > from the latest Intel processors and coprocessors. See abstracts and > register > http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk > _______________________________________________ > senseclusters-users mailing list > sen...@li... > https://lists.sourceforge.net/lists/listinfo/senseclusters-users > -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: Jing W. <jw...@ui...> - 2013-11-04 23:56:58
|
Hello, I am a new user to the senseclusters, and I try to run the discriminate.pl on the sample file under the samples directory, however, I cannot make it work. I simply type the command: perl discriminate.pl ./samples/Data/begin.v-test.xml And the error is: ERROR(discriminate.pl): Only 2 FEATURES found in the <expr1383608414.bigrams> file. At least 10 FEATURES required to proceed with context representation. This might be a silly mistake, but I cannot figure it out. Can someone help me on this issue? Thank you very much! Best, Safari |
From: Ted P. <tpederse@d.umn.edu> - 2013-06-30 03:11:34
|
We are pleased to announce the release of version 1.03 of SenseClusters. This is the first new release in 5 years, and should be the first of several upcoming releases. There has been a little bit of clean up in the test scripts and other places, but the main new functionality are some additional ways of labeling the discovered clusters. Before this version clusters have been labeled with significant bigrams - as of version 1.03 it is now possible to label clusters with trigrams or 4-grams. Additional functionality related to cluster labeling is expected to be released in the coming months, so please give this a try and let us know of any suggestions or observations you might have. The changes in this version are enumerated below. You can download from CPAN or sourceforge via the links provided here : http://senseclusters.sourceforge.net 1.03 Released June 29, 2013 (changes by TDP and AMJ) Modify install.sh to default to Linux-x86_64 for Cluto installation (TDP) Removed various instances of if (defined %hash) in preprocess/sval2 in favor of if (%hash) - defined %hash is now deprecated - however left that in keyconvert.pl as removed caused syntax issue that should be checked out further (TDP) Fixed Testing/ALL-TESTS.sh to run all test cases by enumerating in for loop - previous method of using wild card did not seem to be running all cases (TDP) Fixed some test cases for clusterstopping in Testing - note that we still have Sun test cases included although no Sun platform to test on. Should keep those though as cluto still comes with a Sun version (TDP) Added the flag "ngram" for clusterlabeling.pl. It will allow user to provide the value for ngram. The features selection while creating the labels of cluster will be based on this parameter. (AMJ) Added --label_ngram option to discriminate.pl to support new --ngram option in clusterlabeling.pl (AMJ) Added test cases testA6 and testA7 to test changes in clusterlabeling. (AMJ) Updated INSTALL to mention depencies on csh and using bash as the system shell (TDP) Please let us know of any questions, problems, or suggestions! Enjoy, Ted and Anand -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: Ted P. <tpederse@d.umn.edu> - 2013-06-06 13:16:13
|
SenseClusters participated (yet again) in a SemEval task this year. The paper describing the system and a little bit about the task is available here : http://www.d.umn.edu/~tpederse/Pubs/pedersen-semeval-2013.pdf And I will present a poster next Friday (June 14) in Atlanta at the SemEval workshop. I participated in task 11, which has its poster session at 3:30 I believe... http://www.cs.york.ac.uk/semeval-2013/index.php?id=schedule I do hope to see you there! Cordially, Ted -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: Ted P. <tpederse@d.umn.edu> - 2013-05-08 19:52:48
|
The upgrade of marimba is complete, and the web interfaces to WordNet::Similarity, WordNet::SenseRelate::AllWords, and SenseClusters are back up and running there. However, please do use http://maraca.d.umn.edu as your first choice for web our interfaces - that is a much newer and more powerful server. http://marimba.d.umn.edu is intended to be a backup / reserve system. Please let me know if you see anything amiss on either system! Enjoy, Ted |
From: Ted P. <tpederse@d.umn.edu> - 2013-05-08 12:34:47
|
The web interface at http://marimba.d.umn.edu will be down for a few days for a system upgrade. Please use http://maraca.d.umn.edu instead - this is actually a better and newer server, so I'd suggest generally using that instead of marimba. I'll keep you posted as to when marimba is back. Enjoy, Ted |
From: Ted P. <tpederse@d.umn.edu> - 2011-05-27 19:18:40
|
Greetings all, http://talisker.d.umn.edu http://marimba.d.umn.edu are both back up and fully operational. Please let us know if you have any questions or concerns. Enjoy, Ted On Fri, May 27, 2011 at 10:22 AM, Ted Pedersen <tpederse@d.umn.edu> wrote: > Greetings all, > > http://talisker.d.umn.edu and all the interfaces provided there is > back up. This includes WordNet::Similarity, > WordNet::SenseRelate::AllWords, and SenseClusters. > > The other system (marimba) is in the process of coming back up, and > should be fully available later today. > > Enjoy, > Ted > > On Thu, May 26, 2011 at 2:10 PM, Ted Pedersen <tpederse@d.umn.edu> wrote: >> Greetings all, >> >> The web interfaces for WordNet::Similarity, >> WordNet::SenseRelate::AllWords, and SenseClusters are all down due to >> a long overdue upgrade. But, at least one of our systems will be back >> before 5pm Friday April 27, perhaps both. >> >> These interfaces will continue to be located at the following URLs >> once they are back: >> >> http://marimba.d.umn.edu >> http://talisker.d.umn.edu >> >> Sorry for the short notice, but hopefully things will be back in a day >> or two. If you really need something run *now* let me know and I'll >> see what I can do to assist. >> >> Cordially, >> Ted >> >> -- >> Ted Pedersen >> http://www.d.umn.edu/~tpederse >> > > > > -- > Ted Pedersen > http://www.d.umn.edu/~tpederse > -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: Ted P. <tpederse@d.umn.edu> - 2011-05-27 15:22:54
|
Greetings all, http://talisker.d.umn.edu and all the interfaces provided there is back up. This includes WordNet::Similarity, WordNet::SenseRelate::AllWords, and SenseClusters. The other system (marimba) is in the process of coming back up, and should be fully available later today. Enjoy, Ted On Thu, May 26, 2011 at 2:10 PM, Ted Pedersen <tpederse@d.umn.edu> wrote: > Greetings all, > > The web interfaces for WordNet::Similarity, > WordNet::SenseRelate::AllWords, and SenseClusters are all down due to > a long overdue upgrade. But, at least one of our systems will be back > before 5pm Friday April 27, perhaps both. > > These interfaces will continue to be located at the following URLs > once they are back: > > http://marimba.d.umn.edu > http://talisker.d.umn.edu > > Sorry for the short notice, but hopefully things will be back in a day > or two. If you really need something run *now* let me know and I'll > see what I can do to assist. > > Cordially, > Ted > > -- > Ted Pedersen > http://www.d.umn.edu/~tpederse > -- Ted Pedersen http://www.d.umn.edu/~tpederse |