You can subscribe to this list here.
2007 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(2) |
Jun
|
Jul
(11) |
Aug
(2) |
Sep
(1) |
Oct
(1) |
Nov
|
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2008 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
(1) |
Jul
(28) |
Aug
(2) |
Sep
|
Oct
|
Nov
|
Dec
|
2009 |
Jan
(1) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2010 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
(1) |
2011 |
Jan
(7) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: bthomson <bth...@em...> - 2011-01-13 00:09:26
|
Hello, We are pleased to announce the release of CuiTools version 0.29 CuiTools (Coo-e Tools) is a freely available package of Perl programs for unsupervised and supervised word sense disambiguation (WSD) experiments. The name CuiTools comes from the Concept Unique Identifiers (CUIs) found in the Unified Medical Language System (UMLS). This package allows the users to perform supervised or unsupervised word sense disambiguation using information extracted from the UMLS such as CUIs, semantic types and semantic relations as well as general English features such as unigrams, bigrams and part-of-speech information. As of version 0.29, there is even more changes in the prolog2mm.pl program. Most notably, we added a --metamap option which takes as input the two digit year. This will hopefully allow users to run this with the variousversion of metamap that come out. You an download CuiTools from sourceforge here: http://cuitools.sourceforge.net/ If you have any questions, please don't hesitate to email. Thank you, Bridget |
From: Ted P. <tpederse@d.umn.edu> - 2011-01-01 21:08:04
|
Hi Bridget, BTW, one thing I noticed as well with MetaMap 2010 is that the environment variable they use is METAMAP_HOME I noticed CuiTools uses METAMAP_PATH, so we end up with two METAMAP home variables. That's actually not really a problem, and I guess if earlier MetaMaps use the METAMAP_PATH variable then we need to keep both of them.... Just a very minor observation! Thanks, Ted On Sat, Jan 1, 2011 at 8:30 AM, bthomson <bth...@em...> wrote: > Hi Ted, > > I haven't upgraded to metamap10 yet but I will do that soon. I haven't used > version 10 yet, but yes, it should work if you just rename it as metamap09. > I will move CuiTools over to version 10 though soon! > > Thanks, > > Bridget > > On Fri, 31 Dec 2010, Ted Pedersen wrote: > >> Hi Bridget, >> >> I noticed, I think, that Cuitools has metamap08 and metamap09 >> "covered", but not metamap10. I am using that now (metamap10). Should >> things work for me if I rename as metamap09, or has the metamap output >> changed? We probably want to include support for metamap10 either way. >> >> Thanks! >> Ted >> >> >> >> -- >> Ted Pedersen >> http://www.d.umn.edu/~tpederse >> >> >> ------------------------------------------------------------------------------ >> Learn how Oracle Real Application Clusters (RAC) One Node allows customers >> to consolidate database storage, standardize their database environment, >> and, >> should the need arise, upgrade to a full multi-node Oracle RAC database >> without downtime or disruption >> http://p.sf.net/sfu/oracle-sfdevnl >> _______________________________________________ >> Cuitools-users mailing list >> Cui...@li... >> https://lists.sourceforge.net/lists/listinfo/cuitools-users >> > -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: Ted P. <tpederse@d.umn.edu> - 2011-01-01 16:40:48
|
Hi Bridget, See comments inline... On Sat, Jan 1, 2011 at 9:29 AM, bthomson <bth...@em...> wrote: > Hi Ted, > > I am going throught the log file now. I know that errors are generated > order1vec.pl which have always happened. Since it was an external program, I > haven't messed with it. Maybe I should replace it with an update version of > it? The version of order1vec.pl in SenseClusters is from 2008, so I suspect you have the most current version of that already...? Also, if I'm not mistaken don't you have some modifications to order1vec.pl in the CuiTools version? I don't remember exactly what they are (or if you have them even). Ah, in looking at the code I see the following.... # 3 JUNE 2008: # # THIS PROGRAM ORIGINALLY BELONGED TO THE SenseClusters PACKAGE # VERSION 0.95 (http://senseclusters.sourceforge.net) DEVELOPED # BY Ted Pedersen, Amruta Purandare, Anagha Kulkarni, and Mahesh # Joshi. ITS ORIGINALLY NAME IS ORDER1VEC - IT HAS BEEN MODIFIED # AND CHANGED TO SIMORDER1VEC AND INCLUDED IN CuiTools DISTRIBUTION Some of the order1vec.pl errors look to me like a missing file or something.... Test A11 for order1vec.pl Running order1vec.pl --extarget --target test-A11.target test-A11.sval2 test-A11 .regex Use of uninitialized value $outputfile in concatenation (.) or string at /usr/bi n/order1vec.pl line 660, <DATA> line 206. Use of uninitialized value $outputfile in concatenation (.) or string at /usr/bi n/order1vec.pl line 660, <DATA> line 206. Could not open output file: Also, I ran a test of the SenseClusters order1vec.pl and it came back clean, so I'm wondering if there might not be something specific to the CuiTools version? > > There are two internal errors: > > 1. Error in Testing/supervised/internal/supervised-disambiguate Test 3 > > supervised-disambiguate.pl --ngramcount "--ngram 1" --wekacv 10 > --directory output mm/adjustment.mm > > > 2. Error in Testing/supervised/internal/mm2arff Test 2 > > mm2arff.pl --sentence --stcount --ngram 1 output mm/art.mm > > > I do not get them on one of my machines and do on the other so was able to > do a comparison. The difference was the versions of Text-NSP. I was messing > around with Text-NSP package on one of my computers in order to get some > internal count information for another project a while back, and forgot to > revert back. I ended up lower casing everything in the process. When I > released CuiTools the other day, I went through and updated the test cases > including the mm format which meant updating all of the test cases I think > including the supervised tests. I also revised the output format of the > supervised-disambiguate program last month so I also could have done it then > as well. I just can't remember. Understood, and I'm glad this isn't a major issue! > > Either way, I will get these test cases fixed and update CuiTools to run on > metamap10. That sounds great! Thanks, Ted > > Sorry for these errors! > > Thanks, > > Bridget > > On Fri, 31 Dec 2010, Ted Pedersen wrote: > >> Hi Bridget, >> >> I decided to go ahead and install the new CuiTools (since I'm in the >> process of setting up a new system I recently installed MetaMap and >> Weka and so it seemed like good timing). The install generally seems >> to have went well, except there are a few glitches in the testall.sh >> output. >> >> Any ideas on any of those, and any things I should look at to sort >> those out? Here's a little bit on my setup.... >> >> ted@linux-qdw9:~/Download/CuiTools-0.27/Testing> echo $WEKAHOME >> /home/ted/Download/weka-3-6-4 >> ted@linux-qdw9:~/Download/CuiTools-0.27/Testing> echo $METAMAP_PATH >> /home/ted/Download/public_mm >> ted@linux-qdw9:~/Download/CuiTools-0.27/Testing> echo $CUITOOLSHOME >> /home/ted/Download/CuiTools-0.27 >> ted@linux-qdw9:~/Download/CuiTools-0.27/Testing> uname -a >> Linux linux-qdw9 2.6.34.7-0.5-desktop #1 SMP PREEMPT 2010-10-25 >> 08:40:12 +0200 i686 i686 i386 GNU/Linux >> >> I have not yet downloaded or installed the NLM WSD data, but didn't >> think I would need that quite yet... >> >> I've attached a log file, hopefully the mailing list let's that through? >> >> Thanks, >> Ted >> >> >> >> >> -- >> Ted Pedersen >> http://www.d.umn.edu/~tpederse >> > -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: bthomson <bth...@em...> - 2011-01-01 15:41:43
|
Hi Ted, I am going throught the log file now. I know that errors are generated order1vec.pl which have always happened. Since it was an external program, I haven't messed with it. Maybe I should replace it with an update version of it? There are two internal errors: 1. Error in Testing/supervised/internal/supervised-disambiguate Test 3 supervised-disambiguate.pl --ngramcount "--ngram 1" --wekacv 10 --directory output mm/adjustment.mm 2. Error in Testing/supervised/internal/mm2arff Test 2 mm2arff.pl --sentence --stcount --ngram 1 output mm/art.mm I do not get them on one of my machines and do on the other so was able to do a comparison. The difference was the versions of Text-NSP. I was messing around with Text-NSP package on one of my computers in order to get some internal count information for another project a while back, and forgot to revert back. I ended up lower casing everything in the process. When I released CuiTools the other day, I went through and updated the test cases including the mm format which meant updating all of the test cases I think including the supervised tests. I also revised the output format of the supervised-disambiguate program last month so I also could have done it then as well. I just can't remember. Either way, I will get these test cases fixed and update CuiTools to run on metamap10. Sorry for these errors! Thanks, Bridget On Fri, 31 Dec 2010, Ted Pedersen wrote: > Hi Bridget, > > I decided to go ahead and install the new CuiTools (since I'm in the > process of setting up a new system I recently installed MetaMap and > Weka and so it seemed like good timing). The install generally seems > to have went well, except there are a few glitches in the testall.sh > output. > > Any ideas on any of those, and any things I should look at to sort > those out? Here's a little bit on my setup.... > > ted@linux-qdw9:~/Download/CuiTools-0.27/Testing> echo $WEKAHOME > /home/ted/Download/weka-3-6-4 > ted@linux-qdw9:~/Download/CuiTools-0.27/Testing> echo $METAMAP_PATH > /home/ted/Download/public_mm > ted@linux-qdw9:~/Download/CuiTools-0.27/Testing> echo $CUITOOLSHOME > /home/ted/Download/CuiTools-0.27 > ted@linux-qdw9:~/Download/CuiTools-0.27/Testing> uname -a > Linux linux-qdw9 2.6.34.7-0.5-desktop #1 SMP PREEMPT 2010-10-25 > 08:40:12 +0200 i686 i686 i386 GNU/Linux > > I have not yet downloaded or installed the NLM WSD data, but didn't > think I would need that quite yet... > > I've attached a log file, hopefully the mailing list let's that through? > > Thanks, > Ted > > > > > -- > Ted Pedersen > http://www.d.umn.edu/~tpederse > |
From: bthomson <bth...@em...> - 2011-01-01 14:41:28
|
Hi Ted, I haven't upgraded to metamap10 yet but I will do that soon. I haven't used version 10 yet, but yes, it should work if you just rename it as metamap09. I will move CuiTools over to version 10 though soon! Thanks, Bridget On Fri, 31 Dec 2010, Ted Pedersen wrote: > Hi Bridget, > > I noticed, I think, that Cuitools has metamap08 and metamap09 > "covered", but not metamap10. I am using that now (metamap10). Should > things work for me if I rename as metamap09, or has the metamap output > changed? We probably want to include support for metamap10 either way. > > Thanks! > Ted > > > > -- > Ted Pedersen > http://www.d.umn.edu/~tpederse > > ------------------------------------------------------------------------------ > Learn how Oracle Real Application Clusters (RAC) One Node allows customers > to consolidate database storage, standardize their database environment, and, > should the need arise, upgrade to a full multi-node Oracle RAC database > without downtime or disruption > http://p.sf.net/sfu/oracle-sfdevnl > _______________________________________________ > Cuitools-users mailing list > Cui...@li... > https://lists.sourceforge.net/lists/listinfo/cuitools-users > |
From: Ted P. <tpederse@d.umn.edu> - 2011-01-01 05:03:07
|
Hi Bridget, I noticed, I think, that Cuitools has metamap08 and metamap09 "covered", but not metamap10. I am using that now (metamap10). Should things work for me if I rename as metamap09, or has the metamap output changed? We probably want to include support for metamap10 either way. Thanks! Ted -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: Ted P. <tpederse@d.umn.edu> - 2011-01-01 04:34:36
|
Hi Bridget, I decided to go ahead and install the new CuiTools (since I'm in the process of setting up a new system I recently installed MetaMap and Weka and so it seemed like good timing). The install generally seems to have went well, except there are a few glitches in the testall.sh output. Any ideas on any of those, and any things I should look at to sort those out? Here's a little bit on my setup.... ted@linux-qdw9:~/Download/CuiTools-0.27/Testing> echo $WEKAHOME /home/ted/Download/weka-3-6-4 ted@linux-qdw9:~/Download/CuiTools-0.27/Testing> echo $METAMAP_PATH /home/ted/Download/public_mm ted@linux-qdw9:~/Download/CuiTools-0.27/Testing> echo $CUITOOLSHOME /home/ted/Download/CuiTools-0.27 ted@linux-qdw9:~/Download/CuiTools-0.27/Testing> uname -a Linux linux-qdw9 2.6.34.7-0.5-desktop #1 SMP PREEMPT 2010-10-25 08:40:12 +0200 i686 i686 i386 GNU/Linux I have not yet downloaded or installed the NLM WSD data, but didn't think I would need that quite yet... I've attached a log file, hopefully the mailing list let's that through? Thanks, Ted -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: bthomson <bth...@em...> - 2010-12-31 05:54:50
|
Hello, We are pleased to announce the release of CuiTools version 0.27 CuiTools (Coo-e Tools) is a freely available package of Perl programs for unsupervised and supervised word sense disambiguation (WSD) experiments. The name CuiTools comes from the Concept Unique Identifiers (CUIs) found in the Unified Medical Language System (UMLS). This package allows the users to perform supervised or unsupervised word sense disambiguation using information extracted from the UMLS such as CUIs, semantic types and semantic relations as well as general English features such as unigrams, bigrams and part-of-speech information. As of version 0.27, there has been significant changes in the prolog2mm.pl program. The main change is how the target word is identified in the during the format conversion. I think this change results in a much more reliable conversion from the plain to prolog to mm formats. Thank you, Bridget |
From: bthomson <bth...@em...> - 2010-08-31 23:01:58
|
Hello, After a small hibernation, we are pleased to announce the release of CuiTools version 0.23! CuiTools (Coo-e Tools) is a freely available package of Perl programs for unsupervised and supervised word sense disambiguation (WSD) experiments. The name CuiTools comes from the Concept Unique Identifiers (CUIs) found in the Unified Medical Language System (UMLS). This package allows the users to perform supervised or unsupervised word sense disambiguation using information extracted from the UMLS such as CUIs, semantic types and semantic relations as well as general English features such as unigrams, bigrams and part-of-speech information. As of version 0.23, 1) added two other vector measures: Euclidean distance and Dice Coefficient to our unsupervised method. 2) modified the package to work with the 2009 release of MetaMap. 3) added a utils directory which contains the calculate-statistics.pl program to obtain various statistics from the supervised results. 4) added a demo which uses the supervised-disambiguate.pl program to classify abstracts as related or unrelated CuiTools is available : http://cuitools.sourceforge.net/ If you have any questions or comments, please don't hesitate to email. Thank you, Bridget ------------------------------------------------------------------------------ This SF.net email is sponsored by: SourcForge Community SourceForge wants to tell your story. http://p.sf.net/sfu/sf-spreadtheword _______________________________________________ CuiTools-news mailing list Cui...@li... https://lists.sourceforge.net/lists/listinfo/cuitools-news |
From: Bridget T. M. <bth...@cs...> - 2009-01-21 00:09:02
|
Hello, We are pleased to announce the release of CuiTools version 0.21 CuiTools (Coo-e Tools) is a freely available package of Perl programs for unsupervised and supervised word sense disambiguation (WSD) experiments. The name CuiTools comes from the Concept Unique Identifiers (CUIs) found in the Unified Medical Language System (UMLS). This package allows the users to perform supervised or unsupervised word sense disambiguation using information extracted from the UMLS such as CUIs, semantic types and semantic relations as well as general English features such as unigrams, bigrams and part-of-speech information. As of version 0.21, there has been significant changes in our unsupervised tools and the converters. For our unsupervised tools, we now require the UMLS-Interface module to be installed. This module accesses the UMLS through a mysql database which allows us to incorporate additional UMLS information. For our converters, we have finally fully converted from using the MetaMap Transfer Program (MMTx) to MetaMap. This should dramatically speed up the any tool that was using MMTx. Thank you, Bridget |
From: Ted P. <tpederse@d.umn.edu> - 2008-08-04 14:07:28
|
Hi Bridget, Some small warnings/errors from mm2arff.pl when using longer ngrams....below is the output from a few different runs that show this...it's unclear to me what effect these warnings are having...the concern of course is that we might be "missing" a feature (since it's coming from mm2arff), which will be a fairly subtle thing, but potentially important... Defaults options set: --seed 1 --javaparams "-Xmx600m" --cv 10 --weka weka.classifiers.bayes.NaiveBayes --line User defined options set: --relation /home/cs/tpederse/CuiTools-0.19/default_options/relation --lc --mesh --ngramcount "--ngram 4 --remove 5 " --firstranked --stcount "--ngram 1 --remove 5" Output Directories: WEKA directory : ted-4gram.weka ARFF directory : ted-4gram.arff RESULTS directory: ted-4gram.results LOG directory : ted-4gram.log FILE: Demos/TDP.mm/energy.mm FILE: Demos/TDP.mm/scale.mm Use of uninitialized value in pattern match (m//) at /home/cs/tpederse/bin/mm2arff.pl line 1016, <SRC> line 22233. FILE: Demos/TDP.mm/cold.mm FILE: Demos/TDP.mm/strains.mm FILE: Demos/TDP.mm/white.mm FILE: Demos/TDP.mm/transport.mm FILE: Demos/TDP.mm/nutrition.mm FILE: Demos/TDP.mm/determination.mm FILE: Demos/TDP.mm/surgery.mm FILE: Demos/TDP.mm/fluid.mm FILE: Demos/TDP.mm/evaluation.mm FILE: Demos/TDP.mm/frequency.mm FILE: Demos/TDP.mm/discharge.mm FILE: Demos/TDP.mm/weight.mm FILE: Demos/TDP.mm/secretion.mm FILE: Demos/TDP.mm/blood_pressure.mm FILE: Demos/TDP.mm/pressure.mm Use of uninitialized value in pattern match (m//) at /home/cs/tpederse/bin/mm2arff.pl line 995, <SRC> line 22419. Use of uninitialized value in string eq at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 22419. Exiting subroutine via next at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 22419. FILE: Demos/TDP.mm/reduction.mm FILE: Demos/TDP.mm/adjustment.mm FILE: Demos/TDP.mm/degree.mm FILE: Demos/TDP.mm/single.mm FILE: Demos/TDP.mm/mosaic.mm FILE: Demos/TDP.mm/ganglion.mm FILE: Demos/TDP.mm/depression.mm FILE: Demos/TDP.mm/transient.mm Use of uninitialized value in pattern match (m//) at /home/cs/tpederse/bin/mm2arff.pl line 995, <SRC> line 60016. Use of uninitialized value in string eq at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 60016. Exiting subroutine via next at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 60016. FILE: Demos/TDP.mm/fit.mm FILE: Demos/TDP.mm/growth.mm FILE: Demos/TDP.mm/culture.mm FILE: Demos/TDP.mm/association.mm FILE: Demos/TDP.mm/man.mm FILE: Demos/TDP.mm/sensitivity.mm FILE: Demos/TDP.mm/repair.mm FILE: Demos/TDP.mm/fat.mm FILE: Demos/TDP.mm/japanese.mm FILE: Demos/TDP.mm/ultrasound.mm FILE: Demos/TDP.mm/mole.mm FILE: Demos/TDP.mm/pathology.mm FILE: Demos/TDP.mm/variation.mm FILE: Demos/TDP.mm/condition.mm FILE: Demos/TDP.mm/failure.mm FILE: Demos/TDP.mm/resistance.mm FILE: Demos/TDP.mm/immunosuppression.mm FILE: Demos/TDP.mm/radiation.mm Use of uninitialized value in pattern match (m//) at /home/cs/tpederse/bin/mm2arff.pl line 1016, <SRC> line 5289. FILE: Demos/TDP.mm/support.mm FILE: Demos/TDP.mm/implantation.mm FILE: Demos/TDP.mm/extraction.mm FILE: Demos/TDP.mm/glucose.mm FILE: Demos/TDP.mm/lead.mm FILE: Demos/TDP.mm/inhibition.mm FILE: Demos/TDP.mm/sex.mm Results are located: ted-4gram.results/OverallResults Defaults options set: --seed 1 --javaparams "-Xmx600m" --cv 10 --weka weka.classifiers.bayes.NaiveBayes --line User defined options set: --relation /home/cs/tpederse/CuiTools-0.19/default_options/relation --lc --mesh --ngramcount "--ngram 5 --remove 5 " --firstranked --stcount "--ngram 1 --remove 5" Output Directories: WEKA directory : ted-5gram.weka ARFF directory : ted-5gram.arff RESULTS directory: ted-5gram.results LOG directory : ted-5gram.log FILE: Demos/TDP.mm/energy.mm FILE: Demos/TDP.mm/scale.mm Use of uninitialized value in pattern match (m//) at /home/cs/tpederse/bin/mm2arff.pl line 995, <SRC> line 3. Use of uninitialized value in string eq at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 3. Exiting subroutine via next at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 3. FILE: Demos/TDP.mm/cold.mm FILE: Demos/TDP.mm/strains.mm FILE: Demos/TDP.mm/white.mm FILE: Demos/TDP.mm/transport.mm FILE: Demos/TDP.mm/nutrition.mm Use of uninitialized value in pattern match (m//) at /home/cs/tpederse/bin/mm2arff.pl line 1016, <SRC> line 10666. FILE: Demos/TDP.mm/determination.mm FILE: Demos/TDP.mm/surgery.mm FILE: Demos/TDP.mm/fluid.mm FILE: Demos/TDP.mm/evaluation.mm FILE: Demos/TDP.mm/frequency.mm FILE: Demos/TDP.mm/discharge.mm FILE: Demos/TDP.mm/weight.mm FILE: Demos/TDP.mm/secretion.mm FILE: Demos/TDP.mm/blood_pressure.mm FILE: Demos/TDP.mm/pressure.mm FILE: Demos/TDP.mm/reduction.mm FILE: Demos/TDP.mm/adjustment.mm FILE: Demos/TDP.mm/degree.mm Use of uninitialized value in pattern match (m//) at /home/cs/tpederse/bin/mm2arff.pl line 995, <SRC> line 8660. Use of uninitialized value in string eq at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 8660. Exiting subroutine via next at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 8660. FILE: Demos/TDP.mm/single.mm FILE: Demos/TDP.mm/mosaic.mm FILE: Demos/TDP.mm/ganglion.mm FILE: Demos/TDP.mm/depression.mm FILE: Demos/TDP.mm/transient.mm FILE: Demos/TDP.mm/fit.mm FILE: Demos/TDP.mm/growth.mm FILE: Demos/TDP.mm/culture.mm FILE: Demos/TDP.mm/association.mm FILE: Demos/TDP.mm/man.mm FILE: Demos/TDP.mm/sensitivity.mm Use of uninitialized value in pattern match (m//) at /home/cs/tpederse/bin/mm2arff.pl line 1016, <TRC> line 9019. FILE: Demos/TDP.mm/repair.mm FILE: Demos/TDP.mm/fat.mm FILE: Demos/TDP.mm/japanese.mm FILE: Demos/TDP.mm/ultrasound.mm FILE: Demos/TDP.mm/mole.mm Use of uninitialized value in pattern match (m//) at /home/cs/tpederse/bin/mm2arff.pl line 1016, <SRC> line 22203. FILE: Demos/TDP.mm/pathology.mm FILE: Demos/TDP.mm/variation.mm Use of uninitialized value in pattern match (m//) at /home/cs/tpederse/bin/mm2arff.pl line 995, <SRC> line 1484. Use of uninitialized value in string eq at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 1484. Exiting subroutine via next at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 1484. FILE: Demos/TDP.mm/condition.mm FILE: Demos/TDP.mm/failure.mm Use of uninitialized value in pattern match (m//) at /home/cs/tpederse/bin/mm2arff.pl line 1016, <SRC> line 39529. FILE: Demos/TDP.mm/resistance.mm FILE: Demos/TDP.mm/immunosuppression.mm FILE: Demos/TDP.mm/radiation.mm FILE: Demos/TDP.mm/support.mm FILE: Demos/TDP.mm/implantation.mm FILE: Demos/TDP.mm/extraction.mm FILE: Demos/TDP.mm/glucose.mm FILE: Demos/TDP.mm/lead.mm FILE: Demos/TDP.mm/inhibition.mm FILE: Demos/TDP.mm/sex.mm Results are located: ted-5gram.results/OverallResults Defaults options set: --seed 1 --javaparams "-Xmx600m" --cv 10 --weka weka.classifiers.bayes.NaiveBayes --line User defined options set: --relation /home/cs/tpederse/CuiTools-0.19/default_options/relation --lc --mesh --ngramcount "--ngram 2 --remove 2 --stop default_options/stoplist" --ngrammeasure "pmi.pm" --ngramstat "--score 5.00" --firstranked --stcount "--ngram 1 --remove 5" Output Directories: WEKA directory : ted-bigram.weka ARFF directory : ted-bigram.arff RESULTS directory: ted-bigram.results LOG directory : ted-bigram.log FILE: Demos/TDP.mm/energy.mm FILE: Demos/TDP.mm/scale.mm FILE: Demos/TDP.mm/cold.mm Use of uninitialized value in pattern match (m//) at /home/cs/tpederse/bin/mm2arff.pl line 995, <TRC> line 6051. Use of uninitialized value in string eq at /home/cs/tpederse/bin/mm2arff.pl line 1003, <TRC> line 6051. Exiting subroutine via next at /home/cs/tpederse/bin/mm2arff.pl line 1003, <TRC> line 6051. FILE: Demos/TDP.mm/strains.mm FILE: Demos/TDP.mm/white.mm FILE: Demos/TDP.mm/transport.mm FILE: Demos/TDP.mm/nutrition.mm FILE: Demos/TDP.mm/determination.mm Use of uninitialized value in pattern match (m//) at /home/cs/tpederse/bin/mm2arff.pl line 995, <SRC> line 42211. Use of uninitialized value in string eq at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 42211. Exiting subroutine via next at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 42211. FILE: Demos/TDP.mm/surgery.mm Use of uninitialized value in pattern match (m//) at /home/cs/tpederse/bin/mm2arff.pl line 1016, <SRC> line 19944. FILE: Demos/TDP.mm/fluid.mm Use of uninitialized value in pattern match (m//) at /home/cs/tpederse/bin/mm2arff.pl line 995, <SRC> line 77990. Use of uninitialized value in string eq at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 77990. Exiting subroutine via next at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 77990. FILE: Demos/TDP.mm/evaluation.mm Use of uninitialized value in pattern match (m//) at /home/cs/tpederse/bin/mm2arff.pl line 1016, <SRC> line 22075. FILE: Demos/TDP.mm/frequency.mm FILE: Demos/TDP.mm/discharge.mm FILE: Demos/TDP.mm/weight.mm FILE: Demos/TDP.mm/secretion.mm FILE: Demos/TDP.mm/blood_pressure.mm FILE: Demos/TDP.mm/pressure.mm Use of uninitialized value in pattern match (m//) at /home/cs/tpederse/bin/mm2arff.pl line 1016, <SRC> line 54855. Use of uninitialized value in pattern match (m//) at /home/cs/tpederse/bin/mm2arff.pl line 1016, <SRC> line 10883. FILE: Demos/TDP.mm/reduction.mm FILE: Demos/TDP.mm/adjustment.mm Use of uninitialized value in pattern match (m//) at /home/cs/tpederse/bin/mm2arff.pl line 995, <SRC> line 45699. Use of uninitialized value in string eq at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 45699. Exiting subroutine via next at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 45699. FILE: Demos/TDP.mm/degree.mm FILE: Demos/TDP.mm/single.mm FILE: Demos/TDP.mm/mosaic.mm FILE: Demos/TDP.mm/ganglion.mm FILE: Demos/TDP.mm/depression.mm Use of uninitialized value in pattern match (m//) at /home/cs/tpederse/bin/mm2arff.pl line 1016, <SRC> line 59911. Use of uninitialized value in pattern match (m//) at /home/cs/tpederse/bin/mm2arff.pl line 995, <SRC> line 6431. Use of uninitialized value in string eq at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 6431. Exiting subroutine via next at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 6431. FILE: Demos/TDP.mm/transient.mm FILE: Demos/TDP.mm/fit.mm FILE: Demos/TDP.mm/growth.mm FILE: Demos/TDP.mm/culture.mm FILE: Demos/TDP.mm/association.mm FILE: Demos/TDP.mm/man.mm FILE: Demos/TDP.mm/sensitivity.mm FILE: Demos/TDP.mm/repair.mm FILE: Demos/TDP.mm/fat.mm FILE: Demos/TDP.mm/japanese.mm Use of uninitialized value in pattern match (m//) at /home/cs/tpederse/bin/mm2arff.pl line 995, <SRC> line 49873. Use of uninitialized value in string eq at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 49873. Exiting subroutine via next at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 49873. FILE: Demos/TDP.mm/ultrasound.mm Use of uninitialized value in pattern match (m//) at /home/cs/tpederse/bin/mm2arff.pl line 995, <SRC> line 3. Use of uninitialized value in string eq at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 3. Exiting subroutine via next at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 3. FILE: Demos/TDP.mm/mole.mm FILE: Demos/TDP.mm/pathology.mm FILE: Demos/TDP.mm/variation.mm FILE: Demos/TDP.mm/condition.mm FILE: Demos/TDP.mm/failure.mm FILE: Demos/TDP.mm/resistance.mm Use of uninitialized value in pattern match (m//) at /home/cs/tpederse/bin/mm2arff.pl line 995, <SRC> line 5971. Use of uninitialized value in string eq at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 5971. Exiting subroutine via next at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 5971. FILE: Demos/TDP.mm/immunosuppression.mm FILE: Demos/TDP.mm/radiation.mm FILE: Demos/TDP.mm/support.mm FILE: Demos/TDP.mm/implantation.mm FILE: Demos/TDP.mm/extraction.mm FILE: Demos/TDP.mm/glucose.mm FILE: Demos/TDP.mm/lead.mm Use of uninitialized value in pattern match (m//) at /home/cs/tpederse/bin/mm2arff.pl line 1016, <SRC> line 37416. FILE: Demos/TDP.mm/inhibition.mm FILE: Demos/TDP.mm/sex.mm Results are located: ted-bigram.results/OverallResults Defaults options set: --seed 1 --javaparams "-Xmx600m" --cv 10 --weka weka.classifiers.bayes.NaiveBayes --line User defined options set: --relation /home/cs/tpederse/CuiTools-0.19/default_options/relation --lc --mesh --ngramcount "--ngram 3 --remove 5 " --firstranked --stcount "--ngram 1 --remove 5" Output Directories: WEKA directory : ted-trigram-freq.weka ARFF directory : ted-trigram-freq.arff RESULTS directory: ted-trigram-freq.results LOG directory : ted-trigram-freq.log FILE: Demos/TDP.mm/energy.mm FILE: Demos/TDP.mm/scale.mm FILE: Demos/TDP.mm/cold.mm FILE: Demos/TDP.mm/strains.mm FILE: Demos/TDP.mm/white.mm FILE: Demos/TDP.mm/transport.mm FILE: Demos/TDP.mm/nutrition.mm FILE: Demos/TDP.mm/determination.mm Use of uninitialized value in pattern match (m//) at /home/cs/tpederse/bin/mm2arff.pl line 995, <SRC> line 77440. Use of uninitialized value in string eq at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 77440. Exiting subroutine via next at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 77440. FILE: Demos/TDP.mm/surgery.mm FILE: Demos/TDP.mm/fluid.mm Use of uninitialized value in pattern match (m//) at /home/cs/tpederse/bin/mm2arff.pl line 995, <SRC> line 12697. Use of uninitialized value in string eq at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 12697. Exiting subroutine via next at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 12697. FILE: Demos/TDP.mm/evaluation.mm FILE: Demos/TDP.mm/frequency.mm FILE: Demos/TDP.mm/discharge.mm FILE: Demos/TDP.mm/weight.mm FILE: Demos/TDP.mm/secretion.mm Use of uninitialized value in pattern match (m//) at /home/cs/tpederse/bin/mm2arff.pl line 1016, <SRC> line 76091. FILE: Demos/TDP.mm/blood_pressure.mm FILE: Demos/TDP.mm/pressure.mm FILE: Demos/TDP.mm/reduction.mm FILE: Demos/TDP.mm/adjustment.mm Use of uninitialized value in pattern match (m//) at /home/cs/tpederse/bin/mm2arff.pl line 995, <SRC> line 61847. Use of uninitialized value in string eq at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 61847. Exiting subroutine via next at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 61847. FILE: Demos/TDP.mm/degree.mm FILE: Demos/TDP.mm/single.mm FILE: Demos/TDP.mm/mosaic.mm FILE: Demos/TDP.mm/ganglion.mm FILE: Demos/TDP.mm/depression.mm FILE: Demos/TDP.mm/transient.mm Use of uninitialized value in pattern match (m//) at /home/cs/tpederse/bin/mm2arff.pl line 1016, <SRC> line 18003. FILE: Demos/TDP.mm/fit.mm Use of uninitialized value in pattern match (m//) at /home/cs/tpederse/bin/mm2arff.pl line 995, <SRC> line 3. Use of uninitialized value in string eq at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 3. Exiting subroutine via next at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 3. FILE: Demos/TDP.mm/growth.mm FILE: Demos/TDP.mm/culture.mm FILE: Demos/TDP.mm/association.mm FILE: Demos/TDP.mm/man.mm FILE: Demos/TDP.mm/sensitivity.mm FILE: Demos/TDP.mm/repair.mm FILE: Demos/TDP.mm/fat.mm FILE: Demos/TDP.mm/japanese.mm FILE: Demos/TDP.mm/ultrasound.mm FILE: Demos/TDP.mm/mole.mm FILE: Demos/TDP.mm/pathology.mm FILE: Demos/TDP.mm/variation.mm FILE: Demos/TDP.mm/condition.mm FILE: Demos/TDP.mm/failure.mm FILE: Demos/TDP.mm/resistance.mm FILE: Demos/TDP.mm/immunosuppression.mm Use of uninitialized value in pattern match (m//) at /home/cs/tpederse/bin/mm2arff.pl line 995, <SRC> line 28027. Use of uninitialized value in string eq at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 28027. Exiting subroutine via next at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 28027. FILE: Demos/TDP.mm/radiation.mm Use of uninitialized value in pattern match (m//) at /home/cs/tpederse/bin/mm2arff.pl line 995, <SRC> line 32085. Use of uninitialized value in string eq at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 32085. Exiting subroutine via next at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 32085. FILE: Demos/TDP.mm/support.mm FILE: Demos/TDP.mm/implantation.mm FILE: Demos/TDP.mm/extraction.mm FILE: Demos/TDP.mm/glucose.mm FILE: Demos/TDP.mm/lead.mm FILE: Demos/TDP.mm/inhibition.mm Use of uninitialized value in pattern match (m//) at /home/cs/tpederse/bin/mm2arff.pl line 1016, <SRC> line 3. FILE: Demos/TDP.mm/sex.mm Results are located: ted-trigram-freq.results/OverallResults Defaults options set: --seed 1 --javaparams "-Xmx600m" --cv 10 --weka weka.classifiers.bayes.NaiveBayes --line User defined options set: --relation /home/cs/tpederse/CuiTools-0.19/default_options/relation --lc --mesh --ngramcount "--ngram 3 --remove 2 --stop default_options/stoplist" --ngrammeasure "ll3.pm" --ngramstat "--ngram 3 --score 3.841" --firstranked --stcount "--ngram 1 --remove 5" Output Directories: WEKA directory : ted-trigram.weka ARFF directory : ted-trigram.arff RESULTS directory: ted-trigram.results LOG directory : ted-trigram.log FILE: Demos/TDP.mm/energy.mm FILE: Demos/TDP.mm/scale.mm FILE: Demos/TDP.mm/cold.mm Use of uninitialized value in pattern match (m//) at /home/cs/tpederse/bin/mm2arff.pl line 1016, <SRC> line 8064. FILE: Demos/TDP.mm/strains.mm FILE: Demos/TDP.mm/white.mm FILE: Demos/TDP.mm/transport.mm FILE: Demos/TDP.mm/nutrition.mm FILE: Demos/TDP.mm/determination.mm FILE: Demos/TDP.mm/surgery.mm FILE: Demos/TDP.mm/fluid.mm FILE: Demos/TDP.mm/evaluation.mm Use of uninitialized value in pattern match (m//) at /home/cs/tpederse/bin/mm2arff.pl line 995, <SRC> line 3. Use of uninitialized value in string eq at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 3. Exiting subroutine via next at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 3. FILE: Demos/TDP.mm/frequency.mm FILE: Demos/TDP.mm/discharge.mm FILE: Demos/TDP.mm/weight.mm Use of uninitialized value in pattern match (m//) at /home/cs/tpederse/bin/mm2arff.pl line 995, <SRC> line 11739. Use of uninitialized value in string eq at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 11739. Exiting subroutine via next at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 11739. FILE: Demos/TDP.mm/secretion.mm FILE: Demos/TDP.mm/blood_pressure.mm Use of uninitialized value in pattern match (m//) at /home/cs/tpederse/bin/mm2arff.pl line 995, <TRC> line 8059. Use of uninitialized value in string eq at /home/cs/tpederse/bin/mm2arff.pl line 1003, <TRC> line 8059. Exiting subroutine via next at /home/cs/tpederse/bin/mm2arff.pl line 1003, <TRC> line 8059. FILE: Demos/TDP.mm/pressure.mm FILE: Demos/TDP.mm/reduction.mm FILE: Demos/TDP.mm/adjustment.mm FILE: Demos/TDP.mm/degree.mm FILE: Demos/TDP.mm/single.mm FILE: Demos/TDP.mm/mosaic.mm FILE: Demos/TDP.mm/ganglion.mm FILE: Demos/TDP.mm/depression.mm FILE: Demos/TDP.mm/transient.mm FILE: Demos/TDP.mm/fit.mm FILE: Demos/TDP.mm/growth.mm FILE: Demos/TDP.mm/culture.mm FILE: Demos/TDP.mm/association.mm FILE: Demos/TDP.mm/man.mm FILE: Demos/TDP.mm/sensitivity.mm FILE: Demos/TDP.mm/repair.mm FILE: Demos/TDP.mm/fat.mm Use of uninitialized value in pattern match (m//) at /home/cs/tpederse/bin/mm2arff.pl line 995, <SRC> line 90936. Use of uninitialized value in string eq at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 90936. Exiting subroutine via next at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 90936. FILE: Demos/TDP.mm/japanese.mm FILE: Demos/TDP.mm/ultrasound.mm FILE: Demos/TDP.mm/mole.mm FILE: Demos/TDP.mm/pathology.mm FILE: Demos/TDP.mm/variation.mm FILE: Demos/TDP.mm/condition.mm FILE: Demos/TDP.mm/failure.mm FILE: Demos/TDP.mm/resistance.mm FILE: Demos/TDP.mm/immunosuppression.mm Use of uninitialized value in pattern match (m//) at /home/cs/tpederse/bin/mm2arff.pl line 995, <SRC> line 4432. Use of uninitialized value in string eq at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 4432. Exiting subroutine via next at /home/cs/tpederse/bin/mm2arff.pl line 1003, <SRC> line 4432. FILE: Demos/TDP.mm/radiation.mm FILE: Demos/TDP.mm/support.mm FILE: Demos/TDP.mm/implantation.mm FILE: Demos/TDP.mm/extraction.mm FILE: Demos/TDP.mm/glucose.mm FILE: Demos/TDP.mm/lead.mm FILE: Demos/TDP.mm/inhibition.mm FILE: Demos/TDP.mm/sex.mm Results are located: ted-trigram.results/OverallResults -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: Ted P. <tpederse@d.umn.edu> - 2008-08-04 14:06:23
|
Hi Bridget, I was doing a little experimenting with the --nspconfig option, and it seems like it might not be recognizing the --score option with statistic:: This was my configuration file, which is a slightly modified version of your example in the perldoc... ngramcount:: count:: --ngram 2 --remove 3 statistic:: ll.pm --score 3.841 cuicount:: count:: --ngram 2 --remove 3 statistic:: ll.pm --score 3.841 stcount:: count:: --ngram 1 --remove 3 It got the following error... Defaults options set: --seed 1 --javaparams "-Xmx600m" --cv 10 --weka weka.classifiers.bayes.NaiveBayes --line User defined options set: --lc --nspconfig ./ted-err.txt ngramcount:: count:: --ngram 2 --remove 3 statistic:: ll.pm --score 3.841 cuicount:: count:: --ngram 2 --remove 3 statistic:: ll.pm --score 3.841 stcount:: count:: --ngram 1 --remove 3 Output Directories: WEKA directory : ted-err.weka ARFF directory : ted-err.arff RESULTS directory: ted-err.results LOG directory : ted-err.log FILE: Demos/TDP.mm/energy.mm Unknown option: score Use of uninitialized value in scalar chomp at /usr/local/bin/count.pl line 341. Output file statistic:: already exists! Overwrite (Y/N)? Could not open NSP outp ut file ted-err.log/ngram.0.374441146691428.input.cnt ERROR: The ARFF file (ted-err.arff/energy/1.arff.train) is empty Now, after this happened I thought perhaps statistic:: needed to be on its own line, so I formatted it like this: ngramcount:: count:: --ngram 2 --remove 3 statistic:: ll.pm --score 3.841 cuicount:: count:: --ngram 2 --remove 3 statistic:: ll.pm --score 3.841 stcount:: count:: --ngram 1 --remove 3 However, in this case it seemed like the statistic:: portion was simply ignored. So I think we might want to have some simple parsing of the file to either detect that statistic:: is being used invalidly here, or to allow line breaks in the middle of the lines (ie treat the two files above as identical, which is probably the better option). Thanks! Ted -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: Bridget T. M. <bth...@cs...> - 2008-07-25 13:49:07
|
Hi Ted, This sounds good. I will add it! Thanks, Bridget On Thu, 24 Jul 2008, Ted Pedersen wrote: > Hi Bridget, > > In your INSTALL document, I think we probably want to clarify what > people should be downloading by giving not only what they see on the > link (which you have below) but also the names of the files they get. > Also, I think what they download from the full test collection are the > first and second files there...? > > I see that you say in the INSTALL document we should use the PMID > version, and that seems fine to me...I did not use the PMID version > when I did my install so I'm going back to use PMID.... > > ======================== > The first in the ``Basic Test Collection'': 1. Basic Reviewed Set > > The second and third in the ``Full Test Collection'': 2. Common Files > 3. Full Reviewed Result Set > > Unpack the files in a directory called NLM-WSD (for example - you can > call it anything you would like). You should end up with three > directories in the NLM-WSD directory: 1. Basic_Reviewed_Results 2. > common 3. Reviewed_Results > ========================= > > We may also want caution the user that the file names for the PMID and > non-PMID versions are the same, and give them some idea of how to tell > them apart...and we should probably still have a check somewhere that > automatically verifies that we have the PMID format, not only in the > event that people download the wrong version, but that they might try > to use completely wrong sorts of data.... > > Thanks! > Ted > > On Thu, Jul 24, 2008 at 10:49 AM, Ted Pedersen <tpederse@d.umn.edu> wrote: > > Hi Bridget, > > > > One small question below... > > > > On Wed, Jul 23, 2008 at 4:00 PM, Bridget Thomson McInnes > > <bth...@cs...> wrote: > >> Hi Ted, > >> > >> You are using the wrong version of the NLM-WSD dataset. There are two > >> version, one in which the identifiers are PMIDs and the other which are UIs. > >> Unfortuently, the UIs one is the first one. > >> > >> If you go to the NLM-WSD website: http://wsd.nlm.nih.gov/ > >> > >> and log in, scroll down to the first white box. You will see: > >> > >> Switch to PMID identified version of the WSD Test Collection > >> > >> > >> It originally didn't matter which version that you used but now when using > >> the --mesh option or running the make-MMTxNLM-data.sh Demo - the PMID > >> version is required. > > > > I think I'm running the make-MMTxNLM-data.sh script on the same data I'm using > > for everything else (which isn't probably the PMID data....) Is there something > > I should see in the output that would let me know that it wasn't the PMID data? > > > > I can't remember if I asked this before, but I'm wondering if we should simply > > require the PMID data, to avoid having the user get two different > > forms of the same > > data (and potentially use them in the wrong places...) It sounds like > > some things > > will work with both forms of the data, but a few things will not (and require > > PMID), so that sort of suggests that PMID is the more "generic" form of the > > data....is there any downside to using PMID rather than the other form? > > > > Related to this, are there any other possible enhancements in the > > future (that we've > > already discussed) that would require PMID? > > > > What do you think? > > > > Thanks, > > Ted > > > >> > >> I will think about how to go about adding a check for that. I have done it > >> myself to many times as well. And state it clearer in the INSTALL > >> documentation. > >> > >> Sorry about that! > >> > >> Thanks, > >> > >> Bridget > >> > >> On Wed, 23 Jul 2008, Ted Pedersen wrote: > >> > >>> Hi Bridget, > >>> > >>> I'm using whichever one is used by the McInnes07 demo, since I'm just > >>> copying McInnes07.mm and using that as TDP.mm. > >>> > >>> Here's the first few lines of adjustment.mm, hopefully that will make > >>> it clear what I'm using - I *think* this is the PMID version since it > >>> has that number in it (which I think is the PMID)? > >>> > >>> <corpus lang='en'> > >>> <lexelt item="adjustment" senses="M1,M2,M3,None"> > >>> <instance id="98076825" alias="adjustment"> > >>> <answer instance="98076825.ab.7" senseid="M2"/> > >>> <context line="Influence of physiological factors on the > >>> age-related increase in blood pressure in healthy men. The independent > >>> a > >>> nd collective influences of several physiological factors on the > >>> age-related increase in blood pressure in healthy men were examined. T > >>> wenty-seven younger and 25 older, mostly normotensive, healthy men > >>> were studied. Blood pressure, body fat, body fat distribution, maxim > >>> al oxygen consumption (VO2max), plasma norepinephrine, dietary Na, and > >>> erythrocyte Na-K pump activity were measured. Older men showed 5 > >>> 7% higher percent body fat, 40% higher plasma norepinephrine > >>> concentration, 14% greater mean arterial blood pressure (MAP), and 5% > >>> high > >>> er plasma K concentration than younger men (all p < 0.01). Older men > >>> showed a 38% (p < 0.01) lower VO2max, 19% (p < 0.05) lower energy > >>> intake, 18% (p < 0.05) lower Na-K pump rate constant, and a 17% (p < > >>> 0.05) lower Na-K pump rate. Group means for MAP were adjusted for > >>> combinations of plasma norepinephrine, waist:thigh ratio, VO2max, and > >>> the Na-K pump rate constant, to determine if any one variable or > >>> combination could account for the age related increase in MAP. > >>> Statistical adjustment for plasma norepinephrine, waist:thigh ratio, > >>> and > >>> Na-K pump rate constant eliminated the significant difference between > >>> MAPs for the two groups. Thus, alterations in sympathetic nervou > >>> s system activity, body fat distribution, and the membrane Na-K pump > >>> activity independently contribute to the age-related increase in M > >>> AP in healthy men. "/> > >>> <sentence tw="" id="98076825.ti.1" line="Influence of > >>> physiological factors on the age-related increase in blood pressure in > >>> he > >>> althy men."> > >>> > >>> Does that look like the right format? > >>> > >>> Thanks! > >>> Ted > >>> > >>> On Wed, Jul 23, 2008 at 3:30 PM, Bridget Thomson McInnes > >>> <bth...@cs...> wrote: > >>>> > >>>> Hi Ted, > >>>> > >>>> What version of the NLM-WSD dataset are you using? The --mesh option > >>>> requires that the PMID version be used. > >>>> > >>>> Thanks! > >>>> > >>>> Bridget > >>>> > >>>> On Wed, 23 Jul 2008, Bridget Thomson McInnes wrote: > >>>> > >>>>> Hi Ted, > >>>>> > >>>>> I just downloaded the CuiTools from the webpage and got the same error. > >>>>> I > >>>>> will see why it is doing this! > >>>>> > >>>>> Thanks! > >>>>> > >>>>> Bridget > >>>>> > >>>>> On Wed, 23 Jul 2008, Ted Pedersen wrote: > >>>>> > >>>>>> Hi Bridget, > >>>>>> > >>>>>> I'm in the process of running the --mesh option again, this time just > >>>>>> using adjustment and --mesh, as in... > >>>>>> > >>>>>> supervised-disambiguate.pl Demos/TDP.mm/adjustment.mm --mesh > >>>>>> --directory ted-adjustment-mesh > >>>>>> > >>>>>> One thing I've noticed in the previous cases with --mesh is that the > >>>>>> log directory is empty - which I guess means that no features were > >>>>>> found, or something....then of course the ARFF files don't have any > >>>>>> features in the them either, leading to the majority classifier... > >>>>>> > >>>>>> This isn't just specific to --mesh, but I do think it would be a good > >>>>>> idea to issue a warning or possibly even an error when no features are > >>>>>> found, just so the user doesn't end up getting a majority classifier > >>>>>> without realizing it - unless I had been curious about the Mesh > >>>>>> features I might not have noticed any of this, just because you do get > >>>>>> results back even after finding no features. I think it might be ok to > >>>>>> default to a majority classifier in this case, but we'd want the user > >>>>>> to know that this has happened... > >>>>>> > >>>>>> My adjustment run just finished, so I've attached a zip file with the > >>>>>> log, arff, weka and results directories... > >>>>>> > >>>>>> Thanks, > >>>>>> Ted > >>>>>> > >>>>>> On Wed, Jul 23, 2008 at 2:33 PM, Bridget Thomson McInnes > >>>>>> <bth...@cs...> wrote: > >>>>>>> > >>>>>>> Hi Ted, > >>>>>>> > >>>>>>> I am not certain. I am going to redownload the package, do a clean > >>>>>>> install > >>>>>>> and try it again. Hopefully I will be able to recreate it. > >>>>>>> > >>>>>>> Thanks! > >>>>>>> > >>>>>>> Bridget > >>>>>>> > >>>>>>> On Wed, 23 Jul 2008, Ted Pedersen wrote: > >>>>>>> > >>>>>>>> Hi Bridget, > >>>>>>>> > >>>>>>>> Thanks for this script - I ran it and it seems to give me back a fair > >>>>>>>> number of results...so I seem to be able to access PubMed ok... > >>>>>>>> > >>>>>>>> marimba(4): perl get-mesh.pl 9337195 > >>>>>>>> 9337195 : *Birth Weight,Blood Glucose/metabolism,Blood > >>>>>>>> Pressure,Brachial Artery/anatomy & > >>>>>>>> histology/*physiology/ultrasonography,Cardiovascular > >>>>>>>> Diseases/*epidemiology,Child,Cholesterol, > >>>>>>>> LDL/blood,England,Female,Humans,Lipids/blood,Muscle, Smooth, > >>>>>>>> Vascular/anatomy & > >>>>>>>> histology/physiology/ultrasonography,Parity,Pregnancy,Regional Blood > >>>>>>>> Flow,Risk Factors,Socioeconomic Factors,Tobacco Smoke > >>>>>>>> Pollution,*Vasodilation > >>>>>>>> > >>>>>>>> What could I check next? > >>>>>>>> > >>>>>>>> Thanks, > >>>>>>>> Ted > >>>>>>>> > >>>>>>>> On Wed, Jul 23, 2008 at 2:17 PM, Bridget Thomson McInnes > >>>>>>>> <bth...@cs...> wrote: > >>>>>>>>> > >>>>>>>>> Hi Ted > >>>>>>>>> > >>>>>>>>> I attached a test script to check. It is called : get-msh.pl > >>>>>>>>> > >>>>>>>>> Here is an example run: > >>>>>>>>> > >>>>>>>>> bthomson@caesar (~) % perl get-msh.pl 9337195 > >>>>>>>>> 9337195 : *Birth Weight,Blood Glucose/metabolism,Blood > >>>>>>>>> Pressure,Brachial > >>>>>>>>> Artery/anatomy & > >>>>>>>>> histology/*physiology/ultrasonography,Cardiovascular > >>>>>>>>> Diseases/*epidemiology,Child,Cholesterol, > >>>>>>>>> LDL/blood,England,Female,Humans,Lipids/blood,Muscle, Smooth, > >>>>>>>>> Vascular/anatomy & > >>>>>>>>> histology/physiology/ultrasonography,Parity,Pregnancy,Regional Blood > >>>>>>>>> Flow,Risk Factors,Socioeconomic Factors,Tobacco Smoke > >>>>>>>>> Pollution,*Vasodilation > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> This is the same result that I get on my computer at school and here > >>>>>>>>> at > >>>>>>>>> work. > >>>>>>>>> > >>>>>>>>> Thanks! > >>>>>>>>> > >>>>>>>>> Bridget > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> On Wed, 23 Jul 2008, Ted Pedersen wrote: > >>>>>>>>> > >>>>>>>>>> Hi Bridget, > >>>>>>>>>> > >>>>>>>>>> TDP.mm is just a copy of McInnesPC07.dir.mm from the Demos > >>>>>>>>>> directory. > >>>>>>>>>> Did you mean > >>>>>>>>>> that, or the actual output? > >>>>>>>>>> > >>>>>>>>>> I do have an internet connection so I don't think that's the > >>>>>>>>>> problem. > >>>>>>>>>> How would I know if > >>>>>>>>>> PubMed cut me off? > >>>>>>>>>> > >>>>>>>>>> Thanks! > >>>>>>>>>> Ted > >>>>>>>>>> > >>>>>>>>>> On Wed, Jul 23, 2008 at 12:40 PM, Bridget Thomson McInnes > >>>>>>>>>> <bth...@cs...> wrote: > >>>>>>>>>>> > >>>>>>>>>>> Hi Ted, > >>>>>>>>>>> > >>>>>>>>>>> I am not certain why this is happening. I don't have this problem. > >>>>>>>>>>> The > >>>>>>>>>>> mesh terms are obtained using the PubMed API. I can see two > >>>>>>>>>>> potential > >>>>>>>>>>> problems: > >>>>>>>>>>> 1. No internet connection > >>>>>>>>>>> - which I will put in the documentation! > >>>>>>>>>>> > >>>>>>>>>>> 2. Do you think PubMed cut you off? They have done that to me > >>>>>>>>>>> before. They just start rejecting my queries if they > >>>>>>>>>>> think I have been using it to much. I have not quite > >>>>>>>>>>> determined what to much is yet. I will write a > >>>>>>>>>>> check in the program to make certain that something > >>>>>>>>>>> is coming back and if not error out. > >>>>>>>>>>> > >>>>>>>>>>> Otherwise I can't think of what it is. It isn't like using the > >>>>>>>>>>> UMLSKS > >>>>>>>>>>> API > >>>>>>>>>>> where the ip address needs to be registered. I have only tested > >>>>>>>>>>> this > >>>>>>>>>>> inside NLM - my connection at the apartment goes in and out so I > >>>>>>>>>>> haven't > >>>>>>>>>>> been able to test this on my laptop. > >>>>>>>>>>> > >>>>>>>>>>> Could you send me your TDP.mm file? > >>>>>>>>>>> > >>>>>>>>>>> Thanks! > >>>>>>>>>>> > >>>>>>>>>>> Bridget > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> On Wed, 23 Jul 2008, Ted Pedersen wrote: > >>>>>>>>>>> > >>>>>>>>>>>> Hi Bridget, > >>>>>>>>>>>> > >>>>>>>>>>>> When using the Mesh option (--mesh) by itself, I seem to never > >>>>>>>>>>>> get > >>>>>>>>>>>> any > >>>>>>>>>>>> features... > >>>>>>>>>>>> > >>>>>>>>>>>> My arff files all look something like this... > >>>>>>>>>>>> > >>>>>>>>>>>> @RELATION pressure > >>>>>>>>>>>> @ATTRIBUTE Sense {M1,M2,M3,None} > >>>>>>>>>>>> @DATA > >>>>>>>>>>>> M1 % 97403834 > >>>>>>>>>>>> M1 % 98281278 > >>>>>>>>>>>> M1 % 98124304 > >>>>>>>>>>>> > >>>>>>>>>>>> And so I end up getting a majority classifier... > >>>>>>>>>>>> > >>>>>>>>>>>> Is there something I am supposed to be doing to get the mesh > >>>>>>>>>>>> features? > >>>>>>>>>>>> I am just running like this... > >>>>>>>>>>>> > >>>>>>>>>>>> supervised-disambiguate.pl TDP.mm --mesh > >>>>>>>>>>>> > >>>>>>>>>>>> where TDP.mm is a directory with all the NLM-WSD data in .mm > >>>>>>>>>>>> format > >>>>>>>>>>>> (one file per word). All the files > >>>>>>>>>>>> seem to be getting processed, and no errors are shown, but the > >>>>>>>>>>>> results > >>>>>>>>>>>> are pretty much just a majority > >>>>>>>>>>>> classifier (due to lack of features...) > >>>>>>>>>>>> > >>>>>>>>>>>> Any idea on this? > >>>>>>>>>>>> > >>>>>>>>>>>> Thanks! > >>>>>>>>>>>> Ted > >>>>>>>>>>>> > >>>>>>>>>>>> -- > >>>>>>>>>>>> Ted Pedersen > >>>>>>>>>>>> http://www.d.umn.edu/~tpederse > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> -- > >>>>>>>>>> Ted Pedersen > >>>>>>>>>> http://www.d.umn.edu/~tpederse > >>>>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> -- > >>>>>>>> Ted Pedersen > >>>>>>>> http://www.d.umn.edu/~tpederse > >>>>>>>> > >>>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> -- > >>>>>> Ted Pedersen > >>>>>> http://www.d.umn.edu/~tpederse > >>>>>> > >>>>> > >>>>> > >>>>> ------------------------------------------------------------------------- > >>>>> This SF.Net email is sponsored by the Moblin Your Move Developer's > >>>>> challenge > >>>>> Build the coolest Linux based applications with Moblin SDK & win great > >>>>> prizes > >>>>> Grand prize is a trip for two to an Open Source event anywhere in the > >>>>> world > >>>>> http://moblin-contest.org/redirect.php?banner_id=100&url=/ > >>>>> _______________________________________________ > >>>>> Cuitools-users mailing list > >>>>> Cui...@li... > >>>>> https://lists.sourceforge.net/lists/listinfo/cuitools-users > >>>>> > >>>> > >>> > >>> > >>> > >>> -- > >>> Ted Pedersen > >>> http://www.d.umn.edu/~tpederse > >>> > >> > > > > > > > > -- > > Ted Pedersen > > http://www.d.umn.edu/~tpederse > > > > > > -- > Ted Pedersen > http://www.d.umn.edu/~tpederse > |
From: Bridget T. M. <bth...@cs...> - 2008-07-25 13:38:15
|
Hi Ted, > I think I'm running the make-MMTxNLM-data.sh script on the same data I'm using > for everything else (which isn't probably the PMID data....) Is there something > I should see in the output that would let me know that it wasn't the PMID data? > The make-MMTxNLM-data.sh script should run without error on the UI data (non-PMID NLM-WSD dataset data). But the nlm2sval2 program puts the head tags in the wrong place when using this version so the output data is not always correct. The nlm2sval2 program requires the PMID version of the NLM-WSD dataset. There is not a difference when looking at the dataset. The only difference is the identifier (PMID or UIs). I can create a hard coded check by looking at the first pmid/ui of the target word adjustment in the cit00000 file which should tell me what version is being used. And then I can set a check up for the --mesh option. > I can't remember if I asked this before, but I'm wondering if we should simply > require the PMID data, to avoid having the user get two different > forms of the same > data (and potentially use them in the wrong places...) It sounds like > some things > will work with both forms of the data, but a few things will not (and require > PMID), so that sort of suggests that PMID is the more "generic" form of the > data....is there any downside to using PMID rather than the other form? > > Related to this, are there any other possible enhancements in the > future (that we've > already discussed) that would require PMID? > I do require the PMID version now. I don't see any downside to using the PMID version. In many ways I think it is better since all new information about an abstract is stored based on the PMID of the abstract now; not the UI. From my understanding, all pubmed abstracts have a PMID but not all have a UI anymore. Thanks, Bridget > What do you think? > > Thanks, > Ted > > > > > I will think about how to go about adding a check for that. I have done it > > myself to many times as well. And state it clearer in the INSTALL > > documentation. > > > > Sorry about that! > > > > Thanks, > > > > Bridget > > > > On Wed, 23 Jul 2008, Ted Pedersen wrote: > > > >> Hi Bridget, > >> > >> I'm using whichever one is used by the McInnes07 demo, since I'm just > >> copying McInnes07.mm and using that as TDP.mm. > >> > >> Here's the first few lines of adjustment.mm, hopefully that will make > >> it clear what I'm using - I *think* this is the PMID version since it > >> has that number in it (which I think is the PMID)? > >> > >> <corpus lang='en'> > >> <lexelt item="adjustment" senses="M1,M2,M3,None"> > >> <instance id="98076825" alias="adjustment"> > >> <answer instance="98076825.ab.7" senseid="M2"/> > >> <context line="Influence of physiological factors on the > >> age-related increase in blood pressure in healthy men. The independent > >> a > >> nd collective influences of several physiological factors on the > >> age-related increase in blood pressure in healthy men were examined. T > >> wenty-seven younger and 25 older, mostly normotensive, healthy men > >> were studied. Blood pressure, body fat, body fat distribution, maxim > >> al oxygen consumption (VO2max), plasma norepinephrine, dietary Na, and > >> erythrocyte Na-K pump activity were measured. Older men showed 5 > >> 7% higher percent body fat, 40% higher plasma norepinephrine > >> concentration, 14% greater mean arterial blood pressure (MAP), and 5% > >> high > >> er plasma K concentration than younger men (all p < 0.01). Older men > >> showed a 38% (p < 0.01) lower VO2max, 19% (p < 0.05) lower energy > >> intake, 18% (p < 0.05) lower Na-K pump rate constant, and a 17% (p < > >> 0.05) lower Na-K pump rate. Group means for MAP were adjusted for > >> combinations of plasma norepinephrine, waist:thigh ratio, VO2max, and > >> the Na-K pump rate constant, to determine if any one variable or > >> combination could account for the age related increase in MAP. > >> Statistical adjustment for plasma norepinephrine, waist:thigh ratio, > >> and > >> Na-K pump rate constant eliminated the significant difference between > >> MAPs for the two groups. Thus, alterations in sympathetic nervou > >> s system activity, body fat distribution, and the membrane Na-K pump > >> activity independently contribute to the age-related increase in M > >> AP in healthy men. "/> > >> <sentence tw="" id="98076825.ti.1" line="Influence of > >> physiological factors on the age-related increase in blood pressure in > >> he > >> althy men."> > >> > >> Does that look like the right format? > >> > >> Thanks! > >> Ted > >> > >> On Wed, Jul 23, 2008 at 3:30 PM, Bridget Thomson McInnes > >> <bth...@cs...> wrote: > >>> > >>> Hi Ted, > >>> > >>> What version of the NLM-WSD dataset are you using? The --mesh option > >>> requires that the PMID version be used. > >>> > >>> Thanks! > >>> > >>> Bridget > >>> > >>> On Wed, 23 Jul 2008, Bridget Thomson McInnes wrote: > >>> > >>>> Hi Ted, > >>>> > >>>> I just downloaded the CuiTools from the webpage and got the same error. > >>>> I > >>>> will see why it is doing this! > >>>> > >>>> Thanks! > >>>> > >>>> Bridget > >>>> > >>>> On Wed, 23 Jul 2008, Ted Pedersen wrote: > >>>> > >>>>> Hi Bridget, > >>>>> > >>>>> I'm in the process of running the --mesh option again, this time just > >>>>> using adjustment and --mesh, as in... > >>>>> > >>>>> supervised-disambiguate.pl Demos/TDP.mm/adjustment.mm --mesh > >>>>> --directory ted-adjustment-mesh > >>>>> > >>>>> One thing I've noticed in the previous cases with --mesh is that the > >>>>> log directory is empty - which I guess means that no features were > >>>>> found, or something....then of course the ARFF files don't have any > >>>>> features in the them either, leading to the majority classifier... > >>>>> > >>>>> This isn't just specific to --mesh, but I do think it would be a good > >>>>> idea to issue a warning or possibly even an error when no features are > >>>>> found, just so the user doesn't end up getting a majority classifier > >>>>> without realizing it - unless I had been curious about the Mesh > >>>>> features I might not have noticed any of this, just because you do get > >>>>> results back even after finding no features. I think it might be ok to > >>>>> default to a majority classifier in this case, but we'd want the user > >>>>> to know that this has happened... > >>>>> > >>>>> My adjustment run just finished, so I've attached a zip file with the > >>>>> log, arff, weka and results directories... > >>>>> > >>>>> Thanks, > >>>>> Ted > >>>>> > >>>>> On Wed, Jul 23, 2008 at 2:33 PM, Bridget Thomson McInnes > >>>>> <bth...@cs...> wrote: > >>>>>> > >>>>>> Hi Ted, > >>>>>> > >>>>>> I am not certain. I am going to redownload the package, do a clean > >>>>>> install > >>>>>> and try it again. Hopefully I will be able to recreate it. > >>>>>> > >>>>>> Thanks! > >>>>>> > >>>>>> Bridget > >>>>>> > >>>>>> On Wed, 23 Jul 2008, Ted Pedersen wrote: > >>>>>> > >>>>>>> Hi Bridget, > >>>>>>> > >>>>>>> Thanks for this script - I ran it and it seems to give me back a fair > >>>>>>> number of results...so I seem to be able to access PubMed ok... > >>>>>>> > >>>>>>> marimba(4): perl get-mesh.pl 9337195 > >>>>>>> 9337195 : *Birth Weight,Blood Glucose/metabolism,Blood > >>>>>>> Pressure,Brachial Artery/anatomy & > >>>>>>> histology/*physiology/ultrasonography,Cardiovascular > >>>>>>> Diseases/*epidemiology,Child,Cholesterol, > >>>>>>> LDL/blood,England,Female,Humans,Lipids/blood,Muscle, Smooth, > >>>>>>> Vascular/anatomy & > >>>>>>> histology/physiology/ultrasonography,Parity,Pregnancy,Regional Blood > >>>>>>> Flow,Risk Factors,Socioeconomic Factors,Tobacco Smoke > >>>>>>> Pollution,*Vasodilation > >>>>>>> > >>>>>>> What could I check next? > >>>>>>> > >>>>>>> Thanks, > >>>>>>> Ted > >>>>>>> > >>>>>>> On Wed, Jul 23, 2008 at 2:17 PM, Bridget Thomson McInnes > >>>>>>> <bth...@cs...> wrote: > >>>>>>>> > >>>>>>>> Hi Ted > >>>>>>>> > >>>>>>>> I attached a test script to check. It is called : get-msh.pl > >>>>>>>> > >>>>>>>> Here is an example run: > >>>>>>>> > >>>>>>>> bthomson@caesar (~) % perl get-msh.pl 9337195 > >>>>>>>> 9337195 : *Birth Weight,Blood Glucose/metabolism,Blood > >>>>>>>> Pressure,Brachial > >>>>>>>> Artery/anatomy & > >>>>>>>> histology/*physiology/ultrasonography,Cardiovascular > >>>>>>>> Diseases/*epidemiology,Child,Cholesterol, > >>>>>>>> LDL/blood,England,Female,Humans,Lipids/blood,Muscle, Smooth, > >>>>>>>> Vascular/anatomy & > >>>>>>>> histology/physiology/ultrasonography,Parity,Pregnancy,Regional Blood > >>>>>>>> Flow,Risk Factors,Socioeconomic Factors,Tobacco Smoke > >>>>>>>> Pollution,*Vasodilation > >>>>>>>> > >>>>>>>> > >>>>>>>> This is the same result that I get on my computer at school and here > >>>>>>>> at > >>>>>>>> work. > >>>>>>>> > >>>>>>>> Thanks! > >>>>>>>> > >>>>>>>> Bridget > >>>>>>>> > >>>>>>>> > >>>>>>>> On Wed, 23 Jul 2008, Ted Pedersen wrote: > >>>>>>>> > >>>>>>>>> Hi Bridget, > >>>>>>>>> > >>>>>>>>> TDP.mm is just a copy of McInnesPC07.dir.mm from the Demos > >>>>>>>>> directory. > >>>>>>>>> Did you mean > >>>>>>>>> that, or the actual output? > >>>>>>>>> > >>>>>>>>> I do have an internet connection so I don't think that's the > >>>>>>>>> problem. > >>>>>>>>> How would I know if > >>>>>>>>> PubMed cut me off? > >>>>>>>>> > >>>>>>>>> Thanks! > >>>>>>>>> Ted > >>>>>>>>> > >>>>>>>>> On Wed, Jul 23, 2008 at 12:40 PM, Bridget Thomson McInnes > >>>>>>>>> <bth...@cs...> wrote: > >>>>>>>>>> > >>>>>>>>>> Hi Ted, > >>>>>>>>>> > >>>>>>>>>> I am not certain why this is happening. I don't have this problem. > >>>>>>>>>> The > >>>>>>>>>> mesh terms are obtained using the PubMed API. I can see two > >>>>>>>>>> potential > >>>>>>>>>> problems: > >>>>>>>>>> 1. No internet connection > >>>>>>>>>> - which I will put in the documentation! > >>>>>>>>>> > >>>>>>>>>> 2. Do you think PubMed cut you off? They have done that to me > >>>>>>>>>> before. They just start rejecting my queries if they > >>>>>>>>>> think I have been using it to much. I have not quite > >>>>>>>>>> determined what to much is yet. I will write a > >>>>>>>>>> check in the program to make certain that something > >>>>>>>>>> is coming back and if not error out. > >>>>>>>>>> > >>>>>>>>>> Otherwise I can't think of what it is. It isn't like using the > >>>>>>>>>> UMLSKS > >>>>>>>>>> API > >>>>>>>>>> where the ip address needs to be registered. I have only tested > >>>>>>>>>> this > >>>>>>>>>> inside NLM - my connection at the apartment goes in and out so I > >>>>>>>>>> haven't > >>>>>>>>>> been able to test this on my laptop. > >>>>>>>>>> > >>>>>>>>>> Could you send me your TDP.mm file? > >>>>>>>>>> > >>>>>>>>>> Thanks! > >>>>>>>>>> > >>>>>>>>>> Bridget > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> On Wed, 23 Jul 2008, Ted Pedersen wrote: > >>>>>>>>>> > >>>>>>>>>>> Hi Bridget, > >>>>>>>>>>> > >>>>>>>>>>> When using the Mesh option (--mesh) by itself, I seem to never > >>>>>>>>>>> get > >>>>>>>>>>> any > >>>>>>>>>>> features... > >>>>>>>>>>> > >>>>>>>>>>> My arff files all look something like this... > >>>>>>>>>>> > >>>>>>>>>>> @RELATION pressure > >>>>>>>>>>> @ATTRIBUTE Sense {M1,M2,M3,None} > >>>>>>>>>>> @DATA > >>>>>>>>>>> M1 % 97403834 > >>>>>>>>>>> M1 % 98281278 > >>>>>>>>>>> M1 % 98124304 > >>>>>>>>>>> > >>>>>>>>>>> And so I end up getting a majority classifier... > >>>>>>>>>>> > >>>>>>>>>>> Is there something I am supposed to be doing to get the mesh > >>>>>>>>>>> features? > >>>>>>>>>>> I am just running like this... > >>>>>>>>>>> > >>>>>>>>>>> supervised-disambiguate.pl TDP.mm --mesh > >>>>>>>>>>> > >>>>>>>>>>> where TDP.mm is a directory with all the NLM-WSD data in .mm > >>>>>>>>>>> format > >>>>>>>>>>> (one file per word). All the files > >>>>>>>>>>> seem to be getting processed, and no errors are shown, but the > >>>>>>>>>>> results > >>>>>>>>>>> are pretty much just a majority > >>>>>>>>>>> classifier (due to lack of features...) > >>>>>>>>>>> > >>>>>>>>>>> Any idea on this? > >>>>>>>>>>> > >>>>>>>>>>> Thanks! > >>>>>>>>>>> Ted > >>>>>>>>>>> > >>>>>>>>>>> -- > >>>>>>>>>>> Ted Pedersen > >>>>>>>>>>> http://www.d.umn.edu/~tpederse > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> -- > >>>>>>>>> Ted Pedersen > >>>>>>>>> http://www.d.umn.edu/~tpederse > >>>>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> -- > >>>>>>> Ted Pedersen > >>>>>>> http://www.d.umn.edu/~tpederse > >>>>>>> > >>>>>> > >>>>> > >>>>> > >>>>> > >>>>> -- > >>>>> Ted Pedersen > >>>>> http://www.d.umn.edu/~tpederse > >>>>> > >>>> > >>>> > >>>> ------------------------------------------------------------------------- > >>>> This SF.Net email is sponsored by the Moblin Your Move Developer's > >>>> challenge > >>>> Build the coolest Linux based applications with Moblin SDK & win great > >>>> prizes > >>>> Grand prize is a trip for two to an Open Source event anywhere in the > >>>> world > >>>> http://moblin-contest.org/redirect.php?banner_id=100&url=/ > >>>> _______________________________________________ > >>>> Cuitools-users mailing list > >>>> Cui...@li... > >>>> https://lists.sourceforge.net/lists/listinfo/cuitools-users > >>>> > >>> > >> > >> > >> > >> -- > >> Ted Pedersen > >> http://www.d.umn.edu/~tpederse > >> > > > > > > -- > Ted Pedersen > http://www.d.umn.edu/~tpederse > |
From: Ted P. <tpederse@d.umn.edu> - 2008-07-24 16:40:47
|
Hi Bridget, In your INSTALL document, I think we probably want to clarify what people should be downloading by giving not only what they see on the link (which you have below) but also the names of the files they get. Also, I think what they download from the full test collection are the first and second files there...? I see that you say in the INSTALL document we should use the PMID version, and that seems fine to me...I did not use the PMID version when I did my install so I'm going back to use PMID.... ======================== The first in the ``Basic Test Collection'': 1. Basic Reviewed Set The second and third in the ``Full Test Collection'': 2. Common Files 3. Full Reviewed Result Set Unpack the files in a directory called NLM-WSD (for example - you can call it anything you would like). You should end up with three directories in the NLM-WSD directory: 1. Basic_Reviewed_Results 2. common 3. Reviewed_Results ========================= We may also want caution the user that the file names for the PMID and non-PMID versions are the same, and give them some idea of how to tell them apart...and we should probably still have a check somewhere that automatically verifies that we have the PMID format, not only in the event that people download the wrong version, but that they might try to use completely wrong sorts of data.... Thanks! Ted On Thu, Jul 24, 2008 at 10:49 AM, Ted Pedersen <tpederse@d.umn.edu> wrote: > Hi Bridget, > > One small question below... > > On Wed, Jul 23, 2008 at 4:00 PM, Bridget Thomson McInnes > <bth...@cs...> wrote: >> Hi Ted, >> >> You are using the wrong version of the NLM-WSD dataset. There are two >> version, one in which the identifiers are PMIDs and the other which are UIs. >> Unfortuently, the UIs one is the first one. >> >> If you go to the NLM-WSD website: http://wsd.nlm.nih.gov/ >> >> and log in, scroll down to the first white box. You will see: >> >> Switch to PMID identified version of the WSD Test Collection >> >> >> It originally didn't matter which version that you used but now when using >> the --mesh option or running the make-MMTxNLM-data.sh Demo - the PMID >> version is required. > > I think I'm running the make-MMTxNLM-data.sh script on the same data I'm using > for everything else (which isn't probably the PMID data....) Is there something > I should see in the output that would let me know that it wasn't the PMID data? > > I can't remember if I asked this before, but I'm wondering if we should simply > require the PMID data, to avoid having the user get two different > forms of the same > data (and potentially use them in the wrong places...) It sounds like > some things > will work with both forms of the data, but a few things will not (and require > PMID), so that sort of suggests that PMID is the more "generic" form of the > data....is there any downside to using PMID rather than the other form? > > Related to this, are there any other possible enhancements in the > future (that we've > already discussed) that would require PMID? > > What do you think? > > Thanks, > Ted > >> >> I will think about how to go about adding a check for that. I have done it >> myself to many times as well. And state it clearer in the INSTALL >> documentation. >> >> Sorry about that! >> >> Thanks, >> >> Bridget >> >> On Wed, 23 Jul 2008, Ted Pedersen wrote: >> >>> Hi Bridget, >>> >>> I'm using whichever one is used by the McInnes07 demo, since I'm just >>> copying McInnes07.mm and using that as TDP.mm. >>> >>> Here's the first few lines of adjustment.mm, hopefully that will make >>> it clear what I'm using - I *think* this is the PMID version since it >>> has that number in it (which I think is the PMID)? >>> >>> <corpus lang='en'> >>> <lexelt item="adjustment" senses="M1,M2,M3,None"> >>> <instance id="98076825" alias="adjustment"> >>> <answer instance="98076825.ab.7" senseid="M2"/> >>> <context line="Influence of physiological factors on the >>> age-related increase in blood pressure in healthy men. The independent >>> a >>> nd collective influences of several physiological factors on the >>> age-related increase in blood pressure in healthy men were examined. T >>> wenty-seven younger and 25 older, mostly normotensive, healthy men >>> were studied. Blood pressure, body fat, body fat distribution, maxim >>> al oxygen consumption (VO2max), plasma norepinephrine, dietary Na, and >>> erythrocyte Na-K pump activity were measured. Older men showed 5 >>> 7% higher percent body fat, 40% higher plasma norepinephrine >>> concentration, 14% greater mean arterial blood pressure (MAP), and 5% >>> high >>> er plasma K concentration than younger men (all p < 0.01). Older men >>> showed a 38% (p < 0.01) lower VO2max, 19% (p < 0.05) lower energy >>> intake, 18% (p < 0.05) lower Na-K pump rate constant, and a 17% (p < >>> 0.05) lower Na-K pump rate. Group means for MAP were adjusted for >>> combinations of plasma norepinephrine, waist:thigh ratio, VO2max, and >>> the Na-K pump rate constant, to determine if any one variable or >>> combination could account for the age related increase in MAP. >>> Statistical adjustment for plasma norepinephrine, waist:thigh ratio, >>> and >>> Na-K pump rate constant eliminated the significant difference between >>> MAPs for the two groups. Thus, alterations in sympathetic nervou >>> s system activity, body fat distribution, and the membrane Na-K pump >>> activity independently contribute to the age-related increase in M >>> AP in healthy men. "/> >>> <sentence tw="" id="98076825.ti.1" line="Influence of >>> physiological factors on the age-related increase in blood pressure in >>> he >>> althy men."> >>> >>> Does that look like the right format? >>> >>> Thanks! >>> Ted >>> >>> On Wed, Jul 23, 2008 at 3:30 PM, Bridget Thomson McInnes >>> <bth...@cs...> wrote: >>>> >>>> Hi Ted, >>>> >>>> What version of the NLM-WSD dataset are you using? The --mesh option >>>> requires that the PMID version be used. >>>> >>>> Thanks! >>>> >>>> Bridget >>>> >>>> On Wed, 23 Jul 2008, Bridget Thomson McInnes wrote: >>>> >>>>> Hi Ted, >>>>> >>>>> I just downloaded the CuiTools from the webpage and got the same error. >>>>> I >>>>> will see why it is doing this! >>>>> >>>>> Thanks! >>>>> >>>>> Bridget >>>>> >>>>> On Wed, 23 Jul 2008, Ted Pedersen wrote: >>>>> >>>>>> Hi Bridget, >>>>>> >>>>>> I'm in the process of running the --mesh option again, this time just >>>>>> using adjustment and --mesh, as in... >>>>>> >>>>>> supervised-disambiguate.pl Demos/TDP.mm/adjustment.mm --mesh >>>>>> --directory ted-adjustment-mesh >>>>>> >>>>>> One thing I've noticed in the previous cases with --mesh is that the >>>>>> log directory is empty - which I guess means that no features were >>>>>> found, or something....then of course the ARFF files don't have any >>>>>> features in the them either, leading to the majority classifier... >>>>>> >>>>>> This isn't just specific to --mesh, but I do think it would be a good >>>>>> idea to issue a warning or possibly even an error when no features are >>>>>> found, just so the user doesn't end up getting a majority classifier >>>>>> without realizing it - unless I had been curious about the Mesh >>>>>> features I might not have noticed any of this, just because you do get >>>>>> results back even after finding no features. I think it might be ok to >>>>>> default to a majority classifier in this case, but we'd want the user >>>>>> to know that this has happened... >>>>>> >>>>>> My adjustment run just finished, so I've attached a zip file with the >>>>>> log, arff, weka and results directories... >>>>>> >>>>>> Thanks, >>>>>> Ted >>>>>> >>>>>> On Wed, Jul 23, 2008 at 2:33 PM, Bridget Thomson McInnes >>>>>> <bth...@cs...> wrote: >>>>>>> >>>>>>> Hi Ted, >>>>>>> >>>>>>> I am not certain. I am going to redownload the package, do a clean >>>>>>> install >>>>>>> and try it again. Hopefully I will be able to recreate it. >>>>>>> >>>>>>> Thanks! >>>>>>> >>>>>>> Bridget >>>>>>> >>>>>>> On Wed, 23 Jul 2008, Ted Pedersen wrote: >>>>>>> >>>>>>>> Hi Bridget, >>>>>>>> >>>>>>>> Thanks for this script - I ran it and it seems to give me back a fair >>>>>>>> number of results...so I seem to be able to access PubMed ok... >>>>>>>> >>>>>>>> marimba(4): perl get-mesh.pl 9337195 >>>>>>>> 9337195 : *Birth Weight,Blood Glucose/metabolism,Blood >>>>>>>> Pressure,Brachial Artery/anatomy & >>>>>>>> histology/*physiology/ultrasonography,Cardiovascular >>>>>>>> Diseases/*epidemiology,Child,Cholesterol, >>>>>>>> LDL/blood,England,Female,Humans,Lipids/blood,Muscle, Smooth, >>>>>>>> Vascular/anatomy & >>>>>>>> histology/physiology/ultrasonography,Parity,Pregnancy,Regional Blood >>>>>>>> Flow,Risk Factors,Socioeconomic Factors,Tobacco Smoke >>>>>>>> Pollution,*Vasodilation >>>>>>>> >>>>>>>> What could I check next? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Ted >>>>>>>> >>>>>>>> On Wed, Jul 23, 2008 at 2:17 PM, Bridget Thomson McInnes >>>>>>>> <bth...@cs...> wrote: >>>>>>>>> >>>>>>>>> Hi Ted >>>>>>>>> >>>>>>>>> I attached a test script to check. It is called : get-msh.pl >>>>>>>>> >>>>>>>>> Here is an example run: >>>>>>>>> >>>>>>>>> bthomson@caesar (~) % perl get-msh.pl 9337195 >>>>>>>>> 9337195 : *Birth Weight,Blood Glucose/metabolism,Blood >>>>>>>>> Pressure,Brachial >>>>>>>>> Artery/anatomy & >>>>>>>>> histology/*physiology/ultrasonography,Cardiovascular >>>>>>>>> Diseases/*epidemiology,Child,Cholesterol, >>>>>>>>> LDL/blood,England,Female,Humans,Lipids/blood,Muscle, Smooth, >>>>>>>>> Vascular/anatomy & >>>>>>>>> histology/physiology/ultrasonography,Parity,Pregnancy,Regional Blood >>>>>>>>> Flow,Risk Factors,Socioeconomic Factors,Tobacco Smoke >>>>>>>>> Pollution,*Vasodilation >>>>>>>>> >>>>>>>>> >>>>>>>>> This is the same result that I get on my computer at school and here >>>>>>>>> at >>>>>>>>> work. >>>>>>>>> >>>>>>>>> Thanks! >>>>>>>>> >>>>>>>>> Bridget >>>>>>>>> >>>>>>>>> >>>>>>>>> On Wed, 23 Jul 2008, Ted Pedersen wrote: >>>>>>>>> >>>>>>>>>> Hi Bridget, >>>>>>>>>> >>>>>>>>>> TDP.mm is just a copy of McInnesPC07.dir.mm from the Demos >>>>>>>>>> directory. >>>>>>>>>> Did you mean >>>>>>>>>> that, or the actual output? >>>>>>>>>> >>>>>>>>>> I do have an internet connection so I don't think that's the >>>>>>>>>> problem. >>>>>>>>>> How would I know if >>>>>>>>>> PubMed cut me off? >>>>>>>>>> >>>>>>>>>> Thanks! >>>>>>>>>> Ted >>>>>>>>>> >>>>>>>>>> On Wed, Jul 23, 2008 at 12:40 PM, Bridget Thomson McInnes >>>>>>>>>> <bth...@cs...> wrote: >>>>>>>>>>> >>>>>>>>>>> Hi Ted, >>>>>>>>>>> >>>>>>>>>>> I am not certain why this is happening. I don't have this problem. >>>>>>>>>>> The >>>>>>>>>>> mesh terms are obtained using the PubMed API. I can see two >>>>>>>>>>> potential >>>>>>>>>>> problems: >>>>>>>>>>> 1. No internet connection >>>>>>>>>>> - which I will put in the documentation! >>>>>>>>>>> >>>>>>>>>>> 2. Do you think PubMed cut you off? They have done that to me >>>>>>>>>>> before. They just start rejecting my queries if they >>>>>>>>>>> think I have been using it to much. I have not quite >>>>>>>>>>> determined what to much is yet. I will write a >>>>>>>>>>> check in the program to make certain that something >>>>>>>>>>> is coming back and if not error out. >>>>>>>>>>> >>>>>>>>>>> Otherwise I can't think of what it is. It isn't like using the >>>>>>>>>>> UMLSKS >>>>>>>>>>> API >>>>>>>>>>> where the ip address needs to be registered. I have only tested >>>>>>>>>>> this >>>>>>>>>>> inside NLM - my connection at the apartment goes in and out so I >>>>>>>>>>> haven't >>>>>>>>>>> been able to test this on my laptop. >>>>>>>>>>> >>>>>>>>>>> Could you send me your TDP.mm file? >>>>>>>>>>> >>>>>>>>>>> Thanks! >>>>>>>>>>> >>>>>>>>>>> Bridget >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Wed, 23 Jul 2008, Ted Pedersen wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi Bridget, >>>>>>>>>>>> >>>>>>>>>>>> When using the Mesh option (--mesh) by itself, I seem to never >>>>>>>>>>>> get >>>>>>>>>>>> any >>>>>>>>>>>> features... >>>>>>>>>>>> >>>>>>>>>>>> My arff files all look something like this... >>>>>>>>>>>> >>>>>>>>>>>> @RELATION pressure >>>>>>>>>>>> @ATTRIBUTE Sense {M1,M2,M3,None} >>>>>>>>>>>> @DATA >>>>>>>>>>>> M1 % 97403834 >>>>>>>>>>>> M1 % 98281278 >>>>>>>>>>>> M1 % 98124304 >>>>>>>>>>>> >>>>>>>>>>>> And so I end up getting a majority classifier... >>>>>>>>>>>> >>>>>>>>>>>> Is there something I am supposed to be doing to get the mesh >>>>>>>>>>>> features? >>>>>>>>>>>> I am just running like this... >>>>>>>>>>>> >>>>>>>>>>>> supervised-disambiguate.pl TDP.mm --mesh >>>>>>>>>>>> >>>>>>>>>>>> where TDP.mm is a directory with all the NLM-WSD data in .mm >>>>>>>>>>>> format >>>>>>>>>>>> (one file per word). All the files >>>>>>>>>>>> seem to be getting processed, and no errors are shown, but the >>>>>>>>>>>> results >>>>>>>>>>>> are pretty much just a majority >>>>>>>>>>>> classifier (due to lack of features...) >>>>>>>>>>>> >>>>>>>>>>>> Any idea on this? >>>>>>>>>>>> >>>>>>>>>>>> Thanks! >>>>>>>>>>>> Ted >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Ted Pedersen >>>>>>>>>>>> http://www.d.umn.edu/~tpederse >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Ted Pedersen >>>>>>>>>> http://www.d.umn.edu/~tpederse >>>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Ted Pedersen >>>>>>>> http://www.d.umn.edu/~tpederse >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Ted Pedersen >>>>>> http://www.d.umn.edu/~tpederse >>>>>> >>>>> >>>>> >>>>> ------------------------------------------------------------------------- >>>>> This SF.Net email is sponsored by the Moblin Your Move Developer's >>>>> challenge >>>>> Build the coolest Linux based applications with Moblin SDK & win great >>>>> prizes >>>>> Grand prize is a trip for two to an Open Source event anywhere in the >>>>> world >>>>> http://moblin-contest.org/redirect.php?banner_id=100&url=/ >>>>> _______________________________________________ >>>>> Cuitools-users mailing list >>>>> Cui...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/cuitools-users >>>>> >>>> >>> >>> >>> >>> -- >>> Ted Pedersen >>> http://www.d.umn.edu/~tpederse >>> >> > > > > -- > Ted Pedersen > http://www.d.umn.edu/~tpederse > -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: Ted P. <tpederse@d.umn.edu> - 2008-07-24 15:49:59
|
Hi Bridget, One small question below... On Wed, Jul 23, 2008 at 4:00 PM, Bridget Thomson McInnes <bth...@cs...> wrote: > Hi Ted, > > You are using the wrong version of the NLM-WSD dataset. There are two > version, one in which the identifiers are PMIDs and the other which are UIs. > Unfortuently, the UIs one is the first one. > > If you go to the NLM-WSD website: http://wsd.nlm.nih.gov/ > > and log in, scroll down to the first white box. You will see: > > Switch to PMID identified version of the WSD Test Collection > > > It originally didn't matter which version that you used but now when using > the --mesh option or running the make-MMTxNLM-data.sh Demo - the PMID > version is required. I think I'm running the make-MMTxNLM-data.sh script on the same data I'm using for everything else (which isn't probably the PMID data....) Is there something I should see in the output that would let me know that it wasn't the PMID data? I can't remember if I asked this before, but I'm wondering if we should simply require the PMID data, to avoid having the user get two different forms of the same data (and potentially use them in the wrong places...) It sounds like some things will work with both forms of the data, but a few things will not (and require PMID), so that sort of suggests that PMID is the more "generic" form of the data....is there any downside to using PMID rather than the other form? Related to this, are there any other possible enhancements in the future (that we've already discussed) that would require PMID? What do you think? Thanks, Ted > > I will think about how to go about adding a check for that. I have done it > myself to many times as well. And state it clearer in the INSTALL > documentation. > > Sorry about that! > > Thanks, > > Bridget > > On Wed, 23 Jul 2008, Ted Pedersen wrote: > >> Hi Bridget, >> >> I'm using whichever one is used by the McInnes07 demo, since I'm just >> copying McInnes07.mm and using that as TDP.mm. >> >> Here's the first few lines of adjustment.mm, hopefully that will make >> it clear what I'm using - I *think* this is the PMID version since it >> has that number in it (which I think is the PMID)? >> >> <corpus lang='en'> >> <lexelt item="adjustment" senses="M1,M2,M3,None"> >> <instance id="98076825" alias="adjustment"> >> <answer instance="98076825.ab.7" senseid="M2"/> >> <context line="Influence of physiological factors on the >> age-related increase in blood pressure in healthy men. The independent >> a >> nd collective influences of several physiological factors on the >> age-related increase in blood pressure in healthy men were examined. T >> wenty-seven younger and 25 older, mostly normotensive, healthy men >> were studied. Blood pressure, body fat, body fat distribution, maxim >> al oxygen consumption (VO2max), plasma norepinephrine, dietary Na, and >> erythrocyte Na-K pump activity were measured. Older men showed 5 >> 7% higher percent body fat, 40% higher plasma norepinephrine >> concentration, 14% greater mean arterial blood pressure (MAP), and 5% >> high >> er plasma K concentration than younger men (all p < 0.01). Older men >> showed a 38% (p < 0.01) lower VO2max, 19% (p < 0.05) lower energy >> intake, 18% (p < 0.05) lower Na-K pump rate constant, and a 17% (p < >> 0.05) lower Na-K pump rate. Group means for MAP were adjusted for >> combinations of plasma norepinephrine, waist:thigh ratio, VO2max, and >> the Na-K pump rate constant, to determine if any one variable or >> combination could account for the age related increase in MAP. >> Statistical adjustment for plasma norepinephrine, waist:thigh ratio, >> and >> Na-K pump rate constant eliminated the significant difference between >> MAPs for the two groups. Thus, alterations in sympathetic nervou >> s system activity, body fat distribution, and the membrane Na-K pump >> activity independently contribute to the age-related increase in M >> AP in healthy men. "/> >> <sentence tw="" id="98076825.ti.1" line="Influence of >> physiological factors on the age-related increase in blood pressure in >> he >> althy men."> >> >> Does that look like the right format? >> >> Thanks! >> Ted >> >> On Wed, Jul 23, 2008 at 3:30 PM, Bridget Thomson McInnes >> <bth...@cs...> wrote: >>> >>> Hi Ted, >>> >>> What version of the NLM-WSD dataset are you using? The --mesh option >>> requires that the PMID version be used. >>> >>> Thanks! >>> >>> Bridget >>> >>> On Wed, 23 Jul 2008, Bridget Thomson McInnes wrote: >>> >>>> Hi Ted, >>>> >>>> I just downloaded the CuiTools from the webpage and got the same error. >>>> I >>>> will see why it is doing this! >>>> >>>> Thanks! >>>> >>>> Bridget >>>> >>>> On Wed, 23 Jul 2008, Ted Pedersen wrote: >>>> >>>>> Hi Bridget, >>>>> >>>>> I'm in the process of running the --mesh option again, this time just >>>>> using adjustment and --mesh, as in... >>>>> >>>>> supervised-disambiguate.pl Demos/TDP.mm/adjustment.mm --mesh >>>>> --directory ted-adjustment-mesh >>>>> >>>>> One thing I've noticed in the previous cases with --mesh is that the >>>>> log directory is empty - which I guess means that no features were >>>>> found, or something....then of course the ARFF files don't have any >>>>> features in the them either, leading to the majority classifier... >>>>> >>>>> This isn't just specific to --mesh, but I do think it would be a good >>>>> idea to issue a warning or possibly even an error when no features are >>>>> found, just so the user doesn't end up getting a majority classifier >>>>> without realizing it - unless I had been curious about the Mesh >>>>> features I might not have noticed any of this, just because you do get >>>>> results back even after finding no features. I think it might be ok to >>>>> default to a majority classifier in this case, but we'd want the user >>>>> to know that this has happened... >>>>> >>>>> My adjustment run just finished, so I've attached a zip file with the >>>>> log, arff, weka and results directories... >>>>> >>>>> Thanks, >>>>> Ted >>>>> >>>>> On Wed, Jul 23, 2008 at 2:33 PM, Bridget Thomson McInnes >>>>> <bth...@cs...> wrote: >>>>>> >>>>>> Hi Ted, >>>>>> >>>>>> I am not certain. I am going to redownload the package, do a clean >>>>>> install >>>>>> and try it again. Hopefully I will be able to recreate it. >>>>>> >>>>>> Thanks! >>>>>> >>>>>> Bridget >>>>>> >>>>>> On Wed, 23 Jul 2008, Ted Pedersen wrote: >>>>>> >>>>>>> Hi Bridget, >>>>>>> >>>>>>> Thanks for this script - I ran it and it seems to give me back a fair >>>>>>> number of results...so I seem to be able to access PubMed ok... >>>>>>> >>>>>>> marimba(4): perl get-mesh.pl 9337195 >>>>>>> 9337195 : *Birth Weight,Blood Glucose/metabolism,Blood >>>>>>> Pressure,Brachial Artery/anatomy & >>>>>>> histology/*physiology/ultrasonography,Cardiovascular >>>>>>> Diseases/*epidemiology,Child,Cholesterol, >>>>>>> LDL/blood,England,Female,Humans,Lipids/blood,Muscle, Smooth, >>>>>>> Vascular/anatomy & >>>>>>> histology/physiology/ultrasonography,Parity,Pregnancy,Regional Blood >>>>>>> Flow,Risk Factors,Socioeconomic Factors,Tobacco Smoke >>>>>>> Pollution,*Vasodilation >>>>>>> >>>>>>> What could I check next? >>>>>>> >>>>>>> Thanks, >>>>>>> Ted >>>>>>> >>>>>>> On Wed, Jul 23, 2008 at 2:17 PM, Bridget Thomson McInnes >>>>>>> <bth...@cs...> wrote: >>>>>>>> >>>>>>>> Hi Ted >>>>>>>> >>>>>>>> I attached a test script to check. It is called : get-msh.pl >>>>>>>> >>>>>>>> Here is an example run: >>>>>>>> >>>>>>>> bthomson@caesar (~) % perl get-msh.pl 9337195 >>>>>>>> 9337195 : *Birth Weight,Blood Glucose/metabolism,Blood >>>>>>>> Pressure,Brachial >>>>>>>> Artery/anatomy & >>>>>>>> histology/*physiology/ultrasonography,Cardiovascular >>>>>>>> Diseases/*epidemiology,Child,Cholesterol, >>>>>>>> LDL/blood,England,Female,Humans,Lipids/blood,Muscle, Smooth, >>>>>>>> Vascular/anatomy & >>>>>>>> histology/physiology/ultrasonography,Parity,Pregnancy,Regional Blood >>>>>>>> Flow,Risk Factors,Socioeconomic Factors,Tobacco Smoke >>>>>>>> Pollution,*Vasodilation >>>>>>>> >>>>>>>> >>>>>>>> This is the same result that I get on my computer at school and here >>>>>>>> at >>>>>>>> work. >>>>>>>> >>>>>>>> Thanks! >>>>>>>> >>>>>>>> Bridget >>>>>>>> >>>>>>>> >>>>>>>> On Wed, 23 Jul 2008, Ted Pedersen wrote: >>>>>>>> >>>>>>>>> Hi Bridget, >>>>>>>>> >>>>>>>>> TDP.mm is just a copy of McInnesPC07.dir.mm from the Demos >>>>>>>>> directory. >>>>>>>>> Did you mean >>>>>>>>> that, or the actual output? >>>>>>>>> >>>>>>>>> I do have an internet connection so I don't think that's the >>>>>>>>> problem. >>>>>>>>> How would I know if >>>>>>>>> PubMed cut me off? >>>>>>>>> >>>>>>>>> Thanks! >>>>>>>>> Ted >>>>>>>>> >>>>>>>>> On Wed, Jul 23, 2008 at 12:40 PM, Bridget Thomson McInnes >>>>>>>>> <bth...@cs...> wrote: >>>>>>>>>> >>>>>>>>>> Hi Ted, >>>>>>>>>> >>>>>>>>>> I am not certain why this is happening. I don't have this problem. >>>>>>>>>> The >>>>>>>>>> mesh terms are obtained using the PubMed API. I can see two >>>>>>>>>> potential >>>>>>>>>> problems: >>>>>>>>>> 1. No internet connection >>>>>>>>>> - which I will put in the documentation! >>>>>>>>>> >>>>>>>>>> 2. Do you think PubMed cut you off? They have done that to me >>>>>>>>>> before. They just start rejecting my queries if they >>>>>>>>>> think I have been using it to much. I have not quite >>>>>>>>>> determined what to much is yet. I will write a >>>>>>>>>> check in the program to make certain that something >>>>>>>>>> is coming back and if not error out. >>>>>>>>>> >>>>>>>>>> Otherwise I can't think of what it is. It isn't like using the >>>>>>>>>> UMLSKS >>>>>>>>>> API >>>>>>>>>> where the ip address needs to be registered. I have only tested >>>>>>>>>> this >>>>>>>>>> inside NLM - my connection at the apartment goes in and out so I >>>>>>>>>> haven't >>>>>>>>>> been able to test this on my laptop. >>>>>>>>>> >>>>>>>>>> Could you send me your TDP.mm file? >>>>>>>>>> >>>>>>>>>> Thanks! >>>>>>>>>> >>>>>>>>>> Bridget >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Wed, 23 Jul 2008, Ted Pedersen wrote: >>>>>>>>>> >>>>>>>>>>> Hi Bridget, >>>>>>>>>>> >>>>>>>>>>> When using the Mesh option (--mesh) by itself, I seem to never >>>>>>>>>>> get >>>>>>>>>>> any >>>>>>>>>>> features... >>>>>>>>>>> >>>>>>>>>>> My arff files all look something like this... >>>>>>>>>>> >>>>>>>>>>> @RELATION pressure >>>>>>>>>>> @ATTRIBUTE Sense {M1,M2,M3,None} >>>>>>>>>>> @DATA >>>>>>>>>>> M1 % 97403834 >>>>>>>>>>> M1 % 98281278 >>>>>>>>>>> M1 % 98124304 >>>>>>>>>>> >>>>>>>>>>> And so I end up getting a majority classifier... >>>>>>>>>>> >>>>>>>>>>> Is there something I am supposed to be doing to get the mesh >>>>>>>>>>> features? >>>>>>>>>>> I am just running like this... >>>>>>>>>>> >>>>>>>>>>> supervised-disambiguate.pl TDP.mm --mesh >>>>>>>>>>> >>>>>>>>>>> where TDP.mm is a directory with all the NLM-WSD data in .mm >>>>>>>>>>> format >>>>>>>>>>> (one file per word). All the files >>>>>>>>>>> seem to be getting processed, and no errors are shown, but the >>>>>>>>>>> results >>>>>>>>>>> are pretty much just a majority >>>>>>>>>>> classifier (due to lack of features...) >>>>>>>>>>> >>>>>>>>>>> Any idea on this? >>>>>>>>>>> >>>>>>>>>>> Thanks! >>>>>>>>>>> Ted >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Ted Pedersen >>>>>>>>>>> http://www.d.umn.edu/~tpederse >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Ted Pedersen >>>>>>>>> http://www.d.umn.edu/~tpederse >>>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Ted Pedersen >>>>>>> http://www.d.umn.edu/~tpederse >>>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Ted Pedersen >>>>> http://www.d.umn.edu/~tpederse >>>>> >>>> >>>> >>>> ------------------------------------------------------------------------- >>>> This SF.Net email is sponsored by the Moblin Your Move Developer's >>>> challenge >>>> Build the coolest Linux based applications with Moblin SDK & win great >>>> prizes >>>> Grand prize is a trip for two to an Open Source event anywhere in the >>>> world >>>> http://moblin-contest.org/redirect.php?banner_id=100&url=/ >>>> _______________________________________________ >>>> Cuitools-users mailing list >>>> Cui...@li... >>>> https://lists.sourceforge.net/lists/listinfo/cuitools-users >>>> >>> >> >> >> >> -- >> Ted Pedersen >> http://www.d.umn.edu/~tpederse >> > -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: Ted P. <tpederse@d.umn.edu> - 2008-07-23 23:22:33
|
Hi Bridget, Thanks for clarifying this - I do agree, I think we should have some kind of an error that indicates that data is not in PMID format when --mesh is being used (and anything else that requires PMID). BTW, can the Demos be run using PMID? I don't exactly know why I ran with the version I used.... Thanks! Ted On Wed, Jul 23, 2008 at 4:00 PM, Bridget Thomson McInnes <bth...@cs...> wrote: > Hi Ted, > > You are using the wrong version of the NLM-WSD dataset. There are two > version, one in which the identifiers are PMIDs and the other which are UIs. > Unfortuently, the UIs one is the first one. > > If you go to the NLM-WSD website: http://wsd.nlm.nih.gov/ > > and log in, scroll down to the first white box. You will see: > > Switch to PMID identified version of the WSD Test Collection > > > It originally didn't matter which version that you used but now when using > the --mesh option or running the make-MMTxNLM-data.sh Demo - the PMID > version is required. > > I will think about how to go about adding a check for that. I have done it > myself to many times as well. And state it clearer in the INSTALL > documentation. > > Sorry about that! > > Thanks, > > Bridget > > On Wed, 23 Jul 2008, Ted Pedersen wrote: > >> Hi Bridget, >> >> I'm using whichever one is used by the McInnes07 demo, since I'm just >> copying McInnes07.mm and using that as TDP.mm. >> >> Here's the first few lines of adjustment.mm, hopefully that will make >> it clear what I'm using - I *think* this is the PMID version since it >> has that number in it (which I think is the PMID)? >> >> <corpus lang='en'> >> <lexelt item="adjustment" senses="M1,M2,M3,None"> >> <instance id="98076825" alias="adjustment"> >> <answer instance="98076825.ab.7" senseid="M2"/> >> <context line="Influence of physiological factors on the >> age-related increase in blood pressure in healthy men. The independent >> a >> nd collective influences of several physiological factors on the >> age-related increase in blood pressure in healthy men were examined. T >> wenty-seven younger and 25 older, mostly normotensive, healthy men >> were studied. Blood pressure, body fat, body fat distribution, maxim >> al oxygen consumption (VO2max), plasma norepinephrine, dietary Na, and >> erythrocyte Na-K pump activity were measured. Older men showed 5 >> 7% higher percent body fat, 40% higher plasma norepinephrine >> concentration, 14% greater mean arterial blood pressure (MAP), and 5% >> high >> er plasma K concentration than younger men (all p < 0.01). Older men >> showed a 38% (p < 0.01) lower VO2max, 19% (p < 0.05) lower energy >> intake, 18% (p < 0.05) lower Na-K pump rate constant, and a 17% (p < >> 0.05) lower Na-K pump rate. Group means for MAP were adjusted for >> combinations of plasma norepinephrine, waist:thigh ratio, VO2max, and >> the Na-K pump rate constant, to determine if any one variable or >> combination could account for the age related increase in MAP. >> Statistical adjustment for plasma norepinephrine, waist:thigh ratio, >> and >> Na-K pump rate constant eliminated the significant difference between >> MAPs for the two groups. Thus, alterations in sympathetic nervou >> s system activity, body fat distribution, and the membrane Na-K pump >> activity independently contribute to the age-related increase in M >> AP in healthy men. "/> >> <sentence tw="" id="98076825.ti.1" line="Influence of >> physiological factors on the age-related increase in blood pressure in >> he >> althy men."> >> >> Does that look like the right format? >> >> Thanks! >> Ted >> >> On Wed, Jul 23, 2008 at 3:30 PM, Bridget Thomson McInnes >> <bth...@cs...> wrote: >>> >>> Hi Ted, >>> >>> What version of the NLM-WSD dataset are you using? The --mesh option >>> requires that the PMID version be used. >>> >>> Thanks! >>> >>> Bridget >>> >>> On Wed, 23 Jul 2008, Bridget Thomson McInnes wrote: >>> >>>> Hi Ted, >>>> >>>> I just downloaded the CuiTools from the webpage and got the same error. >>>> I >>>> will see why it is doing this! >>>> >>>> Thanks! >>>> >>>> Bridget >>>> >>>> On Wed, 23 Jul 2008, Ted Pedersen wrote: >>>> >>>>> Hi Bridget, >>>>> >>>>> I'm in the process of running the --mesh option again, this time just >>>>> using adjustment and --mesh, as in... >>>>> >>>>> supervised-disambiguate.pl Demos/TDP.mm/adjustment.mm --mesh >>>>> --directory ted-adjustment-mesh >>>>> >>>>> One thing I've noticed in the previous cases with --mesh is that the >>>>> log directory is empty - which I guess means that no features were >>>>> found, or something....then of course the ARFF files don't have any >>>>> features in the them either, leading to the majority classifier... >>>>> >>>>> This isn't just specific to --mesh, but I do think it would be a good >>>>> idea to issue a warning or possibly even an error when no features are >>>>> found, just so the user doesn't end up getting a majority classifier >>>>> without realizing it - unless I had been curious about the Mesh >>>>> features I might not have noticed any of this, just because you do get >>>>> results back even after finding no features. I think it might be ok to >>>>> default to a majority classifier in this case, but we'd want the user >>>>> to know that this has happened... >>>>> >>>>> My adjustment run just finished, so I've attached a zip file with the >>>>> log, arff, weka and results directories... >>>>> >>>>> Thanks, >>>>> Ted >>>>> >>>>> On Wed, Jul 23, 2008 at 2:33 PM, Bridget Thomson McInnes >>>>> <bth...@cs...> wrote: >>>>>> >>>>>> Hi Ted, >>>>>> >>>>>> I am not certain. I am going to redownload the package, do a clean >>>>>> install >>>>>> and try it again. Hopefully I will be able to recreate it. >>>>>> >>>>>> Thanks! >>>>>> >>>>>> Bridget >>>>>> >>>>>> On Wed, 23 Jul 2008, Ted Pedersen wrote: >>>>>> >>>>>>> Hi Bridget, >>>>>>> >>>>>>> Thanks for this script - I ran it and it seems to give me back a fair >>>>>>> number of results...so I seem to be able to access PubMed ok... >>>>>>> >>>>>>> marimba(4): perl get-mesh.pl 9337195 >>>>>>> 9337195 : *Birth Weight,Blood Glucose/metabolism,Blood >>>>>>> Pressure,Brachial Artery/anatomy & >>>>>>> histology/*physiology/ultrasonography,Cardiovascular >>>>>>> Diseases/*epidemiology,Child,Cholesterol, >>>>>>> LDL/blood,England,Female,Humans,Lipids/blood,Muscle, Smooth, >>>>>>> Vascular/anatomy & >>>>>>> histology/physiology/ultrasonography,Parity,Pregnancy,Regional Blood >>>>>>> Flow,Risk Factors,Socioeconomic Factors,Tobacco Smoke >>>>>>> Pollution,*Vasodilation >>>>>>> >>>>>>> What could I check next? >>>>>>> >>>>>>> Thanks, >>>>>>> Ted >>>>>>> >>>>>>> On Wed, Jul 23, 2008 at 2:17 PM, Bridget Thomson McInnes >>>>>>> <bth...@cs...> wrote: >>>>>>>> >>>>>>>> Hi Ted >>>>>>>> >>>>>>>> I attached a test script to check. It is called : get-msh.pl >>>>>>>> >>>>>>>> Here is an example run: >>>>>>>> >>>>>>>> bthomson@caesar (~) % perl get-msh.pl 9337195 >>>>>>>> 9337195 : *Birth Weight,Blood Glucose/metabolism,Blood >>>>>>>> Pressure,Brachial >>>>>>>> Artery/anatomy & >>>>>>>> histology/*physiology/ultrasonography,Cardiovascular >>>>>>>> Diseases/*epidemiology,Child,Cholesterol, >>>>>>>> LDL/blood,England,Female,Humans,Lipids/blood,Muscle, Smooth, >>>>>>>> Vascular/anatomy & >>>>>>>> histology/physiology/ultrasonography,Parity,Pregnancy,Regional Blood >>>>>>>> Flow,Risk Factors,Socioeconomic Factors,Tobacco Smoke >>>>>>>> Pollution,*Vasodilation >>>>>>>> >>>>>>>> >>>>>>>> This is the same result that I get on my computer at school and here >>>>>>>> at >>>>>>>> work. >>>>>>>> >>>>>>>> Thanks! >>>>>>>> >>>>>>>> Bridget >>>>>>>> >>>>>>>> >>>>>>>> On Wed, 23 Jul 2008, Ted Pedersen wrote: >>>>>>>> >>>>>>>>> Hi Bridget, >>>>>>>>> >>>>>>>>> TDP.mm is just a copy of McInnesPC07.dir.mm from the Demos >>>>>>>>> directory. >>>>>>>>> Did you mean >>>>>>>>> that, or the actual output? >>>>>>>>> >>>>>>>>> I do have an internet connection so I don't think that's the >>>>>>>>> problem. >>>>>>>>> How would I know if >>>>>>>>> PubMed cut me off? >>>>>>>>> >>>>>>>>> Thanks! >>>>>>>>> Ted >>>>>>>>> >>>>>>>>> On Wed, Jul 23, 2008 at 12:40 PM, Bridget Thomson McInnes >>>>>>>>> <bth...@cs...> wrote: >>>>>>>>>> >>>>>>>>>> Hi Ted, >>>>>>>>>> >>>>>>>>>> I am not certain why this is happening. I don't have this problem. >>>>>>>>>> The >>>>>>>>>> mesh terms are obtained using the PubMed API. I can see two >>>>>>>>>> potential >>>>>>>>>> problems: >>>>>>>>>> 1. No internet connection >>>>>>>>>> - which I will put in the documentation! >>>>>>>>>> >>>>>>>>>> 2. Do you think PubMed cut you off? They have done that to me >>>>>>>>>> before. They just start rejecting my queries if they >>>>>>>>>> think I have been using it to much. I have not quite >>>>>>>>>> determined what to much is yet. I will write a >>>>>>>>>> check in the program to make certain that something >>>>>>>>>> is coming back and if not error out. >>>>>>>>>> >>>>>>>>>> Otherwise I can't think of what it is. It isn't like using the >>>>>>>>>> UMLSKS >>>>>>>>>> API >>>>>>>>>> where the ip address needs to be registered. I have only tested >>>>>>>>>> this >>>>>>>>>> inside NLM - my connection at the apartment goes in and out so I >>>>>>>>>> haven't >>>>>>>>>> been able to test this on my laptop. >>>>>>>>>> >>>>>>>>>> Could you send me your TDP.mm file? >>>>>>>>>> >>>>>>>>>> Thanks! >>>>>>>>>> >>>>>>>>>> Bridget >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Wed, 23 Jul 2008, Ted Pedersen wrote: >>>>>>>>>> >>>>>>>>>>> Hi Bridget, >>>>>>>>>>> >>>>>>>>>>> When using the Mesh option (--mesh) by itself, I seem to never >>>>>>>>>>> get >>>>>>>>>>> any >>>>>>>>>>> features... >>>>>>>>>>> >>>>>>>>>>> My arff files all look something like this... >>>>>>>>>>> >>>>>>>>>>> @RELATION pressure >>>>>>>>>>> @ATTRIBUTE Sense {M1,M2,M3,None} >>>>>>>>>>> @DATA >>>>>>>>>>> M1 % 97403834 >>>>>>>>>>> M1 % 98281278 >>>>>>>>>>> M1 % 98124304 >>>>>>>>>>> >>>>>>>>>>> And so I end up getting a majority classifier... >>>>>>>>>>> >>>>>>>>>>> Is there something I am supposed to be doing to get the mesh >>>>>>>>>>> features? >>>>>>>>>>> I am just running like this... >>>>>>>>>>> >>>>>>>>>>> supervised-disambiguate.pl TDP.mm --mesh >>>>>>>>>>> >>>>>>>>>>> where TDP.mm is a directory with all the NLM-WSD data in .mm >>>>>>>>>>> format >>>>>>>>>>> (one file per word). All the files >>>>>>>>>>> seem to be getting processed, and no errors are shown, but the >>>>>>>>>>> results >>>>>>>>>>> are pretty much just a majority >>>>>>>>>>> classifier (due to lack of features...) >>>>>>>>>>> >>>>>>>>>>> Any idea on this? >>>>>>>>>>> >>>>>>>>>>> Thanks! >>>>>>>>>>> Ted >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Ted Pedersen >>>>>>>>>>> http://www.d.umn.edu/~tpederse >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Ted Pedersen >>>>>>>>> http://www.d.umn.edu/~tpederse >>>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Ted Pedersen >>>>>>> http://www.d.umn.edu/~tpederse >>>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Ted Pedersen >>>>> http://www.d.umn.edu/~tpederse >>>>> >>>> >>>> >>>> ------------------------------------------------------------------------- >>>> This SF.Net email is sponsored by the Moblin Your Move Developer's >>>> challenge >>>> Build the coolest Linux based applications with Moblin SDK & win great >>>> prizes >>>> Grand prize is a trip for two to an Open Source event anywhere in the >>>> world >>>> http://moblin-contest.org/redirect.php?banner_id=100&url=/ >>>> _______________________________________________ >>>> Cuitools-users mailing list >>>> Cui...@li... >>>> https://lists.sourceforge.net/lists/listinfo/cuitools-users >>>> >>> >> >> >> >> -- >> Ted Pedersen >> http://www.d.umn.edu/~tpederse >> > -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: Bridget T. M. <bth...@cs...> - 2008-07-23 21:00:15
|
Hi Ted, You are using the wrong version of the NLM-WSD dataset. There are two version, one in which the identifiers are PMIDs and the other which are UIs. Unfortuently, the UIs one is the first one. If you go to the NLM-WSD website: http://wsd.nlm.nih.gov/ and log in, scroll down to the first white box. You will see: Switch to PMID identified version of the WSD Test Collection It originally didn't matter which version that you used but now when using the --mesh option or running the make-MMTxNLM-data.sh Demo - the PMID version is required. I will think about how to go about adding a check for that. I have done it myself to many times as well. And state it clearer in the INSTALL documentation. Sorry about that! Thanks, Bridget On Wed, 23 Jul 2008, Ted Pedersen wrote: > Hi Bridget, > > I'm using whichever one is used by the McInnes07 demo, since I'm just > copying McInnes07.mm and using that as TDP.mm. > > Here's the first few lines of adjustment.mm, hopefully that will make > it clear what I'm using - I *think* this is the PMID version since it > has that number in it (which I think is the PMID)? > > <corpus lang='en'> > <lexelt item="adjustment" senses="M1,M2,M3,None"> > <instance id="98076825" alias="adjustment"> > <answer instance="98076825.ab.7" senseid="M2"/> > <context line="Influence of physiological factors on the > age-related increase in blood pressure in healthy men. The independent > a > nd collective influences of several physiological factors on the > age-related increase in blood pressure in healthy men were examined. T > wenty-seven younger and 25 older, mostly normotensive, healthy men > were studied. Blood pressure, body fat, body fat distribution, maxim > al oxygen consumption (VO2max), plasma norepinephrine, dietary Na, and > erythrocyte Na-K pump activity were measured. Older men showed 5 > 7% higher percent body fat, 40% higher plasma norepinephrine > concentration, 14% greater mean arterial blood pressure (MAP), and 5% > high > er plasma K concentration than younger men (all p < 0.01). Older men > showed a 38% (p < 0.01) lower VO2max, 19% (p < 0.05) lower energy > intake, 18% (p < 0.05) lower Na-K pump rate constant, and a 17% (p < > 0.05) lower Na-K pump rate. Group means for MAP were adjusted for > combinations of plasma norepinephrine, waist:thigh ratio, VO2max, and > the Na-K pump rate constant, to determine if any one variable or > combination could account for the age related increase in MAP. > Statistical adjustment for plasma norepinephrine, waist:thigh ratio, > and > Na-K pump rate constant eliminated the significant difference between > MAPs for the two groups. Thus, alterations in sympathetic nervou > s system activity, body fat distribution, and the membrane Na-K pump > activity independently contribute to the age-related increase in M > AP in healthy men. "/> > <sentence tw="" id="98076825.ti.1" line="Influence of > physiological factors on the age-related increase in blood pressure in > he > althy men."> > > Does that look like the right format? > > Thanks! > Ted > > On Wed, Jul 23, 2008 at 3:30 PM, Bridget Thomson McInnes > <bth...@cs...> wrote: >> Hi Ted, >> >> What version of the NLM-WSD dataset are you using? The --mesh option >> requires that the PMID version be used. >> >> Thanks! >> >> Bridget >> >> On Wed, 23 Jul 2008, Bridget Thomson McInnes wrote: >> >>> Hi Ted, >>> >>> I just downloaded the CuiTools from the webpage and got the same error. I >>> will see why it is doing this! >>> >>> Thanks! >>> >>> Bridget >>> >>> On Wed, 23 Jul 2008, Ted Pedersen wrote: >>> >>>> Hi Bridget, >>>> >>>> I'm in the process of running the --mesh option again, this time just >>>> using adjustment and --mesh, as in... >>>> >>>> supervised-disambiguate.pl Demos/TDP.mm/adjustment.mm --mesh >>>> --directory ted-adjustment-mesh >>>> >>>> One thing I've noticed in the previous cases with --mesh is that the >>>> log directory is empty - which I guess means that no features were >>>> found, or something....then of course the ARFF files don't have any >>>> features in the them either, leading to the majority classifier... >>>> >>>> This isn't just specific to --mesh, but I do think it would be a good >>>> idea to issue a warning or possibly even an error when no features are >>>> found, just so the user doesn't end up getting a majority classifier >>>> without realizing it - unless I had been curious about the Mesh >>>> features I might not have noticed any of this, just because you do get >>>> results back even after finding no features. I think it might be ok to >>>> default to a majority classifier in this case, but we'd want the user >>>> to know that this has happened... >>>> >>>> My adjustment run just finished, so I've attached a zip file with the >>>> log, arff, weka and results directories... >>>> >>>> Thanks, >>>> Ted >>>> >>>> On Wed, Jul 23, 2008 at 2:33 PM, Bridget Thomson McInnes >>>> <bth...@cs...> wrote: >>>>> >>>>> Hi Ted, >>>>> >>>>> I am not certain. I am going to redownload the package, do a clean >>>>> install >>>>> and try it again. Hopefully I will be able to recreate it. >>>>> >>>>> Thanks! >>>>> >>>>> Bridget >>>>> >>>>> On Wed, 23 Jul 2008, Ted Pedersen wrote: >>>>> >>>>>> Hi Bridget, >>>>>> >>>>>> Thanks for this script - I ran it and it seems to give me back a fair >>>>>> number of results...so I seem to be able to access PubMed ok... >>>>>> >>>>>> marimba(4): perl get-mesh.pl 9337195 >>>>>> 9337195 : *Birth Weight,Blood Glucose/metabolism,Blood >>>>>> Pressure,Brachial Artery/anatomy & >>>>>> histology/*physiology/ultrasonography,Cardiovascular >>>>>> Diseases/*epidemiology,Child,Cholesterol, >>>>>> LDL/blood,England,Female,Humans,Lipids/blood,Muscle, Smooth, >>>>>> Vascular/anatomy & >>>>>> histology/physiology/ultrasonography,Parity,Pregnancy,Regional Blood >>>>>> Flow,Risk Factors,Socioeconomic Factors,Tobacco Smoke >>>>>> Pollution,*Vasodilation >>>>>> >>>>>> What could I check next? >>>>>> >>>>>> Thanks, >>>>>> Ted >>>>>> >>>>>> On Wed, Jul 23, 2008 at 2:17 PM, Bridget Thomson McInnes >>>>>> <bth...@cs...> wrote: >>>>>>> >>>>>>> Hi Ted >>>>>>> >>>>>>> I attached a test script to check. It is called : get-msh.pl >>>>>>> >>>>>>> Here is an example run: >>>>>>> >>>>>>> bthomson@caesar (~) % perl get-msh.pl 9337195 >>>>>>> 9337195 : *Birth Weight,Blood Glucose/metabolism,Blood >>>>>>> Pressure,Brachial >>>>>>> Artery/anatomy & histology/*physiology/ultrasonography,Cardiovascular >>>>>>> Diseases/*epidemiology,Child,Cholesterol, >>>>>>> LDL/blood,England,Female,Humans,Lipids/blood,Muscle, Smooth, >>>>>>> Vascular/anatomy & >>>>>>> histology/physiology/ultrasonography,Parity,Pregnancy,Regional Blood >>>>>>> Flow,Risk Factors,Socioeconomic Factors,Tobacco Smoke >>>>>>> Pollution,*Vasodilation >>>>>>> >>>>>>> >>>>>>> This is the same result that I get on my computer at school and here >>>>>>> at >>>>>>> work. >>>>>>> >>>>>>> Thanks! >>>>>>> >>>>>>> Bridget >>>>>>> >>>>>>> >>>>>>> On Wed, 23 Jul 2008, Ted Pedersen wrote: >>>>>>> >>>>>>>> Hi Bridget, >>>>>>>> >>>>>>>> TDP.mm is just a copy of McInnesPC07.dir.mm from the Demos directory. >>>>>>>> Did you mean >>>>>>>> that, or the actual output? >>>>>>>> >>>>>>>> I do have an internet connection so I don't think that's the problem. >>>>>>>> How would I know if >>>>>>>> PubMed cut me off? >>>>>>>> >>>>>>>> Thanks! >>>>>>>> Ted >>>>>>>> >>>>>>>> On Wed, Jul 23, 2008 at 12:40 PM, Bridget Thomson McInnes >>>>>>>> <bth...@cs...> wrote: >>>>>>>>> >>>>>>>>> Hi Ted, >>>>>>>>> >>>>>>>>> I am not certain why this is happening. I don't have this problem. >>>>>>>>> The >>>>>>>>> mesh terms are obtained using the PubMed API. I can see two >>>>>>>>> potential >>>>>>>>> problems: >>>>>>>>> 1. No internet connection >>>>>>>>> - which I will put in the documentation! >>>>>>>>> >>>>>>>>> 2. Do you think PubMed cut you off? They have done that to me >>>>>>>>> before. They just start rejecting my queries if they >>>>>>>>> think I have been using it to much. I have not quite >>>>>>>>> determined what to much is yet. I will write a >>>>>>>>> check in the program to make certain that something >>>>>>>>> is coming back and if not error out. >>>>>>>>> >>>>>>>>> Otherwise I can't think of what it is. It isn't like using the >>>>>>>>> UMLSKS >>>>>>>>> API >>>>>>>>> where the ip address needs to be registered. I have only tested this >>>>>>>>> inside NLM - my connection at the apartment goes in and out so I >>>>>>>>> haven't >>>>>>>>> been able to test this on my laptop. >>>>>>>>> >>>>>>>>> Could you send me your TDP.mm file? >>>>>>>>> >>>>>>>>> Thanks! >>>>>>>>> >>>>>>>>> Bridget >>>>>>>>> >>>>>>>>> >>>>>>>>> On Wed, 23 Jul 2008, Ted Pedersen wrote: >>>>>>>>> >>>>>>>>>> Hi Bridget, >>>>>>>>>> >>>>>>>>>> When using the Mesh option (--mesh) by itself, I seem to never get >>>>>>>>>> any >>>>>>>>>> features... >>>>>>>>>> >>>>>>>>>> My arff files all look something like this... >>>>>>>>>> >>>>>>>>>> @RELATION pressure >>>>>>>>>> @ATTRIBUTE Sense {M1,M2,M3,None} >>>>>>>>>> @DATA >>>>>>>>>> M1 % 97403834 >>>>>>>>>> M1 % 98281278 >>>>>>>>>> M1 % 98124304 >>>>>>>>>> >>>>>>>>>> And so I end up getting a majority classifier... >>>>>>>>>> >>>>>>>>>> Is there something I am supposed to be doing to get the mesh >>>>>>>>>> features? >>>>>>>>>> I am just running like this... >>>>>>>>>> >>>>>>>>>> supervised-disambiguate.pl TDP.mm --mesh >>>>>>>>>> >>>>>>>>>> where TDP.mm is a directory with all the NLM-WSD data in .mm format >>>>>>>>>> (one file per word). All the files >>>>>>>>>> seem to be getting processed, and no errors are shown, but the >>>>>>>>>> results >>>>>>>>>> are pretty much just a majority >>>>>>>>>> classifier (due to lack of features...) >>>>>>>>>> >>>>>>>>>> Any idea on this? >>>>>>>>>> >>>>>>>>>> Thanks! >>>>>>>>>> Ted >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Ted Pedersen >>>>>>>>>> http://www.d.umn.edu/~tpederse >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Ted Pedersen >>>>>>>> http://www.d.umn.edu/~tpederse >>>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Ted Pedersen >>>>>> http://www.d.umn.edu/~tpederse >>>>>> >>>>> >>>> >>>> >>>> >>>> -- >>>> Ted Pedersen >>>> http://www.d.umn.edu/~tpederse >>>> >>> >>> ------------------------------------------------------------------------- >>> This SF.Net email is sponsored by the Moblin Your Move Developer's >>> challenge >>> Build the coolest Linux based applications with Moblin SDK & win great >>> prizes >>> Grand prize is a trip for two to an Open Source event anywhere in the >>> world >>> http://moblin-contest.org/redirect.php?banner_id=100&url=/ >>> _______________________________________________ >>> Cuitools-users mailing list >>> Cui...@li... >>> https://lists.sourceforge.net/lists/listinfo/cuitools-users >>> >> > > > > -- > Ted Pedersen > http://www.d.umn.edu/~tpederse > |
From: Ted P. <tpederse@d.umn.edu> - 2008-07-23 20:50:27
|
Hi Bridget, I'm using whichever one is used by the McInnes07 demo, since I'm just copying McInnes07.mm and using that as TDP.mm. Here's the first few lines of adjustment.mm, hopefully that will make it clear what I'm using - I *think* this is the PMID version since it has that number in it (which I think is the PMID)? <corpus lang='en'> <lexelt item="adjustment" senses="M1,M2,M3,None"> <instance id="98076825" alias="adjustment"> <answer instance="98076825.ab.7" senseid="M2"/> <context line="Influence of physiological factors on the age-related increase in blood pressure in healthy men. The independent a nd collective influences of several physiological factors on the age-related increase in blood pressure in healthy men were examined. T wenty-seven younger and 25 older, mostly normotensive, healthy men were studied. Blood pressure, body fat, body fat distribution, maxim al oxygen consumption (VO2max), plasma norepinephrine, dietary Na, and erythrocyte Na-K pump activity were measured. Older men showed 5 7% higher percent body fat, 40% higher plasma norepinephrine concentration, 14% greater mean arterial blood pressure (MAP), and 5% high er plasma K concentration than younger men (all p < 0.01). Older men showed a 38% (p < 0.01) lower VO2max, 19% (p < 0.05) lower energy intake, 18% (p < 0.05) lower Na-K pump rate constant, and a 17% (p < 0.05) lower Na-K pump rate. Group means for MAP were adjusted for combinations of plasma norepinephrine, waist:thigh ratio, VO2max, and the Na-K pump rate constant, to determine if any one variable or combination could account for the age related increase in MAP. Statistical adjustment for plasma norepinephrine, waist:thigh ratio, and Na-K pump rate constant eliminated the significant difference between MAPs for the two groups. Thus, alterations in sympathetic nervou s system activity, body fat distribution, and the membrane Na-K pump activity independently contribute to the age-related increase in M AP in healthy men. "/> <sentence tw="" id="98076825.ti.1" line="Influence of physiological factors on the age-related increase in blood pressure in he althy men."> Does that look like the right format? Thanks! Ted On Wed, Jul 23, 2008 at 3:30 PM, Bridget Thomson McInnes <bth...@cs...> wrote: > Hi Ted, > > What version of the NLM-WSD dataset are you using? The --mesh option > requires that the PMID version be used. > > Thanks! > > Bridget > > On Wed, 23 Jul 2008, Bridget Thomson McInnes wrote: > >> Hi Ted, >> >> I just downloaded the CuiTools from the webpage and got the same error. I >> will see why it is doing this! >> >> Thanks! >> >> Bridget >> >> On Wed, 23 Jul 2008, Ted Pedersen wrote: >> >>> Hi Bridget, >>> >>> I'm in the process of running the --mesh option again, this time just >>> using adjustment and --mesh, as in... >>> >>> supervised-disambiguate.pl Demos/TDP.mm/adjustment.mm --mesh >>> --directory ted-adjustment-mesh >>> >>> One thing I've noticed in the previous cases with --mesh is that the >>> log directory is empty - which I guess means that no features were >>> found, or something....then of course the ARFF files don't have any >>> features in the them either, leading to the majority classifier... >>> >>> This isn't just specific to --mesh, but I do think it would be a good >>> idea to issue a warning or possibly even an error when no features are >>> found, just so the user doesn't end up getting a majority classifier >>> without realizing it - unless I had been curious about the Mesh >>> features I might not have noticed any of this, just because you do get >>> results back even after finding no features. I think it might be ok to >>> default to a majority classifier in this case, but we'd want the user >>> to know that this has happened... >>> >>> My adjustment run just finished, so I've attached a zip file with the >>> log, arff, weka and results directories... >>> >>> Thanks, >>> Ted >>> >>> On Wed, Jul 23, 2008 at 2:33 PM, Bridget Thomson McInnes >>> <bth...@cs...> wrote: >>>> >>>> Hi Ted, >>>> >>>> I am not certain. I am going to redownload the package, do a clean >>>> install >>>> and try it again. Hopefully I will be able to recreate it. >>>> >>>> Thanks! >>>> >>>> Bridget >>>> >>>> On Wed, 23 Jul 2008, Ted Pedersen wrote: >>>> >>>>> Hi Bridget, >>>>> >>>>> Thanks for this script - I ran it and it seems to give me back a fair >>>>> number of results...so I seem to be able to access PubMed ok... >>>>> >>>>> marimba(4): perl get-mesh.pl 9337195 >>>>> 9337195 : *Birth Weight,Blood Glucose/metabolism,Blood >>>>> Pressure,Brachial Artery/anatomy & >>>>> histology/*physiology/ultrasonography,Cardiovascular >>>>> Diseases/*epidemiology,Child,Cholesterol, >>>>> LDL/blood,England,Female,Humans,Lipids/blood,Muscle, Smooth, >>>>> Vascular/anatomy & >>>>> histology/physiology/ultrasonography,Parity,Pregnancy,Regional Blood >>>>> Flow,Risk Factors,Socioeconomic Factors,Tobacco Smoke >>>>> Pollution,*Vasodilation >>>>> >>>>> What could I check next? >>>>> >>>>> Thanks, >>>>> Ted >>>>> >>>>> On Wed, Jul 23, 2008 at 2:17 PM, Bridget Thomson McInnes >>>>> <bth...@cs...> wrote: >>>>>> >>>>>> Hi Ted >>>>>> >>>>>> I attached a test script to check. It is called : get-msh.pl >>>>>> >>>>>> Here is an example run: >>>>>> >>>>>> bthomson@caesar (~) % perl get-msh.pl 9337195 >>>>>> 9337195 : *Birth Weight,Blood Glucose/metabolism,Blood >>>>>> Pressure,Brachial >>>>>> Artery/anatomy & histology/*physiology/ultrasonography,Cardiovascular >>>>>> Diseases/*epidemiology,Child,Cholesterol, >>>>>> LDL/blood,England,Female,Humans,Lipids/blood,Muscle, Smooth, >>>>>> Vascular/anatomy & >>>>>> histology/physiology/ultrasonography,Parity,Pregnancy,Regional Blood >>>>>> Flow,Risk Factors,Socioeconomic Factors,Tobacco Smoke >>>>>> Pollution,*Vasodilation >>>>>> >>>>>> >>>>>> This is the same result that I get on my computer at school and here >>>>>> at >>>>>> work. >>>>>> >>>>>> Thanks! >>>>>> >>>>>> Bridget >>>>>> >>>>>> >>>>>> On Wed, 23 Jul 2008, Ted Pedersen wrote: >>>>>> >>>>>>> Hi Bridget, >>>>>>> >>>>>>> TDP.mm is just a copy of McInnesPC07.dir.mm from the Demos directory. >>>>>>> Did you mean >>>>>>> that, or the actual output? >>>>>>> >>>>>>> I do have an internet connection so I don't think that's the problem. >>>>>>> How would I know if >>>>>>> PubMed cut me off? >>>>>>> >>>>>>> Thanks! >>>>>>> Ted >>>>>>> >>>>>>> On Wed, Jul 23, 2008 at 12:40 PM, Bridget Thomson McInnes >>>>>>> <bth...@cs...> wrote: >>>>>>>> >>>>>>>> Hi Ted, >>>>>>>> >>>>>>>> I am not certain why this is happening. I don't have this problem. >>>>>>>> The >>>>>>>> mesh terms are obtained using the PubMed API. I can see two >>>>>>>> potential >>>>>>>> problems: >>>>>>>> 1. No internet connection >>>>>>>> - which I will put in the documentation! >>>>>>>> >>>>>>>> 2. Do you think PubMed cut you off? They have done that to me >>>>>>>> before. They just start rejecting my queries if they >>>>>>>> think I have been using it to much. I have not quite >>>>>>>> determined what to much is yet. I will write a >>>>>>>> check in the program to make certain that something >>>>>>>> is coming back and if not error out. >>>>>>>> >>>>>>>> Otherwise I can't think of what it is. It isn't like using the >>>>>>>> UMLSKS >>>>>>>> API >>>>>>>> where the ip address needs to be registered. I have only tested this >>>>>>>> inside NLM - my connection at the apartment goes in and out so I >>>>>>>> haven't >>>>>>>> been able to test this on my laptop. >>>>>>>> >>>>>>>> Could you send me your TDP.mm file? >>>>>>>> >>>>>>>> Thanks! >>>>>>>> >>>>>>>> Bridget >>>>>>>> >>>>>>>> >>>>>>>> On Wed, 23 Jul 2008, Ted Pedersen wrote: >>>>>>>> >>>>>>>>> Hi Bridget, >>>>>>>>> >>>>>>>>> When using the Mesh option (--mesh) by itself, I seem to never get >>>>>>>>> any >>>>>>>>> features... >>>>>>>>> >>>>>>>>> My arff files all look something like this... >>>>>>>>> >>>>>>>>> @RELATION pressure >>>>>>>>> @ATTRIBUTE Sense {M1,M2,M3,None} >>>>>>>>> @DATA >>>>>>>>> M1 % 97403834 >>>>>>>>> M1 % 98281278 >>>>>>>>> M1 % 98124304 >>>>>>>>> >>>>>>>>> And so I end up getting a majority classifier... >>>>>>>>> >>>>>>>>> Is there something I am supposed to be doing to get the mesh >>>>>>>>> features? >>>>>>>>> I am just running like this... >>>>>>>>> >>>>>>>>> supervised-disambiguate.pl TDP.mm --mesh >>>>>>>>> >>>>>>>>> where TDP.mm is a directory with all the NLM-WSD data in .mm format >>>>>>>>> (one file per word). All the files >>>>>>>>> seem to be getting processed, and no errors are shown, but the >>>>>>>>> results >>>>>>>>> are pretty much just a majority >>>>>>>>> classifier (due to lack of features...) >>>>>>>>> >>>>>>>>> Any idea on this? >>>>>>>>> >>>>>>>>> Thanks! >>>>>>>>> Ted >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Ted Pedersen >>>>>>>>> http://www.d.umn.edu/~tpederse >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Ted Pedersen >>>>>>> http://www.d.umn.edu/~tpederse >>>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Ted Pedersen >>>>> http://www.d.umn.edu/~tpederse >>>>> >>>> >>> >>> >>> >>> -- >>> Ted Pedersen >>> http://www.d.umn.edu/~tpederse >>> >> >> ------------------------------------------------------------------------- >> This SF.Net email is sponsored by the Moblin Your Move Developer's >> challenge >> Build the coolest Linux based applications with Moblin SDK & win great >> prizes >> Grand prize is a trip for two to an Open Source event anywhere in the >> world >> http://moblin-contest.org/redirect.php?banner_id=100&url=/ >> _______________________________________________ >> Cuitools-users mailing list >> Cui...@li... >> https://lists.sourceforge.net/lists/listinfo/cuitools-users >> > -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: Bridget T. M. <bth...@cs...> - 2008-07-23 20:30:08
|
Hi Ted, What version of the NLM-WSD dataset are you using? The --mesh option requires that the PMID version be used. Thanks! Bridget On Wed, 23 Jul 2008, Bridget Thomson McInnes wrote: > Hi Ted, > > I just downloaded the CuiTools from the webpage and got the same error. I > will see why it is doing this! > > Thanks! > > Bridget > > On Wed, 23 Jul 2008, Ted Pedersen wrote: > >> Hi Bridget, >> >> I'm in the process of running the --mesh option again, this time just >> using adjustment and --mesh, as in... >> >> supervised-disambiguate.pl Demos/TDP.mm/adjustment.mm --mesh >> --directory ted-adjustment-mesh >> >> One thing I've noticed in the previous cases with --mesh is that the >> log directory is empty - which I guess means that no features were >> found, or something....then of course the ARFF files don't have any >> features in the them either, leading to the majority classifier... >> >> This isn't just specific to --mesh, but I do think it would be a good >> idea to issue a warning or possibly even an error when no features are >> found, just so the user doesn't end up getting a majority classifier >> without realizing it - unless I had been curious about the Mesh >> features I might not have noticed any of this, just because you do get >> results back even after finding no features. I think it might be ok to >> default to a majority classifier in this case, but we'd want the user >> to know that this has happened... >> >> My adjustment run just finished, so I've attached a zip file with the >> log, arff, weka and results directories... >> >> Thanks, >> Ted >> >> On Wed, Jul 23, 2008 at 2:33 PM, Bridget Thomson McInnes >> <bth...@cs...> wrote: >>> Hi Ted, >>> >>> I am not certain. I am going to redownload the package, do a clean install >>> and try it again. Hopefully I will be able to recreate it. >>> >>> Thanks! >>> >>> Bridget >>> >>> On Wed, 23 Jul 2008, Ted Pedersen wrote: >>> >>>> Hi Bridget, >>>> >>>> Thanks for this script - I ran it and it seems to give me back a fair >>>> number of results...so I seem to be able to access PubMed ok... >>>> >>>> marimba(4): perl get-mesh.pl 9337195 >>>> 9337195 : *Birth Weight,Blood Glucose/metabolism,Blood >>>> Pressure,Brachial Artery/anatomy & >>>> histology/*physiology/ultrasonography,Cardiovascular >>>> Diseases/*epidemiology,Child,Cholesterol, >>>> LDL/blood,England,Female,Humans,Lipids/blood,Muscle, Smooth, >>>> Vascular/anatomy & >>>> histology/physiology/ultrasonography,Parity,Pregnancy,Regional Blood >>>> Flow,Risk Factors,Socioeconomic Factors,Tobacco Smoke >>>> Pollution,*Vasodilation >>>> >>>> What could I check next? >>>> >>>> Thanks, >>>> Ted >>>> >>>> On Wed, Jul 23, 2008 at 2:17 PM, Bridget Thomson McInnes >>>> <bth...@cs...> wrote: >>>>> >>>>> Hi Ted >>>>> >>>>> I attached a test script to check. It is called : get-msh.pl >>>>> >>>>> Here is an example run: >>>>> >>>>> bthomson@caesar (~) % perl get-msh.pl 9337195 >>>>> 9337195 : *Birth Weight,Blood Glucose/metabolism,Blood Pressure,Brachial >>>>> Artery/anatomy & histology/*physiology/ultrasonography,Cardiovascular >>>>> Diseases/*epidemiology,Child,Cholesterol, >>>>> LDL/blood,England,Female,Humans,Lipids/blood,Muscle, Smooth, >>>>> Vascular/anatomy & >>>>> histology/physiology/ultrasonography,Parity,Pregnancy,Regional Blood >>>>> Flow,Risk Factors,Socioeconomic Factors,Tobacco Smoke >>>>> Pollution,*Vasodilation >>>>> >>>>> >>>>> This is the same result that I get on my computer at school and here at >>>>> work. >>>>> >>>>> Thanks! >>>>> >>>>> Bridget >>>>> >>>>> >>>>> On Wed, 23 Jul 2008, Ted Pedersen wrote: >>>>> >>>>>> Hi Bridget, >>>>>> >>>>>> TDP.mm is just a copy of McInnesPC07.dir.mm from the Demos directory. >>>>>> Did you mean >>>>>> that, or the actual output? >>>>>> >>>>>> I do have an internet connection so I don't think that's the problem. >>>>>> How would I know if >>>>>> PubMed cut me off? >>>>>> >>>>>> Thanks! >>>>>> Ted >>>>>> >>>>>> On Wed, Jul 23, 2008 at 12:40 PM, Bridget Thomson McInnes >>>>>> <bth...@cs...> wrote: >>>>>>> >>>>>>> Hi Ted, >>>>>>> >>>>>>> I am not certain why this is happening. I don't have this problem. The >>>>>>> mesh terms are obtained using the PubMed API. I can see two potential >>>>>>> problems: >>>>>>> 1. No internet connection >>>>>>> - which I will put in the documentation! >>>>>>> >>>>>>> 2. Do you think PubMed cut you off? They have done that to me >>>>>>> before. They just start rejecting my queries if they >>>>>>> think I have been using it to much. I have not quite >>>>>>> determined what to much is yet. I will write a >>>>>>> check in the program to make certain that something >>>>>>> is coming back and if not error out. >>>>>>> >>>>>>> Otherwise I can't think of what it is. It isn't like using the UMLSKS >>>>>>> API >>>>>>> where the ip address needs to be registered. I have only tested this >>>>>>> inside NLM - my connection at the apartment goes in and out so I >>>>>>> haven't >>>>>>> been able to test this on my laptop. >>>>>>> >>>>>>> Could you send me your TDP.mm file? >>>>>>> >>>>>>> Thanks! >>>>>>> >>>>>>> Bridget >>>>>>> >>>>>>> >>>>>>> On Wed, 23 Jul 2008, Ted Pedersen wrote: >>>>>>> >>>>>>>> Hi Bridget, >>>>>>>> >>>>>>>> When using the Mesh option (--mesh) by itself, I seem to never get any >>>>>>>> features... >>>>>>>> >>>>>>>> My arff files all look something like this... >>>>>>>> >>>>>>>> @RELATION pressure >>>>>>>> @ATTRIBUTE Sense {M1,M2,M3,None} >>>>>>>> @DATA >>>>>>>> M1 % 97403834 >>>>>>>> M1 % 98281278 >>>>>>>> M1 % 98124304 >>>>>>>> >>>>>>>> And so I end up getting a majority classifier... >>>>>>>> >>>>>>>> Is there something I am supposed to be doing to get the mesh features? >>>>>>>> I am just running like this... >>>>>>>> >>>>>>>> supervised-disambiguate.pl TDP.mm --mesh >>>>>>>> >>>>>>>> where TDP.mm is a directory with all the NLM-WSD data in .mm format >>>>>>>> (one file per word). All the files >>>>>>>> seem to be getting processed, and no errors are shown, but the results >>>>>>>> are pretty much just a majority >>>>>>>> classifier (due to lack of features...) >>>>>>>> >>>>>>>> Any idea on this? >>>>>>>> >>>>>>>> Thanks! >>>>>>>> Ted >>>>>>>> >>>>>>>> -- >>>>>>>> Ted Pedersen >>>>>>>> http://www.d.umn.edu/~tpederse >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Ted Pedersen >>>>>> http://www.d.umn.edu/~tpederse >>>>>> >>>> >>>> >>>> >>>> -- >>>> Ted Pedersen >>>> http://www.d.umn.edu/~tpederse >>>> >>> >> >> >> >> -- >> Ted Pedersen >> http://www.d.umn.edu/~tpederse >> > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's challenge > Build the coolest Linux based applications with Moblin SDK & win great prizes > Grand prize is a trip for two to an Open Source event anywhere in the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > _______________________________________________ > Cuitools-users mailing list > Cui...@li... > https://lists.sourceforge.net/lists/listinfo/cuitools-users > |
From: Bridget T. M. <bth...@cs...> - 2008-07-23 20:13:48
|
Hi Ted, I just downloaded the CuiTools from the webpage and got the same error. I will see why it is doing this! Thanks! Bridget On Wed, 23 Jul 2008, Ted Pedersen wrote: > Hi Bridget, > > I'm in the process of running the --mesh option again, this time just > using adjustment and --mesh, as in... > > supervised-disambiguate.pl Demos/TDP.mm/adjustment.mm --mesh > --directory ted-adjustment-mesh > > One thing I've noticed in the previous cases with --mesh is that the > log directory is empty - which I guess means that no features were > found, or something....then of course the ARFF files don't have any > features in the them either, leading to the majority classifier... > > This isn't just specific to --mesh, but I do think it would be a good > idea to issue a warning or possibly even an error when no features are > found, just so the user doesn't end up getting a majority classifier > without realizing it - unless I had been curious about the Mesh > features I might not have noticed any of this, just because you do get > results back even after finding no features. I think it might be ok to > default to a majority classifier in this case, but we'd want the user > to know that this has happened... > > My adjustment run just finished, so I've attached a zip file with the > log, arff, weka and results directories... > > Thanks, > Ted > > On Wed, Jul 23, 2008 at 2:33 PM, Bridget Thomson McInnes > <bth...@cs...> wrote: >> Hi Ted, >> >> I am not certain. I am going to redownload the package, do a clean install >> and try it again. Hopefully I will be able to recreate it. >> >> Thanks! >> >> Bridget >> >> On Wed, 23 Jul 2008, Ted Pedersen wrote: >> >>> Hi Bridget, >>> >>> Thanks for this script - I ran it and it seems to give me back a fair >>> number of results...so I seem to be able to access PubMed ok... >>> >>> marimba(4): perl get-mesh.pl 9337195 >>> 9337195 : *Birth Weight,Blood Glucose/metabolism,Blood >>> Pressure,Brachial Artery/anatomy & >>> histology/*physiology/ultrasonography,Cardiovascular >>> Diseases/*epidemiology,Child,Cholesterol, >>> LDL/blood,England,Female,Humans,Lipids/blood,Muscle, Smooth, >>> Vascular/anatomy & >>> histology/physiology/ultrasonography,Parity,Pregnancy,Regional Blood >>> Flow,Risk Factors,Socioeconomic Factors,Tobacco Smoke >>> Pollution,*Vasodilation >>> >>> What could I check next? >>> >>> Thanks, >>> Ted >>> >>> On Wed, Jul 23, 2008 at 2:17 PM, Bridget Thomson McInnes >>> <bth...@cs...> wrote: >>>> >>>> Hi Ted >>>> >>>> I attached a test script to check. It is called : get-msh.pl >>>> >>>> Here is an example run: >>>> >>>> bthomson@caesar (~) % perl get-msh.pl 9337195 >>>> 9337195 : *Birth Weight,Blood Glucose/metabolism,Blood Pressure,Brachial >>>> Artery/anatomy & histology/*physiology/ultrasonography,Cardiovascular >>>> Diseases/*epidemiology,Child,Cholesterol, >>>> LDL/blood,England,Female,Humans,Lipids/blood,Muscle, Smooth, >>>> Vascular/anatomy & >>>> histology/physiology/ultrasonography,Parity,Pregnancy,Regional Blood >>>> Flow,Risk Factors,Socioeconomic Factors,Tobacco Smoke >>>> Pollution,*Vasodilation >>>> >>>> >>>> This is the same result that I get on my computer at school and here at >>>> work. >>>> >>>> Thanks! >>>> >>>> Bridget >>>> >>>> >>>> On Wed, 23 Jul 2008, Ted Pedersen wrote: >>>> >>>>> Hi Bridget, >>>>> >>>>> TDP.mm is just a copy of McInnesPC07.dir.mm from the Demos directory. >>>>> Did you mean >>>>> that, or the actual output? >>>>> >>>>> I do have an internet connection so I don't think that's the problem. >>>>> How would I know if >>>>> PubMed cut me off? >>>>> >>>>> Thanks! >>>>> Ted >>>>> >>>>> On Wed, Jul 23, 2008 at 12:40 PM, Bridget Thomson McInnes >>>>> <bth...@cs...> wrote: >>>>>> >>>>>> Hi Ted, >>>>>> >>>>>> I am not certain why this is happening. I don't have this problem. The >>>>>> mesh terms are obtained using the PubMed API. I can see two potential >>>>>> problems: >>>>>> 1. No internet connection >>>>>> - which I will put in the documentation! >>>>>> >>>>>> 2. Do you think PubMed cut you off? They have done that to me >>>>>> before. They just start rejecting my queries if they >>>>>> think I have been using it to much. I have not quite >>>>>> determined what to much is yet. I will write a >>>>>> check in the program to make certain that something >>>>>> is coming back and if not error out. >>>>>> >>>>>> Otherwise I can't think of what it is. It isn't like using the UMLSKS >>>>>> API >>>>>> where the ip address needs to be registered. I have only tested this >>>>>> inside NLM - my connection at the apartment goes in and out so I >>>>>> haven't >>>>>> been able to test this on my laptop. >>>>>> >>>>>> Could you send me your TDP.mm file? >>>>>> >>>>>> Thanks! >>>>>> >>>>>> Bridget >>>>>> >>>>>> >>>>>> On Wed, 23 Jul 2008, Ted Pedersen wrote: >>>>>> >>>>>>> Hi Bridget, >>>>>>> >>>>>>> When using the Mesh option (--mesh) by itself, I seem to never get any >>>>>>> features... >>>>>>> >>>>>>> My arff files all look something like this... >>>>>>> >>>>>>> @RELATION pressure >>>>>>> @ATTRIBUTE Sense {M1,M2,M3,None} >>>>>>> @DATA >>>>>>> M1 % 97403834 >>>>>>> M1 % 98281278 >>>>>>> M1 % 98124304 >>>>>>> >>>>>>> And so I end up getting a majority classifier... >>>>>>> >>>>>>> Is there something I am supposed to be doing to get the mesh features? >>>>>>> I am just running like this... >>>>>>> >>>>>>> supervised-disambiguate.pl TDP.mm --mesh >>>>>>> >>>>>>> where TDP.mm is a directory with all the NLM-WSD data in .mm format >>>>>>> (one file per word). All the files >>>>>>> seem to be getting processed, and no errors are shown, but the results >>>>>>> are pretty much just a majority >>>>>>> classifier (due to lack of features...) >>>>>>> >>>>>>> Any idea on this? >>>>>>> >>>>>>> Thanks! >>>>>>> Ted >>>>>>> >>>>>>> -- >>>>>>> Ted Pedersen >>>>>>> http://www.d.umn.edu/~tpederse >>>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Ted Pedersen >>>>> http://www.d.umn.edu/~tpederse >>>>> >>> >>> >>> >>> -- >>> Ted Pedersen >>> http://www.d.umn.edu/~tpederse >>> >> > > > > -- > Ted Pedersen > http://www.d.umn.edu/~tpederse > |
From: Ted P. <tpederse@d.umn.edu> - 2008-07-23 19:35:34
|
Thanks Bridget, This all sounds good! Ted On Wed, Jul 23, 2008 at 12:15 PM, Bridget Thomson McInnes <bth...@cs...> wrote: > Hi Ted, > >> Do we have any way to convert the text to lower case? If not that's >> fairly easy to do, but couldn't remember if that was something we had >> included or not... >> > > We do. I am sorry it didn't get into the supervised-disambiguate > documentation. It is the --lc option. > >> Also, in the help for supervised-disambiguate.pl - there are two >> entries for --mesh, there isn't a space under the --wekacv entry, >> and a few of the left hand margins are slightly off (some indented >> once, others not at all, something like that...) The entry for >> --cv has a kind of funny line break in it...--nokey seems to have a >> repeated word (This) Very very minor cosmetic issues. >> > > Thanks! I made the changes to these in the documentation. > >> Slightly (but just barely) more substantial issues in the --help are >> that I don't think it makes it clear that SOURCE can either be a >> directory containing lots of .mm files or a single .mm file...and it >> also says that at least one feature must be selected - however, >> it will run ok with just the SOURCE indicated, so we should probably >> make it clear what will happen if you just give it SOURCE. >> --version shows copyright as 2007 but we probably want that to be >> 2007-2008 at this point, and you should list yourself as >> first author on copyright and other stuff... >> > > I fixed the copyright in the version number. > > And added to the perldoc and help > > If no feature option is chosen, the default feature setting > is used: > > --ngramcount "--ngram 1 --frequency 2" > > in order to be clear that no feature option is required to run the > supervised-disambiguate.pl program. > > I also A directory containing the CuiTools xml-like .mm formatted > training file(s) or a single file in the CuiTools xml-like > .mm format. > > I modified the help and the perldoc description to be more > clear that SOURCE can be either a directory containing the > .mm files or a single .mm file. > > Here is what I wrote: > > This is a wrapper program for supervised WSD using CuiTools > programs. It takes as input (SOURCE) a directory containing > files in the CuiTools xml-like .mm format or a single file > in the xml-like .mm format. The program extracts specified > features and trains/tests a classifier using the WEKA data > mining package. The overall results are stored in a file > called overallResults located in the results directory. If > no feature option is chosen, the default feature setting is > used: --ngramcount "--ngram 1 --frequency 2" > > I hope this makes the documentation clearer. > > Thanks! > > Bridget > -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: Bridget T. M. <bth...@cs...> - 2008-07-23 19:34:04
|
Hi Ted, I am not certain. I am going to redownload the package, do a clean install and try it again. Hopefully I will be able to recreate it. Thanks! Bridget On Wed, 23 Jul 2008, Ted Pedersen wrote: > Hi Bridget, > > Thanks for this script - I ran it and it seems to give me back a fair > number of results...so I seem to be able to access PubMed ok... > > marimba(4): perl get-mesh.pl 9337195 > 9337195 : *Birth Weight,Blood Glucose/metabolism,Blood > Pressure,Brachial Artery/anatomy & > histology/*physiology/ultrasonography,Cardiovascular > Diseases/*epidemiology,Child,Cholesterol, > LDL/blood,England,Female,Humans,Lipids/blood,Muscle, Smooth, > Vascular/anatomy & > histology/physiology/ultrasonography,Parity,Pregnancy,Regional Blood > Flow,Risk Factors,Socioeconomic Factors,Tobacco Smoke > Pollution,*Vasodilation > > What could I check next? > > Thanks, > Ted > > On Wed, Jul 23, 2008 at 2:17 PM, Bridget Thomson McInnes > <bth...@cs...> wrote: >> Hi Ted >> >> I attached a test script to check. It is called : get-msh.pl >> >> Here is an example run: >> >> bthomson@caesar (~) % perl get-msh.pl 9337195 >> 9337195 : *Birth Weight,Blood Glucose/metabolism,Blood Pressure,Brachial >> Artery/anatomy & histology/*physiology/ultrasonography,Cardiovascular >> Diseases/*epidemiology,Child,Cholesterol, >> LDL/blood,England,Female,Humans,Lipids/blood,Muscle, Smooth, >> Vascular/anatomy & >> histology/physiology/ultrasonography,Parity,Pregnancy,Regional Blood >> Flow,Risk Factors,Socioeconomic Factors,Tobacco Smoke >> Pollution,*Vasodilation >> >> >> This is the same result that I get on my computer at school and here at >> work. >> >> Thanks! >> >> Bridget >> >> >> On Wed, 23 Jul 2008, Ted Pedersen wrote: >> >>> Hi Bridget, >>> >>> TDP.mm is just a copy of McInnesPC07.dir.mm from the Demos directory. >>> Did you mean >>> that, or the actual output? >>> >>> I do have an internet connection so I don't think that's the problem. >>> How would I know if >>> PubMed cut me off? >>> >>> Thanks! >>> Ted >>> >>> On Wed, Jul 23, 2008 at 12:40 PM, Bridget Thomson McInnes >>> <bth...@cs...> wrote: >>>> Hi Ted, >>>> >>>> I am not certain why this is happening. I don't have this problem. The >>>> mesh terms are obtained using the PubMed API. I can see two potential >>>> problems: >>>> 1. No internet connection >>>> - which I will put in the documentation! >>>> >>>> 2. Do you think PubMed cut you off? They have done that to me >>>> before. They just start rejecting my queries if they >>>> think I have been using it to much. I have not quite >>>> determined what to much is yet. I will write a >>>> check in the program to make certain that something >>>> is coming back and if not error out. >>>> >>>> Otherwise I can't think of what it is. It isn't like using the UMLSKS API >>>> where the ip address needs to be registered. I have only tested this >>>> inside NLM - my connection at the apartment goes in and out so I haven't >>>> been able to test this on my laptop. >>>> >>>> Could you send me your TDP.mm file? >>>> >>>> Thanks! >>>> >>>> Bridget >>>> >>>> >>>> On Wed, 23 Jul 2008, Ted Pedersen wrote: >>>> >>>>> Hi Bridget, >>>>> >>>>> When using the Mesh option (--mesh) by itself, I seem to never get any >>>>> features... >>>>> >>>>> My arff files all look something like this... >>>>> >>>>> @RELATION pressure >>>>> @ATTRIBUTE Sense {M1,M2,M3,None} >>>>> @DATA >>>>> M1 % 97403834 >>>>> M1 % 98281278 >>>>> M1 % 98124304 >>>>> >>>>> And so I end up getting a majority classifier... >>>>> >>>>> Is there something I am supposed to be doing to get the mesh features? >>>>> I am just running like this... >>>>> >>>>> supervised-disambiguate.pl TDP.mm --mesh >>>>> >>>>> where TDP.mm is a directory with all the NLM-WSD data in .mm format >>>>> (one file per word). All the files >>>>> seem to be getting processed, and no errors are shown, but the results >>>>> are pretty much just a majority >>>>> classifier (due to lack of features...) >>>>> >>>>> Any idea on this? >>>>> >>>>> Thanks! >>>>> Ted >>>>> >>>>> -- >>>>> Ted Pedersen >>>>> http://www.d.umn.edu/~tpederse >>>>> >>>> >>> >>> >>> >>> -- >>> Ted Pedersen >>> http://www.d.umn.edu/~tpederse >>> > > > > -- > Ted Pedersen > http://www.d.umn.edu/~tpederse > |
From: Ted P. <tpederse@d.umn.edu> - 2008-07-23 19:32:31
|
Hi Bridget, Thanks for this script - I ran it and it seems to give me back a fair number of results...so I seem to be able to access PubMed ok... marimba(4): perl get-mesh.pl 9337195 9337195 : *Birth Weight,Blood Glucose/metabolism,Blood Pressure,Brachial Artery/anatomy & histology/*physiology/ultrasonography,Cardiovascular Diseases/*epidemiology,Child,Cholesterol, LDL/blood,England,Female,Humans,Lipids/blood,Muscle, Smooth, Vascular/anatomy & histology/physiology/ultrasonography,Parity,Pregnancy,Regional Blood Flow,Risk Factors,Socioeconomic Factors,Tobacco Smoke Pollution,*Vasodilation What could I check next? Thanks, Ted On Wed, Jul 23, 2008 at 2:17 PM, Bridget Thomson McInnes <bth...@cs...> wrote: > Hi Ted > > I attached a test script to check. It is called : get-msh.pl > > Here is an example run: > > bthomson@caesar (~) % perl get-msh.pl 9337195 > 9337195 : *Birth Weight,Blood Glucose/metabolism,Blood Pressure,Brachial > Artery/anatomy & histology/*physiology/ultrasonography,Cardiovascular > Diseases/*epidemiology,Child,Cholesterol, > LDL/blood,England,Female,Humans,Lipids/blood,Muscle, Smooth, > Vascular/anatomy & > histology/physiology/ultrasonography,Parity,Pregnancy,Regional Blood > Flow,Risk Factors,Socioeconomic Factors,Tobacco Smoke > Pollution,*Vasodilation > > > This is the same result that I get on my computer at school and here at > work. > > Thanks! > > Bridget > > > On Wed, 23 Jul 2008, Ted Pedersen wrote: > >> Hi Bridget, >> >> TDP.mm is just a copy of McInnesPC07.dir.mm from the Demos directory. >> Did you mean >> that, or the actual output? >> >> I do have an internet connection so I don't think that's the problem. >> How would I know if >> PubMed cut me off? >> >> Thanks! >> Ted >> >> On Wed, Jul 23, 2008 at 12:40 PM, Bridget Thomson McInnes >> <bth...@cs...> wrote: >> > Hi Ted, >> > >> > I am not certain why this is happening. I don't have this problem. The >> > mesh terms are obtained using the PubMed API. I can see two potential >> > problems: >> > 1. No internet connection >> > - which I will put in the documentation! >> > >> > 2. Do you think PubMed cut you off? They have done that to me >> > before. They just start rejecting my queries if they >> > think I have been using it to much. I have not quite >> > determined what to much is yet. I will write a >> > check in the program to make certain that something >> > is coming back and if not error out. >> > >> > Otherwise I can't think of what it is. It isn't like using the UMLSKS API >> > where the ip address needs to be registered. I have only tested this >> > inside NLM - my connection at the apartment goes in and out so I haven't >> > been able to test this on my laptop. >> > >> > Could you send me your TDP.mm file? >> > >> > Thanks! >> > >> > Bridget >> > >> > >> > On Wed, 23 Jul 2008, Ted Pedersen wrote: >> > >> >> Hi Bridget, >> >> >> >> When using the Mesh option (--mesh) by itself, I seem to never get any >> >> features... >> >> >> >> My arff files all look something like this... >> >> >> >> @RELATION pressure >> >> @ATTRIBUTE Sense {M1,M2,M3,None} >> >> @DATA >> >> M1 % 97403834 >> >> M1 % 98281278 >> >> M1 % 98124304 >> >> >> >> And so I end up getting a majority classifier... >> >> >> >> Is there something I am supposed to be doing to get the mesh features? >> >> I am just running like this... >> >> >> >> supervised-disambiguate.pl TDP.mm --mesh >> >> >> >> where TDP.mm is a directory with all the NLM-WSD data in .mm format >> >> (one file per word). All the files >> >> seem to be getting processed, and no errors are shown, but the results >> >> are pretty much just a majority >> >> classifier (due to lack of features...) >> >> >> >> Any idea on this? >> >> >> >> Thanks! >> >> Ted >> >> >> >> -- >> >> Ted Pedersen >> >> http://www.d.umn.edu/~tpederse >> >> >> > >> >> >> >> -- >> Ted Pedersen >> http://www.d.umn.edu/~tpederse >> -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: Bridget T. M. <bth...@cs...> - 2008-07-23 19:17:31
|
Hi Ted I attached a test script to check. It is called : get-msh.pl Here is an example run: bthomson@caesar (~) % perl get-msh.pl 9337195 9337195 : *Birth Weight,Blood Glucose/metabolism,Blood Pressure,Brachial Artery/anatomy & histology/*physiology/ultrasonography,Cardiovascular Diseases/*epidemiology,Child,Cholesterol, LDL/blood,England,Female,Humans,Lipids/blood,Muscle, Smooth, Vascular/anatomy & histology/physiology/ultrasonography,Parity,Pregnancy,Regional Blood Flow,Risk Factors,Socioeconomic Factors,Tobacco Smoke Pollution,*Vasodilation This is the same result that I get on my computer at school and here at work. Thanks! Bridget On Wed, 23 Jul 2008, Ted Pedersen wrote: > Hi Bridget, > > TDP.mm is just a copy of McInnesPC07.dir.mm from the Demos directory. > Did you mean > that, or the actual output? > > I do have an internet connection so I don't think that's the problem. > How would I know if > PubMed cut me off? > > Thanks! > Ted > > On Wed, Jul 23, 2008 at 12:40 PM, Bridget Thomson McInnes > <bth...@cs...> wrote: > > Hi Ted, > > > > I am not certain why this is happening. I don't have this problem. The > > mesh terms are obtained using the PubMed API. I can see two potential > > problems: > > 1. No internet connection > > - which I will put in the documentation! > > > > 2. Do you think PubMed cut you off? They have done that to me > > before. They just start rejecting my queries if they > > think I have been using it to much. I have not quite > > determined what to much is yet. I will write a > > check in the program to make certain that something > > is coming back and if not error out. > > > > Otherwise I can't think of what it is. It isn't like using the UMLSKS API > > where the ip address needs to be registered. I have only tested this > > inside NLM - my connection at the apartment goes in and out so I haven't > > been able to test this on my laptop. > > > > Could you send me your TDP.mm file? > > > > Thanks! > > > > Bridget > > > > > > On Wed, 23 Jul 2008, Ted Pedersen wrote: > > > >> Hi Bridget, > >> > >> When using the Mesh option (--mesh) by itself, I seem to never get any > >> features... > >> > >> My arff files all look something like this... > >> > >> @RELATION pressure > >> @ATTRIBUTE Sense {M1,M2,M3,None} > >> @DATA > >> M1 % 97403834 > >> M1 % 98281278 > >> M1 % 98124304 > >> > >> And so I end up getting a majority classifier... > >> > >> Is there something I am supposed to be doing to get the mesh features? > >> I am just running like this... > >> > >> supervised-disambiguate.pl TDP.mm --mesh > >> > >> where TDP.mm is a directory with all the NLM-WSD data in .mm format > >> (one file per word). All the files > >> seem to be getting processed, and no errors are shown, but the results > >> are pretty much just a majority > >> classifier (due to lack of features...) > >> > >> Any idea on this? > >> > >> Thanks! > >> Ted > >> > >> -- > >> Ted Pedersen > >> http://www.d.umn.edu/~tpederse > >> > > > > > > -- > Ted Pedersen > http://www.d.umn.edu/~tpederse > |