Re: [Senserelate-developers] test cases and testing

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

I've added a "skip all" condition to wsd.t that seems to be working
reasonably well if we don't get a version of 2.0, 2.1, or 3.0. Our
test output looks like this in that case..

marimba(111): make testlib      Makefile.PL  pm_to_blib  samples  TODO  web
PERL_DL_NONLAZY=1 /usr/local/bin/perl "-MExtUtils::Command::MM" "-e"
"test_harness(0, 'blib/lib', 'blib/arch')" t/*.t
t/WordNet-SenseRelate-AllWords....ok 5/35# WordNet hash :
eOS9lXC6GvMWznF1wkZofDdtbBUba(107): nano wsd.t
t/WordNet-SenseRelate-AllWords....ok 6/35# WordNet path :
/usr/local/WordNet-3.0/dict
t/WordNet-SenseRelate-AllWords....ok
t/wsd.............................skipped
        all skipped: WordNet version is not 2.0 2.1 3.0 -> skip tests
All tests successful, 1 test skipped.
Files=2, Tests=35, 24 wallclock secs (23.28 cusr +  0.74 csys = 24.02 CPU)

This is the gist of the code that handles that...

use constant WNver20 => 'US9EUGPpJj2jVr+fRrZqQX6vcGs';
use constant WNver21 => 'LL1BZMsWkr0YOuiewfbiL656+Q4';
use constant WNver30 => 'eOS9lXC6GvMWznF1wkZofDdtbBU';

# find out what version of wordnet we are using and print that
# the hashcode will tell us the version

use WordNet::SenseRelate::AllWords;
use WordNet::QueryData;
use WordNet::Tools;

my $qd = WordNet::QueryData->new;
my $wntools = WordNet::Tools->new($qd);
$wnHashCode = $wntools->hashCode();

# skip all these tests if Wordnet version is not 2.0 2.1 or 3.0

use Test::More;
if ( !($wnHashCode eq WNver20) &&
     !($wnHashCode eq WNver21) &&
     !($wnHashCode eq WNver21)) {
        plan skip_all => 'WordNet version is not 2.0 2.1 3.0 -> skip tests';
     }
else {
        plan tests => 7;
     }

The only drawback to skip_all is that you can't have any tests excute
before it, so I had to remove the BEGIN (use_ok ...) tests for loading
up the modules, and then remove the corresponding END statements (as
those caused a problem when the test was skipped.
But, compared to having problems because someone is using 1.6...this
seems like a small price to pay...

Now, in wsd.t we don't have any differences due to version changes
between 2.0, 2.1, and 3.0, but we do in WordNetSenseRelateAllWords.t,
and those can be handled with a check for the version and then
changing the expected values based on that. So two levels of checking
I guess...just skip everything if the version is not one of the ones
we know about, and then change expected values as needed. It occurs to
me that this skip_all will help too if we encounter different hash
values because of Windows or other variations like that, so that will
actually help avoid failure in those cases and might make the nature
of the problem more clear...

So, I'm still monkeying around with this, but this is more or less the idea...

Thanks,
Ted

On Wed, Apr 9, 2008 at 9:24 PM, Ted Pedersen <dul...@gm...> wrote:
> Hi Sid,
>
>  Actually the disturbing thing about the errors we see with measures other than
>  lesk is that the results disambiguate gives are of the form "cat#n", without a
>  sense number! So its not just that the sense has shifted, it's that there is no
>  sense indicated at all! I think Varada noticed this for some things she was
>  doing with jcn as well...so, it's kind of a puzzle....
>
>  I'm working on that idea of  'todo' test now - I think it's kind of
>  neat....serves as a
>  good reminder I guess of errors, and then as they get fixed you have a built
>  in test case ready and waiting for it....I'll see how that works out...
>
>  Thanks!
>  Ted
>
>
>
>  On Wed, Apr 9, 2008 at 9:11 PM, Siddharth Patwardhan <si...@cs...> wrote:
>  > > Argh! Just realized that SenseRelate-AllWords was using QueryData's
>  > > now deprecated version () method! So, I need to change that. We used
>  > > that in my tests in 0.08 that were attempting to get the version info
>  > > from the user, but since that isn't always reliable now, I think the
>  > > test didn't perform as expected!
>  > >
>  > > In any case, I think that's at the root of these puzzling bug reports
>  > > for 0.08 (the case of fire#n#2 - the version thing wasn't working as
>  > > expected, and I think users got the wrong expected values in some
>  > > cases...)
>  >
>  > Right. Since, WordNet::Similarity is already a dependency, you can use
>  > the hashCode method of WordNet::Tools in it.
>  >
>  > > Also, I tried changing some of the measures in the .t tests to ones
>  > > other than lesk, specifically wup, path, and lch, and there were tons
>  > > of errors. :) So, I think we want to look at that. Those actually
>  > > might be a good way to try out this "todo" type of test in
>  > > Tests::More, where you expect a test to fail so you can work on it in
>  > > future. I'll see if I can get that figured out.
>  >
>  > Hmmm... I think the failures should be expected. I think the test is
>  > basically running AllWords with the lest measure, and then comparing the
>  > output to some expected output (hard-coded in the test). So, most likely
>  > this output is different with different measures.
>  >
>  > But I think it would be good to add tests for each of the measures doing
>  > this.
>  >
>  > -- Sid.
>  >
>  >
>
>
>
>  --
>
>
> Ted Pedersen
>  http://www.d.umn.edu/~tpederse
>

-- 
Ted Pedersen
http://www.d.umn.edu/~tpederse