Re: [Htmlparser-developer] HttpUnit etc. was (Re: Table Scanner )
Brought to you by:
derrickoswald
From: Somik R. <so...@ya...> - 2002-12-21 08:47:08
|
Hi Sam > It seems that really we want something like: > > 1. Specify use case, and data-in/data-out > 2. Automatically generate test code and shell implementation code > 3. Run test code to fail > 4. fill in implementaton details ...etc > I like your idea. You can check this week's release - you'll find searching support for form tags which allow you to pick up input tags, textarea tags, select tags, ... Bytway... I'd written earlier about this - what is your opinion on using Bayesian networks to have a rule-based learning system, that gets better over time ? i.e. right now the tag identification mechanism is linear- there is only so far that can go. But with the sort of dirty html we get, the system has to be self-learning. I am thinking of an approach where we'd try to eliminate a lot of the hard-coded rules, with a learning network. Of course, we'd have our tests to verify that we haven't broken anything, and from there, it should only get better. It would be great to have your insight on this. > p.s. I'm impressed by the frequency with which you are releasing > htmlParser, and your process of having multiple candidates etc. I > struggle to release often as the release process itself still seems a > little cumbersome (sourceforge has got better) .... have you any tips > for streamlining it ....? I guess what I really need is an ant methods like > > ant release-bug-fix version > ant create-new-version-release > ant create-new-candiate-release > > which handle all the necessary communication with sourceforge, > uploading, packaging and handling of release numbers .... Ha ha! I am not sure if you'll believe this, but I was inspired to structure the htmlparser project based on the neurogrid project- you had ant scripts long before we did. Of course, ant scripts are so important to do the job automatically - but I like keeping things simple -in the sense, there is no seperate bug-fix version, but the next integration release (Candidate). I am not yet a fan of branches - they're ok if they dont live more than two weeks (I've been thinking real hard about it for a while). Im planning to get the production release out this week - so we can all move on to 1.3 (instead of having two versions - we'll live with 1.3 integration releases). I'd hate to make the same bug fixes twice. Regards, Somik |