From: Fidelis A. <fi...@po...> - 2004-11-15 20:38:22
|
Citando Christian Siefkes <si...@mi...>: > Fidelis, > > Fidelis Assis wrote: > > Here our results, again, differ a lot too. I have simulated TUNE, deleting > > all css files before starting and setting :preseve_css: to /1/ in toer.crm. > > The results were very good. See the evolution in these reports: > > > > - First shuffle: > > I'm a bit confused about your test setup. Did you disable training for the > test set (last 500 mails)? That's necessary for TUNE, since it doesn't make > much sense to test a classifier on mails it has already trained. No, I didn't disable. I understand what you mean, but I did just a TUNE "simulation", using the :preserve: option as a means to do a quick test. Although Pavel didn't have good results with TUNE, I didn't have any reason to doubt that the strict TUNE, like you mentioned, would work fine, so I did a quick and dirt test. > > > - Second shuffle, starting with the css files trained in the > > previous run: > > Actually, that's still the same shuffle, isn't it? No, in every pass a different shuffle was used, just like in the TOER+Reinforcement. Only the css files weren't deleted after each pass, to stress the accumulation of knowledge and test the quality of the pruning method, which could have some impact on methods like TUNE. The test was more to see if stuffing many features, in a more stressing way than in TUNE, would do any harm. > Usually, TUNE training means to get more information from your training set > (first 3647 mails) by TOE/reinforcement-training the mails in the set for > several times. Either a fixed number of iterations (e.g. 3), or until there > are no errors on the training set ("train-until-no-errors", but that will > often lead to overtraining), or using some other stopping criterion. > > Then you look at the so-far-unseen test set and see the results you got. And > after that, of course, you delete your CSS files and move on to the second > shuffle, etc. Yes, it wasn't a strict TUNE but a quick test to have a feel about how TUNE results would be. I don't think that TUNE results would be much different anyway. BTW, the method I used is what I do for preparing css files for production. -- Fidelis Assis |