Re: [Dclib-devel] pegasos in dlib
Brought to you by:
davisking
From: Davis K. <dav...@us...> - 2009-07-24 14:20:36
|
My first guess would be that you are limiting it to too few support vectors. Underneath the svm_pegasos algorithm in dlib is an object (the kcentroid) that basically just keeps a linearly independent set of sample vectors. If you tell svm_pegasos to use only 20 "support vectors" then the kcentroid internally tries to find a set of 20 sample vectors that best span the vector space induced by the kernel you use. The resulting decision function is a linear combination of these vectors so if, as you noted, it picks bad ones then you most likely won't get a good answer. But if you picked ones that really did span the space covered by your samples then you could apply an SVM learning algorithm using just that set and there wouldn't be any local minima to get stuck in. The catch is you have to pick a nice spanning set somehow. This is what dlib::kcentroid tries to do and svm_pegasos is built on stop of it. So the success of the dlib pegasos implementation is predicated on the capability of the kcentroid to crate a good spanning set. So what do you do right? I use svm_pegasos all the time and there are some things you have to watch for related to how it approximates the spanning set of vectors. The first is if you set the tolerance too high then it might not use enough support vectors even if you set the limit to 100000. So for example, the other day I was using it and it tried to use just 1 support vector. Obviously no good. So if the number of SVs you get out of the algorithm is less than whatever you set as the max allowable number (via set_max_num_sv()) then this is because of the tolerance value being large. The default is 0.01 but 0.001 is sometimes more appropriate. I should probably change it to be the default. As an aside, the reason I don't depsense with the tolerance (essentially setting it to 0) and let the cap on support vectors be the only thing that determines how many you end up with is because using a small tolerance and a SV cap together tends to make the algorithm a whole lot faster. The only problem is when the tolerance is too big, then you run into trouble. But this is easy to spot. Similarly, if you just set the SV limit to be too low to accurately represent the span of all the samples (in the feature space induced by your kernel) then the result will probably be chaotic and bad. So try using more. The only times I have seen svm_pegasos get weird are when I was using too few SVs. The more you use the slower it gets but it also gets more accurate. But usually, at least with the datasets I typically encounter, I get by just fine with under about 150 support vectors. Often under half that is fine. But it all depends on your dataset. For example, if you had a 2D dataset that was a checkerboard and you wanted to separate the white squares from the black squares you would need, at the very minimum (assuming you used a gaussian kernel), a number of support vectors equal to the number of squares on your board. My second guess is that you aren't letting the algorithm run long enough. Are you using the batch() or verbose_batch() training adapters (like in this example: http://dclib.sourceforge.net/svm_pegasos_ex.cpp.html)? I would say let the algorithm run until the learning rate has dropped below something like 0.1. If you stop the algorithm before convergence then it definitely won't work well. Note that the time it takes to converge is also a function of what you set as the value of lambda (I typically use lambda values between 1e-4 and 1e-7, but YMMV). What kernel are you using? Another thing to think about is trying to remove some features. Generally, the more features you have the more vectors it takes to span the space inhabited by your samples. There are a lot of feature selection algorithms out there. Currently there is only one in dlib (the rank_features() function) and you might give it a go and see what it says. Anyway, I'm carrying on... I would suggest setting the tolerance to 0.001 and trying maybe a max SV number of 100 and seeing what that gets you as a starting point. I would also make sure you normalized your features if you haven't already. That usually helps a lot (you might use dlib::vector_normalizer). Hope the above helps. Certainly let me know if it doesn't. I'm just an engineer from Maryland. Not much to tell really :) I work in a group at Northrop Grumman that does a lot of image processing and computer vision kinds of things. They don't sponsor dlib or anything though. It is just a project I started back as an undergrad and work on for fun on my own time. Cheers, Davis On Fri, Jul 24, 2009 at 8:35 AM, Alex Siegel <al...@st...> wrote: > I think I understand the pegasos API pretty well, and I've read the > relevant research papers. One aspect that has been a source of frustration > is the lack of predictability in the quality of the output decision > function. As I'm searching the parameter space for optimal values, I'm > finding that very small changes produce large, random variations in the > classification percentage. Adding or removing features has a similar effect. > It seems like pegasos is too easily trapped in local minima during its > optimization, and it's making poor decisions about what support vectors to > start with. The data I'm working with is extremely noisy, but there is a lot > of it (65,000 samples with 50 features). I'd appreciate some insight into > how to improve performance. > > Frankly, I'm also a little curious about you as a person. Your developer > page has no information on it. Where do you live? Where do you work? > > > On Thu, Jul 23, 2009 at 11:42 AM, Davis King < > dav...@us...> wrote: > >> Thanks, good to know you find it useful :) I'm not particular, you can >> email me and the developer list, or post in the help or open discussion >> forms on dlib's sourceforge page. Whatever floats your boat. >> >> So what's your question? >> >> Cheers, >> Davis > > > |