Add a feature evaluation function to RankLib. While this is perhaps more easily done via a script from outside RankLib, it would be nice if RankLib actually offered an internal, automated way of determining the usefulness (or lack of usefulness) of specified features in producing a ranking model.
The function should be able to work with all the currently defined ranking models.
johywx@users.sf.net has suggested possible methods for determining feature usefulness:
Feature ablation: train/evaluate the model on the full set of features, then remove one of them, train/evaluate on the new set, put it back and remove another one, etc. You can remove either 1 feature at a time or a group of similar features to speed up the process at the cost of less precision. This will basically give an idea of "how important is the feature in the set). If the score drops while removing it, it means it's important, obviously. This particularly help figuring out which features are useless (no change in the score) or redundant (another feature provides the same information).
Single feature test: this is just the other way around. Test the results with no re-ranking, then train the model with only one feature (or group of feature, such as above). This way, you can assess the individual contribution of each feature. Be careful though, a feature with a low contribution is not necessarily useless, because maybe its association with other feature convey a lot of information. So this experiment allows you to detect good features but not bad ones.
Feature frequency: it still makes sense to study the frequency of the features in the model, although this is a bit more complex. I'm not an expert in LambdaMART so I didn't try this one and I was happy enough with 1 and 2. This would require understanding how the model works exactly (so you certainly need to dig in the code a bit) and making sure you're comparing comparable things.
Other packages provide some information as well. GBM in R for instance gives the relative contribution of features (or something like that) as well as other interesting plots. Currently we're working with both RankLib and GBM in our experiments. One is more production-oriented, the other is more analysis-oriented (although I'm sure you can do many things with RankLib, it's just not out of the box).
Whatever solution you pick, it takes time. Because you generally have lots of features, 1 and 2 require to repeat the process a lot. 3 requires having a good knowledge or have a deep look into the source files. 4 is boring because it's R and GBM isn't on CTAN so you pretty much have to install it manually, then learn how to use it.
FeatureManager can produce simple feature distribution statics for models that restrict feature use (e.g. tree models), but is not applicable to models that make use of all defined features (as it is not known how feature weights might correspond to feature usefulness.
See feature_stats parameter in the FeatureManager.