Menu

How about nominal features?

RankLib
Zuguang He
2017-04-19
2017-04-19
  • Zuguang He

    Zuguang He - 2017-04-19

    Hello!
    I am using RankLib to solve a simple ranking problem,but it is not a IR problem.
    And I have some nominal features,for example,temperature:{hot,mild,cool},weather:{sunny,overcast,rainy}.
    So how should I deal with nominal features?
    Can I get hot=0,mild=1,cool=2?
    Thank you very much!

     
  • Lemur Project

    Lemur Project - 2017-04-19

    Conventional wisdom says you shouldn't really try to convert nominal feature data into numeric feature data because the arithmetic of making use of the values (by the ranking algorithms) doesn't really have any meaning.

    For some nominal data such as your examples, there does seem to be a gradient between cold to hot / rainy to sunny, so perhaps a straight numerical substitution would not be meaningless. There is an implied order to the numeric and nominal values.

    You might want to normalize the converted values based on the maximum nominal range (e.g. cold = 0/3, cool=1/3, mild=2/3 and hot=3/3).

    You also may want to check out Panda and SciKit-learn which are python, but does have some capabilities in converting nominal data to numeric values. I don't really know that much about what it does, but it is fairly widely used. You might use it to develop numeric feature values from the nominal values for input into RankLib.

     

Log in to post a comment.

MongoDB Logo MongoDB