Let us assume a following situation. You want to use the part of speech tag of a word as a feature in your training data. Then the training data file will look like this:

.... pos=N.... c1
.... pos=V.... c2
.... pos=P.... c1

After the training you want to know wheter using the part of speech tag was a good idea or not, without excluding it and retraining the model. I thought (and that was also recommended to me) that a good approach would be to use the feature weights from the model file. Let us assume we will get feature weights like this:

pos=N
pos=V
pos=P
....
10.446606836643648
-0.744115260153853
3.5105152438851084

How do you interpret those? What is the difference between the positive ones and the negative ones? How to construct one single weight for the abstract "part of speech feature" out of the collection of weights for all possible values (pos=N, pos=V, pos=P)?

I would be really grateful for any help, since I need it for a very important task for me. Thanks in advance.