The Lemur Project / Discussion / RankLib: Error Code 1 on Training data

Darren L - 2018-04-28

Hi,

I am using RankLib for my data (shape: 218279 rows × 1504 columns) using python and getting error code 1 with none output. I am just wondering is there any documentation regarding error codes on RankLib?

I am using Jupyter iPython for my project and run the process using subprocess.run. In case you are wondering, below is my code to train.

train_data = 'learning_to_rank_data/training.txt' test_data = '' validate_data = '' metric2t = 'NDCG@2' model_dest = 'learning_to_rank_data/model.txt' try: subprocess.run(['java', '-jar', ranklibjar, '-train', train_data, '-ranker', '3', '-metric2t', metric2t, '-save', model_dest], shell=True, check=True) except subprocess.CalledProcessError as e: raise RuntimeError("command '{}' return with error (code {}): {}".format(e.cmd, e.returncode, e.output))

Below is the output:

RuntimeError: command '['java', '-jar', 'RankLib-2.9.jar', '-train', 'learning_to_rank_data/training.txt', '-ranker', '3', '-metric2t', 'NDCG@2', '-save', 'learning_to_rank_data/model.txt']' return with error (code 1): None

I have tried running the RankLib library (i.e. java -jar bin/RankLib.jar) in the Jupyter using same approach (subprocess.run) and it works fine.

What is causing this error code 1? Is it possible because of my data is too big? Or is it because I only conduct training not with testing and validation?

Any help would be appreciated!

Cheers.

EDIT
I just tried sliced my data to 1000 rows and still have return code 1 issue, so the big data is not an issue. What is exactly causing this problem?

Last edit: Darren L 2018-04-28
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Lemur Project - 2018-04-29

I don't have any experience in running an external process from inside Python. I presume the error code is coming from the system, not RankLib, i.e. 0 success and 1 fail.

Often these errors are environment issues when calling RankLib (or any external process). Does Python know your work environment? Is it finding the data as specified by the path? Perhaps use a fully specified path rather than a relative one?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Darren L - 2018-05-09
  
  Hi Stephen,
  
  Thank you for your response. I have solved this issue since I got the login account to the machine I worked at. Previously I was just using Jupyter endpoint environment. By using console in the machine, I could take a look at the actual error message and could compare it with RankLib source code. Turns out, the minimum number of relevance ranking for List-Wise is 1, not 0, which initially I thought it is.
  
  Cheers!
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Lemur Project - 2018-05-09

Not sure I understand your relevance ranking comment.

Are you talking about relevance labels themselves or relevance scores that determine ranking (or something else)?

I believe all RankLib algorithms accept a 0 relevance value as simply non-relevant with other values above that as some degree of relevance. Depends on the metric being used, as some of the metrics are binomial in nature, while others will accept a variety of "relevant" values.

Some of the list wise algorithms will make use of a natural exponential value of a relevance label which would start at 1 [e(0) = 1].

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Error Code 1 on Training data

Search engine and data mining applications and ClueWeb datasets.

Forums

Help

Error Code 1 on Training data

Error Code 1 on Training data

Search engine and data mining applications and ClueWeb datasets.

Forums

Help

Error Code 1 on Training data document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Error Code 1 on Training data