I am using RankLib for my data (shape: 218279 rows × 1504 columns) using python and getting error code 1 with none output. I am just wondering is there any documentation regarding error codes on RankLib?
I am using Jupyter iPython for my project and run the process using subprocess.run. In case you are wondering, below is my code to train.
train_data='learning_to_rank_data/training.txt'test_data=''validate_data=''metric2t='NDCG@2'model_dest='learning_to_rank_data/model.txt'try:subprocess.run(['java','-jar',ranklibjar,'-train',train_data,'-ranker','3','-metric2t',metric2t,'-save',model_dest],shell=True,check=True)exceptsubprocess.CalledProcessErrorase:raiseRuntimeError("command '{}' return with error (code {}): {}".format(e.cmd,e.returncode,e.output))
I have tried running the RankLib library (i.e. java -jar bin/RankLib.jar) in the Jupyter using same approach (subprocess.run) and it works fine.
What is causing this error code 1? Is it possible because of my data is too big? Or is it because I only conduct training not with testing and validation?
Any help would be appreciated!
Cheers.
EDIT
I just tried sliced my data to 1000 rows and still have return code 1 issue, so the big data is not an issue. What is exactly causing this problem?
Last edit: Darren L 2018-04-28
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I don't have any experience in running an external process from inside Python. I presume the error code is coming from the system, not RankLib, i.e. 0 success and 1 fail.
Often these errors are environment issues when calling RankLib (or any external process). Does Python know your work environment? Is it finding the data as specified by the path? Perhaps use a fully specified path rather than a relative one?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thank you for your response. I have solved this issue since I got the login account to the machine I worked at. Previously I was just using Jupyter endpoint environment. By using console in the machine, I could take a look at the actual error message and could compare it with RankLib source code. Turns out, the minimum number of relevance ranking for List-Wise is 1, not 0, which initially I thought it is.
Cheers!
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Not sure I understand your relevance ranking comment.
Are you talking about relevance labels themselves or relevance scores that determine ranking (or something else)?
I believe all RankLib algorithms accept a 0 relevance value as simply non-relevant with other values above that as some degree of relevance. Depends on the metric being used, as some of the metrics are binomial in nature, while others will accept a variety of "relevant" values.
Some of the list wise algorithms will make use of a natural exponential value of a relevance label which would start at 1 [e(0) = 1].
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
I am using RankLib for my data (shape: 218279 rows × 1504 columns) using python and getting error code 1 with none output. I am just wondering is there any documentation regarding error codes on RankLib?
I am using Jupyter iPython for my project and run the process using subprocess.run. In case you are wondering, below is my code to train.
Below is the output:
I have tried running the RankLib library (i.e. java -jar bin/RankLib.jar) in the Jupyter using same approach (subprocess.run) and it works fine.
What is causing this error code 1? Is it possible because of my data is too big? Or is it because I only conduct training not with testing and validation?
Any help would be appreciated!
Cheers.
EDIT
I just tried sliced my data to 1000 rows and still have return code 1 issue, so the big data is not an issue. What is exactly causing this problem?
Last edit: Darren L 2018-04-28
I don't have any experience in running an external process from inside Python. I presume the error code is coming from the system, not RankLib, i.e. 0 success and 1 fail.
Often these errors are environment issues when calling RankLib (or any external process). Does Python know your work environment? Is it finding the data as specified by the path? Perhaps use a fully specified path rather than a relative one?
Hi Stephen,
Thank you for your response. I have solved this issue since I got the login account to the machine I worked at. Previously I was just using Jupyter endpoint environment. By using console in the machine, I could take a look at the actual error message and could compare it with RankLib source code. Turns out, the minimum number of relevance ranking for List-Wise is 1, not 0, which initially I thought it is.
Cheers!
Not sure I understand your relevance ranking comment.
Are you talking about relevance labels themselves or relevance scores that determine ranking (or something else)?
I believe all RankLib algorithms accept a 0 relevance value as simply non-relevant with other values above that as some degree of relevance. Depends on the metric being used, as some of the metrics are binomial in nature, while others will accept a variety of "relevant" values.
Some of the list wise algorithms will make use of a natural exponential value of a relevance label which would start at 1 [e(0) = 1].