ImageTerrier / Discussion / General Discussion: Problems in using HadoopImageTerrier

Anonymous - 2014-07-30

I followed the steps described in wiki to use HadoopImageTerrier tool to
create terrier index. But I encountered several problems in using it:

I used the command described in wiki to run HadoopImageTerrier:

hadoop jar HadoopImageTerrier.jar -t BASIC -nr 1 -fc QuantisedKeypoint -o
hdfs:/servername//data/imageterrier-index.idx -m QUANTISED_FEATURES
hdfs://servername/data/quantised-sift-features.seq

but I got a message to show me it is need to specify the -k parameter. It
is strange that HadoopImageTerrier uses quantised features to make index.
There is no clustering procedures to run, why is it need to specify the -k
parameter?

I specify the -k parameter try to run it on Hadoop 2.0.0-cdh4.7.0 (The
version OpenIMAJ 1.3 snapshot uses)., but I got an error message:

Exception in thread "main" java.lang.IncompatibleClassChangeError: Found
interface org.apache.hadoop.mapreduce.JobContext, but class was expected
at
org.imageterrier.hadoop.mapreduce.PositionAwareSequenceFileInputFormat.getSplits(PositionAwareSequenceFileInputFormat.java:71)
at
org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:468)
at
org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:485)
at
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:369)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1286)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1283)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1283)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1304)
at
org.imageterrier.indexers.hadoop.HadoopIndexer.run(HadoopIndexer.java:569)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at
org.imageterrier.indexers.hadoop.HadoopIndexer.main(HadoopIndexer.java:609)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)

It seems caused by the hadoop versions are incompatible. I found that
Terrier uses hadoop 0.20.2. Is it the reason to produce the error? Thanks.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.
- Jonathon Hare - 2014-07-31
  
  Regarding 1 - simple answer is that it doesn't need to know; however, unfortunately because of the way the arguments parsing works and the fact that we "borrow" arguments from other tools (i.e. ImageTerrierTools and OpenIMAJ's ClusterQuantiserTool) the -k argument gets marked as required.
  
  Regarding 2 - yes, this looks like a problem with mixed hadoop versions... I'm trying to make a fix atm.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Anonymous
    
    Add attachments
    Cancel
    You seem to have CSS turned off. Please don't fill out this field.
    
    You seem to have CSS turned off. Please don't fill out this field.
  - Jonathon Hare - 2014-07-31
    
    Following up, I've just rebuilt Terrier (and deployed at maven.openimaj.org) to use the same version as OpenIMAJ, and committed the required change to the ImageTerrier maven poms to pick-up the new Terrier version. Untested, but hopefully will work...
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    
    Anonymous
    
    Add attachments
    Cancel
    You seem to have CSS turned off. Please don't fill out this field.
    
    You seem to have CSS turned off. Please don't fill out this field.
    - Anonymous - 2014-08-01
      
      I tested it and the same error appeared. I am new to hadoop, but from the error message:
      
      Exception in thread "main" java.lang.IncompatibleClassChangeError: Found
      interface org.apache.hadoop.mapreduce.JobContext, but class was expected
      at
      org.imageterrier.hadoop.mapreduce.PositionAwareSequenceFileInputFormat.getSplits(PositionAwareSequenceFileInputFormat.java:71)
      
      it seems the problem of mapreduce code section of org.imageterrier.hadoop.mapreduce.PositionAwareSequenceFileInputFormat. Is it need to be tuned according the hadoop 2.0 api? Thanks.
      
      If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
      
      Anonymous
      
      Add attachments
      Cancel
      You seem to have CSS turned off. Please don't fill out this field.
      
      You seem to have CSS turned off. Please don't fill out this field.

Jonathon Hare - 2014-08-01

I think for some reason you still have the old version being linked... Try a "mvn -U clean install" to force an update, and then verify that the only hadoop version being linked is 2.0.0 by looking through the output of "mvn dependency:tree"

I've also made a pre-compiled version for you to try here: http://degas.ecs.soton.ac.uk/~jsh2/HadoopImageTerrier-20140801.jar

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.
- Anonymous - 2014-08-02
  
  I downloaded the pre-compiled version and it ran without previous problem. But it failed to generate terrier index. Here is the error that I got:
  
  java.io.IOException: No run status files found in hdfs://localhost:9001/images.idx
  at org.terrier.indexing.HadoopIndexerReducer.loadRunData(HadoopIndexerReducer.java:331)
  at org.terrier.indexing.HadoopIndexerReducer.reduce(HadoopIndexerReducer.java:146)
  at org.terrier.indexing.HadoopIndexerReducer.reduce(HadoopIndexerReducer.java:1)
  at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:170)
  at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:648)
  at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:404)
  at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:443)
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Anonymous
    
    Add attachments
    Cancel
    You seem to have CSS turned off. Please don't fill out this field.
    
    You seem to have CSS turned off. Please don't fill out this field.

Jonathon Hare - 2014-08-06

I would guess that there was an earlier error that caused that to happen... can you paste the complete output log?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Anonymous - 2014-08-07

Post awaiting moderation.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Problems in using HadoopImageTerrier

Forums

Help

Problems in using HadoopImageTerrier

Problems in using HadoopImageTerrier

Forums

Help

Problems in using HadoopImageTerrier document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Problems in using HadoopImageTerrier