OpenIMAJ / Discussion / General Discussion: OpenImaj Hadoop support. Can't find proper document, please help

Hakunami - 2014-04-17

Hey,

I am currently building a image retrieval system using hadoop, and I
want to use OpenIMAJ as part of this system. However, I can't find a
proper document or example that can teach me how to use it. I do think
OpenIMAJ is cool and useful, but it really needs more document.

Can you teach me how to use openimaj with hadoop?

Thank you.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.
- Jonathon Hare - 2014-04-17
  
  Hi,
  
  I can try and help, but you'll need to be a bit more specific as to what you want to do. What kind of techniques are you using in your retrieval system? What bits of the process do you want to distribute on a Hadoop cluster?
  
  Jon
  
  On 17 Apr 2014, at 19:37, Hakunami hakunami@users.sf.net wrote:
  
  Hey,
  
  I am currently building a image retrieval system using hadoop, and I
  want to use OpenIMAJ as part of this system. However, I can't find a
  proper document or example that can teach me how to use it. I do think
  OpenIMAJ is cool and useful, but it really needs more document.
  
  Can you teach me how to use openimaj with hadoop?
  
  Thank you.
  
  OpenImaj Hadoop support. Can't find proper document, please help
  
  Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/openimaj/discussion/general/
  
  To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/
  
  alternate
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Anonymous
    
    Add attachments
    Cancel
    You seem to have CSS turned off. Please don't fill out this field.
    
    You seem to have CSS turned off. Please don't fill out this field.
  - Hakunami - 2014-04-18
    
    Thank you, Jon.
    
    What I want to do is to implement a naive image similarity retrieve application. The idea is from this paper: http://bit.ly/1r3LvWH
    
    First, I want to use mapreduce parallel extract features from images, then a feature descriptor with corresponding image ID together will be a record which I want to build index on. After mapreduce, I want the records are written into HBase. When an image query comes, first, I get the features from this image, then search in HBase if the features distance is within threshold. (I am not very clear how to do the search)
    
    So, for the first stage, I want to use OpenImage in my mapper or reducer to extract SIFT features.
    
    Thank you so much for your help.
    
    Sincerely
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    
    Anonymous
    
    Add attachments
    Cancel
    You seem to have CSS turned off. Please don't fill out this field.
    
    You seem to have CSS turned off. Please don't fill out this field.
    - Jonathon Hare - 2014-04-22
      
      Hi again,
      
      I'll try to give you some general guidance for the first step:
      
      The first thing you'll need to do is to get your images into Hadoop SequenceFiles. Specifically how you do this is going to depend on your image identifiers. If you are going to to use textual identifiers (i.e. the Hadoop Text class), then you can use the SequenceFileTool in OpenIMAJ (see below); otherwise (the paper you referenced uses IntWritable) you'll need to implement something yourself (either from scratch or by extending the SequenceFileTool, and perhaps making a new subclass of the SequenceFileUtility class in the core-hadoop OpenIMAJ subproject). The images themselves will need to be stored as Hadoop BytesWritable.
      
      Secondly you'll need to implement a mapper to do the feature extraction. The HadoopLocalFeaturesTool tool in OpenIMAJ can do this, but will output the key value pairs from the mapper in the form of the image ID and a serialised list of SIFT features (with all the information like the location and scale of the features as well as the descriptor). In the paper you linked, the authors output multiple pairs of <imageID, feature=""> (with just the raw SIFT descriptor as the feature) for each image, so if you want to do this, you'll need to write your own. Assuming that you're using IntWritable keys, the following implementation should get you started:
      
      =========
      package org.openimaj.hadoop.tools.localfeature;
      
      import java.io.ByteArrayInputStream;
      import java.io.IOException;
      
      import org.apache.hadoop.io.BytesWritable;
      import org.apache.hadoop.io.IntWritable;
      import org.apache.hadoop.io.Text;
      import org.apache.hadoop.mapreduce.Mapper;
      import org.openimaj.feature.local.list.LocalFeatureList;
      import org.openimaj.image.FImage;
      import org.openimaj.image.ImageUtilities;
      import org.openimaj.image.feature.local.engine.DoGSIFTEngine;
      import org.openimaj.image.feature.local.keypoints.Keypoint;
      import org.openimaj.io.IOUtils;
      
      public class SimpleSIFT extends Mapper<IntWritable, BytesWritable,="" IntWritable,="" BytesWritable=""> {
      
      private DoGSIFTEngine engine; @Override protected void setup(Context context) throws IOException, InterruptedException { //construct and setup the sift extractor this.engine = new DoGSIFTEngine(); } @Override protected void map(IntWritable key, BytesWritable value, Context context) throws IOException, InterruptedException { //read the image FImage image = ImageUtilities.readF(new ByteArrayInputStream(image)); //extract features LocalFeatureList<Keypoint> features = engine.findFeatures(image); //emit the features for (Keypoint kp : features) { context.write(key, new BytesWritable(kp.ivec)); } }
      
      }
      
      You'll obviously need to write something to configure the map reduce job (i.e. a subclass of the Hadoop ToolRunner class) to use this mapper, and define the input, etc. Make sure you set the number of reducers to 0, as there doesn't need to be a reduction step for feature extraction.
      
      Basic usage of the SequenceFileTool is as follows:
      Build the tool by navigating in the OpenIMAJ sources to hadoop/tools/SequenceFileTool
      run "mvn install assembly:assembly" to build it (it will be packaged in target/SequenceFileTool.jar
      run the tool with "java -jar SequenceFileTool.jar" and it will print a list of options
      
      For example, say you have a directory of images that you want to build into a SequenceFile, then the following command will build the file (mysequence.file) for you:
      java -jar SequenceFileTool.jar -m CREATE -o mysequencefile.seq images/
      
      (You should be able to give a full hdfs uri instead of mysequencefile.seq if you want the output sequencefile to be created directly on your hadoop hdfs).
      
      The HadoopLocalFeaturesTool can be build in the same way as the SequenceFileTool above. Once you've got the jar built you can run it directly on your cluster by issuing a command like:
      hadoop jar HadoopLocalFeaturesTool.jar --mode SIFT -o hdfs://servername/data/mysequencefile.seq -i hdfs://servername/data/sift-features.seq
      
      (as with the SequenceFileTool information on all the options will be printed if you run it without arguments (hadoop jar HadoopLocalFeaturesTool.jar))
      
      Hope that helps,
      
      Jon
      
      On 18 Apr 2014, at 04:41, Hakunami hakunami@users.sf.net wrote:
      
      Thank you, Jon.
      
      What I want to do is to implement a naive image similarity retrieve application. The idea is from this paper: http://bit.ly/1r3LvWH
      
      First, I want to use mapreduce parallel extract features from images, then a feature descriptor with corresponding image ID together will be a record which I want to build index on. After mapreduce, I want the records are written into HBase. When an image query comes, first, I get the features from this image, then search in HBase if the features distance is within threshold. (I am not very clear how to do the search)
      
      So, for the first stage, I want to use OpenImage in my mapper or reducer to extract SIFT features.
      
      Thank you so much for your help.
      
      Sincerely
      
      OpenImaj Hadoop support. Can't find proper document, please help
      
      Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/openimaj/discussion/general/
      
      To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/
      
      alternate
      
      If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
      
      Anonymous
      
      Add attachments
      Cancel
      You seem to have CSS turned off. Please don't fill out this field.
      
      You seem to have CSS turned off. Please don't fill out this field.
      - Anonymous - 2015-10-30
        
        Post awaiting moderation.
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Anonymous
        
        Add attachments
        Cancel
        You seem to have CSS turned off. Please don't fill out this field.
        
        You seem to have CSS turned off. Please don't fill out this field.
      - nam vo - 2015-11-03
        
        I'm a new openimaj. i need a source code to practice about this. thanks Jonathon Hare
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Anonymous
        
        Add attachments
        Cancel
        You seem to have CSS turned off. Please don't fill out this field.
        
        You seem to have CSS turned off. Please don't fill out this field.
        
        Comment has been marked as spam.
        Undo
        
        View and moderate all "General Discussion" comments posted by this user
        
        Mark all as spam, and block user from posting to "Discussion"
        
        Anonymous - 2015-11-03
        
        http://openimaj.org/tutorial/
        
        http://openimaj.org/tutorial/
        
        Add attachments
        Cancel
        You seem to have CSS turned off. Please don't fill out this field.
        
        You seem to have CSS turned off. Please don't fill out this field.
        
        New Attachment:
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Anonymous
        
        Add attachments
        Cancel
        You seem to have CSS turned off. Please don't fill out this field.
        
        You seem to have CSS turned off. Please don't fill out this field.
      - nam vo - 2015-11-10
        
        Hi Jonathon Hare,
        
        I try to write ReducerSIFT function, but it is not work, please help me review it
        
        ======================================================
        public class SimpleSIFT extends Mapper<IntWritable, BytesWritable,="" IntWritable,="" BytesWritable=""> {
        ....
        }
        
        ======================================================
        import org.apache.hadoop.io.Text;
        import org.apache.hadoop.mapreduce.Reducer;
        
        import java.io.IOException;
        
        public class ReducerSIFT extends Reducer<Text, Text,="" Text,="" Text=""> {
        
        public void reduce(Text key, Iterable<Text> values, Context context) throws IOException, InterruptedException { int sum = 0; Text imageFilePath = null; for (Text filePath : values) { imageFilePath = filePath; sum = sum + 1; } context.write(key, new Text(String.valueOf(sum) + " " + imageFilePath)); }
        
        }
        
        Thanks
        nam
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Anonymous
        
        Add attachments
        Cancel
        You seem to have CSS turned off. Please don't fill out this field.
        
        You seem to have CSS turned off. Please don't fill out this field.

Anonymous - 2014-05-22

hadoop jar HadoopLocalFeaturesTool.jar --mode SIFT -o hdfs://servername/data/mysequencefile.seq -i sift-features.seq

Will this command sift features from image sequence if sift-feature.seq is given as input?
Shouldn't sift-feature.seq should come as output?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.
- Jonathon Hare - 2014-05-22
  
  Yes, you're right the files are round the wrong way...
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Anonymous
    
    Add attachments
    Cancel
    You seem to have CSS turned off. Please don't fill out this field.
    
    You seem to have CSS turned off. Please don't fill out this field.

OpenImaj Hadoop support. Can't find proper document, please help

Forums

Help

OpenImaj Hadoop support. Can't find proper document, please help

}

OpenImaj Hadoop support. Can't find proper document, please help

Forums

Help

OpenImaj Hadoop support. Can't find proper document, please help document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

}

OpenImaj Hadoop support. Can't find proper document, please help