Menu

OpenImaj Hadoop support. Can't find proper document, please help

Hakunami
2014-04-17
2015-11-10
  • Hakunami

    Hakunami - 2014-04-17

    Hey,

    I am currently building a image retrieval system using hadoop, and I
    want to use OpenIMAJ as part of this system. However, I can't find a
    proper document or example that can teach me how to use it. I do think
    OpenIMAJ is cool and useful, but it really needs more document.

    Can you teach me how to use openimaj with hadoop?

    Thank you.

     
    • Jonathon Hare

      Jonathon Hare - 2014-04-17

      Hi,

      I can try and help, but you'll need to be a bit more specific as to what you want to do. What kind of techniques are you using in your retrieval system? What bits of the process do you want to distribute on a Hadoop cluster?

      Jon

      On 17 Apr 2014, at 19:37, Hakunami hakunami@users.sf.net wrote:

      Hey,

      I am currently building a image retrieval system using hadoop, and I
      want to use OpenIMAJ as part of this system. However, I can't find a
      proper document or example that can teach me how to use it. I do think
      OpenIMAJ is cool and useful, but it really needs more document.

      Can you teach me how to use openimaj with hadoop?

      Thank you.

      OpenImaj Hadoop support. Can't find proper document, please help

      Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/openimaj/discussion/general/

      To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/

       
      • Hakunami

        Hakunami - 2014-04-18

        Thank you, Jon.

        What I want to do is to implement a naive image similarity retrieve application. The idea is from this paper: http://bit.ly/1r3LvWH

        First, I want to use mapreduce parallel extract features from images, then a feature descriptor with corresponding image ID together will be a record which I want to build index on. After mapreduce, I want the records are written into HBase. When an image query comes, first, I get the features from this image, then search in HBase if the features distance is within threshold. (I am not very clear how to do the search)

        So, for the first stage, I want to use OpenImage in my mapper or reducer to extract SIFT features.

        Thank you so much for your help.

        Sincerely

         
        • Jonathon Hare

          Jonathon Hare - 2014-04-22

          Hi again,

          I'll try to give you some general guidance for the first step:

          The first thing you'll need to do is to get your images into Hadoop SequenceFiles. Specifically how you do this is going to depend on your image identifiers. If you are going to to use textual identifiers (i.e. the Hadoop Text class), then you can use the SequenceFileTool in OpenIMAJ (see below); otherwise (the paper you referenced uses IntWritable) you'll need to implement something yourself (either from scratch or by extending the SequenceFileTool, and perhaps making a new subclass of the SequenceFileUtility class in the core-hadoop OpenIMAJ subproject). The images themselves will need to be stored as Hadoop BytesWritable.

          Secondly you'll need to implement a mapper to do the feature extraction. The HadoopLocalFeaturesTool tool in OpenIMAJ can do this, but will output the key value pairs from the mapper in the form of the image ID and a serialised list of SIFT features (with all the information like the location and scale of the features as well as the descriptor). In the paper you linked, the authors output multiple pairs of <imageID, feature=""> (with just the raw SIFT descriptor as the feature) for each image, so if you want to do this, you'll need to write your own. Assuming that you're using IntWritable keys, the following implementation should get you started:

          =========
          package org.openimaj.hadoop.tools.localfeature;

          import java.io.ByteArrayInputStream;
          import java.io.IOException;

          import org.apache.hadoop.io.BytesWritable;
          import org.apache.hadoop.io.IntWritable;
          import org.apache.hadoop.io.Text;
          import org.apache.hadoop.mapreduce.Mapper;
          import org.openimaj.feature.local.list.LocalFeatureList;
          import org.openimaj.image.FImage;
          import org.openimaj.image.ImageUtilities;
          import org.openimaj.image.feature.local.engine.DoGSIFTEngine;
          import org.openimaj.image.feature.local.keypoints.Keypoint;
          import org.openimaj.io.IOUtils;

          public class SimpleSIFT extends Mapper<IntWritable, BytesWritable,="" IntWritable,="" BytesWritable=""> {

          private DoGSIFTEngine engine;
          
          @Override
          protected void setup(Context context) throws IOException,
                  InterruptedException
          {
              //construct and setup the sift extractor
              this.engine = new DoGSIFTEngine();
          }
          
          @Override
          protected void
                  map(IntWritable key, BytesWritable value, Context context) throws IOException, InterruptedException
          {
              //read the image
              FImage image = ImageUtilities.readF(new ByteArrayInputStream(image));
          
              //extract features
              LocalFeatureList<Keypoint> features = engine.findFeatures(image);
          
              //emit the features
              for (Keypoint kp : features) {
                  context.write(key, new BytesWritable(kp.ivec));
              }
          }
          

          }

          You'll obviously need to write something to configure the map reduce job (i.e. a subclass of the Hadoop ToolRunner class) to use this mapper, and define the input, etc. Make sure you set the number of reducers to 0, as there doesn't need to be a reduction step for feature extraction.

          Basic usage of the SequenceFileTool is as follows:
          Build the tool by navigating in the OpenIMAJ sources to hadoop/tools/SequenceFileTool
          run "mvn install assembly:assembly" to build it (it will be packaged in target/SequenceFileTool.jar
          run the tool with "java -jar SequenceFileTool.jar" and it will print a list of options

          For example, say you have a directory of images that you want to build into a SequenceFile, then the following command will build the file (mysequence.file) for you:
          java -jar SequenceFileTool.jar -m CREATE -o mysequencefile.seq images/

          (You should be able to give a full hdfs uri instead of mysequencefile.seq if you want the output sequencefile to be created directly on your hadoop hdfs).

          The HadoopLocalFeaturesTool can be build in the same way as the SequenceFileTool above. Once you've got the jar built you can run it directly on your cluster by issuing a command like:
          hadoop jar HadoopLocalFeaturesTool.jar --mode SIFT -o hdfs://servername/data/mysequencefile.seq -i hdfs://servername/data/sift-features.seq

          (as with the SequenceFileTool information on all the options will be printed if you run it without arguments (hadoop jar HadoopLocalFeaturesTool.jar))

          Hope that helps,

          Jon

          On 18 Apr 2014, at 04:41, Hakunami hakunami@users.sf.net wrote:

          Thank you, Jon.

          What I want to do is to implement a naive image similarity retrieve application. The idea is from this paper: http://bit.ly/1r3LvWH

          First, I want to use mapreduce parallel extract features from images, then a feature descriptor with corresponding image ID together will be a record which I want to build index on. After mapreduce, I want the records are written into HBase. When an image query comes, first, I get the features from this image, then search in HBase if the features distance is within threshold. (I am not very clear how to do the search)

          So, for the first stage, I want to use OpenImage in my mapper or reducer to extract SIFT features.

          Thank you so much for your help.

          Sincerely

          OpenImaj Hadoop support. Can't find proper document, please help

          Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/openimaj/discussion/general/

          To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/

           
          • Anonymous

            Anonymous - 2015-10-30
            Post awaiting moderation.
          • nam vo

            nam vo - 2015-11-03

            I'm a new openimaj. i need a source code to practice about this. thanks Jonathon Hare

             
          • nam vo

            nam vo - 2015-11-10

            Hi Jonathon Hare,

            I try to write ReducerSIFT function, but it is not work, please help me review it

            ======================================================
            public class SimpleSIFT extends Mapper<IntWritable, BytesWritable,="" IntWritable,="" BytesWritable=""> {
            ....
            }

            ======================================================
            import org.apache.hadoop.io.Text;
            import org.apache.hadoop.mapreduce.Reducer;

            import java.io.IOException;

            public class ReducerSIFT extends Reducer<Text, Text,="" Text,="" Text=""> {

            public void reduce(Text key, Iterable<Text> values, Context context) throws IOException, InterruptedException {
            
                int sum = 0;
                Text imageFilePath = null;
                for (Text filePath : values) {
                    imageFilePath = filePath;
                    sum = sum + 1;
                }
                context.write(key, new Text(String.valueOf(sum) + "       " + imageFilePath));
            }
            

            }

            Thanks
            nam

             
  • Anonymous

    Anonymous - 2014-05-22

    hadoop jar HadoopLocalFeaturesTool.jar --mode SIFT -o hdfs://servername/data/mysequencefile.seq -i sift-features.seq

    Will this command sift features from image sequence if sift-feature.seq is given as input?
    Shouldn't sift-feature.seq should come as output?

     
    • Jonathon Hare

      Jonathon Hare - 2014-05-22

      Yes, you're right the files are round the wrong way...

       

Anonymous
Anonymous

Add attachments
Cancel