I am currently building a image retrieval system using hadoop, and I
want to use OpenIMAJ as part of this system. However, I can't find a
proper document or example that can teach me how to use it. I do think
OpenIMAJ is cool and useful, but it really needs more document.
Can you teach me how to use openimaj with hadoop?
Thank you.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I can try and help, but you'll need to be a bit more specific as to what you want to do. What kind of techniques are you using in your retrieval system? What bits of the process do you want to distribute on a Hadoop cluster?
I am currently building a image retrieval system using hadoop, and I
want to use OpenIMAJ as part of this system. However, I can't find a
proper document or example that can teach me how to use it. I do think
OpenIMAJ is cool and useful, but it really needs more document.
Can you teach me how to use openimaj with hadoop?
Thank you.
OpenImaj Hadoop support. Can't find proper document, please help
What I want to do is to implement a naive image similarity retrieve application. The idea is from this paper: http://bit.ly/1r3LvWH
First, I want to use mapreduce parallel extract features from images, then a feature descriptor with corresponding image ID together will be a record which I want to build index on. After mapreduce, I want the records are written into HBase. When an image query comes, first, I get the features from this image, then search in HBase if the features distance is within threshold. (I am not very clear how to do the search)
So, for the first stage, I want to use OpenImage in my mapper or reducer to extract SIFT features.
Thank you so much for your help.
Sincerely
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I'll try to give you some general guidance for the first step:
The first thing you'll need to do is to get your images into Hadoop SequenceFiles. Specifically how you do this is going to depend on your image identifiers. If you are going to to use textual identifiers (i.e. the Hadoop Text class), then you can use the SequenceFileTool in OpenIMAJ (see below); otherwise (the paper you referenced uses IntWritable) you'll need to implement something yourself (either from scratch or by extending the SequenceFileTool, and perhaps making a new subclass of the SequenceFileUtility class in the core-hadoop OpenIMAJ subproject). The images themselves will need to be stored as Hadoop BytesWritable.
Secondly you'll need to implement a mapper to do the feature extraction. The HadoopLocalFeaturesTool tool in OpenIMAJ can do this, but will output the key value pairs from the mapper in the form of the image ID and a serialised list of SIFT features (with all the information like the location and scale of the features as well as the descriptor). In the paper you linked, the authors output multiple pairs of <imageID, feature=""> (with just the raw SIFT descriptor as the feature) for each image, so if you want to do this, you'll need to write your own. Assuming that you're using IntWritable keys, the following implementation should get you started:
public class SimpleSIFT extends Mapper<IntWritable, BytesWritable,="" IntWritable,="" BytesWritable=""> {
private DoGSIFTEngine engine;
@Override
protected void setup(Context context) throws IOException,
InterruptedException
{
//construct and setup the sift extractor
this.engine = new DoGSIFTEngine();
}
@Override
protected void
map(IntWritable key, BytesWritable value, Context context) throws IOException, InterruptedException
{
//read the image
FImage image = ImageUtilities.readF(new ByteArrayInputStream(image));
//extract features
LocalFeatureList<Keypoint> features = engine.findFeatures(image);
//emit the features
for (Keypoint kp : features) {
context.write(key, new BytesWritable(kp.ivec));
}
}
}
You'll obviously need to write something to configure the map reduce job (i.e. a subclass of the Hadoop ToolRunner class) to use this mapper, and define the input, etc. Make sure you set the number of reducers to 0, as there doesn't need to be a reduction step for feature extraction.
Basic usage of the SequenceFileTool is as follows:
Build the tool by navigating in the OpenIMAJ sources to hadoop/tools/SequenceFileTool
run "mvn install assembly:assembly" to build it (it will be packaged in target/SequenceFileTool.jar
run the tool with "java -jar SequenceFileTool.jar" and it will print a list of options
For example, say you have a directory of images that you want to build into a SequenceFile, then the following command will build the file (mysequence.file) for you:
java -jar SequenceFileTool.jar -m CREATE -o mysequencefile.seq images/
(You should be able to give a full hdfs uri instead of mysequencefile.seq if you want the output sequencefile to be created directly on your hadoop hdfs).
The HadoopLocalFeaturesTool can be build in the same way as the SequenceFileTool above. Once you've got the jar built you can run it directly on your cluster by issuing a command like:
hadoop jar HadoopLocalFeaturesTool.jar --mode SIFT -o hdfs://servername/data/mysequencefile.seq -i hdfs://servername/data/sift-features.seq
(as with the SequenceFileTool information on all the options will be printed if you run it without arguments (hadoop jar HadoopLocalFeaturesTool.jar))
What I want to do is to implement a naive image similarity retrieve application. The idea is from this paper: http://bit.ly/1r3LvWH
First, I want to use mapreduce parallel extract features from images, then a feature descriptor with corresponding image ID together will be a record which I want to build index on. After mapreduce, I want the records are written into HBase. When an image query comes, first, I get the features from this image, then search in HBase if the features distance is within threshold. (I am not very clear how to do the search)
So, for the first stage, I want to use OpenImage in my mapper or reducer to extract SIFT features.
Thank you so much for your help.
Sincerely
OpenImaj Hadoop support. Can't find proper document, please help
Hey,
I am currently building a image retrieval system using hadoop, and I
want to use OpenIMAJ as part of this system. However, I can't find a
proper document or example that can teach me how to use it. I do think
OpenIMAJ is cool and useful, but it really needs more document.
Can you teach me how to use openimaj with hadoop?
Thank you.
Hi,
I can try and help, but you'll need to be a bit more specific as to what you want to do. What kind of techniques are you using in your retrieval system? What bits of the process do you want to distribute on a Hadoop cluster?
Jon
On 17 Apr 2014, at 19:37, Hakunami hakunami@users.sf.net wrote:
Thank you, Jon.
What I want to do is to implement a naive image similarity retrieve application. The idea is from this paper: http://bit.ly/1r3LvWH
First, I want to use mapreduce parallel extract features from images, then a feature descriptor with corresponding image ID together will be a record which I want to build index on. After mapreduce, I want the records are written into HBase. When an image query comes, first, I get the features from this image, then search in HBase if the features distance is within threshold. (I am not very clear how to do the search)
So, for the first stage, I want to use OpenImage in my mapper or reducer to extract SIFT features.
Thank you so much for your help.
Sincerely
Hi again,
I'll try to give you some general guidance for the first step:
The first thing you'll need to do is to get your images into Hadoop SequenceFiles. Specifically how you do this is going to depend on your image identifiers. If you are going to to use textual identifiers (i.e. the Hadoop Text class), then you can use the SequenceFileTool in OpenIMAJ (see below); otherwise (the paper you referenced uses IntWritable) you'll need to implement something yourself (either from scratch or by extending the SequenceFileTool, and perhaps making a new subclass of the SequenceFileUtility class in the core-hadoop OpenIMAJ subproject). The images themselves will need to be stored as Hadoop BytesWritable.
Secondly you'll need to implement a mapper to do the feature extraction. The HadoopLocalFeaturesTool tool in OpenIMAJ can do this, but will output the key value pairs from the mapper in the form of the image ID and a serialised list of SIFT features (with all the information like the location and scale of the features as well as the descriptor). In the paper you linked, the authors output multiple pairs of <imageID, feature=""> (with just the raw SIFT descriptor as the feature) for each image, so if you want to do this, you'll need to write your own. Assuming that you're using IntWritable keys, the following implementation should get you started:
=========
package org.openimaj.hadoop.tools.localfeature;
import java.io.ByteArrayInputStream;
import java.io.IOException;
import org.apache.hadoop.io.BytesWritable;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;
import org.openimaj.feature.local.list.LocalFeatureList;
import org.openimaj.image.FImage;
import org.openimaj.image.ImageUtilities;
import org.openimaj.image.feature.local.engine.DoGSIFTEngine;
import org.openimaj.image.feature.local.keypoints.Keypoint;
import org.openimaj.io.IOUtils;
public class SimpleSIFT extends Mapper<IntWritable, BytesWritable,="" IntWritable,="" BytesWritable=""> {
}
You'll obviously need to write something to configure the map reduce job (i.e. a subclass of the Hadoop ToolRunner class) to use this mapper, and define the input, etc. Make sure you set the number of reducers to 0, as there doesn't need to be a reduction step for feature extraction.
Basic usage of the SequenceFileTool is as follows:
Build the tool by navigating in the OpenIMAJ sources to hadoop/tools/SequenceFileTool
run "mvn install assembly:assembly" to build it (it will be packaged in target/SequenceFileTool.jar
run the tool with "java -jar SequenceFileTool.jar" and it will print a list of options
For example, say you have a directory of images that you want to build into a SequenceFile, then the following command will build the file (mysequence.file) for you:
java -jar SequenceFileTool.jar -m CREATE -o mysequencefile.seq images/
(You should be able to give a full hdfs uri instead of mysequencefile.seq if you want the output sequencefile to be created directly on your hadoop hdfs).
The HadoopLocalFeaturesTool can be build in the same way as the SequenceFileTool above. Once you've got the jar built you can run it directly on your cluster by issuing a command like:
hadoop jar HadoopLocalFeaturesTool.jar --mode SIFT -o hdfs://servername/data/mysequencefile.seq -i hdfs://servername/data/sift-features.seq
(as with the SequenceFileTool information on all the options will be printed if you run it without arguments (hadoop jar HadoopLocalFeaturesTool.jar))
Hope that helps,
Jon
On 18 Apr 2014, at 04:41, Hakunami hakunami@users.sf.net wrote:
I'm a new openimaj. i need a source code to practice about this. thanks Jonathon Hare
View and moderate all "General Discussion" comments posted by this user
Mark all as spam, and block user from posting to "Discussion"
http://openimaj.org/tutorial/
Hi Jonathon Hare,
I try to write ReducerSIFT function, but it is not work, please help me review it
======================================================
public class SimpleSIFT extends Mapper<IntWritable, BytesWritable,="" IntWritable,="" BytesWritable=""> {
....
}
======================================================
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;
import java.io.IOException;
public class ReducerSIFT extends Reducer<Text, Text,="" Text,="" Text=""> {
}
Thanks
nam
hadoop jar HadoopLocalFeaturesTool.jar --mode SIFT -o hdfs://servername/data/mysequencefile.seq -i sift-features.seq
Will this command sift features from image sequence if sift-feature.seq is given as input?
Shouldn't sift-feature.seq should come as output?
Yes, you're right the files are round the wrong way...