I have pdf-file with text and images. Let say products' descriptions and photos.
Some photos consist of several adjacent images.
I want to extract products' photos and for each product join all photo parts into one image.
At the beginning I need to know which images are adjacent.
I studied sample source code and see that I can read image size (height, width). But I haven't found how to read image position on the page.
Could someone give me a clue how to get image positions from existing PDF file?
It seems I found a solution for ImageXObject in the sample ContentScanningSample:
if(xObject is xObjects::ImageXObject)
"Image '" + xObjectKey + "' (" + xObject.BaseObject + ") " // Image key and indirect reference.
+ "on page " + (page.Index + 1) + " (" + page.BaseObject + ")" // Page index and indirect reference.
// Get the coordinates of the image!
double ctm = level.State.CTM; // Current transformation matrix.
SizeF imageSize = xObject.Size; // Image native size.
Console.WriteLine(" x: " + Math.Round(ctm));
Console.WriteLine(" y: " + Math.Round(page.Size.Value.Height - ctm));
Console.WriteLine(" width: " + Math.Round(ctm) + " (native: " + Math.Round(imageSize.Width) + ")");
Console.WriteLine(" height: " + Math.Round(Math.Abs(ctm)) + " (native: " + Math.Round(imageSize.Height) + ")");
Sign up for the SourceForge newsletter:
You seem to have CSS turned off.
Please don't fill out this field.