PDF Clown / Discussion / Open Discussion: How To replace the images in a PDF?

Yes you can, of course: PDF Clown is a neutral, universal tool that is intended to cover *all* the domain of the PDF spec -- its only limitation is due to its actual implementation state that depends on code contributions, so the more people volunteer, more quickly it's developed!

There're various approaches to your task, depending on the way you want to discover and match existing images with new ones. I can suggest these strategies:
1) you can iterate through the resources associated to the document's pages (see documents.contents.Resources [1]), replacing images based on their naming: this is the simplest way, but resource names often aren't meaningful;
2) you can scan the page's contents looking for operations used to show images (see documents.contents.objects.PaintXObject [2]), then you can get their coordinates location and size in order to decide how to match them with the new ones; after that, you can replace the corresponding page's resources (note: correspondence is achieved through the name references obtained from PaintXObject operations).

[1] http://clown.sourceforge.net/API/it/stefanochizzolini/clown/documents/contents/Resources.html
[2] http://clown.sourceforge.net/API/it/stefanochizzolini/clown/documents/contents/objects/PaintXObject.html

To let you grasp something useful, I've prepared an example of the strategy 1 (resource substitution by name). It has been implemented as an ISample (just like any sample about PDF Clown usage) that you can run inserting it into the CLI samples project that ships inside the downloadable distribution. Strategy 2 can be implemented reusing the code of ContentScanningSample (you can find it in the 0.0.7 release), which takes care to spot PaintXObject operations and determine image's size and location.

Here it is the example of the strategy 1:

package it.stefanochizzolini.clown.samples;

import it.stefanochizzolini.clown.documents.Document;
import it.stefanochizzolini.clown.documents.Page;
import it.stefanochizzolini.clown.documents.contents.Resources;
import it.stefanochizzolini.clown.documents.contents.entities.Image;
import it.stefanochizzolini.clown.documents.contents.xObjects.ImageXObject;
import it.stefanochizzolini.clown.documents.contents.xObjects.XObject;
import it.stefanochizzolini.clown.files.File;
import it.stefanochizzolini.clown.objects.PdfName;

import java.util.Map;

/**
This sample demonstrates how to replaces images appearing in a document's pages through their resource names.

@author Stefano Chizzolini (http://www.stefanochizzolini.it)
@version 0.0.7
*/
public class ImageSubstitutionSample
implements ISample
{
public void run(
    SampleLoader loader
    )
{
    // 1. Opening the PDF file...
    File file;
    {
      // (boilerplate user choice -- ignore it)
      String filePath = loader.getPdfFileChoice("Please select a PDF file");

      try{file = new File(filePath);}
      catch(Exception e){throw new RuntimeException(filePath + " file access error.",e);}
    }
    Document document = file.getDocument();

// 2. Replace the images!
replaceImages(document,loader);

// (boilerplate metadata insertion -- ignore it)
loader.buildAccessories(document,this.getClass(),"Page numbering","numbering a document's pages");

// 3. Serialize the PDF file (again, boilerplate code -- see the SampleLoader class source code)!
loader.serialize(file,this.getClass().getSimpleName());
}

private void replaceImages(
    Document document,
    SampleLoader loader
    )
{
    // Get the image used to replace existing ones!
    Image image = Image.get(loader.getInputPath() + java.io.File.separator + "images" + java.io.File.separator + "gnu.jpg"); // Imaeg is an abstract entity, as it still has to be included into the pdf document.
    // Add the image to the document!
    XObject imageXObject = image.toXObject(document); // XObject (i.e. external object) is, in PDF spec jargon, a reusable object.
    // Looking for images to replace...
    for(Page page : document.getPages())
    {
      Resources resources = page.getResources();
      Map<PdfName,XObject> xObjects = resources.getXObjects();
      if(xObjects == null)
        continue;
      for(PdfName xObjectKey : xObjects.keySet())
      {
        XObject xObject = xObjects.get(xObjectKey);
        // Is the page's resource an image?
        if(xObject instanceof ImageXObject)
        {
          System.out.println("Substituting " + xObjectKey + " image xobject.");
          xObjects.put(xObjectKey,imageXObject);
        }
      }
    }
}
}

Note: xobjects iteration could be a bit more concise using entrySet() instead of keySet(): I opted for the latter just because entrySet() hasn't been implemented yet.

Enjoy!
Stefano

How To replace the images in a PDF?

General-Purpose PDF Library for Java and .NET

Forums

Help

How To replace the images in a PDF?

How To replace the images in a PDF?

General-Purpose PDF Library for Java and .NET

Forums

Help

How To replace the images in a PDF? document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

How To replace the images in a PDF?