Menu

Tree [r8] /
 History

HTTPS access


File Date Author Commit
 src 2009-10-12 lomby [r6] Fixes with comments from mailing list
 COPYING 2009-10-07 lomby [r3] Added license
 COPYING.LESSER 2009-10-07 lomby [r3] Added license
 README.txt 2009-10-12 lomby [r7] Add thank you
 build.xml 2009-10-12 lomby [r8] Removes wrong comments.
 input.pdf 2009-10-08 lomby [r5] Version 1.0

Read Me

Name: PdfJavascriptStripper
Version: 1.1
License: LGPL
Author: http://www.oneoverzero.net andrea.lombardoni@oneoverzero.net
Description: This Java utility removes the Javascript parts from a PDF
             document. It may be useful to avoid injection/phishing attacks.
             It is based on the iText library http://www.lowagie.com/iText/
Thanks to: Mark Storer 

How to compile:
---------------

You have to obtain a copy of iText (supported version is 2.0.8).
You can obtain it at SourceForge.net:

https://sourceforge.net/projects/itext/files/iText/iText2.0.8/iText-2.0.8.jar/download

First of all, edit the build.xml file and fix the path containing the iText jar.

The line:
                <pathelement location="${basedir}/iText-2.0.8.jar" />

Must be changed to point to your iText jar.

Run: 

ant compile

This will build everything.

How to run:
-----------

Prepare the PDF file that must be processed and name it input.pdf

Run:

ant pdfstripper

The output will be in a file called output.pdf

How to use:
-----------

In class:
  
  net.oneoverzero.common.itext.PdfJavascriptStripper

Use the method:
  
  public static Pair<byte[],Boolean> stripJavascript(final byte[] in)

Which takes a PDF document as a byte array and returns the same PDF document
without all the Javascript. The boolean flag tells is some Javascript was
removed.

Roadmap:
--------

- check that we remove all Javascript
- remove also Flash
- remove also HTML/Anchors

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.