Menu

Encoding issues

Graeme
2011-03-28
2013-04-29
  • Graeme

    Graeme - 2011-03-28

    Hi,

    I am using Ghost4J to convert EPS graphics to PDF and have found that some of our EPS graphics contain TIFF/WMF thumbnail previews resulting in the first line starting with "ÅÐÓÆ" (C5 D0 D3 C6 in hex) followed by a bunch of strange binary data (a lot them seem to be null characters) before it gets to the "!PS-Adobe-2.0 EPSF-1.2" section.

    Depending on the encoding being used e.g. UTF8, some of this binary data seems to corrupt the first line, for example I can’t copy and paste anything after "ÅÐÓÆ". In Java this often results in the first few lines in the EPS file being removed.

    In Netbeans I can fix this issue by simply changing the encoding options for the project. However the project I am working on needs to be able to still use UTF8 as its encoding so I need to tackle the encoding used by Ghost4J directly instead.

    I have done this by changing the “stdin” call-back method in the “Ghostscript” class to retrieve the encoding being used by JNA and use it when creating a string from the buffer. I can then set the JNA encoding else where in my project.

    stdinCallback = new GhostscriptLibrary.stdin_fn() {
        Properties props = System.getProperties();
        String jnaEncoding = props.getProperty("jna.encoding", "windows-1252");
        public int callback(Pointer caller_handle, Pointer buf, int len) {
            try {
                byte[] buffer = new byte[1000];
                int read = getStdIn().read(buffer);
                if (read != -1) {
                    buf.setString(0, new String(buffer, 0, read, jnaEncoding));
                    buffer = null;
                    return read;
                }
            } catch (Exception e) {
                //an error occurs: do nothing
            }
            return 0;
        }
    };
    

    The above is just a suggested change so you may want to create a more central encoding parameter available to the higher API. In either case I hope this fix is some use and would be nice to see it in the next release.

    Thanks

     
  • zippy1978

    zippy1978 - 2011-05-31

    Hi,

    Sorry for the late reply (I did not see this post at the time).

    Thanks for your tip, I will add a parameter to support custom encoding (from a system property like you did, or with an attribute of the Ghostscript class)

    Regards,
    Gilles

     
  • zippy1978

    zippy1978 - 2011-06-01

    I released version 0.4.3 today.

    Now you can set 'ghost4j.encoding' property to use a specific encoding with Ghostscript stdin.

    Gilles

     
  • Anonymous

    Anonymous - 2011-09-27

    Hi, I have the same problem regarding file encoding, I need to convert PDF to JPEG, the file path of PDF file contains the traditional chinese chararters, when I tried the parameter such as "-dNOPAUSE -dBATCH -sDEVICE=display -dJPEGQ=100 …  C:\測試.pdf", got the exception like "Error: /undefinedfilename in C:\??.pdf".

    But everything goes fine if I set the VM argument to -Dfile.encoding=MS950. Obviously,  the path containing chinese charaters was internally feeded using the default file encoding. It throws out the above exception if the file encoding is not set to MS950, any tips solving the problem?  thanks!

     
  • Rameswar Prusty

    Rameswar Prusty - 2013-03-01

    Hello Gilles,

    If I did not set any encoding type then what type Ghost4J use for file encoding bydefault?

     
  • zippy1978

    zippy1978 - 2013-03-01

    Hi,

    Have a look at http://www.ghost4j.org/encoding.html

    Regards,

    Gilles

     
  • zippy1978

    zippy1978 - 2013-03-01

    One more thing: the project has moved to Github, and is not active anymore here.
    For support go to https://github.com/zippy1978/ghost4j instead.

     

Log in to post a comment.