Load source image over HTTP

Tom Crane
  • Tom Crane

    Tom Crane - 2012-03-01


    I understand why it's preferable to have source images loaded from a local
    file system rather than across the network over http, hence answers like:


    But how about this scenario? We're looking at a setup where the source images
    (of which there might be millions) are held in an asset management system and
    are ONLY available over http. Sure, they are ultimately held on a file system
    somewhere, but that file system is opaque to us behind many layers of
    security, and probably not something we could mount from the DMZ where
    IIPImage would live. Or they could be blobs in a database. The underlying
    storage of the asset management system is off limits.

    The sheer number of images, and the fact that they are continually added to,
    makes copying them to a filesystem that IIPImage can see (i.e., in advance)

    I've made a proof of concept using Djatoka which handles this nicely, pulling
    the source images from the asset management system on demand via HTTP, but I'd
    like to use IIPImage for all sorts of other reasons! I can think of one way of
    doing this - a proxy in front of IIPImage which determines the source image
    required, sees if it's already present on a file system visible to IIPImage,
    and if not, obtains it via http and writes it to the file system. But that
    feels horrible and isn't exactly making the most of IIPImage's performance.

    I'm wondering if I'm missing something more obvious here. Would adding the
    ability to load the source file via http cause all sorts of other problems in
    IIPImage? This feature could use a short term cache to avoid repeatedly
    loading the same image over http for popular images.


  • Ruven

    Ruven - 2012-03-02

    At the moment, IIPImage can only read files on a filesystem. Of course in
    theory you can map any remote system to your local filesystem using some
    synchronization mechanism such as NFS, Samba or even things like Dropbox,
    WebDav etc. Does your asset management support anything other than HTTP?

    The issue with IIPImage is that it requires fast and random access to the
    image. It can be done via HTTP, but it'll be very slow as HTTP is not at all
    designed for this.

    So, the best way of course would be to do this at the filesystem level (like
    NFS). Otherwise you could modify the server itself to first download the image
    from your asset management system into some temporary space and then read it
    normally. If you don't want to modify the server, then you could, as you
    describe, use some proxy service to handle this and give the temporary path to
    the IIPImage server. It's not such a bad solution. Wikimedia Commons use a
    similar principal for their IIPImage support, whereby the first time you
    request it, a proxy will convert the image to multi-resolution tiled TIFF
    ready for IIPImage.

  • Tom Crane

    Tom Crane - 2012-03-06

    Thanks for the reply. I'll do some experiments with a proxy solution and see
    how that goes. I can see the need for fast, random access, obviously!

    My main concern with a proxy is having to have that step for every individual
    tile request; but if I can combine that step with a reverse proxy cache (for
    the tiles) it might be OK.

    I'll report back on how we get on.


  • Illtud Daniel

    Illtud Daniel - 2012-04-04

    Could you do this with the http range command?

    Vanilla apache or other httpd servers probably could, but people (like
    ourselves) who would like to join IIPserver to http are doing it for the same
    reason as tomcrane, ie we have an asset management system with an http
    interface, and that's unlikely to support range, I guess (not that I've

  • Ruven

    Ruven - 2012-04-04

    that's unlikely to support range, I guess (not that I've tried).

    Maybe you should try it ;-)

  • Tom Crane

    Tom Crane - 2012-05-04

    For anyone interested, here's what we decided to do in the end.

    Rather than proxying individual tile requests to ensure that the source image
    is present, we have a "pre-fetch" HTTP endpoint that our viewer has to call
    before it starts generating tile requests for a new image, e.g.,
    /check/xxxx.jp2. The viewer makes a GET request to this, and it must wait for
    the response before proceeding. The response should normally be "1", which
    will come back very quickly if the file is present on the local file system
    IIPImage can see, or a bit more slowly if we have to retrieve xxxx.jp2 from
    the asset management system and write it to the IIPIMage server's filesystem.
    If for some reason xxxx.jp2 is unobtainable it can return a sensible error

    Compared to everything else going on in the system, this adds virtually no
    overhead; the local file system acts as a cache of raw image files for



Cancel  Add attachments