#159 File: Support reading from a single directory (instead of a MediaWiki hive)

v1.3.*
closed
nobody
file (4)
v1.3.1
2014-03-21
2013-07-06
gnosygnu
No

XOWA can download images from either an internet wiki site or a local tarball. In both cases, the folder structure has a specific layout. For example, the A.png file would be in the following hierarchy "/root/7/70/A.png"

A request was made for XOWA to be able to read images from a single directory. Using the above example, this would be "/root/A.png"

This requires non-trivial changes to the code. It may be implemented whenever the thumbnails initiative is completed.

Discussion

1 2 > >> (Page 1 of 2)
  • gnosygnu
    gnosygnu
    2013-07-13

    As a further clarification, XOWA should support some form of relative paths (and not just absolute ones)

     
  • gnosygnu
    gnosygnu
    2013-10-15

    This will be covered by the redesign of the image code for the offline thumbnails. It should be done sometime in a v0.11.* release.

     
  • gnosygnu
    gnosygnu
    2013-11-26

    • status: pfe --> queued
    • Expected release: --> v0.12.*
    • Milestone: PFE --> v0.12.*
     
  • gnosygnu
    gnosygnu
    2013-11-26

    I'm moving this ticket to v0.12.*. I have some infrastructure in place, so hopefully it'll be possible within the next few weeks.

     
  • gnosygnu
    gnosygnu
    2013-12-31

    • Expected release: v0.12. --> v1.1.
    • Milestone: v0.12. --> v1.1.
     
  • gnosygnu
    gnosygnu
    2013-12-31

    I'm pushing this ticket out another month for the following reasons:

    • The ticket is harder than I expected. It not only requires ImageMagick / Inkscape, but also a separate database of widths and heights.
    • There are other tickets that have higher priority, such as support for Chinese Wikipedia, missing extensions, and various parser bugs
    • A lot of my time has been spent on tasks revolving around the offline image databases

    I'll try again for a v1.1.* release, but frankly there is a good deal of work involved, and editable wikis is a lower priority than readable wikis.

     
  • gnosygnu
    gnosygnu
    2014-02-03

    • status: queued --> in-progress
    • Expected release: v1.1. --> v1.2.
    • Milestone: v1.1. --> v1.2.
     
  • gnosygnu
    gnosygnu
    2014-02-03

    I did some work over the weekend, but this is still a difficult ticket. I'm going to try to get this done in a v1.2.* release, but it will be tough

     
  • gnosygnu
    gnosygnu
    2014-03-01

    • status: in-progress --> done
    • Expected release: v1.2.* --> v1.3.1
    • Milestone: v1.2. --> v1.3.
     
  • gnosygnu
    gnosygnu
    2014-03-01

    This feature will be included in v1.3.1

    See home/wiki/Help:File_sources/Directory in XOWA for more info. I've added the page dump below, but further details may be subject to change

    == Overview ==
    Some wikis are not Wikimedia Foundation wikis, and may not have their images / files arranged in a WMF filesystem layout
    
    XOWA supports using files from a single directory.
    
    == Background ==
    Wikimedia Foundation wikis place their images in a precisely defined filesystem layout.
    
    For example, a file in a WMF tarball may have the following path /wmf_tarball/wikipedia/commons/7/70/A.png
    
    Note that this path embeds the MD5 hash of the title in the path. In this case "70" are the first two characters of the MD5 hash for "A.png" which is "701ccaf6ec1641a9ff778fd0b862e5a2"
    
    Because an MD5 hash is a non-trivial function, non-WMF wikis may find it difficult to arrange their files in the same filesystem layout.
    
    XOWA allows these wikis to use an alternate method, where the files need only be placed inside a single directory.
    
    == Instructions ==
    * Go to home/wiki/Help:Options/Config script
    * Enter in the following:
    <pre>
    app.wikis.get('my_wiki').files.wkrs.get('fs.dir') {
      orig_dir  = '~{<>xowa_root_dir<>}wiki/my_wiki/orig/';
      thumb_dir     = '~{<>xowa_root_dir<>}wiki/my_wiki/thumb/';
    }
    </pre>
    * Place a file called "A.png" in "~{<>xowa_root_dir<>}wiki/my_wiki/orig/". For example, if XOWA is setup on a Windows machine at C:\xowa\ and your wiki is my_wiki, then your file should be at C:\xowa\wiki\my_wiki\orig\A.png.
    * Restart XOWA
    * Go to any page in my_wiki
    * Enter in <nowiki>[[File:A.png]]</nowiki>
    * Preview the page. The file should show
    * Enter in <nowiki>[[File:A.png|200px]]</nowiki>
    * Preview the page. The thumb should show
    
    == Image Magic and Inkscape ==
    * You must have ImageMagick and Inkscape installed on your machine. They are needed because:
    ** MediaWiki has a lot of logic that depends on the image's size. ImageMagick is used to get the size.
    ** Thumbs are resized from the original. ImageMagick and Inkscape does the resizing.
    
    == ^orig_regy.sqlite3 ==
    * A file called "^orig_regy.sqlite3" will be in the orig directory. 
    ** This db caches the sizes of the original files (so ImageMagick doesn't need to be continually run).
    ** Note that if this file is deleted, it will be automatically regenerated
    
    == Urls ==
    * XOWA takes the following type of urls
    ** Absolute urls: C:\xowa\wiki\my_wiki\images\
    ** XOWA relative urls: ~{<>xowa_root_dir<>}wiki/my_wiki/images/
    *** Note that relative urls can use the "\" instead of the "/". However, it is recommended to use "/" for sharing across different machines (for example, the same USB drive can be used on both a Windows or Linux machine if a "/" path is used)
    * File names need to comply with valid MediaWiki titles. For example, certain characters are invalid, such as [].
    
    == Orig directory ==
    * All original files should go into the orig directory
    * The orig directory can be nested.
    ** For example, /xowa/wiki/my_wiki/orig/ can have a subfile in /xowa/wiki/my_wiki/orig/level_0/level_00/A.png. 
    ** <nowiki>[[File:A.png]]</nowiki> will pick up this file
    ** Note that file names should be unique with a given folder. If there are two files called A.png in two different subfolders, then XOWA will only use one, and ignore the other.
    
    == Thumb directory ==
    * All thumbs will go into the thumb directory
    * Thumbs can be deleted, and they will be recreated.
    * For nested files, thumbs will be created in a parallel directory
    ** For example, the original file is in /xowa/wiki/my_wiki/orig/level_0/level_00/A.png. 
    ** The thumb file will be created in /xowa/wiki/my_wiki/thumb/level_0/level_00/A.png/30px.png
    
     

  • Anonymous
    2014-03-04

    I read your home/wiki/Help:File_sources/Directory readme in this ticket.

    I put the following in user_custom_cfg.gfs:


    app.wikis.get('atcolib').files.wkrs.get('fs.dir') {
    orig_dir = '~{<>C:\atcolib\<>}imagedump\';
    thumb_dir = '~{<>C:\atcolib\<>}imagedump\';
    }

    The imagedump folder is at C:\atcolib\imagedump with me, and the main xowa folder is thus at C:\atcolib\ My windows xowa root folder is at C:\atcolib\xowa_win_v1.3.1.1

    What exactly do I need to change because at present, the images still do not yet show up ?

     
  • gnosygnu
    gnosygnu
    2014-03-04

    Hi. Thanks for the details.

    First, the diamond syntax (<>) is only used for relative paths. The documentation used an example of <>xowa_root_dir<>

    For, absolute paths, just add it literally. In your case, it would be 'C:\atcolib\imagedump\'

    Second, please use a different directory for the thumb directory. Don't use the same directory as the orig directory. The thumb directory will be auto-created, so you don't have to create it beforehand.

    Third, I don't quite understand your filesystem layout. Is the xowa_windows.jar at C:\atcolib\xowa_win_v1.3.1.1\xowa_windows.jar or in C:\atcolib\xowa_windows.jar?

    I'm assuming it's the latter. If so, your wiki must be at at C:\atcolib\wiki\atcolib*.sqlite3. I'm hoping that's the case, but I just want to be clear.

    Finally, here's the recommended script:

    app.wikis.get('atcolib').files.wkrs.get('fs.dir') {
      orig_dir  = 'C:\atcolib\imagedump\';
      thumb_dir = 'C:\atcolib\imagedump_thumbs\';
    
    }
    

    ... with this filesystem setup ...

    XOWA            C:\atcolib\xowa_windows.jar
    atcolib wiki    C:\atcolib\wiki\atcolib\atcolib.000.sqlite3
    atcolib images  C:\atcolib\imagedump\A.png
    

    If this still doesn't work, please confirm that you're following the above. Also, give me an example of a specific page and image and I'll investigate further

     
  • gnosygnu
    gnosygnu
    2014-03-05

    I'm going to mark this ticket investigating for now, though I'm fairly certain the steps should resolve the issue.

     

    • Anonymous
      2014-03-05

      Thanks for the clarification on how to write the path. I used a relative path now.

      for the windows xowa, I used the following text in the gfs (I updated the text slightly for the osx and linux xowa):

      app.wikis.scripts.set
      ( 'override_main_page'
      , 'other~wikimedia' // NOTE: 'wikimedia' is required due to a defect
      ,
      <:['
      app.wikis.get('atcolib').props.main_page = 'AT CoLib Main Page';
      ']
      :>
      );


      app.wikis.get('atcolib').files.wkrs.get('fs.dir') {
      orig_dir = '~{<>\atcolib\<>}xowa_win_v1.3.1.1\imagedump\';
      thumb_dir = '~{<>\atcolib\xowa_win_v1.3.1.1\imagedump\<>}thumb\';
      }

      For my folder hierarchy:
      I used a main folder called atcolib, the function of which being that all folders are contained in this, allowing quick copying, for example from USB stick to hardisk, or to just move it around at the harddisk.

      In the atcolib folder, there are 6 folders, knowingly:
      imagedump
      filedump
      deleted images
      xowa_lin_v1.3.1.1
      xowa_win_v1.3.1.1
      xowa_osx_v1.3.1.1

      The xowa_windows.jar is at C:\atcolib\xowa_win_v1.3.1.1\xowa_windows.jar

      The images work now, although not yet all images work. Not sure whether this is because I used a slightly outdated version of imagemagick, inkscape

       
      • gnosygnu
        gnosygnu
        2014-03-06

        Hi. Your config script is still not correct. I don't even know how you're seeing images.

        When I said the diamond syntax is for relative paths, I should've said XOWA relative paths . The example I gave was "<>xowa_root_dir<>". You can't use this syntax for other strings, such as your "{<>\atcolib\<>}"

        Please try the absolute paths I specified above. If images still don't load, please specify a specific page and image. Ideally, you'd give me the wikitext that doesn't produce the image.

        For example:

        • I have a page that has this wikitext: [[File:A.png]]
        • I have a file in C:\atcolib\imagedump\A.png
        • No image shows

        In the atcolib folder, there are 6 folders, knowingly:

        I've said this in another ticket, but this is not going to work well. You're creating separate directories by OS. XOWA doesn't support this. You have to understand that what you're doing is non-standard. I can't think of any cross-platform application that sets up multiple directories per OS. It just doesn't scale well, and is fraught with confusion.

        If you go ahead and decide to create separate directories for each OS, you'll run into strange problems (for example, the Page history will be different for each OS). You'll also have extra duplicate files, and upgrades will become difficult (you may have to update all three folders instead of one root directory).

        I'd really recommend that you keep one root directory, and instruct your users to use the .zip file for their OS. If you still persist in going your own route, then so be it, but please don't expect any support for this.

         
        Last edit: gnosygnu 2014-03-06

        • Anonymous
          2014-03-06

          I checked the script again (the one you said was incorrect; namely:

          app.wikis.get('atcolib').files.wkrs.get('fs.dir') {

          orig_dir = '~{<>\atcolib\<>}xowa_win_v1.3.1.1\imagedump\';

          thumb_dir = '~{<>\atcolib\xowa_win_v1.3.1.1\imagedump\<>}thumb\';

          }) and I indeed also saw I made a huge mistake in that the imagedump folder is directly under /atcolib/ and not under /xowa_win_v1.3.1.1/ . I'm also puzzled why this thus nonetheless worked more or less. The correct code would thus have been:

          app.wikis.get('atcolib').files.wkrs.get('fs.dir') {
            orig_dir  = '~{<>atcolib<>}imagedump';
            thumb_dir     = '~{<>atcolib\imagedump\<>}thumb\';
          }
          

          Using this code didn't improve things however.

          I tried putting the windows xowa files directly under /atcolib/ , but that didn't improve things (there weren't more images that loaded). I also used absolute paths, then, exactly as you proposed in your previous post
          ( app.wikis.get('atcolib').files.wkrs.get('fs.dir') {
          orig_dir = 'C:\atcolib\imagedump\';
          thumb_dir = 'C:\atcolib\imagedump_thumbs\';

          } ), but that didn't improve things neither (still a same limited number of images show up)

          I'm wondering whether I can't use relative pathnames, while still keeping in my old folder hierarchy. The benefit of my hierarchy is that the os's are nicely seperated so if a usb-stick is used for various computers and other computers have other os's running, they can simply pick the correct xowa, rather than needing to delete xowa specific folders and replacing them with other xowa folders (to be compatible with that particular OS). I'm thinking of something like this:

          app.wikis.get('atcolib').files.wkrs.get('fs.dir') {
            orig_dir  = '~{<>..\atcolib\<>}\imagedump\';
            thumb_dir     = '~{<>..\atcolib\imagedump\<>}\thumb\';
          }
          

          Another issue I didn't immediatelly spot was that clicking the main page override no longer works. It just loads the regular (appropedia welcome) page now, instead of AT CoLib Main Page.

          In annex you'll find some images that did and didn't work. I don't have the wikitext, since the images were checked on the appropedia welcome page, which has weird code (not wikitext, but hml code to show random images). An overview of what images worked and didnt work:

          Image:Muhammad Yunus - World Economic Forum Annual Meeting 2012.jpg
          Image:Muir portrait 1872.jpg
          Image:Treadle_pump_malawi.jpg
          File:Wall3.jpg

          doesn't work:
          File:BarrelPlant5.jpg
          File:AleihaDish.jpg
          File:ethanol_production_my_pipebomb.jpg
          File:DoggieDooDigester.JPG
          File:Bikeblender_side.JPG
          File:fullsideview1.jpg
          File:Picture_A.png
          File:Diagrm.jpg
          File:Layout.png

          Weirdly enough, the first two that did work are located in a folder that I didn't indicate at all in my script (the folder deletedimages)

           
          Attachments

  • Anonymous
    2014-03-05

    Now that this issue is solved, I got reminded of a different problem again in regards to putting the offline wiki data in a seperate folder (higher-up). This reduces filesize when transferring everything via the internet, and also makes it much easier to update the xowa folders when a new version of xowa has arrived (there's then no need to replace folders individually, making sure no offline wiki data gets removed in the process). See
    http://sourceforge.net/p/xowa/tickets/270/

     
  • gnosygnu
    gnosygnu
    2014-03-07

    Hi. I'm going to be direct, so there won't be any misunderstanding.

    • Don't use the diamond syntax. It was designed for specific uses, and none of your uses are correct.
    • Don't try any relative path syntax. XOWA does not support relative path syntax.
    • Do not use multiple OS folders. XOWA is designed to run in multiple OS's from one folder. I will list my reasons in a section below.

    Finally, use the following instructions. Please confirm that it works for these instructions. I tested now with these instructions and the images appeared. Please do not deviate from them, until after you've verified that they work.

    • Create a folder called C:\atcolib_test
    • Extract xowa_app_windows_v1.3.1.1.zip to C:\atcolib_test\xowa. You will have a file called C:\atcolib_test\xowa\xowa.exe
    • Copy your atcolib wiki to C:\atcolib_test\xowa\wiki. You will have a file like C:\atcolib_test\xowa\wiki\atcolib\atcolib.000.sqlite3
    • Copy your images to C:\atcolib_test\images. You will have a file called C:\atcolib_test\images\150px.jpg
    • Double-click C:\atcolib_test\xowa\xowa.exe
    • Click on the "Set up images (Windows)" on the Main Page. This will download ImageMagick / Inkscape
    • Go to home/wiki/Help:Options/Files. Make sure "Retrieval enabled" is checked
    • Go to home/wiki/Help:Options/Config_script. Enter the following script

      app.wikis.get('atcolib').files.wkrs.get('fs.dir') {
      orig_dir = 'C:\atcolib_test\images\';
      thumb_dir = 'C:\atcolib_test\images_thumbs\';
      }

    • Go to atcolib/wiki/Project:Sandbox

    • Click Edit
    • Enter the following text:

      [[File:150px.jpg]]
      [[File:Aleiha dish.jpg]]
      [[File:BarrelPlant5.jpg]]
      [[File:Bikeblender side.jpg]]
      [[File:DoggieDooDigester.jpg]]
      [[File:Fullsideview1.jpg]]
      [[File:Muhammad Yunus - World Economic Forum Annual Meeting 2012.jpg]]
      [[File:Muir portrait 1872.jpg]]
      [[File:Recyclebot.png]]
      [[File:Wall3.jpg]]

    • Click preview. All the images should show.


    Disadvantages of multiple OS directories

    • Settings will be saved in each different folder. Your page history on a windows machine will be different than your page history on a linux machine
    • Duplicate files. XOWA has many files that are shared in common. For example, the files in /xowa/user/anonymous/lang/ These files will be repeated for each OS
    • Maintenance. Updates to common files will need to be made to multiple OS's. For example, if the wikidata.js file changes, you will have to update this for each OS.

    I see no reason why one root folder wouldn't work. You said this: "they can simply pick the correct xowa, rather than needing to delete xowa specific folders and replacing them with other xowa folders (to be compatible with that particular OS)". I don't understand what this means, especially the "delete" part.

    This is how I've used XOWA on different OS's with my memory card

    • Create a root folder of /xowa/
    • Unzip the xowa_app_windows, xowa_app_linux, xowa_app_macosx into this same root folder
    • Plug it into a Windows machine. Run "X:\xowa\xowa.exe"
    • Plug it into a Linux machine. Run "sh /media/mnt/SDCARD/xowa/xowa_linux.sh"
    • Plug it into a Mac OS X machine. Run "sh /Volumes/SDCARD/xowa/xowa_macosx.sh"

    If you have objections to this, please state them clearly. Please try to provide a specific usage example, like mine above.

     

  • Anonymous
    2014-03-07

    [EDIT: by gnosygnu for formatting]

    Hi gnosygnu,
    I followed your instructions and I can confirm that that worked; I then backtracked on what was actually different since last time and it seemed to be something as simple as the pre-tags: removing these in the script made it work.

    I then tried it again with my own folder-hierarchy, and that worked too, so I reinstated these.

    Next, I wanted to clarify those lines I typed ("they can simply pick the correct xowa, rather than needing to delete xowa specific folders and replacing them with other xowa folders (to be compatible with that particular OS"):

    With your proposed folder hierarchy, the windows xowa is directly under /atcolib/ without the /xowa_win_v1.3.1.1/-folder in between (so having the xowa.exe at C:\atcolib\xowa.gfs and the bin, user, ... folder at /atcolib/bin/, /atcolib/user/).

    Only windows-computers can run this xowa-version, so to be able to run xowa on a linux machine for example, using the exact same same files; you would need to:

    • remove the windows xowa.exe, xowa_build.gfs, xowa.gfs, ... files
    • remove the /bin folder
    • remove all folders in /file/ folder, yet keep /file/atcolib
    • remove /user/, yet move the user\anonymous\app\data\cfg\user_custom_cfg.gfs file to a save place to reinstate in the linux xowa, in the same location
    • download linux imagemagick and inkscape and unzip to /bin/linux/
    • move /wiki/atcolib to the linux wiki

    That's all doable, but still a fair bit of work, and needs to be done every time a different machine (different OS) is used. It's way simpler to say simply be able to run the xowa_linux.jar at /atcolib/xowa_lin_v1.3.1.1/ and run xowa.exe at /atcolib/xowa_win_v1.3.1.1/ when using a windows computer for example; there is then no work involved. The only thing we really need for this to work well is:

    • the ability to put all offline wiki files at a higher folder (say /atcolib/atcolibdata/ )
    • a changable path for the user_custom_cfg.gfs file (say at /win_configfiles/, /lin_configfiles/, ...)
     
    Last edit: gnosygnu 2014-03-10

    • Anonymous
      2014-03-07

      [EDIT: by gnosygnu for formatting]
      I forgot: in regards to your method of simply unzipping all the different xowa's to /atcolib/

      • there is no folder like /xowa_win_v1.3.1.1/, xowa_lin_v1.3.1.1/, xowa_osx_v1.3.1.1/ so:
      • one does not know which version is used for the windows, linux or osx version. the only way to check this is by opening the file and then looking for the version number inside xowa
      • there is a very large number of folders, files, .... Regular users won't be able to see what's what anymore, let alone update xowa's to new versions.
       
      Last edit: gnosygnu 2014-03-10
1 2 > >> (Page 1 of 2)


Anonymous


Cancel   Add attachments