Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo

Close

Dowloading Media files ?

Help
2009-05-18
2013-05-13
  • Wetware Bot
    Wetware Bot
    2009-05-18

    Hello,

    Would someone have any suggestions about the best/easiest way to download the Media files included in a page (i.e. the files included in the [[Media:...] tag) ?
    For downloading images I use getImages()/downloadImage() but for Media files I am not too sure about the best way to proceed.
    Any help appreciated.

    Cheers.

     
    • CodeDriller
      CodeDriller
      2009-05-19

      Currently there is no special method to get files included in the [[Media:...]] tags, so it should be done manually. Just add "Media" tag on line 443 of DotNetWikiBot.cs file (as of version 2.65) like that:

        wikiImageRE = new Regex(@"\[\[(?i)((File|Image|Media" +

      After that, and after recompiling the DotNetWikiBot.cs file, getImages() function will "see" files in [[Media:...]] tags.

      DownloadImage() function downloads audio/video files correctly.

       
    • Wetware Bot
      Wetware Bot
      2009-05-19

      Hi CodeDriller,

      Thanks a lot, this tip really helped and now all the media files are downloaded.

      Nonetheless I am experiencing 2 problems:

      1) The downloaded 'pdf' files are all corrupted : when opening them with a regular text editor it appears their content is an xhtml document.

      2) Sometimes there are weird special characters at the end of the downloaded file name.

      Any ideas what could go wrong ?

      Cheers.

       
    • CodeDriller
      CodeDriller
      2009-05-20

      It seems that DownloadImage() function needs some upgrade too to properly handle PDF files. Please, download new version of that function from CVS: http://dotnetwikibot.cvs.sourceforge.net/viewvc/\*checkout*/dotnetwikibot/framework/DotNetWikiBot.cs

       
    • Wetware Bot
      Wetware Bot
      2009-05-20

      The downloaded pdf files are OK now, many thanks !

      2 small issues:

      1) For this to work I had to add again "|Media" line 443. Would it be possible for you to keep this in the new release or maybe have it as an optional parameter somewhere ?

      2) There are still sometimes (around 10% of downloaded files) a weird special character appended at the end of the file name. When I copy/paste this character in WordPad it looks like a vertical bar with a very small right arrow at the top of the vertical bar, something a bit like this:

      |->
      |
      |

      Any ideas where this special character can come from ?

      Cheers.

       
      • CodeDriller
        CodeDriller
        2009-05-21

        1) I'll try to find some way to implement that.

        2) I have no idea why that happens. Maybe I could say something if I see your code.

         
    • Wetware Bot
      Wetware Bot
      2009-05-21

      1) OK many thanks for that :-)

      2) After further investigation I found out the special characters were in the wikitext itself. I didn't spot them before because the Firefox textbox doesn't display them but when I copied/pasted the text from the Firefox textbox to WordPad they suddenly appeared. So I just removed these special characters from the wikitext and now it works fine so it was not a code problem just a data problem. How these special characters ended up in the wikitext in the first place for now remains a mystery, maybe some copy/paste operations that went pear-shaped.

      Anyway thank you for your help !

       
    • CodeDriller
      CodeDriller
      2009-05-27

      New function was added in version 2.7:

        GetImagesEx(bool withNameSpacePrefix, bool includeFileLinks)

       
    • Wetware Bot
      Wetware Bot
      2009-06-04

      Hello CodeDriller,

      I've tested version 2.7 and the includeFileLinks parameter in GetImagesEx() works great indeed, many thanks for that :-)

      Also, I previously had a problem with downloading a file which had an ampersand in its name but now it works OK with version 2.7. Thank you for this fix as well :-)

      All the best,
      Wetware Bot

       
      • CodeDriller
        CodeDriller
        2009-06-04

        Glad to be of assistance.