Menu

How to find a specific external link inside a page in order to change the link

Help
2015-04-01
2015-04-03
  • Juergen Thomas

    Juergen Thomas - 2015-04-01

    In many cases, a page contains an external link instead of an interwiki link. In order to change such a link I need the exact position inside the page text. Unfortunately, there are several kinds of the link. Examples (in German wikipedia):

    Page "Exponentielle Glättung" in Section "Weblinks"
    http://de.wikibooks.org/wiki/Mathematik:_Statistik:_Gl%C3%A4ttungsverfahren
    (http, underscore instead of space, umlaut encoded)

    Page "Klasseneinteilung (Statistik)" in Section "Lageparameter" (footnote 2)
    https://de.wikibooks.org/wiki/Mathematik:%20Statistik:%20Klassierung%20eines%20
    metrischen%20Merkmals%20mit%20vielen%20verschiedenen%20Ausprägungen
    (https, space encoded, umlaut isn't)

    There may be more variants, especially spaces and umlauts encoded, sometimes colons too. Page.GetInternalLinks() returns the stored values without convertings.

    The issue is to replace the link [[b:Mathematik: Statistik: Glättungsverfahren]] in all variants into [[b:Statistik: Glättungsverfahren]] because the book is renamed. In order to do so, I have to use p.text.IndexOf(searchtext) in all variants:

    • interwiki link using space or underscore (the most simple variants)
    • http or https (this problem can be neglected, it's simple)
    • external link using space, underscore, or %20
    • external link using umlauts with or without encodings

    I don't find a way to a unique version of such a URL. HttpUtility.UrlDecode() helps partially only. The most wanted version should use spaces (or underscores in all cases) and umlauts:

    • //de.wikibooks.org/wiki/Mathematik: Statistik: Glättungsverfahren
    • //de.wikibooks.org/wiki/Mathematik: Statistik: Klassierung eines metrischen Merkmals mit vielen verschiedenen Ausprägungen

    Is there a more direct way than to combine some if - else if - else if instructions? Juergen

     
  • CodeDriller

    CodeDriller - 2015-04-03

    I don't see any universal solution in this case. Much of programmers' work time is spent coding "if - else if - else if", so don't be suprised and get into the way of things here. :)

     
  • Juergen Thomas

    Juergen Thomas - 2015-04-03

    Thank you, I'll go on checking all the variants one by one. Juergen

     

Log in to post a comment.