Can you tell the best way to look for a given link in all links of a page if the link contains spaces or underscores? Example in German wikibooks https://de.wikibooks.org :
1- PageList.FillFromAllPages("Wikijunior Europa" ...);
2- Remove manually all pages that do not start with "Wikijunior Europa/"
3- The issue is to look for all pages in this page list that contain [[File:Flag of Armenia.svg]].
3a- string s = "Datei:Flag of Armenia.svg"; // with spaces
3b- foreach(Page p in PagesList)
3c- p.Load()
3d- listOfLinks = p.GetAllLinks();
3e- if(listOfLinks.Contains(s)) >> OK, 2 pages will be found
You may neglect that s has to use four cases: Datei/Bild/File/Image. But if you use "Datei:Flag_of_Armenia.svg" with underscores, no one page will be found. What would be the best way to check such a situation?
replace space by underscore in listOfLinks and s before the contains instruction
use Bot.UrlEncode in listOfLinks and s before the contains instruction
use a RegEx or Linq feature instead of Contains (I have few knowledge of RegEx and none of Linq)
add a feature to the Bot framework to standardize links and return all links by that feature (optionally by an additional bool parameter)
I know that a work-around is better in this specific situation: first, PageList.FillFromPagesUsingImage - next remove all pages that do not match the requested page name. But my bot is looking for all pages that match a general condition (page name, category, etc.) and then removing all pages that don't match a more specific condition. Moreover, the problem space vs. underscore may appear in many other situations. Therefore I ask for a more general solution.
Thanks in advance for hints, Juergen
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Can you tell the best way to look for a given link in all links of a page if the link contains spaces or underscores? Example in German wikibooks https://de.wikibooks.org :
1- PageList.FillFromAllPages("Wikijunior Europa" ...);
2- Remove manually all pages that do not start with "Wikijunior Europa/"
3- The issue is to look for all pages in this page list that contain [[File:Flag of Armenia.svg]].
3a- string s = "Datei:Flag of Armenia.svg"; // with spaces
3b- foreach(Page p in PagesList)
3c- p.Load()
3d- listOfLinks = p.GetAllLinks();
3e- if(listOfLinks.Contains(s)) >> OK, 2 pages will be found
You may neglect that s has to use four cases: Datei/Bild/File/Image. But if you use "Datei:Flag_of_Armenia.svg" with underscores, no one page will be found. What would be the best way to check such a situation?
I know that a work-around is better in this specific situation: first, PageList.FillFromPagesUsingImage - next remove all pages that do not match the requested page name. But my bot is looking for all pages that match a general condition (page name, category, etc.) and then removing all pages that don't match a more specific condition. Moreover, the problem space vs. underscore may appear in many other situations. Therefore I ask for a more general solution.
Thanks in advance for hints, Juergen
You can do just:
I'll add this correction to FillFromPageLinks().
Thank you for your help and extending FillFromPageLinks(). Juergen
This problem is resolved.