Save() throws WebException (404) if page title ends with a dot.
On my local intranet site this code causes a WebException:
try {
Page page = new Page(site, "Art.");
page.Load(); // works fine!
page.Save("New text", "MyBot V1.0", true); // Saving is done, but after that a WebException is thrown
}
catch (System.Net.WebException e) {
Console.WriteLine("WebException {0}", e.ToString());
}
catch (DotNetWikiBot.WikiBotException e) {
Console.WriteLine("WikiBotException {0}", e.ToString());
}
output form above code:
Localization file "DotNetWikiBot.i18n.xml" is missimg. <- spelling!
Logged in as MyBot.
Site: Wiktionary (MediaWiki 1.18.0)
Page "Art." loaded successfully.
PageList filled with bot account's watchlist.
WebException System.Net.WebException: The remote server returned an error: (404) Not Found.
at System.Net.HttpWebRequest.GetResponse()
at DotNetWikiBot.Site.PostDataAndGetResultHTM(String pageURL, String postData)
in e:\Wiktionary\Software\DotNetWikiBot_2.97\DotNetWikiBot.cs:line 986
at DotNetWikiBot.Page.Save(String newText, String comment, Boolean isMinorEdit)
in e:\Wiktionary\Software\DotNetWikiBot_2.97\DotNetWikiBot.cs:line 1629
I am using Visual C# 2010 Express, .NET Framework 4
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thank you for testing. But i got the same excepting with Version 98beta. Even changing to .NET 2 didn't helped.
Then i changed the order of parameters in the URL in public void GetEditSessionDataEx()
from api.php?action=query&prop=info&format=xml&intoken=edit&titles=" + HttpUtility.UrlEncode(title));
to api.php?action=query&prop=info&format=xml&titles=" + HttpUtility.UrlEncode(title) + "&intoken=edit");
So if there is a dot at the end of the title it is not the last character in the URL.
This worked!
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Sorry for my last post, i fooled myself. Still it doesn't work, but now it seems to be a server issue.
For testing purposes i added an entry with title "Art" (without the dot at the end) to my wiki.
Surprisingly this makes the editing of the entry "Art." working.
I added some debug-code before and after the GetResponse() call in the PostDataAndGetResultHTM() function and watched the Address field of the webReq.
It seems as if the server is stripping the tailing dot somehow from the request address and internally searches for an entry without dot in the database. If it doesn't exist, it generates a 404 error.
I am still examining this strange behaviour.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Well, at least i got it working! The short story is, it looks like a .NET problem:
I had to comment out the line //if (Bot.isRunningOnMono) // Mono bug 636219 evasion
so that webReq.AllowAutoRedirect is set to false for my bot, although it is running on Windows 7 and not using mono.
Warning! Doing this changes the returncode of the PostDataAndGetResultHTM() function. It is no more the content of the edited page. I didn't need it anyway, but a better solution would be to handle the 302 response in the code and start a GET request with the correct new location address from the POST resonse.
The long story is:
A deep inspection of what is going on in the HTTP-Protocol told me this:
After sending the POST request "action=submit" the WikiMedia-Webserver is responding with StatusCode 302 (Found)
and is providing a new address in the response header "Location". In the default settings (AllowAutoRedirect=true)
the .NET HttpWebRequest.GetResponse handles this internally and starts a GET request with this new address. But if this address ends with a dot, this dot is stripped off and a GET request without the dot is startet. Depending on whether this page exists or not a 200 (OK) or 404 (Not found) is returned to this new request. That's it. A problem in the HttpWebRequest class.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I can't believe it. I spent hours on hours to find a bug, that is known by microsoft since december 2008. And still it isn't fixed. But there is a workaround and it works. So there is no need to change the DotNetWikiBot Framework. The code posted on this side uses Reflection to change a bit in the UriParser. I used this code in my bot as an init function. You might think about including this workaround in an init function in the DotNetWikiBot Framework to help others using it.
Save() throws WebException (404) if page title ends with a dot.
On my local intranet site this code causes a WebException:
try {
Page page = new Page(site, "Art.");
page.Load(); // works fine!
page.Save("New text", "MyBot V1.0", true); // Saving is done, but after that a WebException is thrown
}
catch (System.Net.WebException e) {
Console.WriteLine("WebException {0}", e.ToString());
}
catch (DotNetWikiBot.WikiBotException e) {
Console.WriteLine("WikiBotException {0}", e.ToString());
}
output form above code:
Localization file "DotNetWikiBot.i18n.xml" is missimg. <- spelling!
Logged in as MyBot.
Site: Wiktionary (MediaWiki 1.18.0)
Page "Art." loaded successfully.
PageList filled with bot account's watchlist.
WebException System.Net.WebException: The remote server returned an error: (404) Not Found.
at System.Net.HttpWebRequest.GetResponse()
at DotNetWikiBot.Site.PostDataAndGetResultHTM(String pageURL, String postData)
in e:\Wiktionary\Software\DotNetWikiBot_2.97\DotNetWikiBot.cs:line 986
at DotNetWikiBot.Page.Save(String newText, String comment, Boolean isMinorEdit)
in e:\Wiktionary\Software\DotNetWikiBot_2.97\DotNetWikiBot.cs:line 1629
I am using Visual C# 2010 Express, .NET Framework 4
Sorry, can't reproduce that. No exception is thrown when I'm testing, although I use .NET 2.
You can try the most recent version at: http://dotnetwikibot.cvs.sourceforge.net/viewvc/dotnetwikibot/framework/DotNetWikiBot.cs
Thank you for testing. But i got the same excepting with Version 98beta. Even changing to .NET 2 didn't helped.
Then i changed the order of parameters in the URL in public void GetEditSessionDataEx()
from api.php?action=query&prop=info&format=xml&intoken=edit&titles=" + HttpUtility.UrlEncode(title));
to api.php?action=query&prop=info&format=xml&titles=" + HttpUtility.UrlEncode(title) + "&intoken=edit");
So if there is a dot at the end of the title it is not the last character in the URL.
This worked!
Sorry for my last post, i fooled myself. Still it doesn't work, but now it seems to be a server issue.
For testing purposes i added an entry with title "Art" (without the dot at the end) to my wiki.
Surprisingly this makes the editing of the entry "Art." working.
I added some debug-code before and after the GetResponse() call in the PostDataAndGetResultHTM() function and watched the Address field of the webReq.
Console.WriteLine("Address: {0}", webReq.Address);
HttpWebResponse webResp = null;
for (int errorCounter = 0; true; errorCounter++) {
try {
webResp = (HttpWebResponse)webReq.GetResponse();
Console.WriteLine("Response: {0}", webReq.Address);
break;
}
catch (WebException e) {
Console.WriteLine("Exception: {0}", webReq.Address);
I get this result if the entry "Art" is existing in my wiki, and i try to save the entry "Art.":
Address: http://192.168.2.106/mediawiki/index.php?title=Art.&action=submit
Response: http://192.168.2.106/mediawiki/index.php/Art
Page "Art." saved successfully.
I get this result if the entry "Art" is not existing in my wiki, and i try to edit the entry "Art.":
Address: http://192.168.2.106/mediawiki/index.php?title=Art.&action=submit
Exception: http://192.168.2.106/mediawiki/index.php/Art
WebException System.Net.WebException: The remote server returned an error: (404) Not Found.
It seems as if the server is stripping the tailing dot somehow from the request address and internally searches for an entry without dot in the database. If it doesn't exist, it generates a 404 error.
I am still examining this strange behaviour.
Well, at least i got it working! The short story is, it looks like a .NET problem:
I had to comment out the line //if (Bot.isRunningOnMono) // Mono bug 636219 evasion
so that webReq.AllowAutoRedirect is set to false for my bot, although it is running on Windows 7 and not using mono.
Warning! Doing this changes the returncode of the PostDataAndGetResultHTM() function. It is no more the content of the edited page. I didn't need it anyway, but a better solution would be to handle the 302 response in the code and start a GET request with the correct new location address from the POST resonse.
The long story is:
A deep inspection of what is going on in the HTTP-Protocol told me this:
After sending the POST request "action=submit" the WikiMedia-Webserver is responding with StatusCode 302 (Found)
and is providing a new address in the response header "Location". In the default settings (AllowAutoRedirect=true)
the .NET HttpWebRequest.GetResponse handles this internally and starts a GET request with this new address. But if this address ends with a dot, this dot is stripped off and a GET request without the dot is startet. Depending on whether this page exists or not a 200 (OK) or 404 (Not found) is returned to this new request. That's it. A problem in the HttpWebRequest class.
It's really very deep investigation!
I can't believe it. I spent hours on hours to find a bug, that is known by microsoft since december 2008. And still it isn't fixed. But there is a workaround and it works. So there is no need to change the DotNetWikiBot Framework. The code posted on this side uses Reflection to change a bit in the UriParser. I used this code in my bot as an init function. You might think about including this workaround in an init function in the DotNetWikiBot Framework to help others using it.
https://connect.microsoft.com/VisualStudio/feedback/details/386695/system-uri-incorrectly-strips-trailing-dots
That's why so many people don't like Microsoft.