From: Bob A. <apt...@cy...> - 2004-06-08 13:09:16
|
Hi, On Mon, 7 Jun 2004 22:23:42 +0200 Jon =C5slund <d9...@na...> wrote: > On Mon, Jun 07, 2004 at 01:34:45PM -0400, Steve Wainstead wrote: > > Nothing new here except a higher profile for the idea... > > > > http://news.netcraft.com/archives/2004/06/04/wikis_the_next_frontier_fo= r_spammers.html >=20 > If the goal of the spammer is higher google pagerank you could just > force every link outside the wiki to a redirect script something like > this: >=20 > http://phpwiki.sourceforge.net/phpwiki/redirect?http://www.veryniceprodu= ct.com >=20 > or using google itself like described here: >=20 > http://simon.incutio.com/archive/2003/10/13/linkRedirects > http://simon.incutio.com/archive/2004/05/11/approved >=20 > It looks a bit ugly and sometimes you do want to make normal links, > but I guess it wouldn't hurt if this was enabled by default. It would > probably mean a lot less spammers if they knew not to even bother with > phpwikis. Redirects are fine as long as they don't naively redirect to any URL (such as http://phpwiki.sourceforge.net/phpwiki/redirect?http://www.veryniceproduct.= com) This will be abused worse than the sandbox as spammers will search for wikis that allow open redirection and abuse those sites in an effort to get around spam filters. For years the anti-spam community has been working to get sites such as Yahoo to close their open redirectors due to abuse (a recent example: http://rd.yahoo.com/UtcUssn/*http://www.deliveryisguranteed.com) A redirection system such as Shorl or TinyUrl is less prone to abuse, so if you decide to go with redirection, please consider encoding the destination url in the redirector to prevent trivial abuse. Another suggestion is to parse out the URLs in a page (at least those that allow anonymous editing) and check their domains against the SURBL (http://www.surbl.org/), taking care to scrape out hostnames and known rediriectors. The implementation guide at http://www.surbl.org/implementation.html has more specifics. Basically, if you find an URL like http://rd.yahoo.com/*http://www.something.hotbarnyardtonermortgage.ac.uk/en= largeyourxerox, you'll want to reduce that to hotbarnyardtonermortgage.ac.uk and do a DNS lookup for the A record of hotbarnyardtonermortgage.ac.uk.sc.surbl.org. If that comes back with 127.0.0.2, the URL is suspect. The SpamAssassin team has been working on supporting this for the upcoming 3.x release. SA is written in perl but it shouldn't take much to port the core of their work to PHP. Probably. Note also that anyone who implements such a thing in a portable fashion will become a hero of the blog community because they are suffering worse from link spamming than the wiki community. At least that's what the Geeklog people told me when I originally suggested this to them. And no, I don't have a lot of free time to implement this and my PHP skills are rudimentary at best, especially if you want something portable and reusable. Another quick and dirty hack to temporarily foil the bots that detect and mangle the sandbox is to dynamically change the name/url of the sandbox to something not so easily guessable (e.g. something different from SandBox.) It kinda goes against the spirit of a Wiki but there's little reason to allow the sandbox to be easily linked to or guessed. Add a serial number to the sandbox URL (alternating between SandBox12, 12SandBox, Sand12Box where 12 is the serial number. Better to md5() it...) and increment that number every time the sandbox is raked. In short, make the link difficult to guess programmatically. hth, -- Bob |