Re: [GD-Consoles] username dictionaries
Brought to you by:
vexxed72
From: Todd S. <to...@ro...> - 2003-05-27 15:07:09
|
On Tue, 27 May 2003 18:21:27 +0800 "Research \(GameBrains\)" <res...@ga...> wrote: > I'm trying to find a dictionary of vulgar, profane and obscene > usernames so that we can prevent users from signing up for an > account using one. This must be a solved problem but I can't > seem to find any resources for this. I thought perhaps the > console people that hang out in this forum might be more likely > to know something about this? If by "solved problem" you mean "quagmire of madness in which many a good programmer has been lost", or perhaps "a solved problem the same way natural language parsing is a solved problem", then yes. You'll get to something that handles some of the trivial stuff quite quickly, but you'll never get it all. It's the strong ai problem, and you've got people working against you trying to see what they can slip by your validator. Not only that, but you have people who's legitimate names may well contain substrings that match against your "bad word" dictionary (Sexton, Crapper...). The best you can hope for is to flag suspicious names for later evaluation by a human. It gets that much worse if you have to internationalize the thing; "shite" (shitay) is the imperative form of "suru" ("to do") in Japanese, and "phuque" is French for "sea lion" or "seal". Every language in the world used for human discourse has its fair share of the vulgar, the profane and the obscene, and in many cases there's bad phonetic crosstalk with "good" words in other languages. You also have the problem that if you do this kind of filtering, you've legally taken on a policy, which may have wider implications than you think. For example, if you're filtering what people say in the slightest way (even just username validation), in some of the more litigious parts of the world you might find that opened you up to liability if some legal dispute (harassment? slander? mp3 trading?) came up between some of your users, or between one of your users and the outside world. Fundamentally, however, your biggest problem is your users; Anyone who was going to try to use a "bad word" as their user name is going to try to do the same within whatever limits your system imposes. You'll wind up with standard h4x0r speak, rude combinations of allowed words (how do you plan on blocking something like "HamsterStuffer" or "ManPole" or...?), and words that you won't know are offensive until you get mail from the offended. Do you really think you can easily assemble a dictionary of all the racial slurs in the world? If you really must filter user names, you're going to need a person to do it, and you're probably going to want a tool that deals with batches of names and categorizes them based on suspiciousness. You'll still have lots of misses, the human reviewer will make mistakes and be subject to sliding standards based on their mood, but that's about the best you can hope for. Or you could just assign a name, or give them whatever name you find on the billing address. Most users will hate that, though. If it's for kids, and you really, really want to sand off all the corners, you could always make the user name something like "adjective adjective noun", and you supply the lists of from which to pick in clickable form. That solves the profanity problem (unless people can chat in-game, in which case you're screwed anyways...), at the expense of making initial name selection a trying experience for the user: "Sorry, user name 'happy fluffy bunny' is already in use. Sorry, user name 'fluffy happy bunny' is already in use. Sorry, user name...". Todd. -- Todd Showalter to...@ro... |