From: Tim W. <tw...@re...> - 2003-06-18 16:37:11
|
Any thoughts about adding some logic to EditPerson to guess the gender of a person once their first name has been entered? (This could be done by maintaining a dict of name:{male count, female count} and using a sensible confidence threshold.) Overkill, or useful? Obviously, if the user has touched the selector themselves we wouldn't change it. Tim. */ |
From: Don A. <don...@at...> - 2003-06-18 21:02:53
|
There had been some discussion about this a while ago. At that time, an idea being floated around was to maintain a list of male/female names to look for a match. It didn't seem too practical at the time, especially considering different cultures and languages. If I understand correctly, what you are proposing is to maintain a map of first names currently in the database, and use this as a way of matching names to gender. While it wouldn't help much if you were starting from scratch, it would "learn" as you progress. This could be especially useful since family lines tend to use the same names over and over again. Am I understanding your idea correctly? Don Tim Waugh wrote: >Any thoughts about adding some logic to EditPerson to guess the gender >of a person once their first name has been entered? (This could be >done by maintaining a dict of name:{male count, female count} and >using a sensible confidence threshold.) > >Overkill, or useful? Obviously, if the user has touched the selector >themselves we wouldn't change it. > >Tim. >*/ > > |
From: Tim W. <tw...@re...> - 2003-06-18 22:06:59
|
On Wed, Jun 18, 2003 at 02:52:09PM -0600, Don Allingham wrote: > There had been some discussion about this a while ago. At that time, an= =20 > idea being floated around was to maintain a list of male/female names to= =20 > look for a match. It didn't seem too practical at the time, especially=20 > considering different cultures and languages. >=20 > If I understand correctly, what you are proposing is to maintain a map=20 > of first names currently in the database, and use this as a way of=20 > matching names to gender. While it wouldn't help much if you were=20 > starting from scratch, it would "learn" as you progress. This could be=20 > especially useful since family lines tend to use the same names over and= =20 > over again. >=20 > Am I understanding your idea correctly? Yes, that was the idea I had. You can look it up in the names-to-gender dict (which was constructed while the database was read in) to get the counts: def guess_gender (name): (n_male, n_female, n_unknown) =3D name_to_gender_stats[name] if n_male + n_female < 5: # Too few to be statistically significant? return 'unknown' # Any doubt? if n_male and not (n_female or n_unknown): return 'male' if n_female and not (n_male or n_unknown): return 'female' # Go with the odds (optional). male_to_female =3D n_male / n_female threshold =3D 0.1 if male_to_remale > (1 - threshold): return 'male' if male_to_female < threshold: return 'female' Of course, whenever a person's gender is altered, or a person added or removed, the names-to-gender dict should be altered accordingly. Tim. */ |
From: Tim W. <tw...@re...> - 2003-06-24 14:34:56
|
I've implemented this, and checked it in. It seems to work for me; anyone see problems with it? It'll only guess the gender of people who are being added via the "Edit Person" dialog (i.e. not people already in the database), and only then if the user has not touched the gender toggle buttons. Tim. */ |