[Dedupe-news] Project Dedupe Newsletter #4
Status: Pre-Alpha
Brought to you by:
ltickett
|
From: Lee T. <lti...@gm...> - 2006-07-03 15:37:51
|
Good afternoon! Again, thanks for your continued support, welcome to Project Dedupe's 4th edition! What's been going on since the last Newsletter; My time has all been focussed on the php dedupe script which is of course available on Subversion (link in the navigation panel of the wiki). Really feel this is progressing well- would appreciate anyone's input on the code so far; the files loop, prepare and function .php are all used in the identification of match_groups. elect_master and commit_dupe .php are then used to walk through the match_groups flagging master records and false positives. There are a number of configurables but some stuff still needs to be progressed- if you've got a dataset you wish to try running the script on but can't fathom how to apply these scripts please give me a shout! What's next; I will continue to focus my attention/time on the php scripts but all of the bits on the below list still need progressing! Some meat should be added to the Web GUI page(s) on the wiki Discussion should commence on what toolkit (if any) may be used for developing the Web GUI I've been very busy this week and slightly sidetracked so these are still 'to come'... Hopefully v0.1 of the file in/out function for visual basic will be posted to Subversion Hopefully v0.1 of the country extraction function for visual basic will be posted to Subversion If you haven't already, please register on the wiki and write a few lines about yourself (if you could include where you're from (hopefully having members from different countries will enable us to develop a smarter application), what interests you about this project and what you may be able to contribute) on your user page (http://tickett.net/dedupe/index.php?title=User:username&action=edit replacing username with your usersname) Some stats; There are now 9 users registered on the wiki (welcome to the 2 new users since the last newsletter!) and 3 users registered as developers on sourceforge (if you need me to register you as a developer on sourceforge so you can post to subversoin etc please drop me an e-mail) The main page has seen over 2,900 hits since launch (still seems like a lot of people are hitting the site, possibly even bookmarking it, but aren't yet drawn in enough to register/contribute) The most recently edited pages are; http://tickett.net/dedupe/index.php/Algorithms http://tickett.net/dedupe/index.php/Programming_languages http://tickett.net/dedupe/index.php/Existing_software This, and all previously archived newsletters can be viewed at http://sourceforge.net/mailarchive/forum.php?forum_id=48618 Thanks Lee (ltickett) Project Dedupe http://dedupe.sourceforge.net http://sourceforge.net/projects/dedupe |