A coworker and myself are looking to use ChoiceMaker to deduplicate and link records that are stored in CSV files. We've scoured the documentation as well as some of the source code, and it's not clear to us how to create a schema that related to a flat file. Can you give a hand here, maybe point us at some specific documentation? Also, to perform this task, we're assuming that CM Developer isn't available, so we'd need to craft the various files needed by Analyzer and Server by hand, correct?
Hi Del -
I've just added some example record source and record-pair source files to CVS under the example project:
The files contain exactly the same data as the XML-based sources in the same directory. They use exactly the same schema, SimplePersonRecords.schema, without any alteration. (If you needed to work with fixed-width fields, rather than delimited fields, then you would need to modify the schema. If you need fixed-width fields, let me know, and I'll provide an example. There are also tools - albeit poorly documented - that can be used to convert data between XML and CSV sources, and between database records and text files. Again, let me know if these conversion tools are a pressing need, and I'll provide documentation.)
In other words, you might not need to do very much to work with your CSV data. Let me know if you have questions or if something is not clear after you've looked at the new example sources.
Regarding CM Developer vs CM Analyzer, these aren't separate applications anymore. It used to be that CM Developer was CM Analyzer plus a special license that unlocked the ChoiceMaker compiler. A special license is no longer required, so CM Developer has quietly gone away.
Thanks very much for your reply!
Sign up for the SourceForge newsletter:
You seem to have CSS turned off.
Please don't fill out this field.