[pure-lang-svn] SF.net SVN: pure-lang:[882] pure-csv
Status: Beta
Brought to you by:
agraef
From: <ag...@us...> - 2008-09-27 11:36:02
|
Revision: 882 http://pure-lang.svn.sourceforge.net/pure-lang/?rev=882&view=rev Author: agraef Date: 2008-09-27 11:35:58 +0000 (Sat, 27 Sep 2008) Log Message: ----------- Initial import. Added Paths: ----------- pure-csv/ pure-csv/branches/ pure-csv/releases/ pure-csv/trunk/ pure-csv/trunk/COPYING pure-csv/trunk/Makefile pure-csv/trunk/README pure-csv/trunk/csv.c pure-csv/trunk/csv.pure pure-csv/trunk/examples/ pure-csv/trunk/examples/read-sample1.csv pure-csv/trunk/examples/read-sample2.csv pure-csv/trunk/examples/read-sample3.csv pure-csv/trunk/examples/read-sample4.csv pure-csv/trunk/examples/read-sample5.csv pure-csv/trunk/examples/read-samples.pure pure-csv/trunk/examples/write-sample1.csv pure-csv/trunk/examples/write-sample2.csv pure-csv/trunk/examples/write-sample3.csv pure-csv/trunk/examples/write-sample4.csv pure-csv/trunk/examples/write-samples.pure Added: pure-csv/trunk/COPYING =================================================================== --- pure-csv/trunk/COPYING (rev 0) +++ pure-csv/trunk/COPYING 2008-09-27 11:35:58 UTC (rev 882) @@ -0,0 +1,340 @@ + GNU GENERAL PUBLIC LICENSE + Version 2, June 1991 + + Copyright (C) 1989, 1991 Free Software Foundation, Inc. + 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + + Preamble + + The licenses for most software are designed to take away your +freedom to share and change it. By contrast, the GNU General Public +License is intended to guarantee your freedom to share and change free +software--to make sure the software is free for all its users. This +General Public License applies to most of the Free Software +Foundation's software and to any other program whose authors commit to +using it. (Some other Free Software Foundation software is covered by +the GNU Library General Public License instead.) You can apply it to +your programs, too. + + When we speak of free software, we are referring to freedom, not +price. Our General Public Licenses are designed to make sure that you +have the freedom to distribute copies of free software (and charge for +this service if you wish), that you receive source code or can get it +if you want it, that you can change the software or use pieces of it +in new free programs; and that you know you can do these things. + + To protect your rights, we need to make restrictions that forbid +anyone to deny you these rights or to ask you to surrender the rights. +These restrictions translate to certain responsibilities for you if you +distribute copies of the software, or if you modify it. + + For example, if you distribute copies of such a program, whether +gratis or for a fee, you must give the recipients all the rights that +you have. You must make sure that they, too, receive or can get the +source code. And you must show them these terms so they know their +rights. + + We protect your rights with two steps: (1) copyright the software, and +(2) offer you this license which gives you legal permission to copy, +distribute and/or modify the software. + + Also, for each author's protection and ours, we want to make certain +that everyone understands that there is no warranty for this free +software. If the software is modified by someone else and passed on, we +want its recipients to know that what they have is not the original, so +that any problems introduced by others will not reflect on the original +authors' reputations. + + Finally, any free program is threatened constantly by software +patents. We wish to avoid the danger that redistributors of a free +program will individually obtain patent licenses, in effect making the +program proprietary. To prevent this, we have made it clear that any +patent must be licensed for everyone's free use or not licensed at all. + + The precise terms and conditions for copying, distribution and +modification follow. + + GNU GENERAL PUBLIC LICENSE + TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION + + 0. This License applies to any program or other work which contains +a notice placed by the copyright holder saying it may be distributed +under the terms of this General Public License. The "Program", below, +refers to any such program or work, and a "work based on the Program" +means either the Program or any derivative work under copyright law: +that is to say, a work containing the Program or a portion of it, +either verbatim or with modifications and/or translated into another +language. (Hereinafter, translation is included without limitation in +the term "modification".) Each licensee is addressed as "you". + +Activities other than copying, distribution and modification are not +covered by this License; they are outside its scope. The act of +running the Program is not restricted, and the output from the Program +is covered only if its contents constitute a work based on the +Program (independent of having been made by running the Program). +Whether that is true depends on what the Program does. + + 1. You may copy and distribute verbatim copies of the Program's +source code as you receive it, in any medium, provided that you +conspicuously and appropriately publish on each copy an appropriate +copyright notice and disclaimer of warranty; keep intact all the +notices that refer to this License and to the absence of any warranty; +and give any other recipients of the Program a copy of this License +along with the Program. + +You may charge a fee for the physical act of transferring a copy, and +you may at your option offer warranty protection in exchange for a fee. + + 2. You may modify your copy or copies of the Program or any portion +of it, thus forming a work based on the Program, and copy and +distribute such modifications or work under the terms of Section 1 +above, provided that you also meet all of these conditions: + + a) You must cause the modified files to carry prominent notices + stating that you changed the files and the date of any change. + + b) You must cause any work that you distribute or publish, that in + whole or in part contains or is derived from the Program or any + part thereof, to be licensed as a whole at no charge to all third + parties under the terms of this License. + + c) If the modified program normally reads commands interactively + when run, you must cause it, when started running for such + interactive use in the most ordinary way, to print or display an + announcement including an appropriate copyright notice and a + notice that there is no warranty (or else, saying that you provide + a warranty) and that users may redistribute the program under + these conditions, and telling the user how to view a copy of this + License. (Exception: if the Program itself is interactive but + does not normally print such an announcement, your work based on + the Program is not required to print an announcement.) + +These requirements apply to the modified work as a whole. If +identifiable sections of that work are not derived from the Program, +and can be reasonably considered independent and separate works in +themselves, then this License, and its terms, do not apply to those +sections when you distribute them as separate works. But when you +distribute the same sections as part of a whole which is a work based +on the Program, the distribution of the whole must be on the terms of +this License, whose permissions for other licensees extend to the +entire whole, and thus to each and every part regardless of who wrote it. + +Thus, it is not the intent of this section to claim rights or contest +your rights to work written entirely by you; rather, the intent is to +exercise the right to control the distribution of derivative or +collective works based on the Program. + +In addition, mere aggregation of another work not based on the Program +with the Program (or with a work based on the Program) on a volume of +a storage or distribution medium does not bring the other work under +the scope of this License. + + 3. You may copy and distribute the Program (or a work based on it, +under Section 2) in object code or executable form under the terms of +Sections 1 and 2 above provided that you also do one of the following: + + a) Accompany it with the complete corresponding machine-readable + source code, which must be distributed under the terms of Sections + 1 and 2 above on a medium customarily used for software interchange; or, + + b) Accompany it with a written offer, valid for at least three + years, to give any third party, for a charge no more than your + cost of physically performing source distribution, a complete + machine-readable copy of the corresponding source code, to be + distributed under the terms of Sections 1 and 2 above on a medium + customarily used for software interchange; or, + + c) Accompany it with the information you received as to the offer + to distribute corresponding source code. (This alternative is + allowed only for noncommercial distribution and only if you + received the program in object code or executable form with such + an offer, in accord with Subsection b above.) + +The source code for a work means the preferred form of the work for +making modifications to it. For an executable work, complete source +code means all the source code for all modules it contains, plus any +associated interface definition files, plus the scripts used to +control compilation and installation of the executable. However, as a +special exception, the source code distributed need not include +anything that is normally distributed (in either source or binary +form) with the major components (compiler, kernel, and so on) of the +operating system on which the executable runs, unless that component +itself accompanies the executable. + +If distribution of executable or object code is made by offering +access to copy from a designated place, then offering equivalent +access to copy the source code from the same place counts as +distribution of the source code, even though third parties are not +compelled to copy the source along with the object code. + + 4. You may not copy, modify, sublicense, or distribute the Program +except as expressly provided under this License. Any attempt +otherwise to copy, modify, sublicense or distribute the Program is +void, and will automatically terminate your rights under this License. +However, parties who have received copies, or rights, from you under +this License will not have their licenses terminated so long as such +parties remain in full compliance. + + 5. You are not required to accept this License, since you have not +signed it. However, nothing else grants you permission to modify or +distribute the Program or its derivative works. These actions are +prohibited by law if you do not accept this License. Therefore, by +modifying or distributing the Program (or any work based on the +Program), you indicate your acceptance of this License to do so, and +all its terms and conditions for copying, distributing or modifying +the Program or works based on it. + + 6. Each time you redistribute the Program (or any work based on the +Program), the recipient automatically receives a license from the +original licensor to copy, distribute or modify the Program subject to +these terms and conditions. You may not impose any further +restrictions on the recipients' exercise of the rights granted herein. +You are not responsible for enforcing compliance by third parties to +this License. + + 7. If, as a consequence of a court judgment or allegation of patent +infringement or for any other reason (not limited to patent issues), +conditions are imposed on you (whether by court order, agreement or +otherwise) that contradict the conditions of this License, they do not +excuse you from the conditions of this License. If you cannot +distribute so as to satisfy simultaneously your obligations under this +License and any other pertinent obligations, then as a consequence you +may not distribute the Program at all. For example, if a patent +license would not permit royalty-free redistribution of the Program by +all those who receive copies directly or indirectly through you, then +the only way you could satisfy both it and this License would be to +refrain entirely from distribution of the Program. + +If any portion of this section is held invalid or unenforceable under +any particular circumstance, the balance of the section is intended to +apply and the section as a whole is intended to apply in other +circumstances. + +It is not the purpose of this section to induce you to infringe any +patents or other property right claims or to contest validity of any +such claims; this section has the sole purpose of protecting the +integrity of the free software distribution system, which is +implemented by public license practices. Many people have made +generous contributions to the wide range of software distributed +through that system in reliance on consistent application of that +system; it is up to the author/donor to decide if he or she is willing +to distribute software through any other system and a licensee cannot +impose that choice. + +This section is intended to make thoroughly clear what is believed to +be a consequence of the rest of this License. + + 8. If the distribution and/or use of the Program is restricted in +certain countries either by patents or by copyrighted interfaces, the +original copyright holder who places the Program under this License +may add an explicit geographical distribution limitation excluding +those countries, so that distribution is permitted only in or among +countries not thus excluded. In such case, this License incorporates +the limitation as if written in the body of this License. + + 9. The Free Software Foundation may publish revised and/or new versions +of the General Public License from time to time. Such new versions will +be similar in spirit to the present version, but may differ in detail to +address new problems or concerns. + +Each version is given a distinguishing version number. If the Program +specifies a version number of this License which applies to it and "any +later version", you have the option of following the terms and conditions +either of that version or of any later version published by the Free +Software Foundation. If the Program does not specify a version number of +this License, you may choose any version ever published by the Free Software +Foundation. + + 10. If you wish to incorporate parts of the Program into other free +programs whose distribution conditions are different, write to the author +to ask for permission. For software which is copyrighted by the Free +Software Foundation, write to the Free Software Foundation; we sometimes +make exceptions for this. Our decision will be guided by the two goals +of preserving the free status of all derivatives of our free software and +of promoting the sharing and reuse of software generally. + + NO WARRANTY + + 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY +FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN +OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES +PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED +OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF +MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS +TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE +PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, +REPAIR OR CORRECTION. + + 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING +WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR +REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, +INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING +OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED +TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY +YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER +PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE +POSSIBILITY OF SUCH DAMAGES. + + END OF TERMS AND CONDITIONS + + How to Apply These Terms to Your New Programs + + If you develop a new program, and you want it to be of the greatest +possible use to the public, the best way to achieve this is to make it +free software which everyone can redistribute and change under these terms. + + To do so, attach the following notices to the program. It is safest +to attach them to the start of each source file to most effectively +convey the exclusion of warranty; and each file should have at least +the "copyright" line and a pointer to where the full notice is found. + + <one line to give the program's name and a brief idea of what it does.> + Copyright (C) <year> <name of author> + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + + +Also add information on how to contact you by electronic and paper mail. + +If the program is interactive, make it output a short notice like this +when it starts in an interactive mode: + + Gnomovision version 69, Copyright (C) year name of author + Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. + This is free software, and you are welcome to redistribute it + under certain conditions; type `show c' for details. + +The hypothetical commands `show w' and `show c' should show the appropriate +parts of the General Public License. Of course, the commands you use may +be called something other than `show w' and `show c'; they could even be +mouse-clicks or menu items--whatever suits your program. + +You should also get your employer (if you work as a programmer) or your +school, if any, to sign a "copyright disclaimer" for the program, if +necessary. Here is a sample; alter the names: + + Yoyodyne, Inc., hereby disclaims all copyright interest in the program + `Gnomovision' (which makes passes at compilers) written by James Hacker. + + <signature of Ty Coon>, 1 April 1989 + Ty Coon, President of Vice + +This General Public License does not permit incorporating your program into +proprietary programs. If your program is a subroutine library, you may +consider it more useful to permit linking proprietary applications with the +library. If this is what you want to do, use the GNU Library General +Public License instead of this License. Added: pure-csv/trunk/Makefile =================================================================== --- pure-csv/trunk/Makefile (rev 0) +++ pure-csv/trunk/Makefile 2008-09-27 11:35:58 UTC (rev 882) @@ -0,0 +1,41 @@ + +# Package name and version number: +dist = pure-csv-$(version) +version = 0.1 + +# Try to guess the installation prefix (this needs GNU make): +prefix = $(patsubst %/bin/pure,%,$(shell which pure 2>/dev/null)) +ifeq ($(strip $(prefix)),) +# Fall back to /usr/local. +prefix = /usr/local +endif + +# Platform-specific stuff, edit this as needed. +#PIC = -fPIC # uncomment for x86-64 compilation +DLL = .so # .dll on Windows + +DISTFILES = COPYING Makefile README examples/*.pure examples/*.csv \ +csv.c csv.pure + +all: csv$(DLL) + +csv$(DLL): csv.c + gcc -shared -o $@ $< $(PIC) -lpure + +clean: + rm -f *$(DLL) *~ *.a *.o + +install: + test -d "$(DESTDIR)$(prefix)/lib/pure" || mkdir -p "$(DESTDIR)$(prefix)/lib/pure" + cp csv.pure csv$(DLL) "$(DESTDIR)$(prefix)/lib/pure" + +uninstall: + rm -f "$(DESTDIR)$(prefix)/lib/pure/csv.pure" "$(DESTDIR)$(prefix)/lib/pure/csv$(DLL)" + +dist: + rm -rf $(dist) + mkdir $(dist) && mkdir $(dist)/examples + for x in $(DISTFILES); do ln -sf $$PWD/$$x $(dist)/$$x; done + rm -f $(dist).tar.gz + tar cfzh $(dist).tar.gz $(dist) + rm -rf $(dist) Added: pure-csv/trunk/README =================================================================== --- pure-csv/trunk/README (rev 0) +++ pure-csv/trunk/README 2008-09-27 11:35:58 UTC (rev 882) @@ -0,0 +1,89 @@ + +PURE-CSV - Comma Separated Value interface for the Pure programming language +======== = ===== ========= ===== ========= === === ==== =========== ======== + +The CSV library provides an interface to read and write comma separated value +files. The reading and writing functions are loosely based on Python's CSV +module (http://docs.python.org/lib/module-csv.html) + +INSTALLATION + +Run 'make' to compile the module and 'make install' (as root) to install it in +the Pure library directory. This requires GNU make. + +The 'make install' step is only necessary for system-wide installation. 'make' +will try to guess your Pure installation directory, if it gets it wrong, you +can specify the installation prefix as follows: 'make install prefix=/usr'. +Make sure that you get this right, otherwise the Pure interpreter won't be +able to find the installed module. + +USAGE + +Data records are represented as lists of strings and numeric data. Dialects +are created using csv_dialect with a list of specifications outlined in +csv.pure. + +- csv_list (s::string, dialect) + csv_list s::string + + Converts a CSV formatted string s to a list of fields according to the + dialect specified. If no dialect is specified, conversion is performed using + RFC 4180 rules (http://www.ietf.org/rfc/rfc4180.txt). Invalidly formatted + CSV causes return of a 'csv_error msg' term (see NOTES below). + +- csv_str (x:xs) + csv_str ((x:xs), dialect) + + Converts a list of fields, which includes only strings, integers, and + doubles to a CSV formatted string according to the dialect specified. If + no dialect is specified, conversion is performed using RFC 4180 rules + (http://www.ietf.org/rfc/rfc4180.txt). Lists that are not strings, integers, + or floats invoke a 'csv_error msg' term (see NOTES below). + +- csv_fgets (f::pointer, dialect) + csv_fgets f::pointer + + Is equivalent to csv_list except that reading is from file f. + +- csv_fputs ((x:xs), dialect, f) + csv_fputs ((x:xs), f) + + Is equivalent to csv_str except that writing is to file f. + +- csv_fget (name::string, dialect) + csv_fget name::string + + Reads a named file and returns a list of records. These procedures should + only be used on data files that are small enough to fit in the computers + RAM. + +- csv_fput (name::string, recs, dialect) + csv_fput (name::string, recs) + + Writes list of records to a named file. Each record is converted according + to the rules stated in the csv_str procedure. + +NOTES + +- Errors in the conversion routines (input that does not abide by the + dialect rules; records containing field types other than strings, integers + and floats) cause a special 'csv_error msg' term to be returned, where msg + is a string describing the particular error. To handle error conditions, + your application should either check for these, or define csv_error to + directly handle the error in some way (e.g., provide a default value, or + raise an exception). For instance: + + csv_error msg = throw msg; + +- MS Excel files should be written using "=""0004""" if leading 0s are + significant. Use the same technique if leading space is significant. Use this + quirk only if written files are going to be imported to MS Excel. + +EXAMPLES + +Examples are provided in the examples subdirectory. See "readsamples.pure" for +reading csv files and "writesamples.pure" for writing. + +September 24, 2008 +Eddie Rucker +er...@bm... Added: pure-csv/trunk/csv.c =================================================================== --- pure-csv/trunk/csv.c (rev 0) +++ pure-csv/trunk/csv.c 2008-09-27 11:35:58 UTC (rev 882) @@ -0,0 +1,452 @@ +/* Port of CSV module from Q to Pure + + Author: Eddie Rucker + Date: July 3, 2008 +*/ + +#include <stdlib.h> +#include <stdio.h> +#include <string.h> +#include <pure/runtime.h> + +#define STRSIZE 128 +#define BUFSIZE 1024 + +#define QUOTE_ALL 0 +#define QUOTE_STRINGS 1 +#define QUOTE_EMBEDDED 2 + +#define error_handler(msg) \ + return pure_app(pure_symbol(pure_sym("csv_error")), pure_cstring_dup(msg)) + +/* Return a CSV record as a Pure string. + Input: + fp: File pointer to read from + quote: Pure string representing CSV quotes + + Output: + pure_string representing a CSV record. The string may have embedded '\n's. + Does not check for badly formated records. + + Exceptions: + csv_error msg is invoked f there is a system memory allocation error, + beyond end of file, or a file error is encountered. +*/ +pure_expr *c_csv_fgets(FILE *fp, char *quote) { + char *bf, *tb, *s; + int quote_count = 0, n_quote; + long sz = BUFSIZE, n = 0; + pure_expr *ret; + + if (!(s = bf = (char *)malloc(sz))) { + error_handler("malloc error"); + } + + n_quote = strlen(quote); + while (1) { + s = bf + n; + if (n > sz) { + if (!(tb = realloc(bf, sz <<= 1))) { + free(bf); + error_handler("realloc error"); + } + s = (bf = tb) + n; + } + tb = fgets(s, BUFSIZE, fp); + if (ferror(fp)) { + free(bf); + return 0; + } + if (tb == NULL) { + if (n == 0) + return NULL; + else + return pure_cstring(bf); + } + n += strlen(s); + if (*(bf+n-1) != '\n') + continue; + while (*s) { + if (!strncmp(s, quote, n_quote)) { + ++quote_count; + s += n_quote; + } else + ++s; + } + if (!(quote_count & 1)) + /* let pure handle freeing bf */ + return pure_cstring(bf); + } +} + +/* Convert a string to a number. + Input: + s: string to be converted. + cvt_flag: 0 -> no conversion, 1 -> attempt conversion + + Output: + Pure int, Pure double, or Pure string depending on cvt_flag=1 and + s obeys strtol and strtod specs. +*/ +pure_expr *convert_string(char *s, int cvt_flag) { + long i; + double d; + char *p; + + if (cvt_flag) { + i = strtol(s, &p, 0); + if (*p == 0) + return pure_int(i); + d = strtod(s, &p); + if (*p == 0) + return pure_double(d); + } + return pure_cstring_dup(s); +} + +#define putfld(len) \ + if (n_fld + len >= fld_sz) { \ + if (!(tfld = (char *)realloc(fld, fld_sz <<= 1))) \ + goto done; \ + fld = tfld; \ + } \ + fldp = fld + n_fld; \ + strncpy(fldp, s, len); \ + n_fld += len \ + + +#define putrec(qt) \ + *(fldp + 1) = 0; \ + if (n_rec >= rec_sz - 1) { \ + if (!(trec = (pure_expr **)realloc(rec, (rec_sz+=64)*sizeof(pure_expr)))) \ + goto done; \ + rec = trec; \ + } \ + rec[n_rec++] = convert_string(fld, qt) + +#define free_params \ + free(delimiter); \ + free(escape); \ + free(quote); \ + free(lineterm); \ + free(elems) + +/* Convert a CSV string to a list of fields + input: + dialect: (Conversion flag, field delimeter char, string delimeter char) + s: CSV formatted string + + Output: record of fields + + Exceptions: + Invokes 'csv_error MSG' if the string is badly formatted, or memory error + + Notes: + \r char is treated as white space except inside "" +*/ +pure_expr *c_csvstr_to_list(char *s, pure_expr *dialect) { + size_t n_elems; + pure_expr **elems, **rec, **trec; + int n, st = 0, fld_sz = 256, n_fld, rec_sz = 64, n_ws = 0, n_rec = 0, + n_delimiter, n_escape, n_quote, n_lineterm, skipspace_f, esc_eq_quote, + quoting_style; + char *fld, *tfld, *fldp, errmsg[80], *delimiter, *escape, *quote, *lineterm; + + if (!(pure_is_listv(dialect, &n_elems, &elems) + && n_elems == 6 + && pure_is_cstring_dup(elems[0], &delimiter) + && pure_is_cstring_dup(elems[1], &escape) + && pure_is_cstring_dup(elems[2], "e) + && pure_is_int(elems[3], "ing_style) + && pure_is_cstring_dup(elems[4], &lineterm) + && pure_is_int(elems[5], &skipspace_f))) + return 0; + + if (!(fld = (char *)malloc(fld_sz))) { + free_params; + error_handler("malloc error"); + } + + if (!(rec = (pure_expr **)malloc(rec_sz*sizeof(pure_expr)))) { + free(fld); + free_params; + error_handler("malloc error"); + } + n_delimiter = strlen(delimiter); + n_escape = strlen(escape); + n_quote = strlen(quote); + esc_eq_quote = !strcmp(escape, quote); + n_lineterm = strlen(lineterm); + fldp = fld; + while (st < 10) { + switch (st) { + case 0: + fldp = fld; + *fldp = 0; + n_fld = 0; + if (!strncmp(s, delimiter, n_delimiter)) { + putrec(QUOTE_ALL); + s += n_delimiter; + } else if (!strncmp(s, quote, n_quote)) { + s += n_quote; + st = 1; + } else if (!*s || *s == EOF || !strncmp(s, lineterm, n_lineterm)) { + putrec(QUOTE_ALL); + st = 10; + } else if (isspace(*s) && skipspace_f) { + ++s; + } else if (!strncmp(s, escape, n_escape)) { + sprintf(errmsg, "column %d: unexpected escape.", n_fld+1); + st = 20; + } else { + putfld(1); + ++s; + st = 4; + } + break; + case 1: + if (!strncmp(s, quote, n_quote)) { + s += n_quote; + st = 2; + } else if (!*s || *s == EOF) { + sprintf(errmsg, "column %d: expected {%s}.", + n_fld+1, quote); + st = 20; + } else if (!strncmp(s, escape, n_escape)) { + s += n_escape; + putfld(n_escape); + ++s; + } else { + putfld(1); + ++s; + } + break; + case 2: + if (!strncmp(s, quote, n_quote) && esc_eq_quote) { + putfld(n_quote); + s += n_quote; + st = 1; + } else if (!strncmp(s, delimiter, n_delimiter)) { + putrec(QUOTE_ALL); + s += n_delimiter; + st = 0; + } else if (!*s || *s == EOF || !strncmp(s, lineterm, n_lineterm)) { + putrec(QUOTE_ALL); + st = 10; + } else if (isspace(*s)) { + ++s; + st = 3; + } else { + sprintf(errmsg, "column %d: expected {%s}.", + n_fld+1, delimiter); + st = 20; + } + break; + case 3: + if (!strncmp(s, delimiter, n_delimiter)) { + putrec(QUOTE_ALL); + s += n_delimiter; + st = 0; + } else if (!*s || *s == '\n' || *s == EOF) { + putrec(QUOTE_ALL); + st = 10; + } else if (isspace(*s)) { + ++s; + } else { + sprintf(errmsg, "column %d: expected {%s}.", + n_fld+1, delimiter); + st = 20; + } + break; + case 4: + if (!strncmp(s, quote, n_quote) || !strncmp(s, escape, n_escape)) { + sprintf(errmsg, "column %d: expected {%s}.", + n_fld+1, delimiter); + st = 20; + } else if (!strncmp(s, delimiter, n_delimiter)) { + fldp -= n_ws; + n_fld -= n_ws; + putrec(quoting_style); + s += n_delimiter; + st = 0; + } else if (!*s || *s == EOF || !strncmp(s, lineterm, n_lineterm)) { + putrec(quoting_style); + st = 10; + } else if (isspace(*s)) { + n_ws = n_ws ? n_ws+1 : 1; + putfld(1); + ++s; + } else { + n_ws = 0; + putfld(1); + ++s; + } + break; + } + } + done: + free(fld); + free_params; + if (st == 10) + return pure_listv(n_rec, realloc(rec, sizeof(pure_expr)*n_rec)); + else { + for (n = 0; n < n_rec; ++n) + pure_free(rec[n]); + free(rec); + if (st==20) + error_handler(errmsg); + error_handler("malloc error"); + + } +} + +#define resize_str \ + if (len > sz) { \ + if (!(ts = (char *)realloc(s, sz <<= 1))) { \ + free(s); \ + error_handler("realloc error"); \ + } \ + s = ts; \ + } \ + t = s + mrk + + +#define insert \ + mrk = len; \ + len += strlen(tb); \ + resize_str; \ + strncpy(t, tb, len - mrk) + +/* Convert list to a CSV formated string + Input: + dialect: (Conversion flag, field delimeter char, string delimeter char) + list: record to be converted + + Output: CSV formatted string + + Exceptions: + Invokes csv_error if no more memory is available or if field cannot be + converted. + + Notes: + \r char is treated as white space except inside "" +*/ +pure_expr *c_list_to_csvstr(pure_expr *list, pure_expr *dialect) { + size_t n_elems; + int i, n, k, sz = 256, mrk, quote_cnt, delim_cnt, lineterm_cnt, len = 0, + skipspace_f, n_escape, n_quote, n_delimiter, n_lineterm, quoting_style, + ival; + char *s, *ts, *p, *sval, tb[48], errmsg[80], *escape, *quote, *delimiter, + *lineterm; + double dval; + pure_expr **elems, **xs; + register char *t; + + if (!(pure_is_listv(dialect, &n_elems, &elems) + && n_elems == 6 + && pure_is_string_dup(elems[0], &delimiter) + && pure_is_string_dup(elems[1], &escape) + && pure_is_string_dup(elems[2], "e) + && pure_is_int(elems[3], "ing_style) + && pure_is_string_dup(elems[4], &lineterm) + && pure_is_int(elems[5], &skipspace_f) + && pure_is_listv(list, &n_elems, &xs))) { + return 0; + } + + if (!(s = (char *)malloc(sz))) { + free_params; + error_handler("malloc error"); + } + + n_escape = strlen(escape); + n_quote = strlen(quote); + n_delimiter = strlen(delimiter); + n_lineterm = strlen(lineterm); + for (i = 0; i < n_elems; ++i) { + if (pure_is_int(xs[i], &ival)) { + if (!quoting_style) + sprintf(tb, "%s%d%s%s", quote, ival, quote, delimiter); + else + sprintf(tb, "%d%s", ival, delimiter); + insert; + } else if (pure_is_double(xs[i], &dval)) { + if (!quoting_style) + sprintf(tb, "%s%.16g%s%s", quote, dval, quote, delimiter); + else + sprintf(tb, "%.16g%s", dval, delimiter); + insert; + } else if (pure_is_cstring_dup(xs[i], &sval)) { + quote_cnt = 0; + delim_cnt = 0; + lineterm_cnt = 0; + p = sval; + if (skipspace_f && quoting_style == QUOTE_EMBEDDED) + while (isspace(*p) + && strncmp(p, quote, n_delimiter) + && strncmp(p, delimiter, n_delimiter) + && strncmp(p, lineterm, n_lineterm)) + ++p; + k = p - sval; + mrk = len; + while (*p) { + if (!strncmp(p, quote, n_quote)) { + ++quote_cnt; + p += n_quote; + len += n_escape + n_quote; + } else if (!strncmp(p, delimiter, n_delimiter)) { + ++delim_cnt; + p += n_delimiter; + len += n_delimiter; + } else if (!strncmp(p, lineterm, n_lineterm)) { + ++lineterm_cnt; + p += n_lineterm; + len += n_lineterm; + } else { + ++len; + ++p; + } + } + len += n_delimiter; + p = sval + k; + if (quoting_style == QUOTE_EMBEDDED + && !(quote_cnt+delim_cnt+lineterm_cnt)) { + resize_str; + k = len - mrk - 1; + strncpy(t, p, k); + t += k; + } else { + /* Add space for surrounding quotes */ + len += n_quote << 1; + resize_str; + strncpy(t, quote, n_quote); + t += n_quote; + while (*p) { + if (!strncmp(p, quote, n_quote)) { + strncpy(t, escape, n_escape); + t += n_escape; + strncpy(t, quote, n_quote); + t += n_quote; + p += n_quote; + } else + *t++ = *p++; + } + strncpy(t, quote, n_quote); + t += n_quote; + } + strncpy(t, delimiter, n_delimiter); + t += n_delimiter; + } else { + sprintf(errmsg, "field %d: invalid conversion type.", + i+1); + free_params; + error_handler(errmsg); + } + } + mrk = (len -= n_delimiter); /* write over last delimiter */ + len += n_lineterm; + resize_str; + strcpy(t, lineterm); + free_params; + return pure_cstring((char *)realloc(s, len+1)); +} Added: pure-csv/trunk/csv.pure =================================================================== --- pure-csv/trunk/csv.pure (rev 0) +++ pure-csv/trunk/csv.pure 2008-09-27 11:35:58 UTC (rev 882) @@ -0,0 +1,178 @@ +/* This file is part of the Pure programming system. + + The Pure programming system is free software; you can redistribute it and/or + modify it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2, or (at your option) any + later version. + + The Pure programming system is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. +*/ + +/* The CSV library provides an interface to read and write comma separated + value files. The reading and writing functions are loosely based on + Python's csv module (http://docs.python.org/lib/module-csv.html) + + Author: Robert E. Rucker + Email: er...@bm... + Date: July 15, 2008 +*/ + +using "lib:csv"; +using system; + +/* User may define csv_error for custom error handling. */ +csv_error msg; + +private c_csv_fgets c_csvstr_to_list c_list_to_csvstr; + +/* Read a string with embedded '\n's within quotes. No error checking! */ +extern expr *c_csv_fgets(FILE *fp, char *quote); + +/* Convert a CSV string to a list (record). + s: CSV formated string to be converted to a list of fields. + Dialect: CSV format specification. If none is given, defaults to + RFC4180 for Windows and UNIX for all other OSs. + NOTE: Rec must contain ONLY strings, integers, and floating point numbers. + If a field is some other type, the 'csv_error MSG' rule is invoked. +*/ +extern expr *c_csvstr_to_list(char *str, expr *dialect); + +/* Convert a list (record) to a CSV string. + Rec: list of fields to be converted to CSV format. + Dialect: CSV format specification. If none is given, defaults to + RFC4180 for Windows and UNIX for all other OSs. + NOTE: Rec must contain ONLY strings, integers, and floating point numbers. + If a field is some other type, the 'csv_error MSG' rule is invoked. +*/ +extern expr *c_list_to_csvstr(expr *list, expr *dialect); + +/* Public dialect Options +CSV_DELIMITER: Field delimiter. Defaults to ",". + +CSV_ESCAPE: Embedded escape character. Defaults to "\"". + Reading: The escape character is dropped and + the next char is inserted into the field. + Writing: The escape character is written into the + output stream. + +CSV_QUOTE: Quote character. Defaults to "\"". + Note: If embedded quotes are doubled, csv_escape must equal + csv_quote. The csv_dialect function will + automatically set the csv_escape character to + csv_quote if csv_escape is not specified. + +CSV_QUOTING_STYLE: Quoting options: Defaults to csv_quote_strings. + See QuoteStyle constants below. + +CSV_LINETERMINATOR: Record terminator. Defaults to "\r\n". + +CSV_SKIPSPACE: Skip white space flag. Defaults to true. + Reading/Writing: If true, white spaces before fields are + removed. Quoted fields always retain + white space. +*/ +const CSV_QUOTE_ALL = 0; +const CSV_QUOTE_STRINGS = 1; +const CSV_QUOTE_EMBEDDED = 2; + +const CSV_DELIMITER = 0; +const CSV_ESCAPE = 1; +const CSV_QUOTE = 2; +const CSV_QUOTING_STYLE = 3; +const CSV_LINETERMINATOR = 4; +const CSV_SKIPSPACE = 5; + +/* Defaults are set to RFC 4180 (http://www.ietf.org/rfc/rfc4180.txt) except + for the \r\n pair */ +let csv_defaults = [",", "\"", "\"", 1, "\n", 1]; + +/* Create a dialect base on the list of dialect options given above. + See RFC4180 and EXCEL constants below for example usage. +*/ +csv_dialect opts + = zipwith ((lookup) opts2) (0..5) csv_defaults + when + opts2 = if any (\(x=>y) -> x==CSV_ESCAPE) opts then + opts + else + opts + [CSV_ESCAPE=>(lookup opts CSV_QUOTE "\"")]; + end + with + lookup [] k::int v = v; + lookup ((x=>y):xs) k::int _ = y if k == x; + lookup ((x=>y):xs) k::int v = lookup xs k v; + end; + +const CSV_RFC4180 = csv_dialect [CSV_LINETERMINATOR => "\r\n"]; +const CSV_UNIX = csv_defaults; +const CSV_EXCEL = csv_dialect [CSV_QUOTING_STYLE => CSV_QUOTE_EMBEDDED]; +const CSV_DEFAULTS + = if (substr sysinfo 0 5) == "mingw" then + CSV_RFC4180 + else + CSV_UNIX; + +/* List to CSV string conversion functions */ + csv_str rec@[] +| csv_str rec@(_:_) + = c_list_to_csvstr rec CSV_DEFAULTS; + + csv_str (rec@[], dialect@(_:_)) +| csv_str (rec@(_:_), dialect @(_:_)) + = c_list_to_csvstr rec dialect; + +/* File writing functions */ + csv_fputs (rec@[], f::pointer) +| csv_fputs (rec@(_:_), f::pointer) + = fputs (c_list_to_csvstr rec CSV_DEFAULTS) f; + + csv_fputs (rec@[], dialect@(_:_), f::pointer) +| csv_fputs (rec@(_:_), dialect@(_:_), f::pointer) + = fputs (c_list_to_csvstr rec dialect) f; + +/* CSV string to list converstion functions */ +csv_list s::string + = c_csvstr_to_list s CSV_DEFAULTS; +csv_list (s::string, dialect@(_:_)) + = c_csvstr_to_list s dialect; + +/* File reading functions */ +csv_fgets f::pointer + = c_csvstr_to_list (c_csv_fgets f (CSV_DEFAULTS!CSV_QUOTE)) CSV_DEFAULTS; +csv_fgets (f::pointer, dialect@(_:_)) + = c_csvstr_to_list (c_csv_fgets f (dialect!CSV_QUOTE)) dialect; + +/* Read a whole file at one time */ +csv_fget (name::string, dialect@(_:_)) + = read (csv_fgets (f, dialect)) [] + with + read s acc = fclose f $$ reverse acc if feof f; + read s acc = read (csv_fgets (f, dialect)) (s:acc); + end + when + f = fopen name "r"; + end; + +csv_fget name::string + = csv_fget (name, CSV_DEFAULTS); + +/* Write a whole file at one time */ +csv_fput (name::string, recs, dialect@(_:_)) + = write recs f + with + write [] f = fclose f $$ (); + write (x:xs) f = csv_fputs (x, dialect, f) $$ write xs f; + end + when + f = fopen name "w"; + end; + +csv_fput (name::string, recs) + = csv_fput (name, recs, CSV_DEFAULTS); Added: pure-csv/trunk/examples/read-sample1.csv =================================================================== --- pure-csv/trunk/examples/read-sample1.csv (rev 0) +++ pure-csv/trunk/examples/read-sample1.csv 2008-09-27 11:35:58 UTC (rev 882) @@ -0,0 +1,7 @@ +,,4,"0004" +"embedded +new +lines", "4", 0.1, "----" +"this, has an embedded ," ,, 3.2, "0000" +"embedded ""quotes""", "-10", 0 , " " +, , , 2.3e-4 , 54-23 \ No newline at end of file Added: pure-csv/trunk/examples/read-sample2.csv =================================================================== --- pure-csv/trunk/examples/read-sample2.csv (rev 0) +++ pure-csv/trunk/examples/read-sample2.csv 2008-09-27 11:35:58 UTC (rev 882) @@ -0,0 +1,4 @@ + 4 "0004" +"this has an embedded tab" 3.2 "0000" +" this""quotes""" "-10, 0" " " + 2.3e-4 54-23 \ No newline at end of file Added: pure-csv/trunk/examples/read-sample3.csv =================================================================== --- pure-csv/trunk/examples/read-sample3.csv (rev 0) +++ pure-csv/trunk/examples/read-sample3.csv 2008-09-27 11:35:58 UTC (rev 882) @@ -0,0 +1,4 @@ + 4 '0004' +'this has an embedded tab' 3.2 '0000' +'embedded ''quotes''' '-10' 0 ' ' + 2.3e-4 54-23 \ No newline at end of file Added: pure-csv/trunk/examples/read-sample4.csv =================================================================== --- pure-csv/trunk/examples/read-sample4.csv (rev 0) +++ pure-csv/trunk/examples/read-sample4.csv 2008-09-27 11:35:58 UTC (rev 882) @@ -0,0 +1,4 @@ +,,4,"0004" +"%"a,b%"", 3.2, "0000" +"a","b","%"%"",9 +4,3,,,-9.4 Added: pure-csv/trunk/examples/read-sample5.csv =================================================================== --- pure-csv/trunk/examples/read-sample5.csv (rev 0) +++ pure-csv/trunk/examples/read-sample5.csv 2008-09-27 11:35:58 UTC (rev 882) @@ -0,0 +1,4 @@ +,,4,"0004" +"this, has an embedded ," ,, 3.2, "0000" +embedded ""quotes"", "-10", 0 , " " +," , , 2.3e-4 , "54-23 Added: pure-csv/trunk/examples/read-samples.pure =================================================================== --- pure-csv/trunk/examples/read-samples.pure (rev 0) +++ pure-csv/trunk/examples/read-samples.pure 2008-09-27 11:35:58 UTC (rev 882) @@ -0,0 +1,27 @@ +#!/usr/local/bin/pure -x + +using csv; +using system; + +/* Define a few dialects */ +const QUOTE_ALL = csv_dialect [CSV_QUOTING_STYLE=>CSV_QUOTE_ALL]; +const TAB_DELIM = csv_dialect [CSV_DELIMITER=>"\t"]; +const TAB_DELIM_SINGLE_QUOTE = csv_dialect [CSV_DELIMITER=>"\t", CSV_QUOTE=>"'"]; +const ESC_QUOTES = csv_dialect [CSV_ESCAPE=>"%"]; + +main + = puts "Reading 'read-sample1.csv' (standard CSV)" $$ + puts (str $ csv_fget "read-sample1.csv") $$ + puts "\nReading 'read-sample1.csv' (standard CSV, no conversions)" $$ + puts (str $ csv_fget ("read-sample1.csv", QUOTE_ALL)) $$ + puts "\nReading 'read-sample2.csv' (tab delimitied)" $$ + puts (str $ csv_fget ("read-sample2.csv", TAB_DELIM)) $$ + puts "\nReading 'read-sample3.csv' (tab delimited, single quoted)" $$ + puts (str $ csv_fget ("read-sample3.csv", TAB_DELIM_SINGLE_QUOTE)) $$ + puts "\nReading 'read-sample4.csv' (escaped quotes)" $$ + puts (str $ csv_fget ("read-sample4.csv", ESC_QUOTES)) $$ + puts "\nReading 'read-sample5.csv' (Malformed)" $$ + puts (str $ csv_fget ("read-sample5.csv", CSV_DEFAULTS)) $$ + puts "\ndone." $$ (); + +main; Property changes on: pure-csv/trunk/examples/read-samples.pure ___________________________________________________________________ Added: svn:executable + * Added: pure-csv/trunk/examples/write-sample1.csv =================================================================== --- pure-csv/trunk/examples/write-sample1.csv (rev 0) +++ pure-csv/trunk/examples/write-sample1.csv 2008-09-27 11:35:58 UTC (rev 882) @@ -0,0 +1,3 @@ +"this"," that ","23","-3","" +"a ""b""","c c","10","3.2"," " +"a, b","","0","0","00" Added: pure-csv/trunk/examples/write-sample2.csv =================================================================== --- pure-csv/trunk/examples/write-sample2.csv (rev 0) +++ pure-csv/trunk/examples/write-sample2.csv 2008-09-27 11:35:58 UTC (rev 882) @@ -0,0 +1,3 @@ +"this y" " that " 23 -3 "" +"a ""b""" "c c" 10 3.2 " " +"a, b" "" 0 0 "00" Added: pure-csv/trunk/examples/write-sample3.csv =================================================================== --- pure-csv/trunk/examples/write-sample3.csv (rev 0) +++ pure-csv/trunk/examples/write-sample3.csv 2008-09-27 11:35:58 UTC (rev 882) @@ -0,0 +1,3 @@ +'this y' ' that ' 23 -3 '' +'a ''b''' 'c c' 10 3.2 ' ' +'a, b' '' 0 0 '00' Added: pure-csv/trunk/examples/write-sample4.csv =================================================================== --- pure-csv/trunk/examples/write-sample4.csv (rev 0) +++ pure-csv/trunk/examples/write-sample4.csv 2008-09-27 11:35:58 UTC (rev 882) @@ -0,0 +1,3 @@ +"this"," that ",23,-3,"" +"a %"b%"","c c",10,3.2," " +"a, b","",0,0,"00" Added: pure-csv/trunk/examples/write-samples.pure =================================================================== --- pure-csv/trunk/examples/write-samples.pure (rev 0) +++ pure-csv/trunk/examples/write-samples.pure 2008-09-27 11:35:58 UTC (rev 882) @@ -0,0 +1,44 @@ +#!/usr/local/bin/pure -x + +/* To run: + At the command line> pure -i writesamples.pure + At the prompt ==> main; + Using your favorite editor open the write samples to see what was wrought. +*/ + +using csv; +using system; + +/* Define a few dialects */ +const QUOTE_ALL = csv_dialect [CSV_QUOTING_STYLE=>CSV_QUOTE_ALL]; +const TAB_DELIM = csv_dialect [CSV_DELIMITER=>"\t"]; +const TAB_DELIM_SINGLE_QUOTE = csv_dialect [CSV_DELIMITER=>"\t", CSV_QUOTE=>"'"]; +const ESC_QUOTES = csv_dialect [CSV_ESCAPE=>"%"]; + +/* Define a few sample data lists */ +let sample1 = [["this", " that ", 23, -3.0, ""], + ["a \"b\"", "c c", 10, 3.2, " "], + ["a, b", "", 0, 0.0, "00"]]; + +let sample2 = [["this\ty", " that ", 23, -3.0, ""], + ["a \"b\"", "c c", 10, 3.2, " "], + ["a, b", "", 0, 0.0, "00"]]; + +let sample3 = [["this\ty", " that ", 23, -3.0, ""], + ["a 'b'", "c c", 10, 3.2, " "], + ["a, b", "", 0, 0.0, "00"]]; + +main + = puts "Writing 'write-sample1.csv' (standard CSV)" $$ + puts (str $ csv_fput ("write-sample1.csv", sample1)) $$ + puts "\nWriting 'write-sample1.csv' (standard CSV, no conversions)" $$ + puts (str $ csv_fput ("write-sample1.csv", sample1, QUOTE_ALL)) $$ + puts "\nWriting 'write-sample2.csv' (tab delimitied)" $$ + puts (str $ csv_fput ("write-sample2.csv", sample2, TAB_DELIM)) $$ + puts "\nWriting 'write-sample3.csv' (tab delimited, single quoted)" $$ + puts (str $ csv_fput ("write-sample3.csv", sample3, TAB_DELIM_SINGLE_QUOTE)) $$ + puts "\nWriting 'write-sample4.csv' (escaped quotes)" $$ + puts (str $ csv_fput ("write-sample4.csv", sample1, ESC_QUOTES)) $$ + puts "\ndone." $$ (); + +main; Property changes on: pure-csv/trunk/examples/write-samples.pure ___________________________________________________________________ Added: svn:executable + * This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |