You can subscribe to this list here.
2008 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(6) |
Jun
(19) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2009 |
Jan
(3) |
Feb
(35) |
Mar
|
Apr
(10) |
May
(3) |
Jun
|
Jul
(2) |
Aug
|
Sep
(1) |
Oct
|
Nov
(1) |
Dec
|
2010 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
(13) |
Jun
(1) |
Jul
(1) |
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
2011 |
Jan
(1) |
Feb
(1) |
Mar
|
Apr
(7) |
May
|
Jun
(1) |
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
2012 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
(1) |
Jun
|
Jul
|
Aug
(2) |
Sep
(1) |
Oct
(1) |
Nov
(2) |
Dec
(26) |
2013 |
Jan
(6) |
Feb
(1) |
Mar
(1) |
Apr
(1) |
May
(1) |
Jun
(1) |
Jul
|
Aug
(3) |
Sep
|
Oct
|
Nov
|
Dec
|
2014 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
(1) |
Sep
|
Oct
(1) |
Nov
|
Dec
|
2015 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
(1) |
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2016 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
(10) |
Jul
(1) |
Aug
|
Sep
(11) |
Oct
|
Nov
|
Dec
|
2017 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
(1) |
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
2018 |
Jan
|
Feb
(1) |
Mar
|
Apr
(1) |
May
(1) |
Jun
(1) |
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2019 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
(1) |
Jul
|
Aug
(1) |
Sep
(2) |
Oct
|
Nov
|
Dec
|
2020 |
Jan
(4) |
Feb
(4) |
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(1) |
Nov
|
Dec
|
2021 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
2022 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(1) |
Sep
(2) |
Oct
|
Nov
|
Dec
|
From: Kevin K. <kk...@ny...> - 2009-02-11 01:24:59
|
Patrick Dunnigan wrote: > Here's the background: > > I have an enterprise-grade software solution that has been in production use > for 6+ years, the most important component of which is written in TCL / > Expect. > > We've been using various methods to connect to databases over the years, > from ODBC bridges to TCLtoDBI Perl to our own connectivity library to > connect to SQL Server / Sybase (through freeTDS), Oracle OCI, SQLite, MySQL > etc. Kind of a home grown ODBC without needing Unix ODBC. Works very well > and is very fast. > > Well we need to port to Windows and I would like to explore the possibility > of not using the home grown connectivity library. I can use TclODBC for the > SQL Server piece but it doesn't look like it's in active development and I'd > like to make sure we stay with something active. TDBC is now popping up so > I'd like to see if it fits the needs. Exciting. I want to help. > I downloaded ActiveState 8.6b1. Tested TclODBC connectivity to SQL Server > 2005 Enterprise, works fine. Half the battle. If ODBC is going, tdbc::odbc will work fine. > I then moved onto tdbc, tried passing in a connection string with no > success. I then realized that I didn't have enough information on the > command syntax. I looked for man pages, etc. etc. and didn't locate them. Yes, I see that the Webification of the man pages is not happening, and I'm not sure why, but I'll track it down. > I then downloaded the zip file referenced by the link below, ran the install > and had errors at every step, couldn't create directories, etc. I edited the > install script and removed the -force on the file mkdir commands, still had > errors so I decided to go for help. > > The line I decided to stop on was: file mkdir [file join $::tcl_library > tcl8/8.6/tdbc]. I didn't have confidence that this was the proper path. > The path it would create in my case would be: > > C:\Tcl\lib\tcl8.6\tcl8\8.6\tdbc Ouch, that's not right; looks as if I've got a regression in the installer. Let me check it out and get back to you. > 1. is this the most recent and accurate install script? Recent, yes. Accurate, obviously not. Let me get it fixed and get back to you. > 2. How would I pass a SQL Server ODBC connection string to the tdbc > connection command? tdbc::odbc::connection create db $connectionString and 'db' becomes a TDBC connection handle. Replace 'db' with whatever command name you'd like it to have. > 3. You mentioned man pages are there, but I don't see them in the Active > State 8.6b1 install or on the web. Can you point me to them? I don't see them on www.tcl.tk, either - let me get with Jeff and Andreas and see if they can track down the reason they're not showing up. I'll put HTML ones up for you and post back here again. -- 73 de ke9tv/2, Kevin |
From: Patrick D. <pat...@ac...> - 2009-02-10 19:41:14
|
I have tdbc odbc loaded, but I am sill trying to work out the exact connection syntax. I am expecting to pass the DSN-less connection string but getting an error as you can see below: % set conn_string "DRIVER=SQL Server;SERVER=192.168.2.105,1433;DBQ=test;UID=test;PWD=test " DRIVER=SQL Server;SERVER=192.168.2.105,1433;DBQ=test;UID=test;PWD=test % tdbc::odbc::connection create db $conn_string unknown method "init": must be allrows, close, columns, destroy, eval, foreach,prepare, preparecall, resultsets, statements, tables, transaction, typemap, unknown, variable or varnamePatrick Dunnigan I know this is the correct connection string, I have tested it out with other software. I am thinking that I am not using the tdbc::odbc::connection command the right way for ODBC connections. Thanks, Patrick Email: pat...@ac... Visit us at <file:///C:\Documents%20and%20Settings\pdunnigan\Application%20Data\Microsof t\Signatures\www.activecompliance.com> www.activecompliance.com This e-mail message is intended only for the use of the individual or entity to which it is addressed, and may contain information that is confidential. If you are not the intended recipient, any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by reply email. |
From: Kevin K. <kk...@ny...> - 2009-02-10 15:40:48
|
Larry W. Virden wrote: > I know I went out to http://tdbc.tcl.tk/ this morning, and found a > minimal amount of info on the wiki. However, I noticed that when I > built Tcl 8.6 cvs head (obtained from the activestate ftp site) there > were 5 man pages tdbc.n, tdbc_connection.n, tdbc_statement.n, > tdbc_resultset.n, tdbc_tokenize.n that are installed when I > configured, built, and then installed the code in tcl/pkgs/tdbc/ . > > I also note that there doesn't appear to be any drivers that are in > the tcl 8.6 source distribution right now. When I poke around on the > tdbc.tcl.tk site, I see a set of hyperlinks to code. However, I'm > uncertain exactly how one gets the code . Windows drivers are at http://preview.tinyurl.com/db2zw8 . Unix users can build from the ZIP files, downloadable by following the "If you're here to get the sources" instructions on the tdbc.tcl.tk front page. It's all still beta, of course. > > Also, is anyone working on oracle or sybase drivers for tdbc yet? > Just curious. I don't know of any. I'm not the right one to do Sybase, since I don't have access to a Sybase system and haven't used it in years. It's certainly needed. (Is "SQL Server Native API" still enough like Sybase that a common code base could be used for both?) A stopgap would be to use the ODBC bridge. A slightly better stopgap would be to do a pure Tcl driver wrapped around sybtcl, much as the SQLite3 driver is wrapped around SQLite3's native binding. Someone at the conference raised a hand when I asked if anyone was planning an Oracle driver, and then I heard nothing further. Really, people, I can't do it all myself! -- 73 de ke9tv/2, Kevin |
From: Kevin K. <kk...@ny...> - 2009-02-10 15:40:45
|
Patrick Dunnigan wrote: > Hi, I have read tip 308 and various other web pages but can’t seem to > find any comprehensive documentation on the use of TDBC and the small > bits of example code that I have found seem to be conflicting. I can > help with the testing effort but need some information on how to use > TDBC in my scripts. Unfortunately, there isn't very much tutorial information out there. The manual pages *are* there, and if you install drivers (which you have to do to actually *use* tdbc) their manual pages will be there, too. 308 didn't change very much between TIP and implementation, but I agree that TIP prose isn't the easiest thing in the world to read. How far have you got? I don't think that an email thread is the best place to start developing a tutorial, but perhaps we can Wikify the missing knowledge and whack it into a decent form. That will be easiest to do starting from specific questions. In any case, if you haven't already, you need to: (a) Either have 8.6b1 (or later), or else have 8.5 plus the tcloo extension and the tdbc extension. (b) Have a SQL database that you can use. Windows users always have Jet (the engine underlying Microsoft Access) available, even if they haven't bought Access. SQLite runs everywhere. And TDBC has a MySQL driver. (c) Have the driver for the database. Windows users can get binaries for the drivers that are so far available (ODBC, MySQL and SQLite3) from http://preview.tinyurl.com/db2zw8 Unix users at this point still have to build from source. Source ZIPs are available from http://tdbc.tcl.tk/ (d) Once you have a driver built and operational, or if you get stuck, start asking questions! There isn't that much there, but I've probably neglected to document some key piece that's making you stumble. A summary example will give the basic idea: tdbc::sqlite3::connection create db /path/to/my/database.sql db allrows { CREATE TABLE people( name VARCHAR(40), phone VARCHAR(16) ) } set stmt [db prepare { INSERT INTO people(name, phone) VALUES(:name, :phone) }] foreach {name phone} { fred 555-1234 barney 555-1235 wilma 555-1234 betty 555-1235 } { $stmt allrows } $db foreach row {SELECT phone FROM people where NAME='fred'} { if {[dict exists $row phone]} { puts $phone } else { puts "fred has no phone" } } $db close |
From: Patrick D. <pat...@ac...> - 2009-02-10 15:24:18
|
Here's the background: I have an enterprise-grade software solution that has been in production use for 6+ years, the most important component of which is written in TCL / Expect. We've been using various methods to connect to databases over the years, from ODBC bridges to TCLtoDBI Perl to our own connectivity library to connect to SQL Server / Sybase (through freeTDS), Oracle OCI, SQLite, MySQL etc. Kind of a home grown ODBC without needing Unix ODBC. Works very well and is very fast. Well we need to port to Windows and I would like to explore the possibility of not using the home grown connectivity library. I can use TclODBC for the SQL Server piece but it doesn't look like it's in active development and I'd like to make sure we stay with something active. TDBC is now popping up so I'd like to see if it fits the needs. I downloaded ActiveState 8.6b1. Tested TclODBC connectivity to SQL Server 2005 Enterprise, works fine. I then moved onto tdbc, tried passing in a connection string with no success. I then realized that I didn't have enough information on the command syntax. I looked for man pages, etc. etc. and didn't locate them. I then downloaded the zip file referenced by the link below, ran the install and had errors at every step, couldn't create directories, etc. I edited the install script and removed the -force on the file mkdir commands, still had errors so I decided to go for help. The line I decided to stop on was: file mkdir [file join $::tcl_library tcl8/8.6/tdbc]. I didn't have confidence that this was the proper path. The path it would create in my case would be: C:\Tcl\lib\tcl8.6\tcl8\8.6\tdbc Therefore, my questions would be: 1. is this the most recent and accurate install script? 2. How would I pass a SQL Server ODBC connection string to the tdbc connection command? 3. You mentioned man pages are there, but I don't see them in the Active State 8.6b1 install or on the web. Can you point me to them? Thanks for your response. Patrick Dunnigan Email: pat...@ac... Visit us at www.activecompliance.com This e-mail message is intended only for the use of the individual or entity to which it is addressed, and may contain information that is confidential. If you are not the intended recipient, any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by reply email. -----Original Message----- From: Kevin Kenny [mailto:kk...@ny...] Sent: Tuesday, February 10, 2009 9:53 AM To: pat...@ac... Cc: tcl...@li... Subject: Re: [Tcl-tdbc] TDBC documentation, samples Patrick Dunnigan wrote: > Hi, I have read tip 308 and various other web pages but can't seem to > find any comprehensive documentation on the use of TDBC and the small > bits of example code that I have found seem to be conflicting. I can > help with the testing effort but need some information on how to use > TDBC in my scripts. Unfortunately, there isn't very much tutorial information out there. The manual pages *are* there, and if you install drivers (which you have to do to actually *use* tdbc) their manual pages will be there, too. 308 didn't change very much between TIP and implementation, but I agree that TIP prose isn't the easiest thing in the world to read. How far have you got? I don't think that an email thread is the best place to start developing a tutorial, but perhaps we can Wikify the missing knowledge and whack it into a decent form. That will be easiest to do starting from specific questions. In any case, if you haven't already, you need to: (a) Either have 8.6b1 (or later), or else have 8.5 plus the tcloo extension and the tdbc extension. (b) Have a SQL database that you can use. Windows users always have Jet (the engine underlying Microsoft Access) available, even if they haven't bought Access. SQLite runs everywhere. And TDBC has a MySQL driver. (c) Have the driver for the database. Windows users can get binaries for the drivers that are so far available (ODBC, MySQL and SQLite3) from http://preview.tinyurl.com/db2zw8 Unix users at this point still have to build from source. Source ZIPs are available from http://tdbc.tcl.tk/ (d) Once you have a driver built and operational, or if you get stuck, start asking questions! There isn't that much there, but I've probably neglected to document some key piece that's making you stumble. A summary example will give the basic idea: tdbc::sqlite3::connection create db /path/to/my/database.sql db allrows { CREATE TABLE people( name VARCHAR(40), phone VARCHAR(16) ) } set stmt [db prepare { INSERT INTO people(name, phone) VALUES(:name, :phone) }] foreach {name phone} { fred 555-1234 barney 555-1235 wilma 555-1234 betty 555-1235 } { $stmt allrows } $db foreach row {SELECT phone FROM people where NAME='fred'} { if {[dict exists $row phone]} { puts $phone } else { puts "fred has no phone" } } $db close |
From: Larry W. V. <lv...@gm...> - 2009-02-10 14:20:31
|
On Tue, Feb 10, 2009 at 8:26 AM, Patrick Dunnigan <pat...@ac...> wrote: > Hi, I have read tip 308 and various other web pages but can't seem to find > any comprehensive documentation on the use of TDBC and the small bits of > example code that I have found seem to be conflicting. I can help with the > testing effort but need some information on how to use TDBC in my scripts. > I know I went out to http://tdbc.tcl.tk/ this morning, and found a minimal amount of info on the wiki. However, I noticed that when I built Tcl 8.6 cvs head (obtained from the activestate ftp site) there were 5 man pages tdbc.n, tdbc_connection.n, tdbc_statement.n, tdbc_resultset.n, tdbc_tokenize.n that are installed when I configured, built, and then installed the code in tcl/pkgs/tdbc/ . I also note that there doesn't appear to be any drivers that are in the tcl 8.6 source distribution right now. When I poke around on the tdbc.tcl.tk site, I see a set of hyperlinks to code. However, I'm uncertain exactly how one gets the code . Also, is anyone working on oracle or sybase drivers for tdbc yet? Just curious. -- Tcl - The glue of a new generation. http://wiki.tcl.tk/ Larry W. Virden http://www.purl.org/net/lvirden/ http://www.xanga.com/lvirden/ Even if explicitly stated to the contrary, nothing in this posting should be construed as representing my employer's opinions. |
From: Patrick D. <pat...@ac...> - 2009-02-10 13:53:45
|
Hi, I have read tip 308 and various other web pages but can't seem to find any comprehensive documentation on the use of TDBC and the small bits of example code that I have found seem to be conflicting. I can help with the testing effort but need some information on how to use TDBC in my scripts. Thanks Patrick Dunnigan Email: pat...@ac... Visit us at <file:///C:\Documents%20and%20Settings\pdunnigan\Application%20Data\Microsof t\Signatures\www.activecompliance.com> www.activecompliance.com This e-mail message is intended only for the use of the individual or entity to which it is addressed, and may contain information that is confidential. If you are not the intended recipient, any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by reply email. |
From: Kevin K. <ke...@ac...> - 2009-01-11 02:47:35
|
Kevin Kenny wrote: > Because building TEA modules on Windows is problematic for so many > people, I've packaged pre-built binaries of the packages: > > tdbc::mysql > tdbc::odbc > tdbc::sqlite3 > > for 32-bit Windows. They are available from SourceForge at: > > https://sourceforge.net/project/showfiles.php?group_id=10894&package_id=305160&release_id=651729 > (or http://preview.tinyurl.com/7j8zws) The buggy INSTALL.tcl script in the release ZIP has now been corrected. -- 73 de ke9tv/2, Kevin |
From: Kevin K. <ke...@ac...> - 2009-01-06 05:34:43
|
Because building TEA modules on Windows is problematic for so many people, I've packaged pre-built binaries of the packages: tdbc::mysql tdbc::odbc tdbc::sqlite3 for 32-bit Windows. They are available from SourceForge at: https://sourceforge.net/project/showfiles.php?group_id=10894&package_id=305160&release_id=651729 (or http://preview.tinyurl.com/7j8zws) To install them, download the file 'tdbc1.0b6-win32-drivers.ZIP'. Extract this file in a convenient place, and use 'wish8.6' to run the script, 'INSTALL.TCL' inside the folder that's created. All the necessary files should be copied to your Tcl 8.6 installation. I hope this gives many more people the opportunity to start testing TDBC. -- 73 de ke9tv/2, Kevin |
From: Kevin K. <ke...@ac...> - 2009-01-04 22:51:00
|
I am pleased to announce that the current baseline of TDBC, tagged '1.0b5' and bearing the ID, 46df70f083, is available. This release includes what I believe to be functionally complete versions of the TDBC base classes and tokenizer (matching the versions on the Tcl HEAD), together with the following three drivers: tdbc::odbc - A bridge that allows connection to any ODBC-compliant database. This module builds on Windows, and is expected to build on any Unix system with the 'unixodbc' package. tdbc::sqlite3 - A connector that joins TDBC to SQLite3. This connector is written in pure Tcl (using the 'sqlite3' extension), and is expected to run anywhere that SQLite3 runs. tdbc::mysql - A connector that joins TDBC to MySQL. This connector demands headers and libraries for MySQL 5.1 or better. It is all-new C code, not directly derived from Artur Trzewik's fine 'mysqltcl' binding. It uses the MySQL prepared statement interface, so it is expected to be relatively immune to SQL insertion attacks, without depending on mysql_real_escape_string or relatives. These modules are all obtainable as as single ZIP archive at http://tdbc.tcl.tk/ All three of these modules can be built and installed on Unix with a typical Posix toolchain, and on Windows with msys/mingw. Obviously, more work needs to be done for TDBC to achieve world domination. Important further work that is needed: (1) Revise the 'errorCode' lists returned by the various TDBC commands to make more sense with Tcl's new [try] command. This is a fairly mechanical project. Contact me directly if you're interested in taking it on, and I'll advise about what needs to be done. (2) All three of these modules need Makefile.vc, for building with Visual Studio on Windows systems. (3) Binary distributions for the various platforms need to be packaged. (4) More drivers are needed! I know that various people over the last few months have volunteered to work on drivers, but I haven't been maintaining a registry of who's doing what. I think that the project Wiki at tdbc.tcl.tk would be a good place to do that. In particular, people have requested: - pgsql - Sql Server native API - Oracle and several less popular databases. (5) Packaging advice is needed from downstream distributors, particularly as regards the package dependencies. The database drivers usually have complex dependencies; for instance, tdbc::sqlite depends at run time on TDBC itself, and on the 'sqlite3' Tcl extension. (Being a pure Tcl solution, it has no build-time dependencies.) tdbc::mysql depends on mysql 1.5 or better (built with threads), and this is a build-time dependency because it needs #include files and a link library. Clearly, we don't want to bundle all the dependencies with Tcl itself. Bugs and feature requests can be logged at http://tdbc.tcl.tk/ I welcome broader exposure of all three of these modules, which while still 'experimental', should be considered as being on track for release. -- 73 de ke9tv/2, Kevin |
From: Alexandre F. <ale...@gm...> - 2008-06-20 07:11:23
|
On 6/20/08, Tom Krehbiel <tom...@fr...> wrote: > > Does this sound like something worth TIPing? I *knew* 185 would come back to haunt us at night.... -Alex |
From: <lm...@bi...> - 2008-06-20 01:40:36
|
> Since I tend to think in pictures, I drew a picture which I think > succinctly describes the issue and the reason it is a complication in > Tcl. I also believe it shows the various solutions in Tcl. So while I went to gadfly school I didn't go to Tcl school so the "various solutions" part is alluding me. Can you help me see it? -- --- Larry McVoy lm at bitmover.com http://www.bitkeeper.com |
From: Tom K. <tom...@fr...> - 2008-06-19 23:26:32
|
This isn't directly related to TIP #308 but since NULLs are being discussed here I would like to suggest the following as the solution to the "NULL's in tcl" issue. 1) tcl is EIAS and will remain EIAS then I would 2) add a variable named tcl_null to the predefined tclvars which is initialized to something convenient like \0. With this modest change to tcl it suddenly supports NULL value! a) A test for of $tcl_null eq "" fails b) $tcl_null can be used as a NULL list element c) An application writer can change the definition of tcl_null if it conflicts with the dataset being manipulated d) changes to tcl_null would effect extensions so they would be consistent with the application e) If necessary extensions can protect themselves by putting a trigger on tcl_null to detect runtime changes or inappropriate values. Does this sound like something worth TIPing? Tom K. > Twylite wrote: > > (4) (... says he, drawing a breath ... ) the NULL thing. Yes Kevin, I > > heard that this isn't up for negotiation ;) No, I don't want NULLs in > > Tcl. But the answer to "how do we return a row which may have NULLs and > > may have duplicate column names and is easy to process whether you want > > to treat it as a list of values or reference individual values by column > > name" is also linked to my previous two points - in particular how to > > get both a list & a dict representation of a given row. > > OK. I was a bit strong in my rejection of discussing NULLs. But > I didn't want people demanding that 308 be rejected because it failed > to address 185. The general issue of NULL as a value (a non-value > value?) in Tcl is Out Of Scope. > > [... longish discussion of why a cell in a result set is a triple: > {name type value} redacted...] > > It's not clear to me that the type information has a place in Tcl, > except insofar as a result is NULL. Results of any type have already > in principle been converted to strings (and yes, at least the > ODBC bridge uses pure integers, pure doubles, and so on intelligently > to avoid needless shimmering). Except for "is this NULL", it's > not clear what use a script would make of a column type. In any > case, it really isn't always known; SQLite, for instance, uses > "manifest typing", which comes close to "everything is a Tcl_Obj". > > So instead of {name type value}, we really have {name isnull value} > or {name {Maybe value}}. And, well, Tcl dictionaries are an > extremely natural representation for {name {Maybe value}}. > The data returned are at least complete: given the list of > column names, you can get full information with: > > while {[$resultSet nextdict rowdict]} { > foreach column [$resultSet columns] { > if {[dict exists $rowdict $column]} { > ... it's not null, it's [dict get $rowdict $column] > } else { > ... it's null > } > } > } > > One could even use [dict merge] to supply correct column-dependent > default values for the nulls, something that a putative -nullvalue > option, whether connection-wide or statement-specific, cannot do: > > set row [dict merge {counter 0 middlename NONE age UNKNOWN} $row] > > So the whole thing comes down to a matter of notational convenience. > The iteration shown above visits every cell of the result set, > whether null or not, and allows diagnosis of whether the cell > is null. The only thing that's really awkward is the disambiguation > of duplicate column names. But ODBC and JDBC both manage to > do that anyway (by adding strings like '#1' to the end of > duplicate names). And if the user thinks those names are > ugly, there's always the use of 'AS' in SQL. > > So I'm convinced that we already have a solution that's > 'correct' in that it can deal with 100% of the cases. The > notation may not be ideal, but there's been a lot of time > spent quibbling about it, without coming up with a single > solution that will satisfy everyone's sense of æsthetics. > > I think that there's always going to be a slight "impedance > mismatch" between SQL and Tcl. Such a mismatch is not surprising; > SQL was designed in the days when COBOL ruled the earth, > for $env(LC_DEITY)'s sake -- witness, for instance, its > dependence on fixed-length strings and fixed-point > arithmetic. Some things are simply going to be awkward. > (But hybrid object-relational systems are even worse, so > I'm *not* designing a monster like ADO!) > > -- > 73 de ke9tv/2, Kevin > > > ------------------------------------------------------------------------- > Check out the new SourceForge.net Marketplace. > It's the best place to buy or sell services for > just about anything Open Source. > http://sourceforge.net/services/buy/index.php > _______________________________________________ > Tcl-Core mailing list > Tcl...@li... > https://lists.sourceforge.net/lists/listinfo/tcl-core > |
From: miguel s. <mig...@gm...> - 2008-06-19 23:22:03
|
Brian Griffin wrote: > Since I tend to think in pictures, I drew a picture which I think > succinctly describes the issue and the reason it is a complication in > Tcl. I also believe it shows the various solutions in Tcl. > > > file:///u/bgriffin/Documents/NULLS_in_Tcl.gif How do I ssh to your machine? :P |
From: Brian G. <bgr...@mo...> - 2008-06-19 23:20:18
|
Larry McVoy wrote: > On Wed, Jun 18, 2008 at 10:23:47PM -0400, Kevin Kenny wrote: > >> Twylite wrote: >> >>> (4) (... says he, drawing a breath ... ) the NULL thing. Yes Kevin, I >>> heard that this isn't up for negotiation ;) No, I don't want NULLs in >>> Tcl. But the answer to "how do we return a row which may have NULLs and >>> may have duplicate column names and is easy to process whether you want >>> to treat it as a list of values or reference individual values by column >>> name" is also linked to my previous two points - in particular how to >>> get both a list & a dict representation of a given row. >>> >> OK. I was a bit strong in my rejection of discussing NULLs. But >> I didn't want people demanding that 308 be rejected because it failed >> to address 185. >> > > Hey, live by the sword, die by the sword. > > It's pretty clear that NULL is useful. Instead of going "blah blah blah > I can't hear you, EIAS, I can't hear you", address the fucking issue. > All of the other dynamic languages have addressed it, and I, for one, > in my non-sanctioned role as gadfly, will bring this up over and over > again. And if it derails any tip, oh, well. > > Address the fucking issue and stop hiding behind this silly mantra of > EIAS. It's a useful concept, agreed. Moderation in all things. In > case you haven't noticed, EIAS hasn't exactly taken the world by storm. > Since I tend to think in pictures, I drew a picture which I think succinctly describes the issue and the reason it is a complication in Tcl. I also believe it shows the various solutions in Tcl. (see attached) -Brian -- # "Don't be ridiculous. Everyone knows there are no Secret # Tcl Illuminati." # -- Donal Fellows ------------------------------------------------------------- -- Mentor Graphics Corp. -- -- 8005 SW Boeckman Road 503.685.7000 tel -- -- Wilsonville, OR 97070 USA 503.685.0921 fax -- ------------------------------------------------------------- -- Technical support ............ mailto:su...@mo... -- -- Sales and marketing info ....... mailto:sa...@mo... -- -- Licensing .................... mailto:li...@mo... -- -- Home Page ........................ http://www.model.com -- ------------------------------------------------------------- |
From: Brian G. <bgr...@mo...> - 2008-06-19 23:16:36
|
Larry McVoy wrote: > On Wed, Jun 18, 2008 at 10:23:47PM -0400, Kevin Kenny wrote: > >> Twylite wrote: >> >>> (4) (... says he, drawing a breath ... ) the NULL thing. Yes Kevin, I >>> heard that this isn't up for negotiation ;) No, I don't want NULLs in >>> Tcl. But the answer to "how do we return a row which may have NULLs and >>> may have duplicate column names and is easy to process whether you want >>> to treat it as a list of values or reference individual values by column >>> name" is also linked to my previous two points - in particular how to >>> get both a list & a dict representation of a given row. >>> >> OK. I was a bit strong in my rejection of discussing NULLs. But >> I didn't want people demanding that 308 be rejected because it failed >> to address 185. >> > > Hey, live by the sword, die by the sword. > > It's pretty clear that NULL is useful. Instead of going "blah blah blah > I can't hear you, EIAS, I can't hear you", address the fucking issue. > All of the other dynamic languages have addressed it, and I, for one, > in my non-sanctioned role as gadfly, will bring this up over and over > again. And if it derails any tip, oh, well. > > Address the fucking issue and stop hiding behind this silly mantra of > EIAS. It's a useful concept, agreed. Moderation in all things. In > case you haven't noticed, EIAS hasn't exactly taken the world by storm. > Since I tend to think in pictures, I drew a picture which I think succinctly describes the issue and the reason it is a complication in Tcl. I also believe it shows the various solutions in Tcl. file:///u/bgriffin/Documents/NULLS_in_Tcl.gif -Brian -- # "Don't be ridiculous. Everyone knows there are no Secret # Tcl Illuminati." # -- Donal Fellows ------------------------------------------------------------- -- Mentor Graphics Corp. -- -- 8005 SW Boeckman Road 503.685.7000 tel -- -- Wilsonville, OR 97070 USA 503.685.0921 fax -- ------------------------------------------------------------- -- Technical support ............ mailto:su...@mo... -- -- Sales and marketing info ....... mailto:sa...@mo... -- -- Licensing .................... mailto:li...@mo... -- -- Home Page ........................ http://www.model.com -- ------------------------------------------------------------- |
From: Brett S. <bre...@ya...> - 2008-06-19 17:09:47
|
I think I'm late in coming into this, my apologies. > > >> (3) Why are the functions "$dbhandle tables" and "$dbhandle columns" > >> a MUST? I was under the impression that not all databases supported > >> this functionality, and that others may require > >> less-than-straightforward implementations? > >> For example, my understanding of PostgreSQL (please feel free to > >> correct me/it) is that one can only get this information by querying > >> the system catalog (tables), which may change depending on the DBMS > >> version. > > I'm willing once again to downgrade to SHOULD, but the fact that you > > have to query a system catalog for the information is nothing unusual. > > Oracle and Sybase also require you to go after the system catalog > > to get the stuff. Certainly the Postgres ODBC and JDBC drivers manage > > to provide comparable functionality. I'd prefer not to relax this > > one until someone demonstrates a database where it's inordinately > > difficult to get the lists. > I'm more concerned about the implications for database extensions when > DBMS versions are upgraded. This may result in an incompatibility that > affects _only these two functions_, but if you rely on their presence or > have no way to determine if they are supported this is the difference > between "works" and "doesn't work". > I'd be interested to hear the feelings of DB extension developers in > this regard? Ok :) You are correct in that to get tables and columns in PG, you need to query the system catalog. However, since I think version 7.4 (current version=8.3) PG comes with the information_schema, which is a schema that includes views into the system catalogs, which is a thin layer to keep the external API stable, if the underlining system catalogs change. So, for example, to get the tables, you would: SELECT table_name FROM information_schema.tables So, there should not be any real concern with compatibility. I believe information_schema is a SQL standard, so supposedly other SQL DBs should have this, but I suspect not. So, on one hand I think [$dbh tables] and [$dbh columns] are useful. However, the problem I have with the current interface (IIUC), is that I am unclear on how tables are returned: are they fully qualified? I think they need to be, and that also needs to be in the TIP, I believe. > > >> (2) I am also requesting a configuration option "-nullvalue val" > >> (where val defaults to {}). In the result set interface the commands > >> "nextlist" and "nextrow -as lists" , if a cell in the row is NULL, > >> the configured nullvalue MUST be stored as its value. > > Several others have requested this, as well. I'm not against > > it in principle. My guess is that it would be most flexible > > to configure it at the level of individual statements, rather > > than connection-wide. > > > > Nevertheless, it can be added compatibly later; can we maybe > > get some alpha experience with -as dicts before resorting to > > it? I suspect that the latter will prove to be the preferred > > way to deal with NULLs. (Also, let's stay away from further > > discussion of NULL in this message.) > It would be more flexible at statement level, but also less usable as it > would have to be set every time. Both would of course be the most > flexible/usable, and the most effort to implement ;) > I would want both. pgtcl does both now and is helpful. > The purpose of nullvalue in my mind is not to try to offer another way > of identifying NULLs, but to allow users of the "-as lists" > functionality more flexibility in how they want to represent null data. > For example if I want to show "(null)" in a report instead of "" I need > either a nullvalue, or I must use the "-as dicts" interface in > conjunction with "$resultset columns" and a foreach loop. Besides the > horrible performance of this approach (I know, because I have to use > something similar at the moment) it is ugly and noisy. It also means > that "allrows -as lists" becomes pretty much unusable if you don't want > an empty string as the nullvalue. > I agree. I would be very reluctant in using the "-as dicts" approach. > I really think this needs to be added now. There was also another > independent request to tcl-tdbc for this functionality within the last > 24 hours. > > >> (3) In the section "Connecting to a database" I would like to suggest > >> a recommendation that implementations MAY support generic URI syntax > >> to specify the connection parameters. This approach is common e.g. > >> in web applications and frameworks, and spares the user from the need > >> to learn a different syntax for each supported database. > >> We have implemented such a scheme in our proprietary DBI in response > >> to the pain of pgintcl's conninfo structure ;( > > Uhm. They MAY support URI syntax, or, indeed, anything else. > > SQLite uses file names. ODBC connection strings are well defined > > if Byzantine. These things are nothing if not varied, and > > will in any case NOT be portable from one database to another. > > > > It's not the parsing, it's the mapping of the URI to the semantics > > of the underlying engine that's going to be the hard part. Since the > > names are going to be inherently unportable in any case, where is the > > advantage in unifying them? > The point of TDBC - unless I'm missing something - is to have an > abstraction layer for database access. This confers two advantages: > (a) A single API that developers need to learn, rather than one API per DBMS > (b) Portability, allowing multiple DBMS support and/or easy targeting of > a different DBMS. > > A recommendation that database extensions _attempt_ to handle > connections in a common way _where possible and sensible_ increases > these advantages. > > The ODBC connection string is a good example of this: despite the many > ways to specify a database connection ODBC has defined a single syntax > that does the job. There are core elements that all databases handle > (Driver, Uid, Pwd), core elements for remote databases (Server, Port, > Database), core elements for local databases (Dbq/Database), and > additional database-specific elements. But there is one syntax, one set > of rules that developers need to understand, and one set of core > elements that get you 90% of the way there. > > I suggested URIs because I can represent typical/default connections to > every database I have used as a URI, be it postgreSQL, mysql, mssql, > sqlite, whatever. > > Let me instead make a more generic but still compatible suggestion: a > recommendation in the TIP that dbhandle offer a constructor that takes a > dict of options, and that they support _where possible and relevant_ > certain well-known keys in that dict: driver/provider, username, > password, host, port, database, path, filename, extension. > > Using a dict (or other common Tcl representation) distances the DBI user > from the syntax of the DB-specific connection string. This means that > the DBI user doesn't have to know how to correctly quote & escape values > - the DBI handles these DB-specific details. If you're still not > convinced then take some of the common Tcl database extensions out > there, that use their own connection strings, and build the connection > string using a user-provided password (i.e. from a UI). And then, for > fun, make the password something like a{b$c'd"e--f;g\h[i:j . > Why is code in _your_ application figuring out how to quote this > correctly for a particular database, rather than letting the DBI do it? > > Using well-known keys in that dict makes the information _somewhat_ > portable. For example the same URI "username:password@host:port/dbname" > could easily be used to create a pgsql, mysql, mssql, or oracle database > connection to a local or remote server, if they all use the same > well-known keys and provide sensible defaults. That means I'm spending > less time writing a translation layer between my UI or config file and > the DBI to convert from {ui_key value} to {dbi_key value}. > > A _recommendation_, focused on having the DBI do magic quoting and > following the principles of the ODBC connection string (name=value + > common keys) can't hurt, and it can provide advantages in portability, > understandability and reduced bugs in connection string handling. > I agree that there should be some sort of common connection string. I understand Kevin's reasons for not including one, but it seems odd to me to have a common DB API, without having a common connection API. Regards, --brett ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://sourceforge.net/services/buy/index.php _______________________________________________ Tcl-Core mailing list Tcl...@li... https://lists.sourceforge.net/lists/listinfo/tcl-core |
From: Twylite <tw...@cr...> - 2008-06-19 14:43:20
|
And one more silly idea: $resultset nextrow rowdictVarname ?nullcolsVarname? nextrow returns 1 if it got and stored a row, 0 at the end of the resultset. The rowdict variable will contain a dict (columnname, value) where - columnname is unique (if necessary the DBI will add #1, #2, ... suffixes to duplicate column names) - value is $nullvalue if the cell is NULL or the appropriate representation otherwise (per existing nextrow spec) - the dict is built to have the same order as the row/tuples returned by SQL The nullcols variable, if specified, will contain a list of column names for which the corresponding cell in this row is a NULL. Order is not specified. What this gives us: (a) $rowdict is a safe-to-use dict that contains all of the expected keys, and NULLs substituted with a suitable default. It is suitable for "the second 80% case". (b) [dict values $rowdict] is an ordered list of values where NULLs have been substituted with a suitable default. It covers the "first 80% case" (c) [dict remove $rowdict {*}$nullcols] is the dict-with-missing-keys as contemplated by the current nextdict method. It covers the "third 80% case" and offers additional (IMHO clearer) ways to process the row: set nrowdict [dict remove $rowdict {*}$nullcols] if { ! [dict exists $nrowdict $mycol] } { # it's null ... } # alternative if { $mycol in $nullcols } { # it's null ... } Variations: - "$resultset nullcols" method instead of optional nullcolsVarname parameter to nextrow. Impact on allrows/foreach: - Must have variables for both rowdict and (optional) nullcols OR - A flag that determines how rowdict will be populated: as keys with substituted nullvalue, or without keys where the cell is NULL. Regards, Trevor |
From: Twylite <tw...@cr...> - 2008-06-19 10:03:31
|
Hi, > OK. I was a bit strong in my rejection of discussing NULLs. But > I didn't want people demanding that 308 be rejected because it failed > to address 185. The general issue of NULL as a value (a non-value > value?) in Tcl is Out Of Scope. Agreed. In my mind this discussion is about how to retrieve and represent SQL data in a Tclish way. > It's not clear to me that the type information has a place in Tcl, > except insofar as a result is NULL. Results of any type have already > in principle been converted to strings (and yes, at least the > ODBC bridge uses pure integers, pure doubles, and so on intelligently > to avoid needless shimmering). Except for "is this NULL", it's > not clear what use a script would make of a column type. In any > case, it really isn't always known; SQLite, for instance, uses > "manifest typing", which comes close to "everything is a Tcl_Obj". Conceded. The main purpose is in fact "isnull" as you suggest. > The only thing that's really awkward is the disambiguation > of duplicate column names. But ODBC and JDBC both manage to > do that anyway (by adding strings like '#1' to the end of > duplicate names). And if the user thinks those names are > ugly, there's always the use of 'AS' in SQL. Then perhaps - if dict is the accepted approach (and I'm not saying it is ;p) - it needs to be mandatory for the DBI to ensure the uniqueness of column names in this manner? > The data returned are at least complete: given the list of > column names, you can get full information with: > > while {[$resultSet nextdict rowdict]} { > foreach column [$resultSet columns] { > if {[dict exists $rowdict $column]} { > ... it's not null, it's [dict get $rowdict $column] > } else { > ... it's null > } > } > } Yes, I _could_. I could write a helper function that does this for me. I could even suggest that such a helper function should be in the DBI. And I could suggest that the performance sucks. > So I'm convinced that we already have a solution that's > 'correct' in that it can deal with 100% of the cases. The > notation may not be ideal, but there's been a lot of time > spent quibbling about it, without coming up with a single > solution that will satisfy everyone's sense of æsthetics. > So instead of {name type value}, we really have {name isnull value} > or {name {Maybe value}}. And, well, Tcl dictionaries are an > extremely natural representation for {name {Maybe value}}. Assuming that name is unique. By "extremely natural" I take it to mean that "if {Maybe value} is Nothing then leave name out of the dict, otherwise include name -> Just value" ? Here are the three persistent problems with this approach: (1) To represent your row as an ordered list of values - which is undeniably a common use - requires a nasty block of embedded loops and dict lookups that you have kindly presented above, and which has horrible performance. So yes there IS a way to get the coveted ordered-list-of-values, I'm just saying the way isn't great. (2) If you have duplicate column names, you are screwed. This appears to be an uncommon case, and the DB could provide a work-around by providing unique suffixes to make duplicate names unique. So yes there IS a solution, I'm just saying that it isn't great. (3) The one that everyone keeps ignoring, and is to my mind the biggest problem with the dict approach: a dict-with-missing-key is, as many have pointed out, the closest thing Tcl has to a true NULL. This brings with it some of the problems of NULL handling. In particular the problem that SQL returns a list of tuples and if I try to access one by name where the value is NULL, things blow up. By "blow up" I mean "dict get $row colname" throws an error. Now you may think that is good - after all we're dealing with NULL, right? Well I'm dealing with a team of developers who have asked for columns X, Y and Z, and expect to be able to [dict get $result X] (or Y or Z) without it blowing up in their face. Unreasonable? Hardly - SQL did in fact return all those columns, even if they "contained NULL". That's how SQL works. It doesn't leave out data just because its NULL - it tells you the cell is there but undefined. Dict does not tell you the cell is there. The software engineering implication of this is that EVERY SQL statement and the code handling its results must be carefully analyzed to determine if NULLs could possibly be returned, and to do a [dict exists] to check for and handle each potential NULL. This is of course a poor approach as a database schema change could invalidate all this careful analysis in a flash. So instead we will see the following boiler-plate code everywhere (or at least we hope to, or there will be bugs): while {[$resultSet nextdict rowdict]} { if ( [dict size $rowdict] != [llength [$resultset columns]] } { # This resultset may contain unexpected NULLs, blow up sanely # otherwise we get a *ahem* useful error message like 'key "x" not known in dictionary" error "Database row contains NULLs - can't process" } In short the dict interface as it stands requires explicit handling of NULLs and does not permit the _safe_ implicit handling of NULLs which - in my experience and from comments on many on c.l.t and mailing lists - is a common use case. > One could even use [dict merge] to supply correct column-dependent > default values for the nulls, something that a putative -nullvalue > option, whether connection-wide or statement-specific, cannot do: > > set row [dict merge {counter 0 middlename NONE age UNKNOWN} $row] Yes, assuming you know the columns coming back. Much of the database code I deal with doesn't know the columns until it has a resultset. Also a [dict merge] doesn't preserve order, so you must still iterate over "$resultset columns" if order is important to you. > So the whole thing comes down to a matter of notational convenience. Absolutely. Notation to overcome that "impedance mismatch". In the many discussions on this topic three use cases have come up repeatedly: - The 80% case that require simple database operations mainly aimed at reporting, in which rows need to be available as an ordered list of values with an appropriate (preferably configurable) value in place of NULL. - The other 80% case that need to access individual values in the resultset - typically by name - to effect some form of processing, and usually don't want to be bothered by NULL (so substitution if fine). - The other other 80% case of serious database users that need to handle NULL explicitly. They say (for some value of "they") that APIs should be optimized for ease of use for the 80% case. My problem with the TDBC API, as it stands, is that it seems to be optimized for the last 80% case, at the expense of the other two. To improve the situation for the first 80% case we need a configurable nullvalue and decent performance. A helper that converts from "$resultset columns" + rowdict to rowlist with substituted NULLs _may_ suffice, but think needs to be demonstrated. To improve the situation for the second 80% case we need a dict- or array-like representation with substituted NULLs, so that elements can be accessed safely. I'm not in the third 80% case so I'm can't claim to know how to optimize it - I assume the dict-with-missing-keys approach is considered appropriate, although I would have thought that something that does not require a bunch of "dict exists" checks would be more pleasant. CAUTION: FLIGHTY THOUGHTS AHEAD Idea: dict {name {isnull value} name {isnull value} ...} which would allow 'lassign [dict get $rowdict mycol] isnull mycol_value' which would make for more readable code (checking 'if { $isnull }' is more readable/understandable than 'if { ! [dict exists $rowdict mycol] }'). Idea: dict {name {isnull bool value val} ...} which allows [dict get $rowdict $name isnull] and [dict get $rowdict $name value]. Its a nested dict so performance may well suck (would need investigation) but the code reads more clearly and the isnull/value are more directly accessible (no lassign/lindex required). Idea: since the developer must either assume (or be confident) that a column is not NULL before doing the [dict get], or check for NULL first, why not a "$resultset isnull colname" method? (I'll answer that: because you may want to take the result row out of the context of a resultset, e.g. using allrows). Why not a "$resultset nulldict varname" to return a dict name -> isnull? Or "$resultset nextrow rowdictVarname ?nulldictVarname?" - the rowdict has substituted NULL values and covers the second 80% case as well as the first ([dict values $rowdict]). There are many possible representations, and I don't think they have been properly considered or investigated because there was too much focus on NULLs in Tcl, and not enough on optimizing for all the common use cases. SUMMARY - Blah, blah, blah. - Implementations should enforce uniqueness of column names (by appending a suffix if required). - Yadda, software engineering, fubar. - You're making it hard for two common use cases that deserve ease-of-use optimization. - Silly idea, silly idea, demand, etc. Regards, Trevor |
From: Twylite <tw...@cr...> - 2008-06-19 08:30:33
|
Thanks Kevin, please see replies below. >> - There is no support for scoped variables - the variableName MUST be >> in the params dict or the caller's scope and nowhere else? > That is correct. In particular, I didn't want to have to specify > what a scoped variable means when a parameter dictionary is supplied. >> (6) The support for SQL tokenisation says that "'$', ':' and '@' are >> all variable substitutions; the remainder of the token string is a >> variable name." This doesn't match completely with behaviour >> specified for "$dbhandle prepare" which only specified ':'. Further, >> allowing '$' for substitutions complicates (rather than simplifies) >> SQL code in Tcl as additional escaping will be required. >> Which behaviour is correct? > The tokenizer was lifted wholesale from SQLite and contributed by > Richard Hipp. I don't see the need for any substitution character > other than ':', but haven't quite got around to removing the others. Great - can the TIP be updated to clarify that only ':' is used for substitution and scoping is explicitly not permitted (i.e. the variableName MUST be in the params dict or the caller's scope and nowhere else)? >> - the reason that TCL_BREAK and TCL_CONTINUE should cause a commit >> rather than a rollback? > It was "principle of least astonishment." It seems reasonable to > consider exiting a loop to be a "normal" action: > > foreach value $list { > db transaction { > ... do something complicated with $value > if {$done} break > ... do some more stuff > } > } > > seems like something that a user would expect to commit rather than > rollback. Okay, sounds logical to me. >> (3) Why are the functions "$dbhandle tables" and "$dbhandle columns" >> a MUST? I was under the impression that not all databases supported >> this functionality, and that others may require >> less-than-straightforward implementations? >> For example, my understanding of PostgreSQL (please feel free to >> correct me/it) is that one can only get this information by querying >> the system catalog (tables), which may change depending on the DBMS >> version. > I'm willing once again to downgrade to SHOULD, but the fact that you > have to query a system catalog for the information is nothing unusual. > Oracle and Sybase also require you to go after the system catalog > to get the stuff. Certainly the Postgres ODBC and JDBC drivers manage > to provide comparable functionality. I'd prefer not to relax this > one until someone demonstrates a database where it's inordinately > difficult to get the lists. I'm more concerned about the implications for database extensions when DBMS versions are upgraded. This may result in an incompatibility that affects _only these two functions_, but if you rely on their presence or have no way to determine if they are supported this is the difference between "works" and "doesn't work". I'd be interested to hear the feelings of DB extension developers in this regard? Also, is it worth considering a "compatibility" indicator in the DBI so that code can query what "level" of support is available before trying to use it? > (4) Is it possible to determine if a transaction is in progress for a > given database connection? > > No. It would be easy enough to add, and a half-page TIP can add it, > so I don't think this particular issue is worth holding up the vote. Any chance of just adding it to the TIP now? > Prepare is not expected to support multiple statements; in particular, > no provision is made for multiple result sets. If a given database > implementation decides to allow for multiple statements (presumably > containing at most one SELECT, I would presume that the statements > would be executed as a group. (Although the tokenizer recogizes ';', > both the database interfaces that I've code it reject it.) Hmm, this one is of some concern. I've often written and encountered multi-part SQL statements. As a trivial example consider a database upgrade script which may need to create/alter roles and tables, insert information about the upgrade into a history table, etc. It is also common to have database install scripts as a single .sql file that is executed "as a single statement" (by which I mean the code outside the DBI doesn't have to parse it, but rather hands the file-as-a-string to the DBI to execute). How do you see code that uses the DBI handling these multi-part statements? >> (2) I am also requesting a configuration option "-nullvalue val" >> (where val defaults to {}). In the result set interface the commands >> "nextlist" and "nextrow -as lists" , if a cell in the row is NULL, >> the configured nullvalue MUST be stored as its value. > Several others have requested this, as well. I'm not against > it in principle. My guess is that it would be most flexible > to configure it at the level of individual statements, rather > than connection-wide. > > Nevertheless, it can be added compatibly later; can we maybe > get some alpha experience with -as dicts before resorting to > it? I suspect that the latter will prove to be the preferred > way to deal with NULLs. (Also, let's stay away from further > discussion of NULL in this message.) It would be more flexible at statement level, but also less usable as it would have to be set every time. Both would of course be the most flexible/usable, and the most effort to implement ;) The purpose of nullvalue in my mind is not to try to offer another way of identifying NULLs, but to allow users of the "-as lists" functionality more flexibility in how they want to represent null data. For example if I want to show "(null)" in a report instead of "" I need either a nullvalue, or I must use the "-as dicts" interface in conjunction with "$resultset columns" and a foreach loop. Besides the horrible performance of this approach (I know, because I have to use something similar at the moment) it is ugly and noisy. It also means that "allrows -as lists" becomes pretty much unusable if you don't want an empty string as the nullvalue. I really think this needs to be added now. There was also another independent request to tcl-tdbc for this functionality within the last 24 hours. >> (3) In the section "Connecting to a database" I would like to suggest >> a recommendation that implementations MAY support generic URI syntax >> to specify the connection parameters. This approach is common e.g. >> in web applications and frameworks, and spares the user from the need >> to learn a different syntax for each supported database. >> We have implemented such a scheme in our proprietary DBI in response >> to the pain of pgintcl's conninfo structure ;( > Uhm. They MAY support URI syntax, or, indeed, anything else. > SQLite uses file names. ODBC connection strings are well defined > if Byzantine. These things are nothing if not varied, and > will in any case NOT be portable from one database to another. > > It's not the parsing, it's the mapping of the URI to the semantics > of the underlying engine that's going to be the hard part. Since the > names are going to be inherently unportable in any case, where is the > advantage in unifying them? The point of TDBC - unless I'm missing something - is to have an abstraction layer for database access. This confers two advantages: (a) A single API that developers need to learn, rather than one API per DBMS (b) Portability, allowing multiple DBMS support and/or easy targeting of a different DBMS. A recommendation that database extensions _attempt_ to handle connections in a common way _where possible and sensible_ increases these advantages. The ODBC connection string is a good example of this: despite the many ways to specify a database connection ODBC has defined a single syntax that does the job. There are core elements that all databases handle (Driver, Uid, Pwd), core elements for remote databases (Server, Port, Database), core elements for local databases (Dbq/Database), and additional database-specific elements. But there is one syntax, one set of rules that developers need to understand, and one set of core elements that get you 90% of the way there. I suggested URIs because I can represent typical/default connections to every database I have used as a URI, be it postgreSQL, mysql, mssql, sqlite, whatever. Let me instead make a more generic but still compatible suggestion: a recommendation in the TIP that dbhandle offer a constructor that takes a dict of options, and that they support _where possible and relevant_ certain well-known keys in that dict: driver/provider, username, password, host, port, database, path, filename, extension. Using a dict (or other common Tcl representation) distances the DBI user from the syntax of the DB-specific connection string. This means that the DBI user doesn't have to know how to correctly quote & escape values - the DBI handles these DB-specific details. If you're still not convinced then take some of the common Tcl database extensions out there, that use their own connection strings, and build the connection string using a user-provided password (i.e. from a UI). And then, for fun, make the password something like a{b$c'd"e--f;g\h[i:j . Why is code in _your_ application figuring out how to quote this correctly for a particular database, rather than letting the DBI do it? Using well-known keys in that dict makes the information _somewhat_ portable. For example the same URI "username:password@host:port/dbname" could easily be used to create a pgsql, mysql, mssql, or oracle database connection to a local or remote server, if they all use the same well-known keys and provide sensible defaults. That means I'm spending less time writing a translation layer between my UI or config file and the DBI to convert from {ui_key value} to {dbi_key value}. A _recommendation_, focused on having the DBI do magic quoting and following the principles of the ODBC connection string (name=value + common keys) can't hurt, and it can provide advantages in portability, understandability and reduced bugs in connection string handling. >> (2) I am concerned about the API of the "allrows" and "foreach" >> iterators - the manner in which these commands accept and handle >> parameters is both confusing and (somewhat) limiting. >> I find the kludgy "-as lists|dicts" and ordering of parameters >> particularly distasteful. >> I would also like to see some clarification of the interaction of >> these commands with transactions. > Let me review this one. I certainly want to keep the API > involving discrete result sets and 'nextrow' - it's needed for > client-side merges, if nothing else Absolutely! I am in no way suggesting that these should go. I understand 'foreach' / 'allrows' / 'tdbc::execute' to be helpers that drive the underlying dbhandle/statement/resultset to make it easier to retrieve and process results. Regards, Trevor |
From: <lm...@bi...> - 2008-06-19 04:41:03
|
On Wed, Jun 18, 2008 at 09:27:57PM -0700, Larry McVoy wrote: > On Wed, Jun 18, 2008 at 10:23:47PM -0400, Kevin Kenny wrote: > > Twylite wrote: > > > (4) (... says he, drawing a breath ... ) the NULL thing. Yes Kevin, I > > > heard that this isn't up for negotiation ;) No, I don't want NULLs in > > > Tcl. But the answer to "how do we return a row which may have NULLs and > > > may have duplicate column names and is easy to process whether you want > > > to treat it as a list of values or reference individual values by column > > > name" is also linked to my previous two points - in particular how to > > > get both a list & a dict representation of a given row. > > > > OK. I was a bit strong in my rejection of discussing NULLs. But > > I didn't want people demanding that 308 be rejected because it failed > > to address 185. > > Hey, live by the sword, die by the sword. > > It's pretty clear that NULL is useful. Instead of going "blah blah blah > I can't hear you, EIAS, I can't hear you", address the fucking issue. > All of the other dynamic languages have addressed it, and I, for one, > in my non-sanctioned role as gadfly, will bring this up over and over > again. And if it derails any tip, oh, well. > > Address the fucking issue and stop hiding behind this silly mantra of > EIAS. It's a useful concept, agreed. Moderation in all things. In > case you haven't noticed, EIAS hasn't exactly taken the world by storm. And just so you don't brush this off with "Larry is whining and not doing", if you guys can reach agreement on an answer, I'll happily fund the implementation. I'm not looking for a free ride, I'm looking for tcl to be better. And willing to pay for it. -- --- Larry McVoy lm at bitmover.com http://www.bitkeeper.com |
From: <lm...@bi...> - 2008-06-19 04:27:49
|
On Wed, Jun 18, 2008 at 10:23:47PM -0400, Kevin Kenny wrote: > Twylite wrote: > > (4) (... says he, drawing a breath ... ) the NULL thing. Yes Kevin, I > > heard that this isn't up for negotiation ;) No, I don't want NULLs in > > Tcl. But the answer to "how do we return a row which may have NULLs and > > may have duplicate column names and is easy to process whether you want > > to treat it as a list of values or reference individual values by column > > name" is also linked to my previous two points - in particular how to > > get both a list & a dict representation of a given row. > > OK. I was a bit strong in my rejection of discussing NULLs. But > I didn't want people demanding that 308 be rejected because it failed > to address 185. Hey, live by the sword, die by the sword. It's pretty clear that NULL is useful. Instead of going "blah blah blah I can't hear you, EIAS, I can't hear you", address the fucking issue. All of the other dynamic languages have addressed it, and I, for one, in my non-sanctioned role as gadfly, will bring this up over and over again. And if it derails any tip, oh, well. Address the fucking issue and stop hiding behind this silly mantra of EIAS. It's a useful concept, agreed. Moderation in all things. In case you haven't noticed, EIAS hasn't exactly taken the world by storm. -- --- Larry McVoy lm at bitmover.com http://www.bitkeeper.com |
From: Kevin K. <kk...@ny...> - 2008-06-19 02:23:49
|
Twylite wrote: > (4) (... says he, drawing a breath ... ) the NULL thing. Yes Kevin, I > heard that this isn't up for negotiation ;) No, I don't want NULLs in > Tcl. But the answer to "how do we return a row which may have NULLs and > may have duplicate column names and is easy to process whether you want > to treat it as a list of values or reference individual values by column > name" is also linked to my previous two points - in particular how to > get both a list & a dict representation of a given row. OK. I was a bit strong in my rejection of discussing NULLs. But I didn't want people demanding that 308 be rejected because it failed to address 185. The general issue of NULL as a value (a non-value value?) in Tcl is Out Of Scope. [... longish discussion of why a cell in a result set is a triple: {name type value} redacted...] It's not clear to me that the type information has a place in Tcl, except insofar as a result is NULL. Results of any type have already in principle been converted to strings (and yes, at least the ODBC bridge uses pure integers, pure doubles, and so on intelligently to avoid needless shimmering). Except for "is this NULL", it's not clear what use a script would make of a column type. In any case, it really isn't always known; SQLite, for instance, uses "manifest typing", which comes close to "everything is a Tcl_Obj". So instead of {name type value}, we really have {name isnull value} or {name {Maybe value}}. And, well, Tcl dictionaries are an extremely natural representation for {name {Maybe value}}. The data returned are at least complete: given the list of column names, you can get full information with: while {[$resultSet nextdict rowdict]} { foreach column [$resultSet columns] { if {[dict exists $rowdict $column]} { ... it's not null, it's [dict get $rowdict $column] } else { ... it's null } } } One could even use [dict merge] to supply correct column-dependent default values for the nulls, something that a putative -nullvalue option, whether connection-wide or statement-specific, cannot do: set row [dict merge {counter 0 middlename NONE age UNKNOWN} $row] So the whole thing comes down to a matter of notational convenience. The iteration shown above visits every cell of the result set, whether null or not, and allows diagnosis of whether the cell is null. The only thing that's really awkward is the disambiguation of duplicate column names. But ODBC and JDBC both manage to do that anyway (by adding strings like '#1' to the end of duplicate names). And if the user thinks those names are ugly, there's always the use of 'AS' in SQL. So I'm convinced that we already have a solution that's 'correct' in that it can deal with 100% of the cases. The notation may not be ideal, but there's been a lot of time spent quibbling about it, without coming up with a single solution that will satisfy everyone's sense of æsthetics. I think that there's always going to be a slight "impedance mismatch" between SQL and Tcl. Such a mismatch is not surprising; SQL was designed in the days when COBOL ruled the earth, for $env(LC_DEITY)'s sake -- witness, for instance, its dependence on fixed-length strings and fixed-point arithmetic. Some things are simply going to be awkward. (But hybrid object-relational systems are even worse, so I'm *not* designing a monster like ADO!) -- 73 de ke9tv/2, Kevin |
From: Kevin K. <kk...@ny...> - 2008-06-19 01:00:22
|
Twylite wrote: > Sorry to get these comments in so late ... never enough time :( > > SECTION 1: REQUEST FOR CLARIFICATIONS > > (1) Substitutions in SQL statements: "The database interface MUST > support substitutions in SQL-code. Each substitution request has the > form :variableName. That is, each substitution request begins with a > literal colon (:), followed by a letter or underscore, followed by zero > or more letters, digits, or underscores." > I take it that: > - The variable name ends with the first non-(letter|digit|underscore) > character, so that if X is 10 then ":X:" becomes "10:"? > - There is no support for scoped variables - the variableName MUST be in > the params dict or the caller's scope and nowhere else? That is correct. In particular, I didn't want to have to specify what a scoped variable means when a parameter dictionary is supplied. Moreover, there's an opening there for scope creep; someone else will want array elements, or ${...} notation, or $env(LC_DEITY) help us, command substitution. I don't want to try to address how all the details of Tcl's notation might interact with SQL's. > (2) In the "$dbhandle transaction <script>" construct, can someone > please explain: > - the reason that TCL_BREAK and TCL_CONTINUE should cause a commit > rather than a rollback? It was "principle of least astonishment." It seems reasonable to consider exiting a loop to be a "normal" action: foreach value $list { db transaction { ... do something complicated with $value if {$done} break ... do some more stuff } } seems like something that a user would expect to commit rather than rollback. The downside of that is that if the code is not in a loop, the "invoked [break] but not in a loop" error gets thrown. But that error isn't thrown until the stack is already unwound; within a transaction, there's no way to detect it. > - With regard to the note "(Note: Scripts inside a transaction command > SHOULD avoid use of the return -code or return -level operations. If a > script returns from a transaction, with any combination of return > options, the transaction SHALL be committed.)" does this refer only to > the use of "return -code" directly within the script, or also within > procs called by the script? It refers to the use of a [return] that will exit the transaction. Returns that land on code within the transaction are fine. > (3) Why are the functions "$dbhandle tables" and "$dbhandle columns" a > MUST? I was under the impression that not all databases supported this > functionality, and that others may require less-than-straightforward > implementations? > For example, my understanding of PostgreSQL (please feel free to correct > me/it) is that one can only get this information by querying the system > catalog (tables), which may change depending on the DBMS version. I'm willing once again to downgrade to SHOULD, but the fact that you have to query a system catalog for the information is nothing unusual. Oracle and Sybase also require you to go after the system catalog to get the stuff. Certainly the Postgres ODBC and JDBC drivers manage to provide comparable functionality. I'd prefer not to relax this one until someone demonstrates a database where it's inordinately difficult to get the lists. > (4) Is it possible to determine if a transaction is in progress for a > given database connection? No. It would be easy enough to add, and a half-page TIP can add it, so I don't think this particular issue is worth holding up the vote. > (5) With regards to autocommit support the following statement is made: > "Statements executed against the database when no transaction is in > progress (before the first starttransaction or after all started > transactions have been either committed or rolled back) SHOULD be > auto-committed; that is, each such statement SHOULD be executed as if a > starttransaction command preceded the statement and a commit command > followed it." > If the statement given to "$dbhandle prepare" is in fact multiple > statements (separated by ;) how does autocommit behave? > For that matter, is prepare expected to support multiple statements? > Should this be clarified? Prepare is not expected to support multiple statements; in particular, no provision is made for multiple result sets. If a given database implementation decides to allow for multiple statements (presumably containing at most one SELECT, I would presume that the statements would be executed as a group. (Although the tokenizer recogizes ';', both the database interfaces that I've code it reject it.) > (6) The support for SQL tokenisation says that "'$', ':' and '@' are all > variable substitutions; the remainder of the token string is a variable > name." This doesn't match completely with behaviour specified for > "$dbhandle prepare" which only specified ':'. Further, allowing '$' for > substitutions complicates (rather than simplifies) SQL code in Tcl as > additional escaping will be required. > Which behaviour is correct? The tokenizer was lifted wholesale from SQLite and contributed by Richard Hipp. I don't see the need for any substitution character other than ':', but haven't quite got around to removing the others. > SECTION 2: REQUEST FOR TIP CHANGES/FEATURES > > (1) In the section "Basic mechanics of database interfaces" the > following statements are made: > - "Any of the ensembles MAY support abbreviation of its subcommands > according to the rules defined by Tcl_GetIndexFromObj" > - "In all of the places where this syntax is expected, a database module > MAY support abbreviation of options according to the rules of > Tcl_GetIndexFromObj()" > Given that code that uses the DBI "SHOULD spell out subcommands/options > in full", and that (mis)use of abbreviations can affect applications if > the DBI is extended, and that database implementations are not > restricted to providing only the subcommands specified by the TIP > (potentially requiring longer abbraviations), I would like to request > that ensembles MUST NOT (or at least SHOULD NOT) support abbreviation. In an interactive shell (where the code is not, after all, expected to be preserved), the ability to abbreviate is a bit too convenient to dispense with it altogether. For well or ill, the "unique prefix" rule seems to be with us in enough other places without having caused too many headaches. > (2) I am also requesting a configuration option "-nullvalue val" (where > val defaults to {}). In the result set interface the commands > "nextlist" and "nextrow -as lists" , if a cell in the row is NULL, the > configured nullvalue MUST be stored as its value. > In discussions on c.l.t and tcl-core (and possibly the wiki) numerous > users have mentioned that they find this feature useful, and that it is > supported by a number of existing Tcl database extensions (including > oratcl, fbsql, SQLite and proprietary/in-house DBIs). Several others have requested this, as well. I'm not against it in principle. My guess is that it would be most flexible to configure it at the level of individual statements, rather than connection-wide. Nevertheless, it can be added compatibly later; can we maybe get some alpha experience with -as dicts before resorting to it? I suspect that the latter will prove to be the preferred way to deal with NULLs. (Also, let's stay away from further discussion of NULL in this message.) > (3) In the section "Connecting to a database" I would like to suggest a > recommendation that implementations MAY support generic URI syntax to > specify the connection parameters. This approach is common e.g. in web > applications and frameworks, and spares the user from the need to learn > a different syntax for each supported database. > We have implemented such a scheme in our proprietary DBI in response to > the pain of pgintcl's conninfo structure ;( Uhm. They MAY support URI syntax, or, indeed, anything else. SQLite uses file names. ODBC connection strings are well defined if Byzantine. These things are nothing if not varied, and will in any case NOT be portable from one database to another. > The generic syntax for URIs is defined by RFC 3986 (see > http://en.wikipedia.org/wiki/URI_scheme#Generic_syntax for a quick > reference). > One could describe most remote database connections using: > scheme://username:password@host:port/dbname > ... where scheme indicates the database type (pgsql, mysql, oracle, etc.). > Local file-based databases could likewise be defined using: > scheme://username:password@/path/filename.extension > For a constructor (something that takes database connections parameters > and returns a database handle) to support generic URI syntax it should: > - Accept a uri parameter and a dict of options (or args to be treated as > a dict) > - Set defaults for scheme, username, host, port, dbname, etc. as appropriate > - Parse the URI and merge onto the dict of defaults to get a uri_dict > - Merge the dict of options (constructor parameter) onto the uri_dict. > This allows things like passwords to be handled separately (one could > have the URI in a configuration file and solicit the password). > > If this suggestion is accepted I am willing to contribute Tcl code for > generic URI parsing that will support the functionality described above. It's not the parsing, it's the mapping of the URI to the semantics of the underlying engine that's going to be the hard part. Since the names are going to be inherently unportable in any case, where is the advantage in unifying them? > SECTION 3: CONCERNS & COMMENTS > > (1) When retrieving the columns of a table using "$dbhandle columns > tablename" the results are returned as a dict. The order of the columns > should be preserved (for the same reasons that the order of columns in a > resultset row should be preserved). I suggest that EITHER: > - The TIP states the assumption that dict preserves order and requires > that implementations MUST construct the dict in order such that [dict > keys] returns the ordered list of columns; OR > - The "columns" command is changed to return information in some other > form, such as a resultSetHandle or list-of-tuples. [dict] does indeed preserve order. I'll make the appropriate editorial change in the TIP; it was indeed my intention that the keys be returned in sequence. > (2) I am concerned about the API of the "allrows" and "foreach" > iterators - the manner in which these commands accept and handle > parameters is both confusing and (somewhat) limiting. > I find the kludgy "-as lists|dicts" and ordering of parameters > particularly distasteful. > I would also like to see some clarification of the interaction of these > commands with transactions. > > Code example: > $mydb foreach -as lists -columnsvariable c \ > {SELECT * FROM t_table WHERE type = :type} \ > [dict create type "book"] r { > puts "([join $c ","]) -> ([join $r ","])" > } > Umm ... to me that doesn't read like a sentence, and I am going to be > hesitant to guess the meaning of some of those parameters. > > On c.l.t. I presented a sample tcl::db::execute that offers the combined > functionality of these two commands with (IMHO) a clearer interface. > See http://groups.google.com/group/comp.lang.tcl/msg/bf16218a45d280e3 > (the code will not work against the current revision of the TIP). > > My suggestion is roughly: > $dbhandle execute ?-option value? ?...? > Where the options are: > -sql stmt (required) SQL statement to execute > -params dict (optional) parameters/substituents for the SQL statement; > if not present the substituents will be obtained from variables in the > caller's scope > -columns varname (optional) if present the variable will be populated > with an ordered list of column names before result rows are processed. > -listvar varname (optional) variable to be populated with an ordered > list of fields in the result row > -dictvar varname (optional) variable to be populated with a dict of > fields in the result row > -script script (optional) if present the results will be retrieved > (into -listvar/-dictvar) one row at a time and the script executed. If > absent then all rows are retrieved into -listvar as a list-of-lists > and/or into -dictvar as a list-of-dicts > > The reasons that I feel this interface is cleaner are basically: > - There are a lot of parameters, and no obvious way to order them to > "read like a sentence", so clarity can be enhanced by tagging each > parameter. > - The interface is more extensible. It would be easy to add a > -transaction or -isolation option, for example. Let me review this one. I certainly want to keep the API involving discrete result sets and 'nextrow' - it's needed for client-side merges, if nothing else - but I agree that the 'foreach' and 'allrows' are perhaps not the best notation that can be achieved. > (3) The "-as lists|dicts" thing. I'm still unhappy about this one, not > only because I think it looks like a kludge, but because it means you > can get you resultset as a list OR a dict but NOT both. > As is well noted there are deficiencies in retrieving rows as lists, and > other deficiencies in retrieving rows as dicts. And there are no doubt > times when you have to deal with both at once, and the current interface > makes that difficult (if not impossible). I have encountered such a > situation but I forget the specifics (which are all-important) and it's > getting a little late (or is that early) to go digging. No doubt a > better database design and SQL statement could have solved the problem, > but the code in question was a report renderer that accepted a SQL > statement as input, so I wasn't at liberty to design around the problem. To me, it's not clear that you'd ever need both at once. In fact, '-as lists' was intended simply to make report renderers a little easier to write. Certainly, '-as dicts' contains exactly the same information, only packaged differently. In any case, the whole discussion flows directly into your item (4), which at your sensible request, I'm addressing in a separate message. -- 73 de ke9tv/2, Kevin |
From: Twylite <tw...@cr...> - 2008-06-18 23:58:37
|
Sorry to get these comments in so late ... never enough time :( SECTION 1: REQUEST FOR CLARIFICATIONS (1) Substitutions in SQL statements: "The database interface MUST support substitutions in SQL-code. Each substitution request has the form :variableName. That is, each substitution request begins with a literal colon (:), followed by a letter or underscore, followed by zero or more letters, digits, or underscores." I take it that: - The variable name ends with the first non-(letter|digit|underscore) character, so that if X is 10 then ":X:" becomes "10:"? - There is no support for scoped variables - the variableName MUST be in the params dict or the caller's scope and nowhere else? (2) In the "$dbhandle transaction <script>" construct, can someone please explain: - the reason that TCL_BREAK and TCL_CONTINUE should cause a commit rather than a rollback? - With regard to the note "(Note: Scripts inside a transaction command SHOULD avoid use of the return -code or return -level operations. If a script returns from a transaction, with any combination of return options, the transaction SHALL be committed.)" does this refer only to the use of "return -code" directly within the script, or also within procs called by the script? (3) Why are the functions "$dbhandle tables" and "$dbhandle columns" a MUST? I was under the impression that not all databases supported this functionality, and that others may require less-than-straightforward implementations? For example, my understanding of PostgreSQL (please feel free to correct me/it) is that one can only get this information by querying the system catalog (tables), which may change depending on the DBMS version. (4) Is it possible to determine if a transaction is in progress for a given database connection? (5) With regards to autocommit support the following statement is made: "Statements executed against the database when no transaction is in progress (before the first starttransaction or after all started transactions have been either committed or rolled back) SHOULD be auto-committed; that is, each such statement SHOULD be executed as if a starttransaction command preceded the statement and a commit command followed it." If the statement given to "$dbhandle prepare" is in fact multiple statements (separated by ;) how does autocommit behave? For that matter, is prepare expected to support multiple statements? Should this be clarified? (6) The support for SQL tokenisation says that "'$', ':' and '@' are all variable substitutions; the remainder of the token string is a variable name." This doesn't match completely with behaviour specified for "$dbhandle prepare" which only specified ':'. Further, allowing '$' for substitutions complicates (rather than simplifies) SQL code in Tcl as additional escaping will be required. Which behaviour is correct? SECTION 2: REQUEST FOR TIP CHANGES/FEATURES (1) In the section "Basic mechanics of database interfaces" the following statements are made: - "Any of the ensembles MAY support abbreviation of its subcommands according to the rules defined by Tcl_GetIndexFromObj" - "In all of the places where this syntax is expected, a database module MAY support abbreviation of options according to the rules of Tcl_GetIndexFromObj()" Given that code that uses the DBI "SHOULD spell out subcommands/options in full", and that (mis)use of abbreviations can affect applications if the DBI is extended, and that database implementations are not restricted to providing only the subcommands specified by the TIP (potentially requiring longer abbraviations), I would like to request that ensembles MUST NOT (or at least SHOULD NOT) support abbreviation. (2) I am also requesting a configuration option "-nullvalue val" (where val defaults to {}). In the result set interface the commands "nextlist" and "nextrow -as lists" , if a cell in the row is NULL, the configured nullvalue MUST be stored as its value. In discussions on c.l.t and tcl-core (and possibly the wiki) numerous users have mentioned that they find this feature useful, and that it is supported by a number of existing Tcl database extensions (including oratcl, fbsql, SQLite and proprietary/in-house DBIs). (3) In the section "Connecting to a database" I would like to suggest a recommendation that implementations MAY support generic URI syntax to specify the connection parameters. This approach is common e.g. in web applications and frameworks, and spares the user from the need to learn a different syntax for each supported database. We have implemented such a scheme in our proprietary DBI in response to the pain of pgintcl's conninfo structure ;( The generic syntax for URIs is defined by RFC 3986 (see http://en.wikipedia.org/wiki/URI_scheme#Generic_syntax for a quick reference). One could describe most remote database connections using: scheme://username:password@host:port/dbname ... where scheme indicates the database type (pgsql, mysql, oracle, etc.). Local file-based databases could likewise be defined using: scheme://username:password@/path/filename.extension For a constructor (something that takes database connections parameters and returns a database handle) to support generic URI syntax it should: - Accept a uri parameter and a dict of options (or args to be treated as a dict) - Set defaults for scheme, username, host, port, dbname, etc. as appropriate - Parse the URI and merge onto the dict of defaults to get a uri_dict - Merge the dict of options (constructor parameter) onto the uri_dict. This allows things like passwords to be handled separately (one could have the URI in a configuration file and solicit the password). If this suggestion is accepted I am willing to contribute Tcl code for generic URI parsing that will support the functionality described above. SECTION 3: CONCERNS & COMMENTS (1) When retrieving the columns of a table using "$dbhandle columns tablename" the results are returned as a dict. The order of the columns should be preserved (for the same reasons that the order of columns in a resultset row should be preserved). I suggest that EITHER: - The TIP states the assumption that dict preserves order and requires that implementations MUST construct the dict in order such that [dict keys] returns the ordered list of columns; OR - The "columns" command is changed to return information in some other form, such as a resultSetHandle or list-of-tuples. (2) I am concerned about the API of the "allrows" and "foreach" iterators - the manner in which these commands accept and handle parameters is both confusing and (somewhat) limiting. I find the kludgy "-as lists|dicts" and ordering of parameters particularly distasteful. I would also like to see some clarification of the interaction of these commands with transactions. Code example: $mydb foreach -as lists -columnsvariable c \ {SELECT * FROM t_table WHERE type = :type} \ [dict create type "book"] r { puts "([join $c ","]) -> ([join $r ","])" } Umm ... to me that doesn't read like a sentence, and I am going to be hesitant to guess the meaning of some of those parameters. On c.l.t. I presented a sample tcl::db::execute that offers the combined functionality of these two commands with (IMHO) a clearer interface. See http://groups.google.com/group/comp.lang.tcl/msg/bf16218a45d280e3 (the code will not work against the current revision of the TIP). My suggestion is roughly: $dbhandle execute ?-option value? ?...? Where the options are: -sql stmt (required) SQL statement to execute -params dict (optional) parameters/substituents for the SQL statement; if not present the substituents will be obtained from variables in the caller's scope -columns varname (optional) if present the variable will be populated with an ordered list of column names before result rows are processed. -listvar varname (optional) variable to be populated with an ordered list of fields in the result row -dictvar varname (optional) variable to be populated with a dict of fields in the result row -script script (optional) if present the results will be retrieved (into -listvar/-dictvar) one row at a time and the script executed. If absent then all rows are retrieved into -listvar as a list-of-lists and/or into -dictvar as a list-of-dicts The reasons that I feel this interface is cleaner are basically: - There are a lot of parameters, and no obvious way to order them to "read like a sentence", so clarity can be enhanced by tagging each parameter. - The interface is more extensible. It would be easy to add a -transaction or -isolation option, for example. (3) The "-as lists|dicts" thing. I'm still unhappy about this one, not only because I think it looks like a kludge, but because it means you can get you resultset as a list OR a dict but NOT both. As is well noted there are deficiencies in retrieving rows as lists, and other deficiencies in retrieving rows as dicts. And there are no doubt times when you have to deal with both at once, and the current interface makes that difficult (if not impossible). I have encountered such a situation but I forget the specifics (which are all-important) and it's getting a little late (or is that early) to go digging. No doubt a better database design and SQL statement could have solved the problem, but the code in question was a report renderer that accepted a SQL statement as input, so I wasn't at liberty to design around the problem. (4) (... says he, drawing a breath ... ) the NULL thing. Yes Kevin, I heard that this isn't up for negotiation ;) No, I don't want NULLs in Tcl. But the answer to "how do we return a row which may have NULLs and may have duplicate column names and is easy to process whether you want to treat it as a list of values or reference individual values by column name" is also linked to my previous two points - in particular how to get both a list & a dict representation of a given row. Please, for sanity's sake, make any answers to this point separate from responses to the rest of the mail ;/ The suggestion I am forced to put forward comes directly from relational calculus and the observed deficiencies of both list and dict representations: each row of a table is an ordered list of cells, not just an ordered list of values. Each cell is a tuple (name, type, value). The most straightforward representation of a row that is also a complete representation is therefore a list-of-tuples(name,type,value). And barely days ago on tcl-core I said I wouldn't propose it ;) Let us contemplate for a moment: $resultSetHandle nextrow variableName where nextrow returns 1 if a row was retrieved and 0 if no rows remain in the resultset. The given variable will contain a list-of-tuples. There should be no extra complexity in including the type in the tuple - the type must already be available to render the data in its correct representation (as specified by nextrow). Similarly the column name would already be available if returning a dict, and the value in all cases. If the database has poor support for retrieving column types then the type "varchar" could be used as a default (when in Tcl, EIAvarchar?). A null would be indicated by the type 'null' and the value set to the configurable nullvalue. The information contained in the list-of-tuples if sufficient for all cases: it is a true ordered multiset of attributes, capable of representing NULLs and duplicate column names. I believe the only information it may be missing is the precision and scale of the columns (and I'm not sure that other DBIs typically make this information available per resultset). The deficiency of this approach is not in its completeness, but in ease of use. In that respect - as others have proposed - a couple of utilities could support easy conversions such as: - list-of-tuples(name,type,value) to list-of-values - list-of-tuples(name,type,value) to dict(name,value) - get value from list-of-tuples given name etc. There are variations on this idea that may be more appealing, e.g. - Return a list {colname {type value} colname {type value} ...} which is dict-compatible and, with a helper function, could be easily turned into a list-of-values. - Return a list {colname {value type} colname {value type} ...} could make the helper function more efficient? - Return a list {colname value {} type ...} which is dict-compatible, foreach-friendly and (via a helper) easy to turn into a list-of-values. Actually I'm rather liking the last one, but that may be because sleep is not so much beckoning as threatening, so I will call it there. I believe this nextrow issue deserves some more thought & attention. If we go for correctness now we can improve ease of use and efficiency later. If we go for "-as lists|dicts" now (which probably covers two different 80% cases, making us 160% covered, right?) then there are cracks that can't be fixed later (at least not easily). Regards, Trevor |