[Ruby-DBI-devel] thoughts on extending Ruby/DBI

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hello everybody,

I have some proposals for the DB2 driver as well as the Ruby DBI/DBD 
API, and I would be grateful to hear your opinions. In detail:

1)BLOB/CLOB support
I work on a project that will be using DB2 as a backend to store ruby 
objects (via Marshall.store) in VARCHAR columns and I am hitting on the 
database limit of pagesize (4096 bytes default). Extending the pagesize 
at db creation up to 32K will not solve the problem, but will only 
postpone it for a later time, so the logical solution seems to use large 
objects (LOB).
Looking at existing work, I see that only the postgres driver for ruby 
implements LOB functionality, but then it is hidden from the DBI. Class 
DBI::Binary seems a good starting point, so my first thought is to add 
functionality like the one found on java.sql.[B|C]lob interfaces. I 
think that JDBC is very well designed (although  its datatype strictness 
really gets on my nerves) and we can borrow many ideas and function 
calls. Or, we can borrow ideas from Python/Perl DBI implementations. Or 
better, we can keep it simple and use Ruby in a clever manner to hide 
all this complexity from the user.

2) Parameter binding
Right now, parameter binding uses ugly string concatenation to produce 
ad-hoc SQL statements, which cancels all performance gains from prepared 
(precompiled) statements. Apart from that, it doesn't support 
input/output and output parameters, which means that most existing 
legacy databases that use stored procedures are useless under Ruby. 
Therefore, we need to define some sort of attribute to let the db driver 
know what kind of parameter we use. We could be using 'attribs' 
parameter in DBI::Statement.bind_param to do this. For instance:

statement.bind_param(1, value1, { 'direction' => 'in', 'type' => SQL_BLOB})
statement.bind_param(2, value2, { 'direction' => 'out', 'type' => 
SQL_INTEGER})
statement.bind_param(3, value3, { 'direction' => 'inout', 'type' => 
SQL_VARCHAR})

In this case:
a)  direction => 'in' would mean that parameter is bound for input (no 
expected result from the db statement)
b) 'out' means that parameter is bound for output (value generated by 
database, ruby variable initialized & populated at statement execution time)
c) 'inout' means the same as 'out' except that we use the ruby 
variable's value at execution, dispose the value and populate a new 
value with the statement results

attribute 'type' must be supplied because the low-level driver needs to 
know what kind of data to expect from the database. Therefore, all 
returned ('out'/'inout') values should be instantiated from a base class 
that matches closely the specified SQL datatype.

Furthermore, for BLOB/CLOB functionality, value1 could be:
a) read directly from memory,
b) if it was a string it could represent a filename, then we could bind 
the parameter to the file itself. DB2 and ODBC have SQLBindFileTo(Col|Param)

3) DB2 driver
I admit I am confused as to where to draw the separation line between 
database handling logic and the ruby DBD implementation. Looking at the 
ODBC driver (~120k!) I see that all low-level work is done in C and the 
driver uses Ruby to act as a bridge to ODBC functions. This decision 
means that we have to take care of memory management, garbage collection 
etc. in C. What is your opinion?

See you,

Elias