Hello everybody,
I have some proposals for the DB2 driver as well as the Ruby DBI/DBD
API, and I would be grateful to hear your opinions. In detail:
1)BLOB/CLOB support
I work on a project that will be using DB2 as a backend to store ruby
objects (via Marshall.store) in VARCHAR columns and I am hitting on the
database limit of pagesize (4096 bytes default). Extending the pagesize
at db creation up to 32K will not solve the problem, but will only
postpone it for a later time, so the logical solution seems to use large
objects (LOB).
Looking at existing work, I see that only the postgres driver for ruby
implements LOB functionality, but then it is hidden from the DBI. Class
DBI::Binary seems a good starting point, so my first thought is to add
functionality like the one found on java.sql.[B|C]lob interfaces. I
think that JDBC is very well designed (although its datatype strictness
really gets on my nerves) and we can borrow many ideas and function
calls. Or, we can borrow ideas from Python/Perl DBI implementations. Or
better, we can keep it simple and use Ruby in a clever manner to hide
all this complexity from the user.
2) Parameter binding
Right now, parameter binding uses ugly string concatenation to produce
ad-hoc SQL statements, which cancels all performance gains from prepared
(precompiled) statements. Apart from that, it doesn't support
input/output and output parameters, which means that most existing
legacy databases that use stored procedures are useless under Ruby.
Therefore, we need to define some sort of attribute to let the db driver
know what kind of parameter we use. We could be using 'attribs'
parameter in DBI::Statement.bind_param to do this. For instance:
statement.bind_param(1, value1, { 'direction' => 'in', 'type' => SQL_BLOB})
statement.bind_param(2, value2, { 'direction' => 'out', 'type' =>
SQL_INTEGER})
statement.bind_param(3, value3, { 'direction' => 'inout', 'type' =>
SQL_VARCHAR})
In this case:
a) direction => 'in' would mean that parameter is bound for input (no
expected result from the db statement)
b) 'out' means that parameter is bound for output (value generated by
database, ruby variable initialized & populated at statement execution time)
c) 'inout' means the same as 'out' except that we use the ruby
variable's value at execution, dispose the value and populate a new
value with the statement results
attribute 'type' must be supplied because the low-level driver needs to
know what kind of data to expect from the database. Therefore, all
returned ('out'/'inout') values should be instantiated from a base class
that matches closely the specified SQL datatype.
Furthermore, for BLOB/CLOB functionality, value1 could be:
a) read directly from memory,
b) if it was a string it could represent a filename, then we could bind
the parameter to the file itself. DB2 and ODBC have SQLBindFileTo(Col|Param)
3) DB2 driver
I admit I am confused as to where to draw the separation line between
database handling logic and the ruby DBD implementation. Looking at the
ODBC driver (~120k!) I see that all low-level work is done in C and the
driver uses Ruby to act as a bridge to ODBC functions. This decision
means that we have to take care of memory management, garbage collection
etc. in C. What is your opinion?
See you,
Elias
|