In the Java docs, the description of
Statement.setMaxRows() is "Sets the limit for the
maximum number of rows that any ResultSet object can
contain to the given number.". But, if I execute
Statement.setMaxRows() not only will it limit the
maximum number of rows in the resultset, but it will
also limit the number of rows modified in other
statements such as INSERT, UPDATE, DELETE and SELECT
INTO. I've tested this on all other vendor drivers
(Oracle, DB2, PostgreSQL, MySQL and Informix) and none
of them do this except for the jConnect Sybase drivers.
I would consider this implementation incorrect
according to the specs. I am assuming the
implementation uses the SQL Server command SET ROWCOUNT X.
Would this be fixable? Can the limit just be done on
the client-side code of the JDBC driver, so the
implementation more closely matches the specs?
Logged In: YES
user_id=564978
Wow, I didn't realize this. The only way to fix this is (as
you propose) to limit the result sets on the client side.
The problem with that is that it will really affect
performance when using firehose cursors (if SET ROWCOUNT
were used then the data would not be sent across the
network, this way jTDS would basically "hang" and have to
consume all extra rows).
This is definitely a bug and should be fixed but I'm
thinking that a configurable approach to this (i.e. a URL
parameter that would switch between the current
functionality and the correct but slow one) would be the
best way to go. If we were to do this, I would go with the
current behavior as default, for performance reasons (I
guess this is how all the SQL Server drivers work anyway).
Logged In: YES
user_id=1207902
I've been working on a work around in our product for this
bug. Initially we just stopped using the
Statement.setMaxRows(), but this caused big performance
problems with some of our customers as you have pointed out.
So, we ended up having to parse all SQL statements before
execution to find out if it was a SELECT statement or some
other statement. If it is a SELECT statement then we use the
Statement.setMaxRows(), otherwise we don't. So, for us this
bug is no longer high priority since we now have a work-around.
But, it is good to know that this bug exists. Ideally, the
JDBC driver would parse any statement to be executed and
determine if it is a SELECT statement. If it is, then it
would SET ROWCOUNT X before execution, otherwise it wouldn't.
Logged In: YES
user_id=641437
As a by-product of parsing the statement for JDBC escapes
and parameter markers, jTDS does in fact know that it is
sending a select statement so the solution you propose could
be implemented fairly easily. There is a problem with this
approach and that is that the SQL could be executing a
stored procedure, which contains a select. In this case the
driver would not issue a SET ROWCOUNT and all rows would be
returned.
If the approach of looking for SELECTS is still felt to be
viable then it may be worth having the driver edit the SQL
to include 'TOP nnnn' to limit the rows rather than issuing
a SET ROWCOUNT. This would also avoid the need to reset the
row count later. The TOP approach would work with SQL 7 up
and the most recent versions of Sybase. Note, there are
problems with TOP and prepared statements as, with server
versions below 2005, it is not possible to parameterise the
number of rows in the TOP clause.
Another alternative would be to only issue a SET ROWCOUNT
when the executeQuery method is used and not for execute or
executeUpdate. Obviously this does rely on people using the
correct method and would preclude limits on result sets
returned by execute. On the positive side, it does have the
merit of simplicity!
If the approach of looking for SELECTS is still felt to be
viable then it may be worth having the driver edit the SQL
to include 'TOP nnnn' to limit the rows rather than issuing
a SET ROWCOUNT. This would also avoid the need to reset the
row count later. The TOP approach would work with SQL 7 up
and the most recent versions of Sybase. Note, there are
problems with TOP and prepared statements as, with server
versions below 2005, it is not possible to parameterise the
number of rows in the TOP clause.
Logged In: YES
user_id=1207902
Limiting the SET ROWCOUNT to the executeQuery() and not the
executeUpdate() might be a good idea, but I think that it
should be set for execute(). execute() allows for any type
of statement. In our tool we use execute() for all "Ad-Hoc"
queries, including SELECT and UPDATE (we don't setMaxRows()
for the UPDATE).
Editing the SQL that is sent to the server is probably not a
good idea. Our tool is an "Ad-Hoc" query tool. When a user
types in a query to execute, they don't want it to be
modified, especially if they are optimizing the query with
an explain plan. Also, rewriting an SQL statment you would
assume that you can correctly rewrite any valid T-SQL block
(This would be alot of work). I am not sure why this would
be a good option anyway ... if you detect that the query is
a SELECT then you can SET ROWCOUNT ... why would there be a
reason to re-write the SELECT statement?
I think the best solution is to detect a SELECT in which
case you SET ROWCOUNT, otherwise don't SET ROWCOUNT. And if
there is ever a resultset returned (stored procedure) then
do a timeout in the JDBC driver for the Max Rows. Although
this solution could be a little harder to implement, because
the SQL statement can be a valid T-SQL block. For example,
this is a valid statement that can be executed with the JDBC
driver:
-------- Start T-SQL --------------
/ This is not an UPDATE, DELETE or INSERT /
SELECT * FROM MYTABLE
UPDATE MYTABLE SET mycolumn=1
-------- End T-SQL -------------
Do you set the Max Row count here? Do you have a parser
that can determine if this is a SELECT statement even if you
remove the UPDATE statement?
Logged In: YES
user_id=641437
The parser is fairly sophisticated and will skip any
comments to find the select keyword so it would be possible
to send an initial set row count for the batch example you
gave. The problem is that the update would also be included
in the scope of the set row count and may be
unintentionally affected by the limit. As I understand it
this was your original concern with the current
implementation.
It would be possible to impose a row limit on the client
side and send a cancel to the server to avoid having to
read through to the end of a very large result set. The
problem is that a cancel would terminate the entire batch
so again, in the example you give; it is possible that the
update would never be executed at all.
Maybe the best option is to make the implementation user
selectable as Alin suggests and have a choice between the
current behaviour and a more sophisticated option that uses
row count when dealing with select statements. There could
also be a fall back of imposing the row count limit on the
client side as well.
Are there any plans to work on this bug? No matter what arguments you put forward about impacting performance, the fact remains that jtds doesn't satisfy the contract put forward in the description of Statement.setMaxRows().
Like nielsg, we have an adhoc querying tool and this is causing problems when a user tries to delete/update more rows than our default max rows (usually 5000).
Well, performance is the main reason to use Statement.setMaxRows() in the first place, so performance is what this whole discussion is all about. Of course you are right, jTDS violates the JDBC contract here. But would it really be of any help to have a correct implementation that simply skips remaining rows on the client side? How would that be beneficial, at all?
I'd agree that a more sophisticated solution limiting row count on the server side would be best. But currently I don't see how exactly this could be achieved. Any proposal that came up has its limits and drawbacks, so any ideas would be appreciated.
Cheers,
momo
Seems that the whole problem is even worse. As reported in bug #635, this could lead to invalid data being returned by a query. I'd never expected such behavior, but now we know about it, this problem should be addressed by jTDS.
I see a number of cases that could be handled better without negatively affecting performance. I plan to further improve our SQL parser in 1.3.x, making it possible to sort out more of the problematic cases. As far as I see, we have the following options:
- just execute statements found to not return a resultset
- use set rowcount
- rewrite statements to use top
- cancel, if sure there are no more resultsets
- break up statements/procedures and use cancel
- limit rowcount on the client (additional threads might prevent blocking)
By better analyzing SQL statements, including the source of called stored procedures, we could then choose the best approach.
Cheers,
momo
Now, I also fell into this trap! I had a prepared statement with setMaxRows(1) and wondered, why INSERT statements in the procedure only inserted 1 row ...
Please fix this bug - thank you!
We ran into the same issue: https://github.com/slick/slick/issues/878
To add another data point to this discussion: My tests indicate that Microsoft's official JDBC driver also has this bug.