Menu

#473 Statement.setMaxRows() Implementation out of specs

open
7
2014-08-08
2006-01-29
nielsg
No

In the Java docs, the description of
Statement.setMaxRows() is "Sets the limit for the
maximum number of rows that any ResultSet object can
contain to the given number.". But, if I execute
Statement.setMaxRows() not only will it limit the
maximum number of rows in the resultset, but it will
also limit the number of rows modified in other
statements such as INSERT, UPDATE, DELETE and SELECT
INTO. I've tested this on all other vendor drivers
(Oracle, DB2, PostgreSQL, MySQL and Informix) and none
of them do this except for the jConnect Sybase drivers.
I would consider this implementation incorrect
according to the specs. I am assuming the
implementation uses the SQL Server command SET ROWCOUNT X.

Would this be fixable? Can the limit just be done on
the client-side code of the JDBC driver, so the
implementation more closely matches the specs?

Discussion

  • Alin Sinpalean

    Alin Sinpalean - 2006-03-03

    Logged In: YES
    user_id=564978

    Wow, I didn't realize this. The only way to fix this is (as
    you propose) to limit the result sets on the client side.
    The problem with that is that it will really affect
    performance when using firehose cursors (if SET ROWCOUNT
    were used then the data would not be sent across the
    network, this way jTDS would basically "hang" and have to
    consume all extra rows).

    This is definitely a bug and should be fixed but I'm
    thinking that a configurable approach to this (i.e. a URL
    parameter that would switch between the current
    functionality and the correct but slow one) would be the
    best way to go. If we were to do this, I would go with the
    current behavior as default, for performance reasons (I
    guess this is how all the SQL Server drivers work anyway).

     
  • nielsg

    nielsg - 2006-03-03

    Logged In: YES
    user_id=1207902

    I've been working on a work around in our product for this
    bug. Initially we just stopped using the
    Statement.setMaxRows(), but this caused big performance
    problems with some of our customers as you have pointed out.
    So, we ended up having to parse all SQL statements before
    execution to find out if it was a SELECT statement or some
    other statement. If it is a SELECT statement then we use the
    Statement.setMaxRows(), otherwise we don't. So, for us this
    bug is no longer high priority since we now have a work-around.

    But, it is good to know that this bug exists. Ideally, the
    JDBC driver would parse any statement to be executed and
    determine if it is a SELECT statement. If it is, then it
    would SET ROWCOUNT X before execution, otherwise it wouldn't.

     
  • Mike Hutchinson

    Mike Hutchinson - 2006-03-05

    Logged In: YES
    user_id=641437

    As a by-product of parsing the statement for JDBC escapes
    and parameter markers, jTDS does in fact know that it is
    sending a select statement so the solution you propose could
    be implemented fairly easily. There is a problem with this
    approach and that is that the SQL could be executing a
    stored procedure, which contains a select. In this case the
    driver would not issue a SET ROWCOUNT and all rows would be
    returned.

    If the approach of looking for SELECTS is still felt to be
    viable then it may be worth having the driver edit the SQL
    to include 'TOP nnnn' to limit the rows rather than issuing
    a SET ROWCOUNT. This would also avoid the need to reset the
    row count later. The TOP approach would work with SQL 7 up
    and the most recent versions of Sybase. Note, there are
    problems with TOP and prepared statements as, with server
    versions below 2005, it is not possible to parameterise the
    number of rows in the TOP clause.

    Another alternative would be to only issue a SET ROWCOUNT
    when the executeQuery method is used and not for execute or
    executeUpdate. Obviously this does rely on people using the
    correct method and would preclude limits on result sets
    returned by execute. On the positive side, it does have the
    merit of simplicity!

    If the approach of looking for SELECTS is still felt to be
    viable then it may be worth having the driver edit the SQL
    to include 'TOP nnnn' to limit the rows rather than issuing
    a SET ROWCOUNT. This would also avoid the need to reset the
    row count later. The TOP approach would work with SQL 7 up
    and the most recent versions of Sybase. Note, there are
    problems with TOP and prepared statements as, with server
    versions below 2005, it is not possible to parameterise the
    number of rows in the TOP clause.

     
  • nielsg

    nielsg - 2006-03-05

    Logged In: YES
    user_id=1207902

    Limiting the SET ROWCOUNT to the executeQuery() and not the
    executeUpdate() might be a good idea, but I think that it
    should be set for execute(). execute() allows for any type
    of statement. In our tool we use execute() for all "Ad-Hoc"
    queries, including SELECT and UPDATE (we don't setMaxRows()
    for the UPDATE).

    Editing the SQL that is sent to the server is probably not a
    good idea. Our tool is an "Ad-Hoc" query tool. When a user
    types in a query to execute, they don't want it to be
    modified, especially if they are optimizing the query with
    an explain plan. Also, rewriting an SQL statment you would
    assume that you can correctly rewrite any valid T-SQL block
    (This would be alot of work). I am not sure why this would
    be a good option anyway ... if you detect that the query is
    a SELECT then you can SET ROWCOUNT ... why would there be a
    reason to re-write the SELECT statement?

    I think the best solution is to detect a SELECT in which
    case you SET ROWCOUNT, otherwise don't SET ROWCOUNT. And if
    there is ever a resultset returned (stored procedure) then
    do a timeout in the JDBC driver for the Max Rows. Although
    this solution could be a little harder to implement, because
    the SQL statement can be a valid T-SQL block. For example,
    this is a valid statement that can be executed with the JDBC
    driver:

    -------- Start T-SQL --------------
    / This is not an UPDATE, DELETE or INSERT /
    SELECT * FROM MYTABLE
    UPDATE MYTABLE SET mycolumn=1
    -------- End T-SQL -------------

    Do you set the Max Row count here? Do you have a parser
    that can determine if this is a SELECT statement even if you
    remove the UPDATE statement?

     
  • Mike Hutchinson

    Mike Hutchinson - 2006-03-06

    Logged In: YES
    user_id=641437

    The parser is fairly sophisticated and will skip any
    comments to find the select keyword so it would be possible
    to send an initial set row count for the batch example you
    gave. The problem is that the update would also be included
    in the scope of the set row count and may be
    unintentionally affected by the limit. As I understand it
    this was your original concern with the current
    implementation.

    It would be possible to impose a row limit on the client
    side and send a cancel to the server to avoid having to
    read through to the end of a very large result set. The
    problem is that a cancel would terminate the entire batch
    so again, in the example you give; it is possible that the
    update would never be executed at all.

    Maybe the best option is to make the implementation user
    selectable as Alin suggests and have a choice between the
    current behaviour and a more sophisticated option that uses
    row count when dealing with select statements. There could
    also be a fall back of imposing the row count limit on the
    client side as well.

     
  • grimlock81

    grimlock81 - 2009-11-05

    Are there any plans to work on this bug? No matter what arguments you put forward about impacting performance, the fact remains that jtds doesn't satisfy the contract put forward in the description of Statement.setMaxRows().

    Like nielsg, we have an adhoc querying tool and this is causing problems when a user tries to delete/update more rows than our default max rows (usually 5000).

     
  • momo

    momo - 2009-11-05

    Well, performance is the main reason to use Statement.setMaxRows() in the first place, so performance is what this whole discussion is all about. Of course you are right, jTDS violates the JDBC contract here. But would it really be of any help to have a correct implementation that simply skips remaining rows on the client side? How would that be beneficial, at all?

    I'd agree that a more sophisticated solution limiting row count on the server side would be best. But currently I don't see how exactly this could be achieved. Any proposal that came up has its limits and drawbacks, so any ideas would be appreciated.

    Cheers,
    momo

     
  • momo

    momo - 2012-10-20

    Seems that the whole problem is even worse. As reported in bug #635, this could lead to invalid data being returned by a query. I'd never expected such behavior, but now we know about it, this problem should be addressed by jTDS.

    I see a number of cases that could be handled better without negatively affecting performance. I plan to further improve our SQL parser in 1.3.x, making it possible to sort out more of the problematic cases. As far as I see, we have the following options:
    - just execute statements found to not return a resultset
    - use set rowcount
    - rewrite statements to use top
    - cancel, if sure there are no more resultsets
    - break up statements/procedures and use cancel
    - limit rowcount on the client (additional threads might prevent blocking)
    By better analyzing SQL statements, including the source of called stored procedures, we could then choose the best approach.

    Cheers,
    momo

     
  • m.hilpert

    m.hilpert - 2013-06-06

    Now, I also fell into this trap! I had a prepared statement with setMaxRows(1) and wondered, why INSERT statements in the procedure only inserted 1 row ...

    Please fix this bug - thank you!

     
  • Stefan Zeiger

    Stefan Zeiger - 2014-07-02

    We ran into the same issue: https://github.com/slick/slick/issues/878

    To add another data point to this discussion: My tests indicate that Microsoft's official JDBC driver also has this bug.

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.