Menu

#566 Performance problem ResultSet.close() / Reading all rows

open
5
2014-07-02
2008-05-30
No

Problem:
If a result contains a lot of rows (e.g. 50.000) and only a few rows has been read using next() and the result set shall be closed, then the result set is reading all remaining rows from database in its close()-method.
This has caused a heavy performance issue in several applications, because in some cases the number of the needed rows from database is unknown or no limitation of the result using setMaxRows() were set.

Here the code in class JtdsResultSet:

public void close() throws SQLException {
    if (!closed) {
        try {
            if (!getConnection().isClosed()) {
               // Skip to end of result set
               // Could send cancel but this is safer as
               // cancel could kill other statements in a batch.
               while (next());
            }
        } finally {
        ...

Why do you avoid the use of statement.cancel(), i cannot understand how canceling the current statement can affect other statements ?
Or is this used workaround obsolete by now in the current driver version and can be replaced with a optimized implementation ?

The used database system is MS-SQL-Server 2005.

Discussion

  • m.hilpert

    m.hilpert - 2008-07-02

    Logged In: YES
    user_id=667728
    Originator: NO

    I work around this problem by setting max rows for the statement:

    myCallableStatement.setMaxRows(1);

    as you also mentioned. Perhaps you can workaround by setting max rows for each case of your usage. So, if you know you nedd just x rows, setMaxRows(x) and in other cases you don't set max ros.

    Either way, ResultSet.close() should not traverse remaining rows.

     
  • momo

    momo - 2009-09-16

    Unfortunately, this cannot be resolved that easily. Any execute method that can return a result (except executeQuery) is also allowed to return more than a single resultset. Since cancel aborts the whole operation (or batch of operations), any remaining resultsets would also be discarded, which in turn breaks the JDBC spec. Due to the fact that we cannot know if there are additional resultsets before completely reading the server's response, I do not see a way to avoid traversing all the remaining rows. But of course, any idea how to avoid that would be highly appreciated.

    Cheers,
    momo

     
  • Matthew Munz

    Matthew Munz - 2010-09-08

    For my application, this is not just a performance problem, it is a serious bug. This method call can take 15 minutes or more to complete. This is a problem for my application so it kills the process. As a result, there are many large temporary files left hanging around -- it's really unacceptable.

    I'm not sure that I have an elegant solution to ickzon's concerns about how to resolve this issue, but why not provide a configuration option that enables immediate closing of the ResultSet? My application does not use multiple open ResultSets from the same statement so I'm fine with the behavior that ickzon describes. For others that need this functionality (and don't mind the problem described in this bug) they could configure the driver to behave as it does currently.

    Also, if you're worried about breaking the JDBC spec, well, that already is the case. The Javadoc for ResultSet#close says that it closes the ResultSet immediately. I don't think that 15+ minutes fits the definition of "immediate". If a configuration option for closing the ResultSet is provided, then at least I (and others, I assume) can continue to use this (otherwise excellent) driver. I'd really appreciate it if you could make this change.

     
  • Anonymous

    Anonymous - 2014-01-06

    From Java 7 JavaDoc for ResultSet#close():

    Releases this ResultSet object's database and JDBC resources immediately instead of waiting for this to happen when it is automatically closed.

    The same story is with Statement#close():

    Releases this Statement object's database and JDBC resources immediately instead of waiting for this to happen when it is automatically closed. It is generally good practice to release resources as soon as you are finished with them to avoid tying up database resources.

    According to these JavaDoc comments it seems that driver should close underlying resources immediately and reading all rows from ResultSet is not immediate!

    Now on the topic of multiple ResultSets. I don't know how this feature is implemented in jTDS and I haven't used it myself but looking at JavaDoc for Statement#execute(java.lang.String) method:

    Executes the given SQL statement, which may return multiple results. In some (uncommon) situations, a single SQL statement may return multiple result sets and/or update counts. Normally you can ignore this unless you are (1) executing a stored procedure that you know may return multiple results or (2) you are dynamically executing an unknown SQL string.

    The execute method executes an SQL statement and indicates the form of the first result. You must then use the methods getResultSet or getUpdateCount to retrieve the result, and getMoreResults to move to any subsequent result(s).

    It seems that one would need to use Statement#getResultSet() and Statement#getMoreResults(int) methods to obtain current and next ResultSets. Since Statement#getMoreResults(int) allows passing constant Statement#KEEP_CURRENT_RESULT I would expect that one can actually obtain different ResultSet objects in this case, because one moves to another ResultSet without closing or discarding previous ResultSet. Hence ResultSets should be independent of one another and therefore invoking ResultSet#close() on one of them should not depend or impact other ResultSets!

    Finally a suggestion on how to distinguish between a case when there is a single ResultSet object and a case for multiple ResultSets. What if when Statement#executeQuery(java.lang.String) is called it will create JtdsResultSet object with a flag that says that method close() should terminate immediately (i.e. without scanning result set). And in other case (i.e. Statement#execute(java.lang.String)) this flag will be set to false. This way at least simplest case would be abortable immediately.


    Current workaround for this bug that we use is to call Statement#cancel() prior closing ResultSet and Statement objects.

     
  • momo

    momo - 2014-01-07

    According to these JavaDoc comments it seems that driver should close underlying resources immediately and reading all rows from ResultSet is not immediate!

    Here "immediately" specifies the time when a resource should be closed in contrast to "waiting for this to happen when it is automatically closed" - closing itself will of course take some time anyway and in this certain case unfortunately also involves reading all remaining rows.

    I fully understand that running through the whole result may cause immense problems and is anything but a perfect solution. But in almost any problematic case I've seen so far, the application logic was poorly designed and issuing an appropriate query solved the problem of huge amounts of waste rows. Needless to say this wouldn't help if you are stuck to a third party software product.

    It seems that one would need to use Statement#getResultSet() and Statement#getMoreResults(int) methods to obtain current and next ResultSets. Since Statement#getMoreResults(int) allows passing constant Statement#KEEP_CURRENT_RESULT I would expect that one can actually obtain different ResultSet objects in this case, because one moves to another ResultSet without closing or discarding previous ResultSet. Hence ResultSets should be independent of one another and therefore invoking ResultSet#close() on one of them should not depend or impact other ResultSets!

    That's what this discussion is all about - jTDS has to provide a client-side implementation of the describes behavior (e.g. by buffering results to disk) because SQL Server simply sends a single data stream containing all results, one after another, without previously announcing what it will return in response to the last command. So we have to read through the complete first ResultSet just to know whether there is any further result, fetch errors reported by the server and so on. There simply is no server-side function for skipping anything - we could just cancel the whole operation, discarding any remaining results.

    What if when Statement#executeQuery(java.lang.String) is called it will create JtdsResultSet object with a flag that says that method close() should terminate immediately...

    Sounds quite reasonable. I'll see if that can be implemented without causing too much hassle.

    Current workaround for this bug that we use is to call Statement#cancel() prior closing ResultSet and Statement objects.

    This is exactly what I would recommend if you know you are dealing with a lot of unused data. This may not be the most elegant solution but the driver simply cannot distinguish between "close this result and continue with whatever comes next" and "close the result, I'm done".

     

Log in to post a comment.