Would it be a good idea to reduce the default fetch size from 100 to 1 in the driver? This thought was prompted by one of Alins suggestions in a thread about inputstream and large objects.
I think the original idea of the large fetch size was to free up the TDS object as quickly as possible to reduce the need to create new connections. Now, if another statement needs to use the connection, the TdsSocket class will cache the result set data anyway. If the connection can be dedicated to the current select then, where long rows are returned, the drivers memory usage will be reduced considerably.
As far as I can see larger fetch sizes are now only required for cursor result sets when the user wants to reduce the number of round trips but I think at present (correct me if I am wrong Alin) only one row can be fetched at a time. Even if the server returns more than one row per fetch I am not convinced that we ever need more than one packetRowResult object.
Interestingly, if only one packetRowResult object is ever required at one time this offers up the prospect of simplifying the support for large objects under certain circumstances. The existing TdsResult set class can also be simplified by removing the logic connected with maintaining the row cache.
Mike.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The idea scared me a bit at first, but come to think of it it's not so bad. ;o) Indeed we now have two caches, one in ResultSet and one in TdsSocket.
On the other hand, we will probably need this functionality in CursorResultSet if we want to improve the performance of cursor-based ResultSets. Problem is, I'm not entirely sure about how sp_cursorfetch works with fetches larger than 1 (i.e. I'm not sure where the cursor is positioned, I've seen some really weird behavior if forward and backward fetches are combined).
All in all, simplifying the logic in ResultSet seems like a good thing to do especially since caching there did not affect performance in any way.
Alin.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I dont know if this helps with regard to the cursor issues but I think with ODBC, if the fetchsize (the row set in ODBC terms) is set to say 10, each fetch will fetch another 10 rows from the last and a fetch prior will move back 10 rows before the first and a fetch absolute will fetch 10 rows from the specified row.
To update in ODBC you specify the row to update or delete in the row set with SqlSetPos(row number in row set). I.e. 1 to 10 in my example.
Looking at your code in CursorResult set I would speculate that you set the row num parameter to the sp_cursor call to the row to update or delete in the row set (again 1 to 10 in my example). If you allow the user to fetch multiple rows the next() method will need to invoke an actual fetch whenever the user reaches the end of the current row set rather than fetch with each call as now. Going backwards is a challenge but one could fetch prior and then read forward through the result set until the row before the one you where on.
If the fetch sizes > 1 are implemented this way there is no need to actually cache the rows explicitly in the CursorResult set as one is always reading forward through a result set.
In ODBC it is possible to bind several rows to arrays of program variables and populate them all with one fetch. This is not supported with JDBC so may be you could restrict fetch sizes > 1 to forward only result sets where greater read efficiency is required. This seems like a reasonable compromise to me.
Mike.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
However, I think you are missing something here. We will in fact need to cache rows inside the CursorResultSet because all CursorResultSets use the main Tds, which means that an operation on another cursor will drop all the rows in the TdsSocket cache. We could work around this by having CursorResultSets lock their own Tds instances (just like a Statement does) but I don't know if that will not just add overhead.
Alin.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I was thinking that the CursorResult set could just hang on to the Tds until the current fetched data is exhausted. A new fetch or position command could just flush the unread data. An update is a bit trickier as this would temporarily require an additional Tds object, the residual unread data being cached by TdsSocket. Alternatively you could just restrict the fetch size to 1 for updateable cursors.
My point is that with a single connection you no longer have to reserve a special Tds for cursor operations so each CursorResultSet could have a Tds in the same way as a TdsResultSet.
Mike.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I understand now. It's just that in the current implementation with a main Tds it wouldn't work. I think not restricting the fetch size to 1 would give us some important performance benefits (if you'll look at one of the benchmarks, the i-net one I believe, you'll see that going to a scrollable ResultSet is one of the weak points of jTDS).
Alin.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
you have to change the default in the source, it is 100 and cant be changed as a URL parameter. You can limit it on the statement, possibly your Pool implementation can be configured to set that on the new statements.
Would it be a good idea to reduce the default fetch size from 100 to 1 in the driver? This thought was prompted by one of Alins suggestions in a thread about inputstream and large objects.
I think the original idea of the large fetch size was to free up the TDS object as quickly as possible to reduce the need to create new connections. Now, if another statement needs to use the connection, the TdsSocket class will cache the result set data anyway. If the connection can be dedicated to the current select then, where long rows are returned, the drivers memory usage will be reduced considerably.
As far as I can see larger fetch sizes are now only required for cursor result sets when the user wants to reduce the number of round trips but I think at present (correct me if I am wrong Alin) only one row can be fetched at a time. Even if the server returns more than one row per fetch I am not convinced that we ever need more than one packetRowResult object.
Interestingly, if only one packetRowResult object is ever required at one time this offers up the prospect of simplifying the support for large objects under certain circumstances. The existing TdsResult set class can also be simplified by removing the logic connected with maintaining the row cache.
Mike.
Mike,
The idea scared me a bit at first, but come to think of it it's not so bad. ;o) Indeed we now have two caches, one in ResultSet and one in TdsSocket.
On the other hand, we will probably need this functionality in CursorResultSet if we want to improve the performance of cursor-based ResultSets. Problem is, I'm not entirely sure about how sp_cursorfetch works with fetches larger than 1 (i.e. I'm not sure where the cursor is positioned, I've seen some really weird behavior if forward and backward fetches are combined).
All in all, simplifying the logic in ResultSet seems like a good thing to do especially since caching there did not affect performance in any way.
Alin.
Alin,
I dont know if this helps with regard to the cursor issues but I think with ODBC, if the fetchsize (the row set in ODBC terms) is set to say 10, each fetch will fetch another 10 rows from the last and a fetch prior will move back 10 rows before the first and a fetch absolute will fetch 10 rows from the specified row.
To update in ODBC you specify the row to update or delete in the row set with SqlSetPos(row number in row set). I.e. 1 to 10 in my example.
Looking at your code in CursorResult set I would speculate that you set the row num parameter to the sp_cursor call to the row to update or delete in the row set (again 1 to 10 in my example). If you allow the user to fetch multiple rows the next() method will need to invoke an actual fetch whenever the user reaches the end of the current row set rather than fetch with each call as now. Going backwards is a challenge but one could fetch prior and then read forward through the result set until the row before the one you where on.
If the fetch sizes > 1 are implemented this way there is no need to actually cache the rows explicitly in the CursorResult set as one is always reading forward through a result set.
In ODBC it is possible to bind several rows to arrays of program variables and populate them all with one fetch. This is not supported with JDBC so may be you could restrict fetch sizes > 1 to forward only result sets where greater read efficiency is required. This seems like a reasonable compromise to me.
Mike.
Mike,
Thanks for the ODBC stuff.
However, I think you are missing something here. We will in fact need to cache rows inside the CursorResultSet because all CursorResultSets use the main Tds, which means that an operation on another cursor will drop all the rows in the TdsSocket cache. We could work around this by having CursorResultSets lock their own Tds instances (just like a Statement does) but I don't know if that will not just add overhead.
Alin.
Alin,
I was thinking that the CursorResult set could just hang on to the Tds until the current fetched data is exhausted. A new fetch or position command could just flush the unread data. An update is a bit trickier as this would temporarily require an additional Tds object, the residual unread data being cached by TdsSocket. Alternatively you could just restrict the fetch size to 1 for updateable cursors.
My point is that with a single connection you no longer have to reserve a special Tds for cursor operations so each CursorResultSet could have a Tds in the same way as a TdsResultSet.
Mike.
I understand now. It's just that in the current implementation with a main Tds it wouldn't work. I think not restricting the fetch size to 1 would give us some important performance benefits (if you'll look at one of the benchmarks, the i-net one I believe, you'll see that going to a scrollable ResultSet is one of the weak points of jTDS).
Alin.
you have to change the default in the source, it is 100 and cant be changed as a URL parameter. You can limit it on the statement, possibly your Pool implementation can be configured to set that on the new statements.
https://github.com/seeburger-ag/jTDS/blob/jTDS_1.3_(stable)-seeburger/src/main/net/sourceforge/jtds/jdbc/JtdsStatement.java#L87