|
From: Fred T. <fr...@us...> - 2016-05-27 13:45:27
|
We do accept contributions and I often fix any issues or side-effects they introduce. Please implement the change as I proposed in the simplest possible way. The final implementation needs some setting to control the amount of statement caching (from default zero up to a given number) but you can leave it out for now as it touches lots of other unrelated source files. You can then send me a diff to review etc. SVN access comes after that. Regards Fred On Fri, May 27, 2016, at 12:11, Daniel Walsh wrote: > I made these changes against 2.3.4 , would I be able to get SVN access > to branch the project and implement a proposed addition ? > > Ideally i'd like to share my results with you and convince you of the > magnitude of the performance gain, so this feature can be re- > integrated into the main codeline. I'll run some test cases > implementing the cache as you proposed and look for side-effects in my > data, if all looks good i'll submit it to my fork for you to review ? > > Daniel > > On 27 May 2016 at 11:49, Fred Toussi > <fr...@us...> wrote: >> In recent versions of HSQLDB, Statement objects are not generally >> mutable and can be shared concurrently. The only mutable aspect is >> the setCursorPropertiesRequest() method that is checked for in >> StatementManager.compile(). Statements that are out-of-date are also >> rejected by the same method. >> >> Fred >> >> On Fri, May 27, 2016, at 11:26, Daniel Walsh wrote: >>> I initially added the Statement Cache to the Statement Manager >>> class, but I realized that sharing a Statement object concurrently >>> between many connections would mean mixing state between concurrent >>> users causing functional errors. To do that correctly I would need >>> to implement a global cache of statements that never get "used" , >>> but rather on retrieval get cloned , so their state isn't shared, >>> and then that Statement follows a regular lifecycle, being returned >>> to the connection, pooled as normally done, closed and destroyed as >>> usual. But the object in the cache would be retained, un-modified >>> after creation , used as a template so-to-speak. >>> >>> And that whole solution as described above seemed to be less >>> performant than just caching the non-modifiable aspect of the >>> Statement objects and let them be constructed natively as is >>> currently done. >>> >>> Am I correct in thinking that sharing statement objects across >>> connections would cause data inconsistencies? It appears that way. >>> >>> Daniel >>> >>> >>> On 27 May 2016 at 11:07, Fred Toussi <fr...@us...> >>> wrote: >>>> If you limit statement caching to PreparedStatement the best place >>>> for the cache would be the StatementManager. >>>> >>>> The StatementManager compile() method checks if the SQL and its >>>> Statement object already exists. The Statement is removed when it >>>> is closed. All Statements are cleared when the connection and the >>>> session are closed. So for your usage, you can put a non-static >>>> ConcurrentMap in org.hsqldb.Database and each time a statement is >>>> compiled, you add to the ConcurrentMap and each time the lookup in >>>> compile() does not find the SQL and its statement, you try >>>> retrieving from the Concurrent map and let the compile() method >>>> perform its other checks. Note there is a different map for each >>>> schema (same SQL compiled with a different current schema can >>>> target different tables). Therefore you need a map of schema name >>>> to ConcurrentMap to be precise. This method doesn't need changes to >>>> resetSession and works with or without a connection pool. >>>> >>>> Fred >>>> >>>> On Fri, May 27, 2016, at 09:49, Daniel Walsh wrote: >>>>> I'll investigate using a Connection Pool , i'm currently using the >>>>> apache BasicDataSource for connection retrieval. >>>>> >>>>> However if every time I retrieve from the pool if resetSession is >>>>> called doesn't that clear the statement manager as you indicate, >>>>> wiping out any performance gains of persisting compiled >>>>> PreparedStatements between separate usages of the connection? >>>>> Essentially I have many concurrent users hitting the same >>>>> statement structures with different values for ? , ? , ? >>>>> parameters. The statements are programatically generated, and >>>>> quite finite, between 100-5000 thousand possible statements >>>>> depending on the scale of the configuration. >>>>> >>>>> For this reason at the level of the class ParserDQL I added a very >>>>> basic hack ( since this class is called in a stateful way , where >>>>> fields are set on it - so i was a bit careful ) , I added a >>>>> private static final ConcurrentMap of SQL String -> StatementQuery >>>>> , on the basis that if the sql strings are identical their >>>>> compiled queries must also be identical, I couldnt think of a >>>>> scenario where they would need to be destroyed and recreated , if >>>>> the DDL changes then the statement would also need to change by >>>>> necessity, therefore the value in the cache would never get used >>>>> again .... so except for the glaring memory leak at this moment it >>>>> would work in principle. >>>>> >>>>> So when compileCursorSpecification is called on this class , from >>>>> digging through the code I knew the sqlString was defined at this >>>>> time, so i do a simple map lookup and return if found, or compile >>>>> , push to map , return if not. >>>>> I identified the StatementQuery as the only costly part, so I >>>>> cache that state-less element and cannot really see any drawbacks >>>>> like you mentioned regarding closing sessions etc. >>>>> >>>>> Thanks for discussing this matter in great depth btw , I'm a large >>>>> fan of HSQLDB and have used it in projects @ my company for >>>>> several years. >>>>> >>>>> Daniel >>>>> >>>>> >>>>> >>>>> On 26 May 2016 at 20:48, Fred Toussi <fr...@us...> >>>>> wrote: >>>>>> There was a global Prepared Statement cache in version 1.8. This >>>>>> was too complex as it had to keep track of closed sessions and >>>>>> statements closing on some sessions and not on others. The cache >>>>>> was moved to individual sessions in version 2.0. >>>>>> >>>>>> You can use a connection pool. There are two options. The >>>>>> org.hsqldb.jdbc.JDBCPool or a generic JDBC pool. >>>>>> >>>>>> The org.hsqldb.Session class has a resetSession() method that >>>>>> calls statementManager.reset(). The JDBCPool calls the >>>>>> resetSession() method. Therefore you need to modify either class >>>>>> to not reset the StatementManager. Third party pools may or may >>>>>> not be resetting the HSQLDB session upon reuse, so you need to >>>>>> check. >>>>>> >>>>>> Fred Toussi >>>>>> >>>>>> >>>>>> >>>>>> On Thu, May 26, 2016, at 19:41, Daniel Walsh wrote: >>>>>>> Hi Devs, >>>>>>> >>>>>>> I have a question I couldn't find an answer for online or by >>>>>>> investigating the HSQLDB code myself. >>>>>>> >>>>>>> Is there a reason currently in the implementation that the >>>>>>> Stateless aspect of a Prepared Statement when compiled isn't >>>>>>> stored in a global non-connection level cache ? >>>>>>> >>>>>>> I have a use case where connections are created and destroyed >>>>>>> regularly and each connection calls 100+ prepared statement >>>>>>> creation and executions during its lifecycle, in a test case >>>>>>> that encompasses many other factors I increased performances by >>>>>>> creating a naive low level cache of the compiled statement to be >>>>>>> used each time a new PreparedStatement with a matching sql >>>>>>> string was created. >>>>>>> >>>>>>> This had a massive benefit to performance because the statements >>>>>>> being compiled are quite complex. >>>>>>> >>>>>>> Is this a feature that hasn't been implemented on purpose ? Or >>>>>>> just an open gap ? >>>>>>> >>>>>>> Thanks, >>>>>>> Daniel Walsh >>>>>>> Software Dev >>>>>>> ---------------------------------------------------------------- >>>>>>> -------------- >>>>>>> Mobile security can be enabling, not merely restricting. >>>>>>> Employees who >>>>>>> bring their own devices (BYOD) to work are irked by the >>>>>>> imposition of MDM >>>>>>> restrictions. Mobile Device Manager Plus allows you to control >>>>>>> only the >>>>>>> apps on BYO-devices by containerizing them, leaving personal >>>>>>> data untouched! >>>>>>> https://ad.doubleclick.net/ddm/clk/304595813;131938128;j[1] >>>>>>> _________________________________________________ >>>>>>> hsqldb-developers mailing list >>>>>>> hsq...@li... >>>>>>> https://lists.sourceforge.net/lists/listinfo/hsqldb-developers >>>>>> >>>>>> ---------------------------------------------------------------- >>>>>> -------------- >>>>>> Mobile security can be enabling, not merely restricting. >>>>>> Employees who bring their own devices (BYOD) to work are irked >>>>>> by the imposition of MDM restrictions. Mobile Device Manager >>>>>> Plus allows you to control only the apps on BYO-devices by >>>>>> containerizing them, leaving personal data untouched! >>>>>> https://ad.doubleclick.net/ddm/clk/304595813;131938128;j >>>>>> _______________________________________________ >>>>>> hsqldb-developers mailing list hsqldb- >>>>>> dev...@li... >>>>>> https://lists.sourceforge.net/lists/listinfo/hsqldb-developers >>>>>> >>>>> ------------------------------------------------------------------ >>>>> ------------ >>>>> What NetFlow Analyzer can do for you? Monitors network bandwidth >>>>> and traffic >>>>> patterns at an interface-level. Reveals which users, apps, and >>>>> protocols are >>>>> consuming the most bandwidth. Provides multi-vendor support for >>>>> NetFlow, >>>>> J-Flow, sFlow and other flows. Make informed decisions using >>>>> capacity >>>>> planning reports. >>>>> https://ad.doubleclick.net/ddm/clk/305295220;132659582;e[2] >>>>> _________________________________________________ >>>>> hsqldb-developers mailing list >>>>> hsq...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/hsqldb-developers >>>> >>>> ------------------------------------------------------------------ >>>> ------------ >>>> What NetFlow Analyzer can do for you? Monitors network bandwidth >>>> and traffic patterns at an interface-level. Reveals which users, >>>> apps, and protocols are consuming the most bandwidth. Provides multi- >>>> vendor support for NetFlow, J-Flow, sFlow and other flows. Make >>>> informed decisions using capacity planning reports. >>>> https://ad.doubleclick.net/ddm/clk/305295220;132659582;e >>>> _______________________________________________ >>>> hsqldb-developers mailing list hsqldb- >>>> dev...@li... >>>> https://lists.sourceforge.net/lists/listinfo/hsqldb-developers >>>> >>> -------------------------------------------------------------------- >>> ---------- >>> What NetFlow Analyzer can do for you? Monitors network bandwidth and >>> traffic >>> patterns at an interface-level. Reveals which users, apps, and >>> protocols are >>> consuming the most bandwidth. Provides multi-vendor support for >>> NetFlow, >>> J-Flow, sFlow and other flows. Make informed decisions using >>> capacity >>> planning reports. >>> https://ad.doubleclick.net/ddm/clk/305295220;132659582;e[3] >>> _________________________________________________ >>> hsqldb-developers mailing list >>> hsq...@li... >>> https://lists.sourceforge.net/lists/listinfo/hsqldb-developers >> >> -------------------------------------------------------------------- >> ---------- >> What NetFlow Analyzer can do for you? Monitors network bandwidth and >> traffic patterns at an interface-level. Reveals which users, apps, >> and protocols are consuming the most bandwidth. Provides multi- >> vendor support for NetFlow, J-Flow, sFlow and other flows. Make >> informed decisions using capacity planning reports. >> https://ad.doubleclick.net/ddm/clk/305295220;132659582;e >> _______________________________________________ >> hsqldb-developers mailing list hsqldb- >> dev...@li... >> https://lists.sourceforge.net/lists/listinfo/hsqldb-developers >> > ---------------------------------------------------------------------- > -------- > What NetFlow Analyzer can do for you? Monitors network bandwidth > and traffic > patterns at an interface-level. Reveals which users, apps, and > protocols are > consuming the most bandwidth. Provides multi-vendor support for > NetFlow, > J-Flow, sFlow and other flows. Make informed decisions using capacity > planning reports. > https://ad.doubleclick.net/ddm/clk/305295220;132659582;e[4] > _________________________________________________ > hsqldb-developers mailing list > hsq...@li... > https://lists.sourceforge.net/lists/listinfo/hsqldb-developers Links: 1. https://ad.doubleclick.net/ddm/clk/304595813%3B131938128%3Bj 2. https://ad.doubleclick.net/ddm/clk/305295220%3B132659582%3Be 3. https://ad.doubleclick.net/ddm/clk/305295220%3B132659582%3Be 4. https://ad.doubleclick.net/ddm/clk/305295220%3B132659582%3Be |