Does the cache database table have a purpose . . .
Virtual Research Environment / On-line Bibliography Manager
Brought to you by:
sirfragalot
. . . or does it simply complicate the code?
The mechanism has been around for a long time. Its purpose was to speed up the recall of potentially long lists of WIKINDX elements (e.g., keywords or creators). It was implemented perhaps 15 years ago or more and there have been multiple optimisations and speed improvements since then. The code is more complex because every addition, edit, and deletion of, for example, a keyword, requires the cache to be rewritten.
I'm not convinced that it is any longer necessary—get the lists always direct from the appropriate database.
Mark
There's only one way to know. Measure before and after modification. Even with a little slowdown it's good if it removes a lot of complexity.
It once made a significant difference for very long lists. Now it no longer does. I think it should go in the next full release.
Agreed.
Just some information.
100 iterations . . .
Using cache (12260 creators):
PHP execution time: 0.38562 s
SQL execution time: 0.06524 s
TPL rendering time: 0.01705 s
Total elapsed time: 0.46791 s
Peak memory usage: 16.0322 MB
Memory at close: 13.0279 MB
Database queries: 135
Bypassing cache (12260 creators):
PHP execution time: 5.38126 s
SQL execution time: 4.70736 s
TPL rendering time: 0.02068 s
Total elapsed time: 10.10930 s
Peak memory usage: 16.0316 MB
Memory at close: 13.0273 MB
Database queries: 235
This is creators. What is stored in the cache represents a list of creators having been processed for display in a select box—there's a fair bit of tidying up hence the significant difference. A list of keywords does not present such a difference.
On the other hand, a full list of formatted creator names is not widely used and, in any case, with just one iteration on this number of creators the difference is down to 100ths of a second on my system.
Thus, I still think it is worthwhile to remove the cache table: particularly to simplify the code and its maintenance.
Using a list of creator is a pathological case. As I explained in ticket [#457], I would like to create an HTML/JS selector that allow to select a creator from a popup that show a search (+ creation form). No need for cache in this case
Such a widget has four beneficial effects:
On the other hand, the user is slowed down a little in his input by the popup/search system. It's worth it for information as rich as the creator.
For me this ticket depends on the completion of [#457].
Related
Bugs and feature requests : #457
OK. I'll hold back.
Mark
You can still remove the cache feature selectively, e.g keep it for creator an publishers and remove others.
I remember that these are the two fields that slow down the large test base.
In my recollection it was the writing of the timestamps that was problematic and we've bypassed that now. Whatever we do, it should not be selective as we will never know on any database which fields (creator, keywords, publishers, collections) contain the most values.
On my system, getting a list of creators for editing one of them involves selecting the fields of over 12000 creators which takes 0.05seconds and writing that amount to cache, which also takes 0.05 seconds. These two operations take place rarely—if there is no cache or if the cache needs rewriting (e.g., a creator is added/edited/deleted). If there is a cache, selecting it takes 0.001 seconds.
It all comes down to a judgement on whether the added complexity of having the cache is worth it for a saving (on my system) of about 0.1 seconds.
I originally questioned the need for a cache. I'm now beginning to think it is worthwhile.
Let's leave the decision for now.
I said this because I knew more about which fields are cached.
And yes, the cache system is unfortunately essential in the absence of a better design for the selection of this data.
better design = [#457].
Related
Bugs and feature requests : #457