Menu

#674 Does the cache database table have a purpose . . .

Unknown
standby
None
6.8.0
Upkeep
Unknown
Unknown
Unknown
Development
2024-02-22
2024-02-18
No

. . . or does it simply complicate the code?

The mechanism has been around for a long time. Its purpose was to speed up the recall of potentially long lists of WIKINDX elements (e.g., keywords or creators). It was implemented perhaps 15 years ago or more and there have been multiple optimisations and speed improvements since then. The code is more complex because every addition, edit, and deletion of, for example, a keyword, requires the cache to be rewritten.

I'm not convinced that it is any longer necessary—get the lists always direct from the appropriate database.

Mark

Related

Bugs and feature requests : #682

Discussion

  • Mark Grimshaw

    Mark Grimshaw - 2024-02-18
    • Release cycle: Unknown --> Development
     
  • Stéphane Aulery

    There's only one way to know. Measure before and after modification. Even with a little slowdown it's good if it removes a lot of complexity.

     
    • Mark Grimshaw

      Mark Grimshaw - 2024-02-18

      It once made a significant difference for very long lists. Now it no longer does. I think it should go in the next full release.

       
      • Stéphane Aulery

        Agreed.

         
  • Mark Grimshaw

    Mark Grimshaw - 2024-02-21

    Just some information.

    100 iterations . . .

    Using cache (12260 creators):
    PHP execution time: 0.38562 s
    SQL execution time: 0.06524 s
    TPL rendering time: 0.01705 s
    Total elapsed time: 0.46791 s
    Peak memory usage: 16.0322 MB
    Memory at close: 13.0279 MB
    Database queries: 135

    Bypassing cache (12260 creators):
    PHP execution time: 5.38126 s
    SQL execution time: 4.70736 s
    TPL rendering time: 0.02068 s
    Total elapsed time: 10.10930 s
    Peak memory usage: 16.0316 MB
    Memory at close: 13.0273 MB
    Database queries: 235

    This is creators. What is stored in the cache represents a list of creators having been processed for display in a select box—there's a fair bit of tidying up hence the significant difference. A list of keywords does not present such a difference.

    On the other hand, a full list of formatted creator names is not widely used and, in any case, with just one iteration on this number of creators the difference is down to 100ths of a second on my system.

    Thus, I still think it is worthwhile to remove the cache table: particularly to simplify the code and its maintenance.

     
    • Stéphane Aulery

      Using a list of creator is a pathological case. As I explained in ticket [#457], I would like to create an HTML/JS selector that allow to select a creator from a popup that show a search (+ creation form). No need for cache in this case

      Such a widget has four beneficial effects:

      • Reduce the size of an HTML page
      • Speed up the construction of an HTML page
      • The search can be rich, i.e. cover more than the name of the linked data.
      • Allow the creation of linked data while editing or creating another.

      On the other hand, the user is slowed down a little in his input by the popup/search system. It's worth it for information as rich as the creator.

      For me this ticket depends on the completion of [#457].

       

      Related

      Bugs and feature requests : #457

      • Mark Grimshaw

        Mark Grimshaw - 2024-02-21

        OK. I'll hold back.

        Mark

         
        • Stéphane Aulery

          You can still remove the cache feature selectively, e.g keep it for creator an publishers and remove others.

          I remember that these are the two fields that slow down the large test base.

           
          • Mark Grimshaw

            Mark Grimshaw - 2024-02-22

            In my recollection it was the writing of the timestamps that was problematic and we've bypassed that now. Whatever we do, it should not be selective as we will never know on any database which fields (creator, keywords, publishers, collections) contain the most values.

            On my system, getting a list of creators for editing one of them involves selecting the fields of over 12000 creators which takes 0.05seconds and writing that amount to cache, which also takes 0.05 seconds. These two operations take place rarely—if there is no cache or if the cache needs rewriting (e.g., a creator is added/edited/deleted). If there is a cache, selecting it takes 0.001 seconds.

            It all comes down to a judgement on whether the added complexity of having the cache is worth it for a saving (on my system) of about 0.1 seconds.

            I originally questioned the need for a cache. I'm now beginning to think it is worthwhile.

            Let's leave the decision for now.

             
            • Stéphane Aulery

              I said this because I knew more about which fields are cached.

              And yes, the cache system is unfortunately essential in the absence of a better design for the selection of this data.
              better design = [#457].

               

              Related

              Bugs and feature requests : #457

  • Stéphane Aulery

    • status: open --> standby
     

Log in to post a comment.