DjVuLibre / Bugs / #320 memory leak

Janusz - 2020-08-16

I attach the valgrind output for reading a 100 pages one by one.

djview4-valgrind.zip

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Janusz - 2020-08-17

After browsing the first 100 pages one by one with

djview4 https://djvu.szukajwslownikach.uw.edu.pl/linde-t/01/index.djvu&

'free' shows the increase of memory used from 2209292 to 2973452, after
the next 100 pages to 3561568.

Can you reproduce this?

Regards

Janusz

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Leon Bottou - 2020-08-18
  
  I've seen that, but the valgrind log is unfortunately useless. The lost
  memory blocks are just known things. I suspect that the wasted memory
  is still referenced in the cache, but not counted towards the maximum
  cache size. This is going to be tricky and I lack the time to do this
  well...
  
  Leon
  
  On 8/17/20 2:30 AM, Janusz wrote:
  
  After browsing the first 100 pages one by one with
  
  djview4 https://djvu.szukajwslownikach.uw.edu.pl/linde-t/01/index.djvu&
  
  'free' shows the increase of memory used from 2209292 to 2973452, after
  the next 100 pages to 3561568.
  
  Can you reproduce this?
  
  Regards
  
  Janusz
  
  [bugs:#320] https://sourceforge.net/p/djvu/bugs/320/ memory leak
  
  Status: open
  Group: djview
  Created: Wed Jul 08, 2020 07:17 AM UTC by Janusz
  Last Updated: Sun Aug 16, 2020 05:21 PM UTC
  Owner: nobody
  
  Cf. https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=964506.
  
  Sent from sourceforge.net because you indicated interest in
  https://sourceforge.net/p/djvu/bugs/320/
  
  To unsubscribe from further messages, please visit
  https://sourceforge.net/auth/subscriptions/
  
  Related
  
  Bugs: #320
  
  alternate
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Janusz - 2020-08-18

Is there a way I can help? My qualifications are very limited but after retiring the time is not a problem.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Leon Bottou - 2020-08-19
  
  I ran the valgrind "massif" tool to study the allocations while browsing
  50 pages (25% magnification, continuous side-by-side, scroll until
  reaching 50 pages). No memory leak per se. But I was surprised to see
  that the decoded hidden text (which is kept around in order to search
  quickly) can take about 500KB per page in this document. For 100 pages,
  that 50MB. This explains pretty much all the allocation growth that I
  can see with massif. But that is far from enough to exhaust all your
  computer memory in 200 pages....
  
  Any additional hint about better ways to reproduce the problem you describe?
  
  Leon
  
  On 8/18/20 12:37 AM, Janusz wrote:
  
  Is there a way I can help? My qualifications are very limited but
  after retiring the time is not a problem.
  
  [bugs:#320] https://sourceforge.net/p/djvu/bugs/320/ memory leak
  
  Status: open
  Group: djview
  Created: Wed Jul 08, 2020 07:17 AM UTC by Janusz
  Last Updated: Mon Aug 17, 2020 06:30 AM UTC
  Owner: nobody
  
  Cf. https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=964506.
  
  Sent from sourceforge.net because you indicated interest in
  https://sourceforge.net/p/djvu/bugs/320/
  
  To unsubscribe from further messages, please visit
  https://sourceforge.net/auth/subscriptions/
  
  Related
  
  Bugs: #320
  
  alternate
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Janusz - 2020-08-19

Thanks for your effort. The sort answer to your question is unfortunately negative, but just in case I add some comments.
The problem was first noticed in our djview4poliqarp, so the culprit is somewhere in the shared code. The document in question is a dictionary with large dense pages scanned in 600DPI , so perhaps this explains the size of the decode page . I practically never use continuous side-by-side, I browse with PageDown with zoom "wide" and "page". I was quite happy to be able to reproduce it in djview4 with just browsing page by page as I was unable to reproduce it in djview4poliqarp. Any suggestions how to try to reproduce it in some other way?
I had an impression which I've not verified (had no idea which tool to use) that the leak is stopped or more probably restarted but some actions like saving a page fragment. Do you want me to make more such experiments?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Janusz - 2020-08-19

Massif visualisation suggested by Joachim Aleszkiewicz.

massif.pdf

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Leon Bottou - 2020-08-20
  
  I did not use this fancy visualization but the simple ms_print ascii tool.
  
  Regardless of my efforts, I did not succeed in getting this kind of explosive memory consumption in plain djview.
  
  I tried using continuous, fit width, and go over 50 pages using the space key. No such effect….
  
  In the context of djview4poliqarp, the key thing to check is to make sure that one does not keep around GP<djvufile> or GP<djvuimage> for the previously visualized pages. One way to check this is to compile with -DDEBUGLVL=1 and to look for the DjVuFile destruction message “DjVuFile::~DjVuFile(): destroying...\n” …</djvuimage></djvufile>
  
  Leon
  
  From: Janusz jsbien@users.sourceforge.net
  Reply-To: "[djvu:bugs]" 320@bugs.djvu.p.re.sourceforge.net
  Date: Wednesday, August 19, 2020 at 4:47 PM
  To: "[djvu:bugs]" 320@bugs.djvu.p.re.sourceforge.net
  Subject: [djvu:bugs] #320 memory leak
  
  Massif visualisation suggested by Joachim Aleszkiewicz.
  
  Attachments:
  massif.pdf (71.4 kB; application/pdf)
  [bugs:#320] memory leak
  
  Status: open
  Group: djview
  Created: Wed Jul 08, 2020 07:17 AM UTC by Janusz
  Last Updated: Wed Aug 19, 2020 05:11 AM UTC
  Owner: nobody
  
  Cf. https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=964506.
  
  Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/djvu/bugs/320/
  
  To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/
  
  Related
  
  Bugs: #320
  
  alternate
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Leon Bottou - 2020-08-24
    
    Hi Janusz,
    
    Here is my massif visualisation when going over the first 50 pages of
    your document (fit width, pressing space bar). It shows some growing
    memory consumption because we record all the text (allocated in
    new_block). But that's nothing like yours. I wonder what's going on.
    
    Leon
    
    On 8/20/20 5:18 PM, Leon Bottou wrote:
    
    I did not use this fancy visualization but the simple ms_print ascii tool.
    
    Regardless of my efforts, I did not succeed in getting this kind of
    explosive memory consumption in plain djview.
    
    I tried using continuous, fit width, and go over 50 pages using the
    space key. No such effect….
    
    In the context of djview4poliqarp, the key thing to check is to make
    sure that one does not keep around GP<djvufile> or GP<djvuimage> for
    the previously visualized pages. One way to check this is to compile
    with -DDEBUGLVL=1 and to look for the DjVuFile destruction message
    “DjVuFile::~DjVuFile(): destroying...\n” …</djvuimage></djvufile>
    
    Leon
    
    From: Janusz jsbien@users.sourceforge.net
    jsbien@users.sourceforge.net
    Reply-To: "[djvu:bugs]" 320@bugs.djvu.p.re.sourceforge.net
    320@bugs.djvu.p.re.sourceforge.net
    Date: Wednesday, August 19, 2020 at 4:47 PM
    To: "[djvu:bugs]" 320@bugs.djvu.p.re.sourceforge.net
    320@bugs.djvu.p.re.sourceforge.net
    Subject: [djvu:bugs] #320 memory leak
    
    Massif visualisation suggested by Joachim Aleszkiewicz.
    
    Attachments:
    massif.pdf (71.4 kB; application/pdf)
    [bugs:#320] https://sourceforge.net/p/djvu/bugs/320/ memory leak
    
    Status: open
    Group: djview
    Created: Wed Jul 08, 2020 07:17 AM UTC by Janusz
    Last Updated: Wed Aug 19, 2020 05:11 AM UTC
    Owner: nobody
    
    Cf. https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=964506.
    
    Sent from sourceforge.net because you indicated interest in
    https://sourceforge.net/p/djvu/bugs/320/
    
    To unsubscribe from further messages, please visit
    https://sourceforge.net/auth/subscriptions/
    
    [bugs:#320] https://sourceforge.net/p/djvu/bugs/320/ memory leak
    
    Status: open
    Group: djview
    Created: Wed Jul 08, 2020 07:17 AM UTC by Janusz
    Last Updated: Wed Aug 19, 2020 08:47 PM UTC
    Owner: nobody
    
    Cf. https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=964506.
    
    Sent from sourceforge.net because you indicated interest in
    https://sourceforge.net/p/djvu/bugs/320/
    
    To unsubscribe from further messages, please visit
    https://sourceforge.net/auth/subscriptions/
    
    Related
    
    Bugs: #320
    
    alternate
    
    massif-20076.pdf
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Janusz - 2020-08-24

I reproduced the problem on Windows 10, however instead of a crash I got a nice error message "Out of memory. Cannot decode page 290", repeated for the page 291. Perhaps the message can be suplemented by some additional useful information?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Janusz - 2020-08-27

I reproduced the problem on another installation of Windows 10 on a computer with bigger RAM. I got "Out of memory" for page 284 despite the fact that there was still several GB memory free! Moreover both the pixel and the decode page cache was increased by me twice. So instead or beside the memory leak we have a bug which should be easier to diagnose. What about extending "out of memory" message with the information what kind of memory is insufficient?

Last edit: Janusz 2020-08-27

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Leon Bottou - 2020-11-18

Please let me know exactly which release you're using (which installer).
Hoping that the new one fixes the problem (which I otherwise cannot see).

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Leon Bottou - 2020-11-18

I am trying now to go over your file page per page, and this seems a lot more reasonable.
I find that having the text at the character level eats a lot more memory than I would like, but I am reaching page 289 with 289MB of memory usage...

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Janusz - 2020-11-20

I got "out of memory" error for the page 282 on a 4GB RAM laptop, earlier it was for 200 (on the same computer). So the is a small improvement, but I still don't understand the problem. Why the memory demand is not constant for displaying pages one after one? Evidently some memory is not released.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Janusz - 2020-11-20

I reproduced the problem from my post of 2020-08-27 , but this time there is just a crush instead of the "out of memory" error.
What interesting and probably easier to diagnose, there is the program crush several seconds after displaying the first page of https://djvu.szukajwslownikach.uw.edu.pl/linde/index.djvu (during downloading the subsequent pages to the cache?); unable to reproduce this on Linux.
This experiments were made on a 8GB RAM desktop (Windows 10), and the monitor shows that only about half of it was used.
As for the original problem with https://djvu.szukajwslownikach.uw.edu.pl/linde-t/01/, perhaps the character level segmentation is the culprit? Recently I browsed page after page several other large documents (about 700 pages each) and the memory footprint was constant as it should be.

Last edit: Janusz 2020-11-21

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Leon Bottou - 2020-11-21

This is getting very strange.
I went over the first 300 pages of https://djvu.szukajwslownikach.uw.edu.pl/linde-t/01/ on a windows 10 without problems. The max memory was about 300MB because of the character level text. But all was fine.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

memory leak

Group

Searches

Help

#320 memory leak

Related

Discussion

Related

Related

Related

Related