From: Norm P. <npi...@da...> - 2012-09-21 17:10:34
Attachments:
ROX-Filer.diff
|
Attached is a patch to provide support in ROX-Filer for the application/x-zerosize MIME type. The changes made to the code are mostly just a subset of the changes in a patch submitted to the xdgmime code base by David Faure. (See "xdg/xdgmime: Implement text vs binary fallback; support for application/x-zerosize" at http://cgit.freedesktop.org/xdg/xdgmime/commit/?id=5181175d5fdaa3832b0fd094cda0120b1fe92af6 ) My only contributions were to make a small change in the diritem_restat() function so empty files are no longer assumed to be text/plain, and to make a change to the xdg_mime_get_mime_type_for_file() function identical to a change Faure made to the _xdg_mime_cache_get_mime_type_for_file() function; this fixes an omission in his patch so that this will still work even if there is no support for the mime.cache file. Note that David Faure's patch also adds support for guessing whether a file is binary or text if no other method has been able to identify the MIME type. ROX-Filer has had support for that since 2005 (with a short vacation from 2007-08-18 to 2008-01-21 when the code had been mistakenly removed). I have retained the ROX-Filer code for that support and not implemented David Faure's related changes. Related reading: Some discussion of Faure's patch is available here (although mostly about the part of his patch not used in my patch): "Re: Questions regarding the shared mime spec" http://old.nabble.com/Questions-regarding-the-shared-mime-spec-td32227868.html#a32563181 Bug report regarding omission in Faure's patch: "Bug 55120 - application/x-zerosize isn't supported when not using cache." https://bugs.freedesktop.org/show_bug.cgi?id=55120 |
From: Thomas L. <ta...@gm...> - 2012-09-22 11:25:34
|
On 21 September 2012 16:22, Norm Pierce <npi...@da...> wrote: > Attached is a patch to provide support in ROX-Filer for the > application/x-zerosize MIME type. > > The changes made to the code are mostly just a subset of the changes in a > patch submitted to the xdgmime code base by David Faure. (See > "xdg/xdgmime: Implement text vs binary fallback; support for > application/x-zerosize" at > http://cgit.freedesktop.org/xdg/xdgmime/commit/?id=5181175d5fdaa3832b0fd094cda0120b1fe92af6 > ) My only contributions were to make a small change in the > diritem_restat() function so empty files are no longer assumed to be > text/plain, and to make a change to the xdg_mime_get_mime_type_for_file() > function identical to a change Faure made to the > _xdg_mime_cache_get_mime_type_for_file() function; this fixes an omission > in his patch so that this will still work even if there is no support for > the mime.cache file. > > Note that David Faure's patch also adds support for guessing whether a > file is binary or text if no other method has been able to identify the > MIME type. ROX-Filer has had support for that since 2005 (with a short > vacation from 2007-08-18 to 2008-01-21 when the code had been mistakenly > removed). I have retained the ROX-Filer code for that support and not > implemented David Faure's related changes. Well, I don't think we should diverge too far from upstream, so I've merged in the latest code from there and then applied your patches on top. Are you going to upstream the zero-length fix? If so, you might want to send them this use-after-free fix at the same time: http://repo.or.cz/w/rox-filer.git/commitdiff/97f32448e491de3a824ce7e3723d3db9a80df12c I haven't looked at why caching is off. Should we enable it? -- Dr Thomas Leonard http://0install.net/ GPG: 9242 9807 C985 3C07 44A6 8B9A AE07 8280 59A5 3CC1 GPG: DA98 25AE CAD0 8975 7CDA BD8E 0713 3F96 CA74 D8BA |
From: Norm P. <npi...@da...> - 2012-09-27 23:59:51
|
On Sat, 22 Sep 2012, Thomas Leonard wrote: > . . . > Are you going to upstream the zero-length fix? > I reported that bug upstream last week, and suggested the fix. See the last URL in my previous post. > . . . > I haven't looked at why caching is off. Should we enable it? > Last week I took a look into this, to see if it would be worth trying to get ROX-Filer working with mime.cache. I got it going (with the old xdgmime code that was still in ROX-Filer last week) and did some time trials, to see if the response improved significantly. Although there was some improvement, my conclusion was that it was so minor that it did not warrant the effort required to get it going, risk subjecting users to any new bugs that might be introduced, fix those bugs, and maintain the two sets of functions (one that is used with cache, and one that is used without). My opinion: In theory, cache is wonderful, and worthwhile when speed is improved significantly, but in reality the level of complication introduced, and the problems that can arise, make cache a detriment if the speed improvement is only minor. But I'll give you the results of my investigation, so you can decide for yourself if the speed improvement is worth it. To get ROX-Filer to use the mime.cache file (using the old xdgmime code that was still in ROX-Filer last week), I ran into three issues: 1. xdgmimecache.c needs to be compiled with HAVE_MMAP or _xdg_mime_cache_new_from_file() does nothing but return NULL. 2. The old xdgmime code supported only mime.cache file format version 1.0, but current version is 1.2. 3. Text/binary guessing didn't work with cache because _rox_buffer_looks_like_text() is called from _xdg_mime_magic_lookup_data(), which is called from xdg_mime_get_mime_type_for_data() but not from _xdg_mime_cache_get_mime_type_for_data(). The new xdgmime code should take care of issue #2; the code looks like it should now support version 1.1 or 1.2. The new xdgmime code should also take care of issue #3, assuming that David Faure's 2011 patch is working properly. (Do note, however, that Faure's code has only a simple test for unexpected binary bytes. Your _rox_buffer_looks_like_text() function looked to be doing a more extensive test, which is why I chose to retain it, rather than replacing it with Faure's code.) Anyway, I got ROX-Filer working using a version 1.0 mime.cache file, and proceeded to run some tests. I temporarily modified dir.c so that it prints the time it takes to process dir->recheck_list, which is the closest I can get to timing just the processing done to work through all of the files in one directory. Then, with no instances of ROX-Filer running, I started it from the command line and gave one directory as an argument. I did this multiple times, using three different directories as test subjects. One directory held 2000 files which had neither globs nor magic that matched anything in the shared-mime-info database. I figured that would be a worst-case scenario, since the code would have to work its way through all of the globs and all of the magic signatures for each of the 2000 files. Another directory was more of a real-world case. It was just my /tmp/ directory which contained 124 items, 15 of which were directories, 2 were pipes, and 1 was a socket, leaving 106 actual files of various shapes and sizes (including two symlinks to files). The third contained 2000 .txt files -- actually the same directory as the first, but I had renamed the files by adding the .txt extension. This was to see how fast it was when it actually found a matching glob. Here is a summary of the results (times are in milliseconds) - - - directories: 1: 2000 files, no globs no magic 2: 124 items (106 files) various 3: 2000 .txt files directory first run subsequent scans --------- -------------------------- -------------------------- no cache cache saved % no cache cache saved % -------- ----- ----- -- -------- ----- ----- -- 1: 570 419 151 26 540 367 173 32 2: 92 86 6 7 36 25 11 31 3: 126 123 3 2 78 70 8 10 The four columns under "first run" are times (and percent improvement) when starting from the command line. The four columns under "subsequent scans" are times (and percent improvement) when clicking the rescan button. Each time listed in the table is an average of at least nine (and usually ten or more) samplings. These times are elapsed "real time". I also have numbers for user and sys times, but haven't summarized them. (The resolution for those numbers was apparently only in centiseconds, so not as useful for testing on a small sample size.) As you can see, even with the "worst-case" scenario with 2000 files that had no glob or magic match, the total savings was less than 2/10 of a second. Using the "real world" test on a smaller directory of assorted stuff only saved about 1/100 of a second. And the large directory of 2000 files that matched the .txt glob saved even less than that. Certainly saving 2/10 of a second in a loop that is repeated many times would be a big savings, but for something that only happens once when the user clicks a button, it is not a lot, especially since waiting for this processing is just a small fraction of the total time the user waits for the directory to be displayed. (Possibly this doesn't save any time at all if this processing happens, finishes, and then waits while another process, like maybe X, is busy, but I certainly don't know the code well enough to know if that is the case.) Anyway, bottom line: no, I don't think xdgmime cache should be enabled. Thanks for adding the application/x-zerosize support. Norm Pierce |
From: Thomas L. <ta...@gm...> - 2012-09-28 07:18:33
|
On 28 September 2012 00:59, Norm Pierce <npi...@da...> wrote: [...] >> I haven't looked at why caching is off. Should we enable it? > > Last week I took a look into this, to see if it would be worth trying to > get ROX-Filer working with mime.cache. > > I got it going (with the old xdgmime code that was still in ROX-Filer last > week) and did some time trials, to see if the response improved > significantly. Although there was some improvement, my conclusion was > that it was so minor that it did not warrant the effort required to get it > going, risk subjecting users to any new bugs that might be introduced, fix > those bugs, and maintain the two sets of functions (one that is used with > cache, and one that is used without). > > My opinion: In theory, cache is wonderful, and worthwhile when speed is > improved significantly, but in reality the level of complication > introduced, and the problems that can arise, make cache a detriment if the > speed improvement is only minor. Thanks for the detailed report - sounds like keeping it off is indeed the right thing to do. -- Dr Thomas Leonard http://0install.net/ GPG: 9242 9807 C985 3C07 44A6 8B9A AE07 8280 59A5 3CC1 GPG: DA98 25AE CAD0 8975 7CDA BD8E 0713 3F96 CA74 D8BA |