|
From: Michael D. <md...@st...> - 2013-07-31 14:59:21
|
On 07/31/2013 10:38 AM, Jeffrey Spencer wrote:
> Michael,
>
> Pdftocairo is a good tool to know so thanks for that tip.
>
> I still think currently it is a regression with the current 'stamp'
> method to use it on all accounts. I understand in a complicated figure
> with a bunch of subplots that this would be beneficial and create
> smaller code. I don't see how in single figures this would often
> result in reduced files sizes.
The case where it has an enormous impact is when the same shape is used
multiple times. For example in a scatter, hexbin or pcolor plot.
> I usually output single figures with one plot and I don't think one of
> them that I am currently working on was smaller in 1.4.x. They all
> resulted in reduced file sizes with mpl 1.1.1. This figure of 3d
> spheres resulted in 60kb instead of roughly 80kb after running
> pdftocairo. Anyway, you said in coming versions a threshold should be
> set before stamping of objects occurs so a fix is on the way eventually.
Yes, but it's too complex of a fix to throw in quickly. I think the
overall benefit of stamping is preferable to not doing it at all at this
point.
Mike
>
> Thanks for all the help,
> Jeff
>
>
> On Wed, Jul 31, 2013 at 11:31 PM, Michael Droettboom <md...@st...
> <mailto:md...@st...>> wrote:
>
> On 07/30/2013 04:20 PM, Jeffrey Spencer wrote:
>> Michael,
>>
>> Thanks that is very informative. Answers most of the problems I
>> was having and read MEP14 which looks really useful
>>
>> That being said does the ps backend subset the fonts or use
>> collections for drawing (is the collections feature global or
>> just in the pdf backend)?
>
> The ps backend has the same behavior as pdf on both counts. TTF
> fonts are subsetted, but the fonts that come from TeX come to use
> as Type1 fonts, which matplotlib currently does not know how to
> subset. It also handles collections in the same way (by creating
> a "stamp" and reusing it).
>
>
>> I usually use .eps output and convert to pdf using epstopdf
>> unless the figure has an alpha channel because always results in
>> a much smaller file (60kB roughly for this file or plain figure
>> around 10kB) than direct pdf output with the output looking the
>> same. I pretty much always have usetex=True so maybe the pdf file
>> is always embedding the full fonts.
>
> Yes, when usetex=True, matplotlib does not do any font subsetting
> (in any backend). To get around this limitation, one can use the
> `pdftocairo` tool (part of poppler utils), to convert from pdf to
> a pdf with subsetted fonts. With your example, I was able to get
> the pdf down to ~80k. With MEP14, we would basically move such
> functionality into matplotlib itself, but that's sort of a long
> term, semi-back-burner project so it could be a while.
>
> It's possible that epstopdf is doing some font subsetting of its
> own. But as you point out, Postscript (as a specification)
> doesn't support alpha, so it's not useful when you need alpha.
>
>
>>
>> Also, does the Cairo backend support usetex=True or subsetting? I
>> know I had read it did not support usetex but that was maybe 2
>> years ago or so. The x,y,z axis look correct with cairo but the
>> IPA Fonts don't render properly. The legend font says it is size
>> 12 but if you zoom in extremely close you can see they are the
>> correct fonts just way to small. The file size is around 60kB as
>> well so I am guessing it supports subsetting of fonts.
>
> Cairo does support font subsetting, but the matplotlib Cairo
> backend has no support for usetex. I'm surprised this worked for
> you at all. When I run your example with the Cairo backend, the
> IPA characters appear as raw TeX source code, i.e. "\textipa{i}",
> which is what I would expect given that the regular font renderer
> doesn't understand that syntax.
>
>
>>
>> The pgf backend would also subset fonts if output to .pdf I'm
>> assuming because that is the default with pdftex? It results in
>> similar size files to the .eps output for this file (roughly 60kB
>> also).
>
> Yes.
>
>
>>
>> The IPA font uses the package (\usepackage{tipa}) and therefore
>> that is why I think these look differently. That package draws
>> these fonts with its' font libraries instead of whatever is
>> selected as the text font. Maybe I'm wrong about this but that is
>> my understanding because even in normal latex code the fonts look
>> different than the standard text.
>
> That is correct. The default font for usetex=True is Computer
> Modern, whereas it is Bitstream Vera Sans in the default font
> rendering. I was referring to the difference between 1.2 and 1.4
> which was using TeX fonts in both cases, but due to a bug in
> 1.3/1.4 was rendering the IPA in serif when you had requested
> sans-serif.
>
> Mike
>
>
>>
>> Cheers,
>> Jeff
>>
>>
>> On Wed, Jul 31, 2013 at 4:43 AM, Michael Droettboom
>> <md...@st... <mailto:md...@st...>> wrote:
>>
>> There are two different things going on here.
>>
>> Between 1.2.1 and now, there was a bugfix to the font
>> selection routine that inadvertently introduced a bug
>> selecting fonts in the usetex backend. You may notice that
>> on master, the IPA font selected is different. The file size
>> difference can be attributed to the slightly larger font size
>> of the one it selected vs. the one it should have. Note that
>> when usetex is True, the fonts are not subsetted, so you
>> always get the full font embedded in the file (MEP14 work
>> will fix this in the future).
>>
>> See b5c340 for the bug that introduced the commit, and
>> https://github.com/matplotlib/matplotlib/pull/2260 for the
>> fix (which should make it into 1.3.0 final).
>>
>> Between 1.1.1 and 1.2.1 a change was made in how collections
>> are handled. Previously, each path was redrawn individually.
>> In 1.2, if a path is reused multiple times, a "stamp" is
>> created and then it is "used" multiple times. In principle,
>> this generally reduces file sizes by a large amount.
>> However, in the case of this figure with the 3D spheres, each
>> path is used only once, so rather than getting the file size
>> savings of that approach, we only get the overhead. The
>> backend could be smarter by not doing this when the path is
>> only used a small number of times. Such a fix would be
>> welcome, but is probably too large/risky to try to get into
>> the current release cycle. It will have to wait for 1.3.1
>>
>> Cheers,
>> Mike
>>
>>
>>
>> On 07/30/2013 12:24 PM, Jeffrey Spencer wrote:
>>> K, I have just made the script self-contained but it loads
>>> external data so I have attached that as well. If you want
>>> me to just separate out the plotting commands let me know. I
>>> have also attached my matplotlib rc file which is the same
>>> on all three systems. All the modifications to the
>>> matplotlibrc file are copied to the top and in the first 30
>>> lines or so.
>>>
>>> Of note, the smallest file sizes for pdf are using the pgf
>>> backend around 60kb. Not sure if that helps at all. It is
>>> also around the same size if I export to .eps and then
>>> convert to pdf. About 60kb. The problem with eps in these 3d
>>> figures though is the back wall I think has an alpha channel
>>> because just becomes a solid wall in the output. No lines
>>> through it like the other two walls.
>>>
>>>
>>> On Tue, Jul 30, 2013 at 11:23 PM, Jouni K. Seppänen
>>> <jk...@ik... <mailto:jk...@ik...>> wrote:
>>>
>>> Jeffrey Spencer <jef...@gm...
>>> <mailto:jef...@gm...>> writes:
>>>
>>> > I have three different versions of matplotlib that all
>>> output different
>>> > file sizes with matplotlib 1.1.1 providing the
>>> smallest. This is for the
>>> > same exact script. I can post the script if that helps.
>>> >
>>> > MPL 1.4.x: 539.32kb, Ubuntu 12.10
>>> > MPL 1.1.1: 172.56kb Ubuntu 12.10
>>> > MPL 1.2.1: 475.9kb, Ubuntu 13.04
>>>
>>> Yes, it would be interesting to know what the plotting
>>> commands are.
>>> Just as a guess, since all the sizes are a few hundred
>>> kilobytes, it
>>> could be a difference in e.g. font embedding - many
>>> TrueType fonts are
>>> of comparable size.
>>>
>>> --
>>> Jouni K. Seppänen
>>> http://www.iki.fi/jks
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Get your SQL database under version control now!
>>> Version control is standard for application code, but
>>> databases havent
>>> caught up. So what steps can you take to put your SQL
>>> databases under
>>> version control? Why should you start doing it? Read
>>> more to find out.
>>> http://pubads.g.doubleclick.net/gampad/clk?id=49501711&iu=/4140/ostg.clktrk
>>> _______________________________________________
>>> Matplotlib-users mailing list
>>> Mat...@li...
>>> <mailto:Mat...@li...>
>>> https://lists.sourceforge.net/lists/listinfo/matplotlib-users
>>>
>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Get your SQL database under version control now!
>>> Version control is standard for application code, but databases havent
>>> caught up. So what steps can you take to put your SQL databases under
>>> version control? Why should you start doing it? Read more to find out.
>>> http://pubads.g.doubleclick.net/gampad/clk?id=49501711&iu=/4140/ostg.clktrk
>>>
>>>
>>> _______________________________________________
>>> Matplotlib-users mailing list
>>> Mat...@li... <mailto:Mat...@li...>
>>> https://lists.sourceforge.net/lists/listinfo/matplotlib-users
>>
>>
>> ------------------------------------------------------------------------------
>> Get your SQL database under version control now!
>> Version control is standard for application code, but
>> databases havent
>> caught up. So what steps can you take to put your SQL
>> databases under
>> version control? Why should you start doing it? Read more to
>> find out.
>> http://pubads.g.doubleclick.net/gampad/clk?id=49501711&iu=/4140/ostg.clktrk
>> _______________________________________________
>> Matplotlib-users mailing list
>> Mat...@li...
>> <mailto:Mat...@li...>
>> https://lists.sourceforge.net/lists/listinfo/matplotlib-users
>>
>>
>
>
|