You can subscribe to this list here.
2000 |
Jan
(8) |
Feb
(49) |
Mar
(48) |
Apr
(28) |
May
(37) |
Jun
(28) |
Jul
(16) |
Aug
(16) |
Sep
(44) |
Oct
(61) |
Nov
(31) |
Dec
(24) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2001 |
Jan
(56) |
Feb
(54) |
Mar
(41) |
Apr
(71) |
May
(48) |
Jun
(32) |
Jul
(53) |
Aug
(91) |
Sep
(56) |
Oct
(33) |
Nov
(81) |
Dec
(54) |
2002 |
Jan
(72) |
Feb
(37) |
Mar
(126) |
Apr
(62) |
May
(34) |
Jun
(124) |
Jul
(36) |
Aug
(34) |
Sep
(60) |
Oct
(37) |
Nov
(23) |
Dec
(104) |
2003 |
Jan
(110) |
Feb
(73) |
Mar
(42) |
Apr
(8) |
May
(76) |
Jun
(14) |
Jul
(52) |
Aug
(26) |
Sep
(108) |
Oct
(82) |
Nov
(89) |
Dec
(94) |
2004 |
Jan
(117) |
Feb
(86) |
Mar
(75) |
Apr
(55) |
May
(75) |
Jun
(160) |
Jul
(152) |
Aug
(86) |
Sep
(75) |
Oct
(134) |
Nov
(62) |
Dec
(60) |
2005 |
Jan
(187) |
Feb
(318) |
Mar
(296) |
Apr
(205) |
May
(84) |
Jun
(63) |
Jul
(122) |
Aug
(59) |
Sep
(66) |
Oct
(148) |
Nov
(120) |
Dec
(70) |
2006 |
Jan
(460) |
Feb
(683) |
Mar
(589) |
Apr
(559) |
May
(445) |
Jun
(712) |
Jul
(815) |
Aug
(663) |
Sep
(559) |
Oct
(930) |
Nov
(373) |
Dec
|
From: Sebastian H. <ha...@ms...> - 2006-09-20 15:50:33
|
Robert Kern wrote: >> This was not supposed to be a scientific statement -- I'm (again) >> thinking of our students that not always appreciate the full >> complexity >> of computational numerics and data types and such. > > They need to appreciate the complexity of computational numerics if > they are going to do numerical computation. Double precision does not > make it any simpler. This is were we differ. > We haven't forgotten what newcomers will do; to the contrary, we are > quite aware > that new users need consistent behavior in order to learn how to use a > system. > Adding another special case in how dtypes implicitly convert to one > another will > impede new users being able to understand the whole system. All I'm proposing could be summarized in: mean(), sum(), var() ... produce output of dtype float64 (except for input float96 which produces float96) A comment on this is also that for these operations the input type/precision is almost not related to the resulting output precision -- the int case makes that already clear. (This is different for e.g. min() or max() ) The proposed alternative implementations seem to have one or more multiplication (or division) for each value -- this might be noticeably slower ... Regards, Sebastian |
From: Martin W. <mar...@gm...> - 2006-09-20 15:03:57
|
Hi list, I just stumbled accross NPY_WRITEABLE flag. Now I'd like to know if there are ways either from Python or C to make an array temporarily immutable. Thanks, Martin. |
From: Francesc A. <fa...@ca...> - 2006-09-20 14:34:51
|
A Dimecres 20 Setembre 2006 12:48, Travis Oliphant va escriure: > Making sure you get the correct data-type is why there are NPY_INT32 and > NPY_INT64 enumerated types. You can't code using NPY_LONG and expect > it will give you the same sizes when moving from 32-bit and 64-bit > platforms. That's a problem that has been fixed with the bitwidth > types. I don't understand why you are using the enumerated types at all > in this circumstance. Ooops. I didn't know that NPY_INT32 and NPY_INT64 were there. I think this= =20 solves all my problems. In fact, you were proposing this from the very=20 beginning, but I was confused because I was hoping to find NPY_INT32 and=20 NPY_INT64 in NPY_TYPES enumerated and I didn't find it there. I didn't=20 realized that NPY_INT32 and NPY_INT64 were defined outside NPY_TYPES as=20 platform independent constants. Blame on me. Sorry about any inconveniences and thanks once more for your=20 patience! =2D-=20 >0,0< Francesc Altet =A0 =A0 http://www.carabos.com/ V V C=E1rabos Coop. V. =A0=A0Enjoy Data "-" |
From: Xavier G. <gn...@ob...> - 2006-09-20 13:15:49
|
IMHO, the only correct way to handle this case is to raise an exception. It does not make sense to compare NaN and "real" numbers. It could be very confusing not to raise an exception. Xavier. > On 19/09/06, Tim Hochberg <tim...@ie...> wrote: > >> A. M. Archibald wrote: >> >>> Mmm. Somebody who's working with NaNs has more or less already decided >>> they don't want to be pestered with exceptions for invalid data. >>> >> Do you really think so? In my experience NaNs are nearly always just an >> indication of a mistake somewhere that didn't get trapped for one reason >> or another. >> > > Well, I said that because for an image porcessing project I was doing, > the easiest thing to do with certain troublesome pixels was to fill in > NaNs, and then at the end replace the NaNs with sensible values. It > seems as if the point of NaNs is to allow you to keep working with > those numbers that make sense while ingoring those that don't. If you > wanted exceptions, why not get them as soon as the first NaN would > have been generated? > > >>> I'd >>> be happy if they wound up at either end, but I'm not sure it's worth >>> hacking up the sort algorithm when a simple isnan() can pull them out. >>> >>> >> Moving them to the end seems to be the worst choice to me. Leaving them >> alone is fine with me. Or raising an exception would be fine. Or doing >> one or the other depending on the error mode settings would be even >> better if it is practical. >> > > I was just thinking in terms of easy removal. > > >> Is that true? Are all of numpy's sorting algorithms robust against >> nontransitive objects laying around? The answer to that appears to be >> no. Try running this a couple of times to see what I mean: >> > > >> The values don't correctly cross the inserted NaN and the sort is incorrect. >> > > You're quite right: when NaNs are present in the array, sorting and > then removing them does not yield a sorted array. For example, > mergesort just output > [ 2. 4. 6. 9. nan 0. 1. > 3. 5. 7. 8. ] > > The other two are no better (and arguably worse). Python's built-in > sort() for lists has the same problem. > > This is definitely a bug, and the best way to fix it is not clear to > me - perhaps sort() needs to always do any(isnan(A)) before starting > to sort. I don't really like raising an exception, but sort() isn't > really very meaningful with NaNs in the array. The only other option I > can think of is to somehow remove them, sort without, and reintroduce > them at the end, which is going to be a nightmare when sorting a > single axis of a large array. Or, I suppose, sort() could simply fill > the array with NaNs; I'm sure users will love that. > > A. M. Archibald > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys -- and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > -- ############################################ Xavier Gnata CRAL - Observatoire de Lyon 9, avenue Charles André 69561 Saint Genis Laval cedex Phone: +33 4 78 86 85 28 Fax: +33 4 78 86 83 86 E-mail: gn...@ob... ############################################ |
From: Rick W. <rl...@st...> - 2006-09-20 12:11:37
|
On Sep 19, 2006, at 9:45 PM, Tim Hochberg wrote: > Perhaps there's some use for the sort to end behaviour that I'm > missing, > but the raise an exception behaviour sure looks a lot more > appealing to me. FYI, in IDL the NaN values wind up at the end of the sorted array. That's true despite the fact that IDL does respect all the comparison properties of NaNs (i.e. Value>NaN, Value<NaN, and Value==NaN are all false for any value). So clearly the sort behavior was created deliberately. It's also the case that the median for arrays including NaN values is computed as the median of the defined values, ignoring the NaNs. My view is that if the user has NaN values in the array, sort should respect the float exception flags and should only raise an exception if that is what the user has requested. > Here's a strawman proposal: > > Sort the array. Then examine numpy.geterr()['invalid']. If it > is not > 'ignore', then check examine sometrue(isnan(thearray)). If the > latter is true then raise and error, issue a warning or call the > error reporting functioni as appropriate. Note that we always sort > the array to be consistent with the behaviour of the ufuncs that > proceed even when they end up raising an exception. Here's another proposal: Make a first pass through the array, replacing NaN values with Inf and counting the NaNs ("nancount"). Raise an exception at this point if NaNs are not supposed to be allowed. Otherwise sort the array, and then as the last step replace the trailing NaNcount values with NaN. It seems to me that this would give predictable results while respecting the exception flags at little extra cost. Rick |
From: tpm <zd...@to...> - 2006-09-20 11:36:39
|
<html> <head> <meta http-equiv="Content-Type" content="text/html; charset=gb2312"> <title>无标题文档</title> <style type="text/css"> <!-- .td { font-size: 12px; color: #313131; line-height: 20px; font-family: "Arial", "Helvetica", "sans-serif"; } --> </style> </head> <body leftmargin="0" background="http://bo.sohu.com//images/img20040502/dj_bg.gif"> <br> <table width="673" border="0" align="center" cellpadding="0" cellspacing="0"> <tr> <td height="62" bgcolor="#8C8C8C"> <div align="center"> <table width="100%" border="0" cellspacing="1" cellpadding="0"> <tr> <td height="62" bgcolor="#F3F3F3"><div align="center"><font size="+3" color="#FF0000"><b>TPM全员设备维护与管理</b></font><font color="#aa0000" size="+2"><br> </font></div></td> </tr> </table> <font color="#FF0000" size="+3" face="黑体"></font></div></td> </tr> </table> <table width="673" border="0" align="center" cellpadding="0" cellspacing="0" class="td"> <tr> <td bgcolor="#8C8C8C"><table width="100%" border="0" cellspacing="0" cellpadding="0"> <tr> <td height="28" bgcolor="#F3F3F3"> <div align="center" class="td"><font size="2">易腾企业管理咨询有限公司</font></div></td> </tr> </table></td> </tr> <tr> <td height="116" bgcolor="#FFFFFF"> <div align="center"> <table width="99%" border="0" cellspacing="0" cellpadding="0"> <tr> <td width="17%" height="20" bgcolor="#BF0000" class="td"> <div align="center"><font color="#FFFFFF" size="2">[课 程 背 景]</font></div></td> <td width="83%" class="td"><font size="2"> </font></td> </tr> <tr> <td height="74" colspan="2" class="td"><font size="2">――全球生产型企业最受推崇、最受欢迎的生产管理方法之一――<br> TPM就是Total Productive Maintenance,其定义为:以最有效的设备利用为目标,以设备保养(MP)、预防维修(PM)、改善维修(CM)和事后维修(BM)综合构成生产维修(PM)为总运行体制。从最高经营管理者到第一线作业人员全体参与,以自主的小组活动来推行PM,使因设备问题引起的直接或间接损失为零。<br> 任何企业的综合生产力是以投入和产出来衡量的。具体地讲,产出的产品(服务、情报)要大于投入的 3M (材料、人、设备),生产才具有实际意义。就是说,要提高生产力,方法一是花钱搞设备投资;方法二是不花钱或少花钱,靠人、机械、方法充分协调的 TPM 活动。由此说明, TPM 是人与物质协调的技术产物。它通过对设备、业务的改善,促使人的思想发生理性变化,特别是促进员工形成主人翁意识,从而给企业带来竞争活力。故有人把 TPM 称为“综合生产力经营管理(Total Productivity Management)”,因为它与企业经营目标直接关联。 </font> </td> </tr> </table> </div> <div align="center" style="width: 671; height: 1"> </div> <div align="center"> <table width="99%" height="84" border="0" cellpadding="0" cellspacing="0"> <tr> <td width="17%" height="20" bgcolor="#0080C0" class="td"> <div align="center"><font color="#FFFFFF" size="2">[课 程 目 标]</font></div></td> <td width="83%" class="td"><font size="2"> </font></td> </tr> <tr> <td height="64" colspan="2" class="td"> <p><font size="2"> 先进的设备管理是制造型企业降低成本,增加效益的最直接,最有效的途径。TPM活动就是以全员参与的小组方式,创建设计优良的设备系统,提高现有设备的最高限运用,实现安全性和高质量,防止错误发生,从而使企业达到降低成本和全面生产效率的提高,我们希望学员通过此次培训达到以下目的:<br> 1. 强化设备基础管理,提高设备可动率;<br> 2. 维持设备良好状态,延长设备寿命;<br> 3. 提高生产效率,降低成本;<br> 4. 改善工作环境,消除安全隐患,提高员工工作满意度;<br> 5. 提高企业持续改善的意识和能力。 </font> </p></td> </tr> </table> <table width="99%" height="84" border="0" cellpadding="0" cellspacing="0"> <tr> <td width="17%" height="20" bgcolor="#0080C0" class="td"> <div align="center"><font color="#FFFFFF" size="2">[课 程 大 钢]</font></div></td> <td width="83%" class="td"><font size="2"> </font></td> </tr> <tr> <td height="64" colspan="2" class="td"> <p><font size="2"><font color="#0000FF">1、TPM概论</font><br> ◆什么是TPM活动<br> ◆TPM与企业竞争力提升<br> ◆TPM的含义及其演进过程<br> ◆TPM活动与设备维修的关联<br> ◆TPM主要内容及推行组织保证<br> ◆平均修复时间MTTR、平均故障间隔时间MTBF计算与分析<br> ◆设备综合效率OEE计算与分析― 企业效率损失知多少<br> ◆透过OEE看企业“无形的浪费”与改善空间<br> ◆小组分析与讨论<br> <font color="#0000FF">2、TPM自主保全活动实务展开</font><br> ◆为什么要推行TPM自主保全<br> ◆企业实践自主保全活动7步骤<br> Step1初期清扫<br> Step2污染源及困难处所对策<br> Step3制定自主保养临时基准书<br> Step4总点检<br> Step5自主点检<br> Step6工程品质标准化<br> Step7彻底的自主管理<br> ◆在实务中如何展开以上7步骤<br> ◆成功推行自主保全的要点<br> ◆TPM活动企业成功案例分享<br> 演练:TPM自主保全活动计划书及活动要点讨论<br> <font color="#0000FF">3、TPM计划保全活动实务展开</font><br> ◆计划保全的基本观念体系<br> ◆如何正确处理计划保全与自主保全的关联<br> ◆建立设备计划保全运作体系<br> ◆设备日常维修履历管理<br> ◆实践设备零故障的7个步骤<br> Step1使用条件差异分析<br> Step2问题点对策<br> Step3制定计划保养临时基准书<br> Step4自然劣化对策<br> Step5点检效率化<br> Step6 M-Q关联分析Step7点检预知化<br> ◆设备保养信息e化<br> ◆支援自主保全方法―OPL(One Point Lesson)训练<br> ◆计划保全4阶段7步骤展开<br> ◆成功推行计划保全的要点<br> ◆TPM活动企业成功案例分享<br> 演练:TPM计划保全活动计划书及活动要点讨论<br> <font color="#0000FF">4、TPM个别改善活动实务展开</font><br> ◆工厂运行中16种损失分析(Loss)<br> ◆系统的改善活动-步骤与实务方法<br> ◆P-M分析与P-M演练<br> ◆个别改善活动的要点<br> ◆TPM活动企业成功案例分享<br> 演练:个别改善活动推进方法实务讨论<br> <font color="#0000FF">5、TPM开发管理活动实务展开</font><br> ◆设备初期管理体制建立<br> ◆M-P信息回馈管理<br> ◆LCC(Life Cycle Cost)分析<br> <font color="#0000FF">6、TPM品质保全活动实务展开</font><br> ◆M-Q分析―品质可以预防吗?<br> ◆品质保全与TPM其他活动的关联<br> ◆品质保全活动之实现零不良7步骤<br> <font color="#0000FF">7、TPM其他活动展开介绍</font><br> ◆教育训练:TPM教育训练体系<br> ◆安全与卫生改善活动<br> ◆事务部门效率化改善活动 <br> <font color="#0000FF">8、TPM完整案例分享&学员问题解答</font></font> </p></td> </tr> </table> </div> <table width="99%" height="77" border="0" align="center" cellpadding="0" cellspacing="0"> <tr> <td width="17%" height="20" bgcolor="#0080C0" class="td"> <div align="center"><font color="#FFFFFF" size="2">[联 系 我 们]</font></div></td> <td width="83%" class="td"><font size="2"> </font></td> </tr> <tr> <td height="57" colspan="2" class="td"> <font size="2"> <font color="#000000">(</font><a href="mailto:注订退):如您不需要此邮件.请发送邮件至:ts...@to...并在邮件标题中注明(定退邮件"><font color="#000000"><span style="text-decoration: none">注订退):如您不需要此邮件.请发送邮件至:ts...@to...并在邮件标题中注明(定退邮件</span></font></a><font color="#000000">)</font><br> 联系人:刘小姐 欢迎接洽厂内训和咨询项目!<br> 联系方式: 021-51187132</font></td> </tr> </table> </td> </tr> </table> </body> </html> |
From: tpm <zd...@to...> - 2006-09-20 11:27:08
|
<html> <head> <meta http-equiv="Content-Type" content="text/html; charset=gb2312"> <title>无标题文档</title> <style type="text/css"> <!-- .td { font-size: 12px; color: #313131; line-height: 20px; font-family: "Arial", "Helvetica", "sans-serif"; } --> </style> </head> <body leftmargin="0" background="http://bo.sohu.com//images/img20040502/dj_bg.gif"> <br> <table width="673" border="0" align="center" cellpadding="0" cellspacing="0"> <tr> <td height="62" bgcolor="#8C8C8C"> <div align="center"> <table width="100%" border="0" cellspacing="1" cellpadding="0"> <tr> <td height="62" bgcolor="#F3F3F3"><div align="center"><font size="+3" color="#FF0000"><b>TPM全员设备维护与管理</b></font><font color="#aa0000" size="+2"><br> </font></div></td> </tr> </table> <font color="#FF0000" size="+3" face="黑体"></font></div></td> </tr> </table> <table width="673" border="0" align="center" cellpadding="0" cellspacing="0" class="td"> <tr> <td bgcolor="#8C8C8C"><table width="100%" border="0" cellspacing="0" cellpadding="0"> <tr> <td height="28" bgcolor="#F3F3F3"> <div align="center" class="td"><font size="2">易腾企业管理咨询有限公司</font></div></td> </tr> </table></td> </tr> <tr> <td height="116" bgcolor="#FFFFFF"> <div align="center"> <table width="99%" border="0" cellspacing="0" cellpadding="0"> <tr> <td width="17%" height="20" bgcolor="#BF0000" class="td"> <div align="center"><font color="#FFFFFF" size="2">[课 程 背 景]</font></div></td> <td width="83%" class="td"><font size="2"> </font></td> </tr> <tr> <td height="74" colspan="2" class="td"><font size="2">――全球生产型企业最受推崇、最受欢迎的生产管理方法之一――<br> TPM就是Total Productive Maintenance,其定义为:以最有效的设备利用为目标,以设备保养(MP)、预防维修(PM)、改善维修(CM)和事后维修(BM)综合构成生产维修(PM)为总运行体制。从最高经营管理者到第一线作业人员全体参与,以自主的小组活动来推行PM,使因设备问题引起的直接或间接损失为零。<br> 任何企业的综合生产力是以投入和产出来衡量的。具体地讲,产出的产品(服务、情报)要大于投入的 3M (材料、人、设备),生产才具有实际意义。就是说,要提高生产力,方法一是花钱搞设备投资;方法二是不花钱或少花钱,靠人、机械、方法充分协调的 TPM 活动。由此说明, TPM 是人与物质协调的技术产物。它通过对设备、业务的改善,促使人的思想发生理性变化,特别是促进员工形成主人翁意识,从而给企业带来竞争活力。故有人把 TPM 称为“综合生产力经营管理(Total Productivity Management)”,因为它与企业经营目标直接关联。 </font> </td> </tr> </table> </div> <div align="center" style="width: 671; height: 1"> </div> <div align="center"> <table width="99%" height="84" border="0" cellpadding="0" cellspacing="0"> <tr> <td width="17%" height="20" bgcolor="#0080C0" class="td"> <div align="center"><font color="#FFFFFF" size="2">[课 程 目 标]</font></div></td> <td width="83%" class="td"><font size="2"> </font></td> </tr> <tr> <td height="64" colspan="2" class="td"> <p><font size="2"> 先进的设备管理是制造型企业降低成本,增加效益的最直接,最有效的途径。TPM活动就是以全员参与的小组方式,创建设计优良的设备系统,提高现有设备的最高限运用,实现安全性和高质量,防止错误发生,从而使企业达到降低成本和全面生产效率的提高,我们希望学员通过此次培训达到以下目的:<br> 1. 强化设备基础管理,提高设备可动率;<br> 2. 维持设备良好状态,延长设备寿命;<br> 3. 提高生产效率,降低成本;<br> 4. 改善工作环境,消除安全隐患,提高员工工作满意度;<br> 5. 提高企业持续改善的意识和能力。 </font> </p></td> </tr> </table> <table width="99%" height="84" border="0" cellpadding="0" cellspacing="0"> <tr> <td width="17%" height="20" bgcolor="#0080C0" class="td"> <div align="center"><font color="#FFFFFF" size="2">[课 程 大 钢]</font></div></td> <td width="83%" class="td"><font size="2"> </font></td> </tr> <tr> <td height="64" colspan="2" class="td"> <p><font size="2"><font color="#0000FF">1、TPM概论</font><br> ◆什么是TPM活动<br> ◆TPM与企业竞争力提升<br> ◆TPM的含义及其演进过程<br> ◆TPM活动与设备维修的关联<br> ◆TPM主要内容及推行组织保证<br> ◆平均修复时间MTTR、平均故障间隔时间MTBF计算与分析<br> ◆设备综合效率OEE计算与分析― 企业效率损失知多少<br> ◆透过OEE看企业“无形的浪费”与改善空间<br> ◆小组分析与讨论<br> <font color="#0000FF">2、TPM自主保全活动实务展开</font><br> ◆为什么要推行TPM自主保全<br> ◆企业实践自主保全活动7步骤<br> Step1初期清扫<br> Step2污染源及困难处所对策<br> Step3制定自主保养临时基准书<br> Step4总点检<br> Step5自主点检<br> Step6工程品质标准化<br> Step7彻底的自主管理<br> ◆在实务中如何展开以上7步骤<br> ◆成功推行自主保全的要点<br> ◆TPM活动企业成功案例分享<br> 演练:TPM自主保全活动计划书及活动要点讨论<br> <font color="#0000FF">3、TPM计划保全活动实务展开</font><br> ◆计划保全的基本观念体系<br> ◆如何正确处理计划保全与自主保全的关联<br> ◆建立设备计划保全运作体系<br> ◆设备日常维修履历管理<br> ◆实践设备零故障的7个步骤<br> Step1使用条件差异分析<br> Step2问题点对策<br> Step3制定计划保养临时基准书<br> Step4自然劣化对策<br> Step5点检效率化<br> Step6 M-Q关联分析Step7点检预知化<br> ◆设备保养信息e化<br> ◆支援自主保全方法―OPL(One Point Lesson)训练<br> ◆计划保全4阶段7步骤展开<br> ◆成功推行计划保全的要点<br> ◆TPM活动企业成功案例分享<br> 演练:TPM计划保全活动计划书及活动要点讨论<br> <font color="#0000FF">4、TPM个别改善活动实务展开</font><br> ◆工厂运行中16种损失分析(Loss)<br> ◆系统的改善活动-步骤与实务方法<br> ◆P-M分析与P-M演练<br> ◆个别改善活动的要点<br> ◆TPM活动企业成功案例分享<br> 演练:个别改善活动推进方法实务讨论<br> <font color="#0000FF">5、TPM开发管理活动实务展开</font><br> ◆设备初期管理体制建立<br> ◆M-P信息回馈管理<br> ◆LCC(Life Cycle Cost)分析<br> <font color="#0000FF">6、TPM品质保全活动实务展开</font><br> ◆M-Q分析―品质可以预防吗?<br> ◆品质保全与TPM其他活动的关联<br> ◆品质保全活动之实现零不良7步骤<br> <font color="#0000FF">7、TPM其他活动展开介绍</font><br> ◆教育训练:TPM教育训练体系<br> ◆安全与卫生改善活动<br> ◆事务部门效率化改善活动 <br> <font color="#0000FF">8、TPM完整案例分享&学员问题解答</font></font> </p></td> </tr> </table> </div> <table width="99%" height="77" border="0" align="center" cellpadding="0" cellspacing="0"> <tr> <td width="17%" height="20" bgcolor="#0080C0" class="td"> <div align="center"><font color="#FFFFFF" size="2">[联 系 我 们]</font></div></td> <td width="83%" class="td"><font size="2"> </font></td> </tr> <tr> <td height="57" colspan="2" class="td"> <font size="2"> <font color="#000000">(</font><a href="mailto:注订退):如您不需要此邮件.请发送邮件至:ts...@to...并在邮件标题中注明(定退邮件"><font color="#000000"><span style="text-decoration: none">注订退):如您不需要此邮件.请发送邮件至:ts...@to...并在邮件标题中注明(定退邮件</span></font></a><font color="#000000">)</font><br> 联系人:刘小姐 欢迎接洽厂内训和咨询项目!<br> 联系方式: 021-51187132</font></td> </tr> </table> </td> </tr> </table> </body> </html> |
From: Travis O. <oli...@ie...> - 2006-09-20 10:59:25
|
A. M. Archibald wrote: > Hi, > > What are the rules for datatype conversion in ufuncs? Does ufunc(a,b) > always yield the smallest type big enough to represent both a and b? > What is the datatype of ufunc.reduce(a)? > This is an unintended consequence of making add.reduce() reduce over at least a ("long"). I've fixed the code so that only add.reduce and multiply.reduce alter the default reducing data-type to be long. All other cases use the data-type of the array as the default. Regarding your other question on data-type conversion in ufuncs: 1) If you specify an output array, then the result will be cast to the output array data-type. 2) The actual computation takes place using a data-type that all (non-scalar) inputs can be cast to safely (with the exception that we assume that long long integers can be "safely" cast to "doubles" even though this is not technically true). -Travis |
From: Travis O. <oli...@ie...> - 2006-09-20 10:48:14
|
Francesc Altet wrote: > Hi, > > I'm sending a message here because discussing about this in the bug tracker is > not very comfortable. This my last try before giving up, so don't be > afraid ;-) > > In bug #283 (http://projects.scipy.org/scipy/numpy/ticket/283) I complained > about the fact that a numpy.int32 is being mapped in NumPy to NPY_LONG > enumerated type and I think I failed to explain well why I think this is a > bad thing. Now, I'll try to expose an (real life) example, in the hope that > things will make clearer. > > Realize that you are coding a C extension that receives NumPy arrays for > saving them on-disk for a later retrieval. Realize also that an user is using > your extension on a 32-bit platform. If she pass to this extension an array > of type 'int32', and the extension tries to read the enumerated type (using > array.dtype.num), it will get NPY_LONG. > So, the extension use this code > (NPY_LONG) to save the type (together with data) on-disk. Now, she send this > data file to a teammate that works on a 64-bit machine, and tries to read the > data using the same extension. The extension would see that the data is > NPY_LONG type and would try to deserialize interpreting data elements as > being as 64-bit integer (this is the size of a NPY_LONG in 64-bit platforms), > and this is clearly wrong. > > In my view, this "real-life" example points to a flaw in the coding design that will not be fixed by altering what numpy.int32 maps to under the covers. It is wrong to use a code for the platform c data-type (NPY_LONG) as a key to understand data written to disk. This is and always has been a bad idea. No matter what we do with numpy.int32 this can cause problems. Just because a lot of platforms think an int is 32-bits does not mean all of them do. C gives you no such guarantee. Notice that pickling of NumPy arrays does not store the "enumerated type" as the code. Instead it stores the data-type object (which itself pickles using the kind and element size so that the correct data-type object can be reconstructed on the other end --- if it is available at all). Thus, you should not be storing the enumerated type but instead something like the kind and element-size. > Besides this, if for making your C extension you are using a C library that is > meant to save data in a platform-independent (say, HDF5), then, having a > NPY_LONG will not automatically say which C library datatype maps to, because > it only have datatypes that are of a definite size in all platforms. So, this > is a second problem. > > Making sure you get the correct data-type is why there are NPY_INT32 and NPY_INT64 enumerated types. You can't code using NPY_LONG and expect it will give you the same sizes when moving from 32-bit and 64-bit platforms. That's a problem that has been fixed with the bitwidth types. I don't understand why you are using the enumerated types at all in this circumstance. > Of course there are workarounds for this, but my impression is that they can > be avoided with a more sensible mapping between NumPy Python types and NumPy > enumerated types, like: > > numpy.int32 --> NPY_INT > numpy.int64 --> NPY_LONGLONG > numpy.int_ --> NPY_LONG > > in all platforms, avoiding the current situation of ambiguous mapping between > platforms. > The problem is that C gives us this ambiguous mapping. You are asking us to pretend it isn't there because it "simplifies" a hypothetical case so that poor coding practice can be allowed to work in a special case. I'm not convinced. This persists the myth that C data-types have a defined length. This is not guaranteed. The current system defines data-types with a guaranteed length. Yes, there is ambiguity as to which is "the" underlying c-type on certain platforms, but if you are running into trouble with the difference, then you need to change how you are coding because you would run into trouble on some combination of platforms even if we made the change. Basically, you are asking to make a major change, and at this point I'm very hesitant to make such a change without a clear and pressing need for it. Your hypothetical example does not rise to the level of "clear and pressing need." In fact, I see your proposal as a step backwards. Now, it is true that we could change the default type that gets first grab at int32 to be int (instead of the current long) --- I could see arguments for that. But, since the choice is ambiguous and the Python integer type is the c-type long, I let long get first dibs on everything as this seemed to work better for code I was wrapping in the past. I don't see any point in changing this choice now and risk code breakage, especially when your argument is that it would let users think that a c int is always 32-bits. Best regards, -Travis |
From: Francesc A. <fa...@ca...> - 2006-09-20 10:00:31
|
A Dimarts 19 Setembre 2006 21:41, Bill Baxter va escriure: > I think he meant do an argsort first, then use fancy indexing to get > the sorted array. > For a 1-d array that's just > > ind =3D A.argsort() > Asorted =3D A[ind] > > That should be O(N lg N + N), aka O(N lg N) I see. Thanks. OTOH, maybe your estimations are right, but the effect of t= he=20 constants in these O(whatever) estimations can be surprisingly high: In [3]: from timeit import Timer In [4]: Timer("b=3Dnumpy.argsort(a);c=3Da[b]", "import numpy;=20 a=3Dnumpy.arange(100000,-1,-1)").repeat(3,100) Out[4]: [1.6653108596801758, 1.670341968536377, 1.6632120609283447] In [5]: Timer("b=3Dnumpy.argsort(a);c=3Dnumpy.sort(a)", "import numpy;=20 a=3Dnumpy.arange(100000,-1,-1)").repeat(3,100) Out[5]: [1.6533238887786865, 1.6272940635681152, 1.6253311634063721] In [6]: Timer("b=3Dnumpy.argsort(a);a.sort();c=3Da", "import numpy;=20 a=3Dnumpy.arange(100000,-1,-1)").repeat(3,100) Out[6]: [0.95492100715637207, 0.90312504768371582, 0.90426898002624512] so, it seems that argsorting first and fancy indexing later on is the most= =20 expensive procedure for relatively large arrays (100000). Interestingly, figures above seems to indicate that in-place sort is=20 stunningly fast: In [7]: Timer("a.sort()","import numpy;=20 a=3Dnumpy.arange(100000,-1,-1)").repeat(3,100) Out[7]: [0.32840394973754883, 0.2746579647064209, 0.2770991325378418] and much faster indeed than fancy indexing In [8]: Timer("b[a]","import numpy;=20 a=3Dnumpy.arange(100000,-1,-1);b=3Da.copy()").repeat(3,100) Out[8]: [0.79876089096069336, 0.74172186851501465, 0.74209499359130859] i.e. in-place sort seems 2.7x faster than fancy indexing (at least for thes= e=20 datasets). Mmmm, with this, I really ponder if a combo that makes the argsort() and=20 sort() in one shot really makes any sense, at least from the point of view = of=20 speed: In [10]: Timer("b=3Dnumpy.argsort(a);a.sort();c=3Da","import numpy;=20 a=3Dnumpy.arange(100000,-1,-1)").repeat(3,100) Out[10]: [0.98506593704223633, 0.89880609512329102, 0.89982390403747559] In [11]: Timer("b=3Dnumpy.argsort(a)","import numpy;=20 a=3Dnumpy.arange(100000,-1,-1)").repeat(3,100) Out[11]: [0.92959284782409668, 0.85385990142822266, 0.87773990631103516] So, it seems that doing an in-place sort() immediately after an argsort()=20 operation is very efficient (cache effects here?), and would avoid the need= =20 of the combo function (from the point of view of efficiency, I repeat). Cheers, =2D-=20 >0,0< Francesc Altet =A0 =A0 http://www.carabos.com/ V V C=E1rabos Coop. V. =A0=A0Enjoy Data "-" |
From: Robert K. <rob...@gm...> - 2006-09-20 08:01:35
|
Sebastian Haase wrote: > Robert Kern wrote: >> Sebastian Haase wrote: >>> I know that having too much knowledge of the details often makes one >>> forget what the "newcomers" will do and expect. >> Please be more careful with such accusations. Repeated frequently, they can >> become quite insulting. >> > I did not mean to insult anyone - what I meant was, that I'm for numpy > becoming an easy platform to use. I have spend and enjoyed part the last > four years developing and evangelizing Python as an alternative to > Matlab and C/Fortran based image analysis environment. I often find > myself arguing for good support of the single precision data format. So > I find it actually somewhat ironic to see myself arguing now for wanting > float64 over float32 ;-) No one is doubting that you want numpy to be easy to use. Please don't doubt that the rest of us want otherwise. However, the fact that you *want* numpy to be easy to use does not mean that your suggestions *will* make numpy easy to use. We haven't forgotten what newcomers will do; to the contrary, we are quite aware that new users need consistent behavior in order to learn how to use a system. Adding another special case in how dtypes implicitly convert to one another will impede new users being able to understand the whole system. See A. M. Archibald's question in the thread "ufunc.reduce and conversion" for an example. In our judgement this is a worse outcome than notational convenience for float32 users, who already need to be aware of the effects of their precision choice. Each of us can come to different conclusions in good faith without one of us forgetting the new user experience. Let me offer a third path: the algorithms used for .mean() and .var() are substandard. There are much better incremental algorithms that entirely avoid the need to accumulate such large (and therefore precision-losing) intermediate values. The algorithms look like the following for 1D arrays in Python: def mean(a): m = a[0] for i in range(1, len(a)): m += (a[i] - m) / (i + 1) return m def var(a): m = a[0] t = a.dtype.type(0) for i in range(1, len(a)): q = a[i] - m r = q / (i+1) m += r t += i * q * r t /= len(a) return t Alternatively, from Knuth: def var_knuth(a): m = a.dtype.type(0) variance = a.dtype.type(0) for i in range(len(a)): delta = a[i] - m m += delta / (i+1) variance += delta * (a[i] - m) variance /= len(a) return variance If you will code up implementations of these for ndarray.mean() and ndarray.var(), I will check them in and then float32 arrays will avoid most of the catastrophes that the current implementations run into. >>> We are only talking >>> about people that will a) work with single-precision data (e.g. large >>> scale-image analysis) and who b) will tend to "just use the default" >>> (dtype) --- How else can I say this: these people will just assume that >>> arr.mean() *is* the mean of arr. >> I don't understand what you mean, here. arr.mean() is almost never *the* mean of >> arr. Double precision can't fix that. >> > This was not supposed to be a scientific statement -- I'm (again) > thinking of our students that not always appreciate the full complexity > of computational numerics and data types and such. They need to appreciate the complexity of computational numerics if they are going to do numerical computation. Double precision does not make it any simpler. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco |
From: Francesc A. <fa...@ca...> - 2006-09-20 07:52:38
|
Hi, I'm sending a message here because discussing about this in the bug tracker= is=20 not very comfortable. This my last try before giving up, so don't be=20 afraid ;-) In bug #283 (http://projects.scipy.org/scipy/numpy/ticket/283) I complained= =20 about the fact that a numpy.int32 is being mapped in NumPy to NPY_LONG=20 enumerated type and I think I failed to explain well why I think this is a= =20 bad thing. Now, I'll try to expose an (real life) example, in the hope that= =20 things will make clearer. Realize that you are coding a C extension that receives NumPy arrays for=20 saving them on-disk for a later retrieval. Realize also that an user is usi= ng=20 your extension on a 32-bit platform. If she pass to this extension an array= =20 of type 'int32', and the extension tries to read the enumerated type (using= =20 array.dtype.num), it will get NPY_LONG. So, the extension use this code=20 (NPY_LONG) to save the type (together with data) on-disk. Now, she send thi= s=20 data file to a teammate that works on a 64-bit machine, and tries to read t= he=20 data using the same extension. The extension would see that the data is=20 NPY_LONG type and would try to deserialize interpreting data elements as=20 being as 64-bit integer (this is the size of a NPY_LONG in 64-bit platforms= ),=20 and this is clearly wrong. Besides this, if for making your C extension you are using a C library that= is=20 meant to save data in a platform-independent (say, HDF5), then, having a=20 NPY_LONG will not automatically say which C library datatype maps to, becau= se=20 it only have datatypes that are of a definite size in all platforms. So, th= is=20 is a second problem. Of course there are workarounds for this, but my impression is that they ca= n=20 be avoided with a more sensible mapping between NumPy Python types and NumP= y=20 enumerated types, like: numpy.int32 --> NPY_INT numpy.int64 --> NPY_LONGLONG numpy.int_ --> NPY_LONG in all platforms, avoiding the current situation of ambiguous mapping betwe= en=20 platforms. Sorry for being so persistent, but I think the issue is worth it. =2D-=20 >0,0< Francesc Altet =A0 =A0 http://www.carabos.com/ V V C=E1rabos Coop. V. =A0=A0Enjoy Data "-" |
From: <kon...@la...> - 2006-09-20 07:45:32
|
On 19.09.2006, at 20:42, Travis Oliphant wrote: > Well, I got both ScientificPython and MMTK to compile and import =20 > using > the steps outlined on http://www.scipy.org/Porting_to_NumPy in =20 > about 1 > hour (including time to fix alter_code1 to make the process even =20 > easier). Could you please send me those versions? I'd happily put them on my =20 Web site for volunteers to test. > I was able to install ScientificPython and MMTK for NumPy on my system > using the patches provided on that page. Is there a test suite that > can be run? Not much yet, unfortunately. There is a minuscule test suite for =20 ScientificPython in the latest release, and an even more miniscule =20 one for MMTK that I didn't even publish yet because it doesn't test =20 more than that everything can be imported. > Users of MMTK could really help out here. I hope so! Konrad. -- --------------------------------------------------------------------- Konrad Hinsen Centre de Biophysique Mol=E9culaire, CNRS Orl=E9ans Synchrotron Soleil - Division Exp=E9riences Saint Aubin - BP 48 91192 Gif sur Yvette Cedex, France Tel. +33-1 69 35 97 15 E-Mail: hi...@cn... --------------------------------------------------------------------- |
From: <kon...@la...> - 2006-09-20 07:41:26
|
On 19.09.2006, at 18:48, Christopher Barker wrote: > Konrad Hinsen wrote: >> MMTK works fine with Numeric 23.x (and probably many other versions), >> so I don't see a pressing need to change to NumPy. > > Pressing is in the eye of the beholder. Obviously. It also depends on the context in which one develops or =20 uses software. For me, a pressing need is to finish the two =20 publications I am working on. Next come various administrational =20 tasks that have a deadline. On the third place, there's work on a new =20= research project that I started recently. Software development is at =20 best on position number 4, but in that category my personal =20 priorities are adding more unit tests and reworking the MMTK =20 documentation system, the old one being unusable because various =20 pieces of code it relied on are no longer supported. As with many other scientific software projects, MMTK development is =20 completely unfunded and even hardly recognized by my employer's =20 evaluation committees. Any work on MMTK that does not help me in a =20 research project can therefore not be a priority for me. > However: I don't think we should underestimate the negative impact of > the Numeric/numarray split on the usability and perception of the > community. Also the impact on how much work has been done to =20 > accommodate it. If you consider matplotlib alone: I completely agree. The existence of a third choice (NumPy) just =20 makes it worse. For client code like mine there is little chance to =20 escape from the split issues. Even if I had the resources to adapt =20 all my code to NumPy immediately, I'd still have to support Numeric =20 because that's what everyone is using at the moment, and many users =20 can't or won't switch immediately. Since the APIs are not fully =20 compatible, I either have to support two versions in parallel, or =20 introduce my own compatibility layer. > In addition, as I understand it, MMTK was NOT working fine for the OP. The issues he had were already solved, he just had to apply the =20 solutions (i.e. reinstall using a more recent version and appropriate =20= compilation options). > As robust as they (and Numeric) might be, when you need to run > something on a new platform (OS-X - Intel comes to mind), or use a new > LAPACK, or whatever, there are going to be (and have been) issues that > need to be addressed. No one is maintaining Numeric, so it makes much > more sense to put your effort into porting to numpy, rather than =20 > trying > to fix or work around Numeric issues. In the long run, I agree. But on the time scale on which is what my =20 work conditions force me to plan, it is less work for me to provide =20 patches for Numeric as the need arises. > PS: this really highlights the strength of having good unit tests: as > far as i can tell, it's really not that much work to do the port -- =20= > the > work is in the testing. Comprehensive units tests would make that part > trivial too. Yes, testing is the bigger chunk of the work. And yes, unit tests are =20= nice to have - but they don't write themselves, unfortunately. Konrad. -- --------------------------------------------------------------------- Konrad Hinsen Centre de Biophysique Mol=E9culaire, CNRS Orl=E9ans Synchrotron Soleil - Division Exp=E9riences Saint Aubin - BP 48 91192 Gif sur Yvette Cedex, France Tel. +33-1 69 35 97 15 E-Mail: hi...@cn... --------------------------------------------------------------------- |
From: Sebastian H. <ha...@ms...> - 2006-09-20 04:53:18
|
Robert Kern wrote: > Sebastian Haase wrote: >> I know that having too much knowledge of the details often makes one >> forget what the "newcomers" will do and expect. > > Please be more careful with such accusations. Repeated frequently, they can > become quite insulting. > I did not mean to insult anyone - what I meant was, that I'm for numpy becoming an easy platform to use. I have spend and enjoyed part the last four years developing and evangelizing Python as an alternative to Matlab and C/Fortran based image analysis environment. I often find myself arguing for good support of the single precision data format. So I find it actually somewhat ironic to see myself arguing now for wanting float64 over float32 ;-) >> We are only talking >> about people that will a) work with single-precision data (e.g. large >> scale-image analysis) and who b) will tend to "just use the default" >> (dtype) --- How else can I say this: these people will just assume that >> arr.mean() *is* the mean of arr. > > I don't understand what you mean, here. arr.mean() is almost never *the* mean of > arr. Double precision can't fix that. > This was not supposed to be a scientific statement -- I'm (again) thinking of our students that not always appreciate the full complexity of computational numerics and data types and such. The best I can hope for is a "sound" default for most (practical) cases... I still think that 80bit vs. 128bit vs 96bit is rather academic for most people ... most people seem to only use float64 and then there are some that use float32 (like us) ... Cheers, Sebastian |
From: Charles R H. <cha...@gm...> - 2006-09-20 04:18:55
|
On 9/19/06, Charles R Harris <cha...@gm...> wrote: > > > > On 9/19/06, Charles R Harris <cha...@gm...> wrote: > > > > > > > > On 9/19/06, Sebastian Haase < ha...@ms...> wrote: > > > > > > Travis Oliphant wrote: > > > > Sebastian Haase wrote: > > > >> I still would argue that getting a "good" (smaller rounding errors) > > > answer > > > >> should be the default -- if speed is wanted, then *that* could be > > > still > > > >> specified by explicitly using dtype=float32 (which would also be a > > > possible > > > >> choice for int32 input) . > > > >> > > > > So you are arguing for using long double then.... ;-) > > > > > > > >> In image processing we always want means to be calculated in > > > float64 even > > > >> though input data is always float32 (if not uint16). > > > >> > > > >> Also it is simpler to say "float64 is the default" (full stop.) - > > > instead > > > >> > > > >> "float64 is the default unless you have float32" > > > >> > > > > "the type you have is the default except for integers". Do you > > > really > > > > want float64 to be the default for float96? > > > > > > > > Unless we are going to use long double as the default, then I'm not > > > > convinced that we should special-case the "double" type. > > > > > > > I guess I'm not really aware of the float96 type ... > > > Is that a "machine type" on any system ? I always thought that -- e.g. > > > coming from C -- double is "as good as it gets"... > > > Who uses float96 ? I heard somewhere that (some) CPUs use 80bits > > > internally when calculating 64bit double-precision... > > > > > > Is this not going into some academic argument !? > > > For all I know, calculating mean()s (and sum()s, ...) is always done > > > in > > > double precision -- never in single precision, even when the data is > > > in > > > float32. > > > > > > Having float32 be the default for float32 data is just requiring more > > > typing, and more explaining ... it would compromise numpy usability > > > as > > > a day-to-day replacement for other systems. > > > > > > Sorry, if I'm being ignorant ... > > > > > > I'm going to side with Travis here. It is only a default and easily > > overridden. And yes, there are precisions greater than double. I was using > > quad precision back in the eighties on a VAX for some inherently ill > > conditioned problems. And on my machine long double is 12 bytes. > > > > Here is the 754r (revision) spec: http://en.wikipedia.org/wiki/IEEE_754r > > It includes quads (128 bits) and half precision (16 bits) floats. I > believe the latter are used for some imaging stuff, radar for instance, and > are also available in some high end GPUs from Nvidia and other companies. > The 80 bit numbers you refer to were defined as extended precision in the > original 754 spec and were mostly intended for temporaries in internal FPU > computations. They have various alignment requirements for efficient use, > which is why they show up as 96 bits (4 byte alignment) and sometimes 128 > bits (8 byte alignment). So actually, float128 would not always distinquish > between extended precision and quad precision. I see more work for Travis > in the future ;) > I just checked this out. On amd64 32 bit linux gives 12 bytes for long double, 64 bit linux gives 16 bytes for long doubles, but they both have 64 bit mantissas, i.e., they are both 80 bit extended precision. Those sizes are the defaults and can be overridden by compiler flags. Anyway, we may need some way to tell the difference between float128 and quads since they will both have the same length on 64 bit architectures. But that is a problem for the future. Chuck |
From: Robert K. <rob...@gm...> - 2006-09-20 04:10:37
|
Sebastian Haase wrote: > I know that having too much knowledge of the details often makes one > forget what the "newcomers" will do and expect. Please be more careful with such accusations. Repeated frequently, they can become quite insulting. > We are only talking > about people that will a) work with single-precision data (e.g. large > scale-image analysis) and who b) will tend to "just use the default" > (dtype) --- How else can I say this: these people will just assume that > arr.mean() *is* the mean of arr. I don't understand what you mean, here. arr.mean() is almost never *the* mean of arr. Double precision can't fix that. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco |
From: A. M. A. <per...@gm...> - 2006-09-20 03:53:41
|
Hi, What are the rules for datatype conversion in ufuncs? Does ufunc(a,b) always yield the smallest type big enough to represent both a and b? What is the datatype of ufunc.reduce(a)? I ask because I was startled by the following behaviour: >>> a = array([1,1],uint8); print a.dtype; print maximum.reduce(a).dtype; '|u1' '<u4' uint16 behaves similarly, but none of the others seem to involve a conversion (with the possible exception, in SVN, of int32 -> float, possibly only for add.reduce). Thanks, A. M. Archibald |
From: Charles R H. <cha...@gm...> - 2006-09-20 03:43:02
|
On 9/19/06, Charles R Harris <cha...@gm...> wrote: > > > > On 9/19/06, Sebastian Haase <ha...@ms...> wrote: > > > > Travis Oliphant wrote: > > > Sebastian Haase wrote: > > >> I still would argue that getting a "good" (smaller rounding errors) > > answer > > >> should be the default -- if speed is wanted, then *that* could be > > still > > >> specified by explicitly using dtype=float32 (which would also be a > > possible > > >> choice for int32 input) . > > >> > > > So you are arguing for using long double then.... ;-) > > > > > >> In image processing we always want means to be calculated in float64 > > even > > >> though input data is always float32 (if not uint16). > > >> > > >> Also it is simpler to say "float64 is the default" (full stop.) - > > instead > > >> > > >> "float64 is the default unless you have float32" > > >> > > > "the type you have is the default except for integers". Do you really > > > want float64 to be the default for float96? > > > > > > Unless we are going to use long double as the default, then I'm not > > > convinced that we should special-case the "double" type. > > > > > I guess I'm not really aware of the float96 type ... > > Is that a "machine type" on any system ? I always thought that -- e.g . > > coming from C -- double is "as good as it gets"... > > Who uses float96 ? I heard somewhere that (some) CPUs use 80bits > > internally when calculating 64bit double-precision... > > > > Is this not going into some academic argument !? > > For all I know, calculating mean()s (and sum()s, ...) is always done in > > double precision -- never in single precision, even when the data is in > > float32. > > > > Having float32 be the default for float32 data is just requiring more > > typing, and more explaining ... it would compromise numpy usability as > > a day-to-day replacement for other systems. > > > > Sorry, if I'm being ignorant ... > > > I'm going to side with Travis here. It is only a default and easily > overridden. And yes, there are precisions greater than double. I was using > quad precision back in the eighties on a VAX for some inherently ill > conditioned problems. And on my machine long double is 12 bytes. > Here is the 754r (revision) spec: http://en.wikipedia.org/wiki/IEEE_754r It includes quads (128 bits) and half precision (16 bits) floats. I believe the latter are used for some imaging stuff, radar for instance, and are also available in some high end GPUs from Nvidia and other companies. The 80 bit numbers you refer to were defined as extended precision in the original 754 spec and were mostly intended for temporaries in internal FPU computations. They have various alignment requirements for efficient use, which is why they show up as 96 bits (4 byte alignment) and sometimes 128 bits (8 byte alignment). So actually, float128 would not always distinquish between extended precision and quad precision. I see more work for Travis in the future ;) Chuck |
From: Sebastian H. <ha...@ms...> - 2006-09-20 03:39:11
|
Charles R Harris wrote: > > > On 9/19/06, *Sebastian Haase* <ha...@ms... > <mailto:ha...@ms...>> wrote: > > Travis Oliphant wrote: > > Sebastian Haase wrote: > >> I still would argue that getting a "good" (smaller rounding > errors) answer > >> should be the default -- if speed is wanted, then *that* could > be still > >> specified by explicitly using dtype=float32 (which would also > be a possible > >> choice for int32 input) . > >> > > So you are arguing for using long double then.... ;-) > > > >> In image processing we always want means to be calculated in > float64 even > >> though input data is always float32 (if not uint16). > >> > >> Also it is simpler to say "float64 is the default" (full stop.) > - instead > >> > >> "float64 is the default unless you have float32" > >> > > "the type you have is the default except for integers". Do you > really > > want float64 to be the default for float96? > > > > Unless we are going to use long double as the default, then I'm not > > convinced that we should special-case the "double" type. > > > I guess I'm not really aware of the float96 type ... > Is that a "machine type" on any system ? I always thought that -- e.g . > coming from C -- double is "as good as it gets"... > Who uses float96 ? I heard somewhere that (some) CPUs use 80bits > internally when calculating 64bit double-precision... > > Is this not going into some academic argument !? > For all I know, calculating mean()s (and sum()s, ...) is always done in > double precision -- never in single precision, even when the data is in > float32. > > Having float32 be the default for float32 data is just requiring more > typing, and more explaining ... it would compromise numpy usability as > a day-to-day replacement for other systems. > > Sorry, if I'm being ignorant ... > > > I'm going to side with Travis here. It is only a default and easily > overridden. And yes, there are precisions greater than double. I was > using quad precision back in the eighties on a VAX for some inherently > ill conditioned problems. And on my machine long double is 12 bytes. > > Chuck > I just did a web search for "long double" http://www.google.com/search?q=%22long+double%22 and it does not look like there is much agreement on what that is - see also http://en.wikipedia.org/wiki/Long_double I really think that float96 *is* a special case - but computing mean()s and var()s in float32 would be "bad science". I hope I'm not alone in seeing numpy a great "interactive platform" to do evaluate data... I know that having too much knowledge of the details often makes one forget what the "newcomers" will do and expect. We are only talking about people that will a) work with single-precision data (e.g. large scale-image analysis) and who b) will tend to "just use the default" (dtype) --- How else can I say this: these people will just assume that arr.mean() *is* the mean of arr. -Sebastian |
From: Bill B. <wb...@gm...> - 2006-09-20 03:26:26
|
On 9/20/06, Charles R Harris <cha...@gm...> wrote: > > I guess I'm not really aware of the float96 type ... > > Is that a "machine type" on any system ? I always thought that -- e.g . > > coming from C -- double is "as good as it gets"... > > Who uses float96 ? I heard somewhere that (some) CPUs use 80bits > > internally when calculating 64bit double-precision... > > > I'm going to side with Travis here. It is only a default and easily > overridden. And yes, there are precisions greater than double. I was using > quad precision back in the eighties on a VAX for some inherently ill > conditioned problems. And on my machine long double is 12 bytes. > And on Intel chips the internal fp precision is 80bits. The D programming language even exposes this 80-bit floating point type to the user. http://www.digitalmars.com/d/type.html http://www.digitalmars.com/d/faq.html#real --bb |
From: Charles R H. <cha...@gm...> - 2006-09-20 03:15:33
|
On 9/19/06, Sebastian Haase <ha...@ms...> wrote: > > Travis Oliphant wrote: > > Sebastian Haase wrote: > >> I still would argue that getting a "good" (smaller rounding errors) > answer > >> should be the default -- if speed is wanted, then *that* could be still > >> specified by explicitly using dtype=float32 (which would also be a > possible > >> choice for int32 input) . > >> > > So you are arguing for using long double then.... ;-) > > > >> In image processing we always want means to be calculated in float64 > even > >> though input data is always float32 (if not uint16). > >> > >> Also it is simpler to say "float64 is the default" (full stop.) - > instead > >> > >> "float64 is the default unless you have float32" > >> > > "the type you have is the default except for integers". Do you really > > want float64 to be the default for float96? > > > > Unless we are going to use long double as the default, then I'm not > > convinced that we should special-case the "double" type. > > > I guess I'm not really aware of the float96 type ... > Is that a "machine type" on any system ? I always thought that -- e.g. > coming from C -- double is "as good as it gets"... > Who uses float96 ? I heard somewhere that (some) CPUs use 80bits > internally when calculating 64bit double-precision... > > Is this not going into some academic argument !? > For all I know, calculating mean()s (and sum()s, ...) is always done in > double precision -- never in single precision, even when the data is in > float32. > > Having float32 be the default for float32 data is just requiring more > typing, and more explaining ... it would compromise numpy usability as > a day-to-day replacement for other systems. > > Sorry, if I'm being ignorant ... I'm going to side with Travis here. It is only a default and easily overridden. And yes, there are precisions greater than double. I was using quad precision back in the eighties on a VAX for some inherently ill conditioned problems. And on my machine long double is 12 bytes. Chuck |
From: Sebastian H. <ha...@ms...> - 2006-09-20 03:07:41
|
Travis Oliphant wrote: > Sebastian Haase wrote: >> I still would argue that getting a "good" (smaller rounding errors) answer >> should be the default -- if speed is wanted, then *that* could be still >> specified by explicitly using dtype=float32 (which would also be a possible >> choice for int32 input) . >> > So you are arguing for using long double then.... ;-) > >> In image processing we always want means to be calculated in float64 even >> though input data is always float32 (if not uint16). >> >> Also it is simpler to say "float64 is the default" (full stop.) - instead >> >> "float64 is the default unless you have float32" >> > "the type you have is the default except for integers". Do you really > want float64 to be the default for float96? > > Unless we are going to use long double as the default, then I'm not > convinced that we should special-case the "double" type. > I guess I'm not really aware of the float96 type ... Is that a "machine type" on any system ? I always thought that -- e.g. coming from C -- double is "as good as it gets"... Who uses float96 ? I heard somewhere that (some) CPUs use 80bits internally when calculating 64bit double-precision... Is this not going into some academic argument !? For all I know, calculating mean()s (and sum()s, ...) is always done in double precision -- never in single precision, even when the data is in float32. Having float32 be the default for float32 data is just requiring more typing, and more explaining ... it would compromise numpy usability as a day-to-day replacement for other systems. Sorry, if I'm being ignorant ... - Sebastian |
From: Charles R H. <cha...@gm...> - 2006-09-20 02:42:56
|
On 9/19/06, A. M. Archibald <per...@gm...> wrote: > > On 19/09/06, Tim Hochberg <tim...@ie...> wrote: > > > I'm still somewhat mystified by the desire to move the nans to one end > > of the sorted object. I see two scenarios: > > It's mostly to have something to do with them other than throw an > exception. Leaving them in place while the rest of the array is > reshuffled requires a lot of work and isn't particularly better. I > mostly presented it as an alternative to throwing an exception. > > Throwing a Python exception now seems like the most reasonable idea. Well, mergesort can be adapted without a lot of work. Could be used to sort masked arrays too, not that I know why anyone would want that, but then again, I haven't used masked arrays. Agreed, throwing some sort of error still seems the simplest thing to do. |
From: Travis O. <oli...@ie...> - 2006-09-20 02:33:35
|
Sebastian Haase wrote: > I still would argue that getting a "good" (smaller rounding errors) answer > should be the default -- if speed is wanted, then *that* could be still > specified by explicitly using dtype=float32 (which would also be a possible > choice for int32 input) . > So you are arguing for using long double then.... ;-) > In image processing we always want means to be calculated in float64 even > though input data is always float32 (if not uint16). > > Also it is simpler to say "float64 is the default" (full stop.) - instead > > "float64 is the default unless you have float32" > "the type you have is the default except for integers". Do you really want float64 to be the default for float96? Unless we are going to use long double as the default, then I'm not convinced that we should special-case the "double" type. -Travis |