numpy-discussion Mailing List for Numerical Python (Page 355)

A package for scientific computing with Python

Brought to you by: charris208, jarrodmillman, kern, rgommers, teoliphant

numpy-discussion — Discussion list for all users of Numerical Python

You can subscribe to this list here.

2000	Jan (8)	Feb (49)	Mar (48)	Apr (28)	May (37)	Jun (28)	Jul (16)	Aug (16)	Sep (44)	Oct (61)	Nov (31)	Dec (24)
2001	Jan (56)	Feb (54)	Mar (41)	Apr (71)	May (48)	Jun (32)	Jul (53)	Aug (91)	Sep (56)	Oct (33)	Nov (81)	Dec (54)
2002	Jan (72)	Feb (37)	Mar (126)	Apr (62)	May (34)	Jun (124)	Jul (36)	Aug (34)	Sep (60)	Oct (37)	Nov (23)	Dec (104)
2003	Jan (110)	Feb (73)	Mar (42)	Apr (8)	May (76)	Jun (14)	Jul (52)	Aug (26)	Sep (108)	Oct (82)	Nov (89)	Dec (94)
2004	Jan (117)	Feb (86)	Mar (75)	Apr (55)	May (75)	Jun (160)	Jul (152)	Aug (86)	Sep (75)	Oct (134)	Nov (62)	Dec (60)
2005	Jan (187)	Feb (318)	Mar (296)	Apr (205)	May (84)	Jun (63)	Jul (122)	Aug (59)	Sep (66)	Oct (148)	Nov (120)	Dec (70)
2006	Jan (460)	Feb (683)	Mar (589)	Apr (559)	May (445)	Jun (712)	Jul (815)	Aug (663)	Sep (559)	Oct (930)	Nov (373)	Dec

Flat | Threaded

<< < 1 .. 353 354 355 356 357 .. 480 > >> (Page 355 of 480)

Re: [Numpy-discussion] How to median filter a masked array?

From: Russell E O. <rowen@u.washington.edu> - 2004-07-14 17:41:09

At 9:50 AM -0700 2004-07-14, Paul F. Dubois wrote:
>The median filter is prepared to take an argument of a numarray 
>array but ignorant of and unprepared to deal with masked  values. 
>Using the __array__ trick, both Numeric.MA and numarray.ma would 
>'know' this and therefore replace the missing values in the filter's 
>argument with the 'fill value' for that type -- a big number in the 
>case of real arrays. You could explicitly choose that value (say 
>using the overall median of the data m) by passing x.filled(m) 
>rather than x to the filter.
>
>If there is no such value, you probably do have to do it in C. If 
>you wrote it in C, how would you treat missing elements? BTW it 
>wouldn't be that hard; just pass both the array and its mask as 
>separate elements to a C routine and use SWIG to hook it up.

I already have routines that handle masked data in C to create a 
radial profiles from 2-d integer data (since I could not figure out 
how to do that in numarray). I chose to pass the mask as a separate 
array, since I could not find any C interface for numarray.ma and 
since NaN made no sense for integer data.

That code was pretty straightforward. I wish I could have found a 
simple way to support multiple array types. I thought using C++ with 
prototypes would be the ticket, but absent any examples and after 
looking through the numarray code, I gave up and took the easy way 
out. (I didn't use SWIG, though, I just hand coded everything. Maybe 
that was a mistake.)

I confess that makes me worry about the underpinnings of numarray. It 
seems an obvious candidate to be written in C++ with prototypes. I 
hate to think what the developers have to go through, instead.

In any case, writing a median filter is a bigger deal than taking a 
radial profile, and since one already existed I thought I'd ask.

>I doubt NaN would help you here; you'd still have to figure out what 
>to do in those places. Numeric did not have support for NaN because 
>there were portability problems. Probably still are. And you still 
>are stuck in a lot of cases anyway.

Well, NaN isn't very general in any case, since it's meaningless for 
integer data. So maybe that's a red herring. (Though if NaN had 
worked to mask data I would cheerfully have converted my images to 
floats to take advantage of it!).

What's really wanted is a more unified approach to masked data. I 
suppose it's pie in the sky, but I sure wish most the numarray 
functions took an optional mask array (or accepted a numarray.ma 
object -- nice for the user, but probably too painful for words under 
the hood).

I don't think there are major issues with what to do with masked 
data. Simply ignoring it works in most cases, e.g. mean, std dev, 
sum, max... In some cases one needs the new mask as output (e.g. 
matrix multiply). Filtering is a bit subtle: can masked data be 
treated the same as data off the edge? I hope so, but I'm not sure.

Anyway, I am grateful for what we do have. Without Numeric or 
numarray I would have to write all my image processing code in a 
different language.

-- Russell

Re: [Numpy-discussion] How to median filter a masked array?

From: Peter V. <ve...@em...> - 2004-07-14 17:37:26

On 14 Jul 2004, at 17:47, Russell E Owen wrote:
> I want to 3x3 median filter a masked array (2-d array of ints -- an 
> astronomical image), where the masked data and points off the edge are 
> excluded from the local median calculation. Any suggestions for how to 
> do this efficiently?

I don't think that you can do it very efficiently right now with the 
functions that are available in numarray.

>  I suspect I have to write it in C, which is an unpleasant prospect.

Yes, that is unpleasant, trust me :-) However, in version 1.0 of 
numarray in the nd_image package, I have added some support for writing 
filter functions. The generic_filter() function iterates over the array 
and applies a user-defined filter function at each element. The 
user-defined function can be written in python or in C, and is called 
at each element with the values within the filter-footprint as an 
argument. You would write a function that finds the median of these 
values, excluding the NaNs (or whatever value that flags the mask.) I 
would suggest to prototype this function in python and move that to C 
as soon as it works to your satisfaction. See the numarray manual for 
more details.

Cheers, Peter

[Numpy-discussion] ANN matplotlib-0.60.2: python graphs and charts

From: John H. <jdh...@ac...> - 2004-07-14 16:50:47

matplotlib is a 2D plotting library for python.  You can use
matplotlib interactively from a python shell or IDE, or embed it in
GUI applications (WX, GTK, and Tkinter).  matplotlib supports many
plot types: line plots, bar charts, log plots, images, pseudocolor
plots, legends, date plots, finance charts and more.  

What's new since matplotlib 0.50

  This is the first wide release in 5 months and there has been a
  tremendous amount of development since then, with new backends, many
  optimizations, new plotting types, new backends and enhanced text
  support. See http://matplotlib.sourceforge.net/whats_new.html for
  details.
 
 * Todd Miller's tkinter backend (tkagg) with good support for
   interactive plotting using the standard python shell, ipython or
   others.  matplotlib now runs on windows out of the box with python
   + numeric/numarry

 * Full Numeric / numarray integration with Todd Miller's numerix
   module.  Prebuilt installers for numeric and numarray on win32.
   Others, please set your numerix settings before building
   matplotlib, as described on
   http://matplotlib.sourceforge.net/faq.html#NUMARRAY

 * Mathtext: you can write TeX style math expressions anywhere in your
   figure.
   http://matplotlib.sourceforge.net/screenshots.html#mathtext_demo.

 * Images - figure and axes images with optional interpolated
   resampling, alpha blending of multiple images, and more with the
   imshow and figimage commands.  Interactive control of colormaps,
   intensity scaling and colorbars -
   http://matplotlib.sourceforge.net/screenshots.html#layer_images

 * Text: freetype2 support, newline separated strings with arbitrary
   rotations, Paul Barrett's cross platform font
   manager.
   http://matplotlib.sourceforge.net/screenshots.html#align_text

 * Jared Wahlstrand's SVG backend (alpha)

 * Support for popular financial plot types -
   http://matplotlib.sourceforge.net/screenshots.html#finance_work2

 * Many optimizations and extension code to remove performance
   bottlenecks.  pcolors and scatters are an order of magnitude
   faster.

 * GTKAgg, WXAgg, TkAgg backends for http://antigrain.com (agg)
   rendering in the GUI canvas.  Now all the major GUIs (WX, GTK, Tk)
   can be used with a common (agg) renderer.

 * Many new examples and demos - see http://matplotlib.sf.net/examples
   or download the src distribution and look in the examples dir.

Documentation and downloads available at
http://matplotlib.sourceforge.net.

John Hunter

[Numpy-discussion] How to median filter a masked array?

From: Russell E O. <rowen@u.washington.edu> - 2004-07-14 15:47:47

I want to 3x3 median filter a masked array (2-d array of ints -- an 
astronomical image), where the masked data and points off the edge 
are excluded from the local median calculation. Any suggestions for 
how to do this efficiently? I suspect I have to write it in C, which 
is an unpleasant prospect.

I tried using NaN for points to mask out, but the median filter seems 
to handle those as "infinity", or something equally inappropriate.

In a related vein, has Python come along far enough that it would be 
reasonable to add support for NaN to numarray -- in the sense that 
statistics calculations, filters, etc. could be convinced to ignore 
NaNs? Obviously this support would be contingent on compiling python 
with IEEE floating point support, but I suspect that's the default on 
most platforms these days.

-- Russell

RE: [Numpy-discussion] a 'for' loop within another 'for' loop?

From: <Seb...@el...> - 2004-07-14 15:40:36

I could not resist to propose an other solution:

r = array([0,2,5,6,8])
l = (r[:,NewAxis] + r[NewAxis,:]).flat

 -----Original Message-----
From: Hee-Seng Kye [mailto:ky...@ea...]
Sent: mercredi 14 juillet 2004 4:22
To: num...@li...
Subject: [Numpy-discussion] a 'for' loop within another 'for' loop?

Hi. I wrote a program to calculate sums of every possible combinations of
two indices of a list. The main body of the program looks something like
this: 

r = [0,2,5,6,8] 

l = [] 

for x in range(0, len(r)): 

for y in range(0, len(r)): 

k = r[x]+r[y] 

l.append(k) 

print l 

1. I've heard that it's not a good idea to have a 'for' loop within another
'for' loop, and I was wondering if there is a more efficient way to do this.

2. Does anyone know if there is a built-in function or module that would do
the above task in NumPy or Numarray (or even in Python)? 

I would really appreciate it if anyone could let me know. 

Thanks for your help!

Re: [Numpy-discussion] a 'for' loop within another 'for' loop?

From: Paul F. D. <pa...@pf...> - 2004-07-14 12:56:51

 >>> add.reduce(take(r,indices([len(r),len(r)]))).flat
array([ 0,  2,  5,  6,  8,  2,  4,  7,  8, 10,  5,  7, 10, 11, 13,  6, 
8, 11, 12, 14,  8, 10, 13, 14, 16])

Always like a good challenge in the morning. God, it is like the old 
rush of writing APL.

Hee-Seng Kye wrote:

> Hi. I wrote a program to calculate sums of every possible combinations 
> of two indices of a list. The main body of the program looks something 
> like this:
> 
> r = [0,2,5,6,8]
> l = []
> 
> for x in range(0, len(r)):
> for y in range(0, len(r)):
> k = r[x]+r[y]
> l.append(k)
> print l
> 
> 1. I've heard that it's not a good idea to have a 'for' loop within 
> another 'for' loop, and I was wondering if there is a more efficient way 
> to do this.
> 
> 2. Does anyone know if there is a built-in function or module that would 
> do the above task in NumPy or Numarray (or even in Python)?
> 
> I would really appreciate it if anyone could let me know.
> 
> Thanks for your help!

Re: [Numpy-discussion] numarray-1.0 Bug Alert

From: Todd M. <jm...@st...> - 2004-07-14 11:37:37

On Wed, 2004-07-14 at 05:36, Francesc Alted wrote:
> A Dimarts 13 Juliol 2004 19:41, Todd Miller va escriure:
> > The real fix for the bug appears to be to redefine the semantics of
> > numarray's PyArrayObject ->data pointer to include ->byteoffset,
> > altering the C-API. 
> 
> Oh well, I'm afraid that I'll be affected by that :(. Just to understand
> that fully, you mean that real data for an array will start in the future at
> narr->data, instead of narr->data+narr->byteoffset as it does now?

That is the current plan.  I was thinking developers could just replace
the new narr->data with (narr->data - narr->byteoffset) if needed.  I'm
assuming the planned changes will cost at most a few edits and package
redistribution, which I understand is still a major pain in the neck; 
let me know if the cost is higher than that for some reason.

Regards,
Todd

Re: [Numpy-discussion] numarray-1.0 Bug Alert

From: Francesc A. <fa...@py...> - 2004-07-14 09:36:23

A Dimarts 13 Juliol 2004 19:41, Todd Miller va escriure:
> The real fix for the bug appears to be to redefine the semantics of
> numarray's PyArrayObject ->data pointer to include ->byteoffset,
> altering the C-API. 

Oh well, I'm afraid that I'll be affected by that :(. Just to understand
that fully, you mean that real data for an array will start in the future at
narr->data, instead of narr->data+narr->byteoffset as it does now?

-- 
Francesc Alted

Re: [Numpy-discussion] a 'for' loop within another 'for' loop?

From: Hee-Seng K. <ky...@ea...> - 2004-07-14 06:29:52

Thank you so much.  It works beautifully!

On Jul 14, 2004, at 1:01 AM, Warren Focke wrote:

> l = Numeric.add.outer(r, r).flat
> oughta do the trick.  Should work for numarray, too.
>
> On Tue, 13 Jul 2004, Hee-Seng Kye wrote:
>
>> Hi.  I wrote a program to calculate sums of every possible 
>> combinations
>> of two indices of a list.  The main body of the program looks 
>> something
>> like this:
>>
>> r = [0,2,5,6,8]
>> l = []
>>
>> for x in range(0, len(r)):
>>      for y in range(0, len(r)):
>>          k = r[x]+r[y]
>>          l.append(k)
>> print l
>>
>> 1. I've heard that it's not a good idea to have a 'for' loop within
>> another 'for' loop, and I was wondering if there is a more efficient
>> way to do this.
>>
>> 2. Does anyone know if there is a built-in function or module that
>> would do the above task in NumPy or Numarray (or even in Python)?
>>
>> I would really appreciate it if anyone could let me know.
>>
>> Thanks for your help!
>
>
> -------------------------------------------------------
> This SF.Net email sponsored by Black Hat Briefings & Training.
> Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
> digital self defense, top technical experts, no vendor pitches,
> unmatched networking opportunities. Visit www.blackhat.com
> _______________________________________________
> Numpy-discussion mailing list
> Num...@li...
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>

[Numpy-discussion] ANN: Reminder -- SciPy 04 is coming up

From: eric j. <er...@en...> - 2004-07-14 05:08:06

Hey folks,

Just a reminder that SciPy 04 is coming up.  More information is here:

http://www.scipy.org/wikis/scipy04

About the Conference and Keynote Speaker
---------------------------------------------
The 1st annual *SciPy Conference* will be held this year at Caltech, 
September 2-3, 2004.  As some of you may know, we've experienced great 
participation in two SciPy "Workshops" (with ~70 attendees in both 2002 
and 2003) and this year we're graduating to a "conference."  With the 
prestige of a conference comes the responsibility of a keynote address.  
This year, Jim Hugunin has answered the call and will be speaking to 
kickoff the meeting on Thursday September 2nd.  Jim is the creator of 
Numeric Python, Jython, and co-designer of AspectJ. Jim is currently 
working on IronPython--a fast implementation of Python for .NET and Mono.

Presenters
-----------
We still have room for a few more standard talks, and there is plenty of 
room for lightning talks. Because of this, we are extending the abstract 
deadline until July 23rd.  Please send your abstract to 
abs...@sc....  Travis Oliphant is organizing the presentations 
this year. (Thanks!)  Once accepted, papers and/or presentation slides 
are acceptable and are due by August 20, 2004. 

Registration
-------------
Early registration ($100.00) has been extended to July 23rd.  Follow the 
links off of the main conference site:

http://www.scipy.org/wikis/scipy04

After July 23rd, registration will be $150.00.  Registration includes 
breakfast and lunch Thursday & Friday and a very nice dinner Thursday 
night.  Please register as soon as possible as it will help us in 
planning for food, room sizes, etc.

Sprints
--------
As of now, we really haven't had much of a call for coding sprints for 
the 3 days prior to SciPy 04.  Below is the original announcement about 
sprints.  If you would like to suggest a topic and see if others are 
interested, please send a message to the list.  Otherwise, we'll forgo 
the sprints session this year.

    We're also planning three days of informal "Coding Sprints" prior to
    the conference -- August 30 to September 1, 2004.  Conference
    registration is not required to participate in the sprints.  Please
    email the list, however, if you plan to attend.  Topics for these
    sprints will be determined via the mailing lists as well, so please
    submit any suggestions for topics to the scipy-user list:

    list signup: http://www.scipy.org/mailinglists/
    list address: sci...@sc...


thanks,
eric

Re: [Numpy-discussion] a 'for' loop within another 'for' loop?

From: Warren F. <fo...@sl...> - 2004-07-14 05:01:49

l = Numeric.add.outer(r, r).flat
oughta do the trick.  Should work for numarray, too.

On Tue, 13 Jul 2004, Hee-Seng Kye wrote:

> Hi.  I wrote a program to calculate sums of every possible combinations
> of two indices of a list.  The main body of the program looks something
> like this:
>
> r = [0,2,5,6,8]
> l = []
>
> for x in range(0, len(r)):
>      for y in range(0, len(r)):
>          k = r[x]+r[y]
>          l.append(k)
> print l
>
> 1. I've heard that it's not a good idea to have a 'for' loop within
> another 'for' loop, and I was wondering if there is a more efficient
> way to do this.
>
> 2. Does anyone know if there is a built-in function or module that
> would do the above task in NumPy or Numarray (or even in Python)?
>
> I would really appreciate it if anyone could let me know.
>
> Thanks for your help!

[Numpy-discussion] a 'for' loop within another 'for' loop?

From: Hee-Seng K. <ky...@ea...> - 2004-07-14 02:22:22

Hi.  I wrote a program to calculate sums of every possible combinations 
of two indices of a list.  The main body of the program looks something 
like this:

r = [0,2,5,6,8]
l = []

for x in range(0, len(r)):
     for y in range(0, len(r)):
         k = r[x]+r[y]
         l.append(k)
print l

1. I've heard that it's not a good idea to have a 'for' loop within 
another 'for' loop, and I was wondering if there is a more efficient 
way to do this.

2. Does anyone know if there is a built-in function or module that 
would do the above task in NumPy or Numarray (or even in Python)?

I would really appreciate it if anyone could let me know.

Thanks for your help!

Re: [Numpy-discussion] differencing numarray arrays.

From: Russell E O. <rowen@u.washington.edu> - 2004-07-14 00:04:10

At 1:41 PM -0700 2004-07-13, Mike Zingale wrote:
>thanks, all these responses helped.  I guess I was still a little
>unclear with the slicing abilities in numarray...

Also note that there is a shift function: numarray.nd_image.shift

In your case I suspect slicing is better, but there are times when 
one really does want to shift the data (e.g. when one wants the 
resulting array to be the same shape as the original).

-- Russell

Re: [Numpy-discussion] differencing numarray arrays.

From: Mike Z. <zi...@uc...> - 2004-07-13 20:41:29

thanks, all these responses helped.  I guess I was still a little
unclear with the slicing abilities in numarray.

Mike


On Tue, 13 Jul 2004, Paul Dubois wrote:

> Two of the responses to your question, while correct, might have seemed
> mysterious to a beginner.
>
> a[1:] - a[:-1]
>
> is actually shorthand for:
>
> a[1:, :] - a[:-1, :]
>
> Or to be even more explicit:
>
> n = 8
> a[1:n, 0:n] - a[0:(n-1), 0:n]
>
> If you had wanted the difference in the second index, you have to use
> the more explicit forms.
>
>
>

Re: [Numpy-discussion] differencing numarray arrays.

From: Robert K. <rk...@uc...> - 2004-07-13 20:00:44

Mike Zingale wrote:

> Hi, I am trying to efficiently compute a difference of two 2-d flux
> arrays, as arises quite commonly in finite-difference/finite-volume
> methods.  Ex:
> 
> a = arange(64)
> a.shape = (8,8)
> 
> I want to do create a new array, b, of shape such that
> 
> b[i,j] = a[i,j] - a[i-1,j]
> 
> for 1 <= i < 8
>     0 <= i < 8

Try
b = a[1:] - a[:-1]

-- 
Robert Kern
rk...@uc...

"In the fields of hell where the grass grows high
  Are the graves of dreams allowed to die."
   -- Richard Harter

Re: [Numpy-discussion] differencing numarray arrays.

From: Tim H. <tim...@co...> - 2004-07-13 19:58:10

Mike Zingale wrote:

>Hi, I am trying to efficiently compute a difference of two 2-d flux
>arrays, as arises quite commonly in finite-difference/finite-volume
>methods.  Ex:
>
>a = arange(64)
>a.shape = (8,8)
>
>I want to do create a new array, b, of shape such that
>
>b[i,j] = a[i,j] - a[i-1,j]
>
>for 1 <= i < 8
>    0 <= i < 8
>  
>
That's supposed to be a j in the second eq., right?

If I understand you right, what you want is:

b = a[1:] - a[:-1]

-tim

>I can obviously do this through loops, but this is quite slow.  In IDL,
>which is often compared to numarray/python, this is simple to do with the
>shift() function, but I cannot find an efficient way to do it with
>numarray arrays.
>
>I tried defining a list
>
>i = range(8)
>im1[1:9] = im1[1:9] - 1
>
>and indexing with im1, but this does not work.
>
>Any suggestions?  For large array, this simple differencing in python is
>very expensive when using loops.
>
>Thanks,
>
>Mike
>
>------------------------------------------------------------------------------
>Michael Zingale
>UCO/Lick Observatory
>UCSC
>Santa Cruz, CA 95064
>
>phone:  (831) 459-5246
>fax:    (831) 459-5265
>e-mail: zi...@uc...
>web:    http://www.ucolick.org/~zingale
>
>``Don't worry head, the computer will do our thinking now''  -- Homer
>
>
>
>-------------------------------------------------------
>This SF.Net email sponsored by Black Hat Briefings & Training.
>Attend Black Hat Briefings & Training, Las Vegas July 24-29 - 
>digital self defense, top technical experts, no vendor pitches, 
>unmatched networking opportunities. Visit www.blackhat.com
>_______________________________________________
>Numpy-discussion mailing list
>Num...@li...
>https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
>  
>

[Numpy-discussion] differencing numarray arrays.

From: Mike Z. <zi...@uc...> - 2004-07-13 19:53:19

Hi, I am trying to efficiently compute a difference of two 2-d flux
arrays, as arises quite commonly in finite-difference/finite-volume
methods.  Ex:

a = arange(64)
a.shape = (8,8)

I want to do create a new array, b, of shape such that

b[i,j] = a[i,j] - a[i-1,j]

for 1 <= i < 8
    0 <= i < 8

I can obviously do this through loops, but this is quite slow.  In IDL,
which is often compared to numarray/python, this is simple to do with the
shift() function, but I cannot find an efficient way to do it with
numarray arrays.

I tried defining a list

i = range(8)
im1[1:9] = im1[1:9] - 1

and indexing with im1, but this does not work.

Any suggestions?  For large array, this simple differencing in python is
very expensive when using loops.

Thanks,

Mike

------------------------------------------------------------------------------
Michael Zingale
UCO/Lick Observatory
UCSC
Santa Cruz, CA 95064

phone:  (831) 459-5246
fax:    (831) 459-5265
e-mail: zi...@uc...
web:    http://www.ucolick.org/~zingale

``Don't worry head, the computer will do our thinking now''  -- Homer

[Numpy-discussion] numarray-1.0 Bug Alert

From: Todd M. <jm...@st...> - 2004-07-13 17:41:56

Overview

There is a bug in numarray's Numeric compatible C-API.  The bug has been
latent for a long time, since numarray-0.3 was released roughly two
years ago.  It is serious because it results in wrong answers for a
certain extension functions fed a certain class of arrays.

What's affected

The bug affects affects numarray's add-on packages or third party
extension functions which use the Numeric compatibility C-API. 
Generally, this means C-code that was either ported from Numeric or was
written with both Numeric and numarray in mind.  This includes the
add-on packages numarray.linear_algebra,  numarray.fft,
numarray.random_array, and numarray.mlab.  More recently, it includes
the ports of core Numeric functions to numarray.numeric.  Because
numarray.ma uses numarray.numeric,  the bug also affects numarray.ma. 
Finally, for numarray-1.0 this bug affects the functions numarray.argmin
and numarray.argmax; these should be the only two functions in core
numarray which are affected.

Detailed Bug Description

The bug is exposed by calling an extension function (written using the
Numeric compatible C-API) with an array that has a non-zero _byteoffset
attribute.  Arrays with non-zero _byteoffset are typically created as a
result of partially indexing higher dimensional arrays or slicing
arrays.  Partially indexing or slicing an array generally results in a
sub-array, a view which often refers to an interior region of the
original array buffer.  Because numarray's PyArrayObject does not
currently include it's ->byteoffset in its ->data pointer as the Numeric
compatibility API assumes it does, an extension function sees the base
region of the original array rather than the region belonging to the
sub-array.

Immediate User Workaround

A simple user level workaround for people that need to use the affected
packages and functions today is one like the following:

def make_safe_for_numeric_api(a):
	a = numarray.asarray(a)
	if a._byteoffset != 0:
		return a.copy()
	else:
		return a

The array inputs to an affected extension function need to be wrapped
with calls to make_safe_for_numeric_api().  Since this is intrusive and
a real fix should be released in the near future, this approach is not
recommended.

Long Term Fix

The real fix for the bug appears to be to redefine the semantics of
numarray's PyArrayObject ->data pointer to include ->byteoffset,
altering the C-API.  This should make most existing Numeric compatible
extension functions work without modification or recompilation,  but
will necessitate the re-compilation of some extension functions written
using the native numarray API approaches (the NA_* functions and
macros).   This recompilation will be required because key macros will
change, most notably NA_OFFSETDATA. This fix is not the only possible
one, and other suggestions are welcome,  but changing the semantics of
->data appears to be the best way to facilitate numarray/Numeric
interoperability.  By doing this fix, numarray operates more like
Numeric so fewer changes need to be made in the future to perform ports
of Numeric code to numarray.

Impact of Proposed Fix

Regrettably, the proposed fix will break binary compatibility for
clients of the numarray-1.0 native C-API.  So, extensions built using
the numarray native C-API will need to be rebuilt for numarray-1.1. 
Extensions that have made direct access to PyArrayObject's ->data and
require the original offsetless meaning will also need to change code
for numarray-1.1.  This is something we *really* wanted to avoid... it
just isn't going to happen this time.  

The Plan

The current plan is to fix the Numeric compatible API by changing the
semantics of ->data and release numarray-1.1 relatively soon, hopefully
within 2 weeks.   I'm sorry for any inconvenience this has caused
numarray users.

Regards,
Todd Miller

[Numpy-discussion] PyTables 0.8.1 released

From: Francesc A. <fa...@py...> - 2004-07-13 09:12:23

PyTables is a hierarchical database package designed to efficiently
manage very large amounts of data. PyTables is built on top of the
HDF5 library and the numarray package. It features an object-oriented
interface that, combined with natural naming and C-code generated from
Pyrex sources, makes it a fast, yet extremely easy-to-use tool for
interactively saving and retrieving different kinds of datasets. It
also provides flexible indexed access on disk to anywhere in the data.

The primary purpose of this release is to incorporate updates to
related to the newly released numarray 1.0. I've taken the opportunity
to backport some improvements added in PyTables 0.9 (in alpha stage)
as well as to fix the known problems

Improvements:

- The logic for computing the buffer sizes has been revamped. As a
  consequence, the performance of writing/reading tables with large
  record sizes has improved by a factor of ten or more, now exceeding
  70 MB/s for writing and 130 MB/s for reading (using compression).

- The maximum record size for tables has been raised to 512 KB
  (before it was 8 KB, due to some internal limitations)

- Documentation has been improved in many minor details. As a result
  of a fix in the underlying documentation system (tbook), chapters
  start now at odd pages, instead of even. So those of you who want
  to print to double side probably will have better luck now when
  aligning pages ;).  Another one is that HTML documentation has
  improved its look as well.

Bug Fixes:

- Indexing of Arrays with list or tuple flavors (#968131)
  When retrieving single elements from an array with 'List' or
  'Tuple' flavors, an error occurred. This has been
  corrected and now you can retrieve fileh.root.array[2] without
  problems for 'List' or 'Tuple' flavored (E, VL)Arrays.
  
- Iterators on Arrays with list or tuple flavors fail (#968132)
  When using iterators with Array objects with 'List' or
  'Tuple' flavors, an error occurred. This has been
  corrected.

- Last Index (-1) of Arrays doesn't work (#968149)
  When accessing to the last element in an Array using the notation
  -1, an empty list (or tuple or array) is returned instead of the
  proper value. This happened in general with all negative
  indices. Fixed.

- Table.read(flavor="List") should return pure lists (#972534)
  However, it used to return a pointer to numarray.records.Record
  instances, as in:

   >>> fileh.root.table.read(1,2,flavor="List") 
    [<numarray.records.Record instance at 0x4128352c>] 
   >>> fileh.root.table.read(1,3,flavor="List") 
    [<numarray.records.Record instance at 0x4128396c>, 
     <numarray.records.Record instance at 0x41283a8c>] 
 
  Now the next records are returned:

   >>> fileh.root.table.read(1,2, flavor=List) 
    [(' ', 1, 1.0)] 
   >>> fileh.root.table.read(1,3, flavor=List) 
    [(' ', 1, 1.0), 
     (' ', 2, 2.0)] 
 
  In addition, when reading a single row of a table, a
  numarray.records.Record pointer was returned:
 
  >>> fileh.root.table[1] 
   <numarray.records.Record instance at 0x4128398c> 
 
  Now, it returns a tuple:

  >>> fileh.root.table[1] 
   (' ', 1, 1.0) 
 
  Which I think is more consistent, and more Pythonic.

- Copy of leaves fails... (#973370)
  Attempting to copy leaves (Table or Array with different flavors) on
  top of themselves caused an internal error in PyTables. This has
  been corrected by silently avoiding the copy and returning the
  original Leaf as a result.

Minor changes:

- When assigning a value to a non-existing field in a table row, now a
  KeyError is raised, instead of the AttributeError that was issued
  before. I think this is more consistent with the type of error.

- Tests have been improved so as to pass the whole suite when compiled
  in 64 bit mode on a Linux/PowerPC machine (namely a dual-G5 Powermac
  running a 64-bit, 2.6.4 Linux kernel and the preview YDL
  distribution for G5, with 64-bit GCC toolchain). Thanks to Ciro
  Cattuto for testing and reporting the modifications that were
  needed.


Where PyTables can be applied?
------------------------------

PyTables is not designed to work as a relational database competitor,
but rather as a teammate. If you want to work with large datasets of
multidimensional data (for example, for multidimensional analysis), or
just provide a categorized structure for some portions of your cluttered
RDBS, then give PyTables a try. It works well for storing data from data
acquisition systems (DAS), simulation software, network data monitoring
systems (for example, traffic measurements of IP packets on routers),
very large XML files, or for creating a centralized repository for system 
logs, to name only a few possible uses.
 
What is a table?
----------------

A table is defined as a collection of records whose values are stored in
fixed-length fields. All records have the same structure and all values
in each field have the same data type.  The terms "fixed-length" and
"strict data types" seem to be quite a strange requirement for a
language like Python that supports dynamic data types, but they serve a
useful function if the goal is to save very large quantities of data
(such as is generated by many scientific applications, for example) in
an efficient manner that reduces demand on CPU time and I/O resources.

What is HDF5?
-------------

For those people who know nothing about HDF5, it is a general purpose
library and file format for storing scientific data made at NCSA. HDF5
can store two primary objects: datasets and groups. A dataset is
essentially a multidimensional array of data elements, and a group is a
structure for organizing objects in an HDF5 file. Using these two basic
constructs, one can create and store almost any kind of scientific data
structure, such as images, arrays of vectors, and structured and
unstructured grids. You can also mix and match them in HDF5 files
according to your needs.

Platforms
---------

I'm using Linux (Intel 32-bit) as the main development platform, but
PyTables should be easy to compile/install on many other UNIX
machines. This package has also passed all the tests on a UltraSparc
platform with Solaris 7 and Solaris 8. It also compiles and passes all
the tests on a SGI Origin2000 with MIPS R12000 processors, with the
MIPSPro compiler and running IRIX 6.5. It also runs fine on Linux
64-bit platforms, like an AMD Opteron running SuSe Linux Enterprise
Server or PowerPC G5 with Linux 2.6.x in 64bit mode. It has also been
tested in MacOSX platforms (10.2 but should also work on newer
versions).

Regarding Windows platforms, PyTables has been tested with Windows
2000 and Windows XP (using the Microsoft Visual C compiler), but it
should also work with other flavors as well.

An example?
-----------

For online code examples, have a look at

http://pytables.sourceforge.net/html/tut/tutorial1-1.html

and, for newly introduced Variable Length Arrays:

http://pytables.sourceforge.net/html/tut/vlarray2.html

Web site
--------

Go to the PyTables web site for more details:

http://pytables.sourceforge.net/

Share your experience
---------------------

Let me know of any bugs, suggestions, gripes, kudos, etc. you may
have.

Enjoy!

-- 
Francesc Alted

Re: [Numpy-discussion] RecArray.tolist() suggestion

From: Francesc A. <fa...@py...> - 2004-07-13 09:06:29

A Dimarts 13 Juliol 2004 10:28, Francesc Alted va escriure:
> A Dilluns 12 Juliol 2004 23:14, Perry Greenfield va escriure:
> > What I'm wondering about is what a single element of a record array
> > should be. Returning a tuple has an undeniable simplicity to it.
> 
> Yeah, this why I'm strongly biased toward this possibility.
> 
> > On the other hand, we've been using recarrays that allow naming the
> > various columns (which we refer to as "fields").  If one can refer
> > to fields of a recarray, shouldn't one be able to refer to a field
> > (by name) of one of it's elements? Or are you proposing that basic
> > recarrays not have that sort of capability (something added by a
> > subclass)?
> 
> Well, I'm not sure about that. But just in case most of people would like to
> access records by field as well as by index, I would advocate for the
> possibility that the Record instances would behave as similar as possible as
> a tuple (or dictionary?). That include creating appropriate __str__() *and*
> __repr__() methods as well as __getitem__() that supports both name fields
> and indices. I'm not sure about whether providing an __getattr__() method
> would ok, but for the sake of simplicity and in order to have (preferably)
> only one way to do things, I would say no.

I've been thinking that one can made compatible to return a tuple on a
single element of a RecArray and still being able to retrieve a field by
name is to play with the RecArray.__getitem__ and let it to suport key names
in addition to indices. This would be better seen as an example:

Right now, one can say:

>>> r=records.array([(1,"asds", 24.),(2,"pwdw", 48.)], "1i4,1a4,1f8")
>>> r._fields["c1"]
array([1, 2])
>>> r._fields["c1"][1]
2

What I propose is to be able to say:

>>> r["c1"]
array([1, 2])
>>> r["c1"][1]
2

Which would replace the notation:

>>> r[1]["c1"]
2

which was recently suggested.

I.e. the suggestion is to realize RecArrays as a collection of columns,
as well as a collection of rows.

-- 
Francesc Alted

Re: [Numpy-discussion] RecArray.tolist() suggestion

From: Francesc A. <fa...@py...> - 2004-07-13 08:28:13

A Dilluns 12 Juliol 2004 23:14, Perry Greenfield va escriure:
> What I'm wondering about is what a single element of a record array
> should be. Returning a tuple has an undeniable simplicity to it.

Yeah, this why I'm strongly biased toward this possibility.

> On the other hand, we've been using recarrays that allow naming the
> various columns (which we refer to as "fields").  If one can refer
> to fields of a recarray, shouldn't one be able to refer to a field
> (by name) of one of it's elements? Or are you proposing that basic
> recarrays not have that sort of capability (something added by a
> subclass)?

Well, I'm not sure about that. But just in case most of people would like to
access records by field as well as by index, I would advocate for the
possibility that the Record instances would behave as similar as possible as
a tuple (or dictionary?). That include creating appropriate __str__() *and*
__repr__() methods as well as __getitem__() that supports both name fields
and indices. I'm not sure about whether providing an __getattr__() method
would ok, but for the sake of simplicity and in order to have (preferably)
only one way to do things, I would say no.

Regards,

-- 
Francesc Alted

RE: [Numpy-discussion] RecArray.tolist() suggestion

From: Russell E O. <rowen@u.washington.edu> - 2004-07-12 23:08:07

At 5:14 PM -0400 2004-07-12, Perry Greenfield wrote:
>What I'm wondering about is what a single element of a record array
>should be. Returning a tuple has an undeniable simplicity to it.
>On the other hand, we've been using recarrays that allow naming the
>various columns (which we refer to as "fields").  If one can refer
>to fields of a recarray, shouldn't one be able to refer to a field
>(by name) of one of it's elements? Or are you proposing that basic
>recarrays not have that sort of capability (something added by a
>subclass)?

In my opinion, an single item of a record array should be a 
RecordItem object that is a dictionary that keeps items in field 
order. Thus:
- use the standard dictionary interface to deal with values by name 
(except the keys are always in the correct order.
- one can also get and set the all data at once as a tuple. This is 
NOT a standard dictionary interface, but is essential. Functions such 
as getvalues(), setvalues(dataTuple) should do it.

Adopting the full dictionary interface means one gets a standard, 
mature and fairly complete set of features. ALSO a RecordItem object 
can then be used wherever a dictionary object is needed.

I suspect it's also useful to have named field access:
RecordItem.fieldname
but am a bit reluctant to suggest so many different ways of getting 
to the data.

I assume it will continue to be easy to get all data for a field by 
naming the appropriate field. That's a really nice feature. It would 
be even better if a masked array could be used, but I have no idea 
how hard this would be.

Which brings up a side issue: any hope of integrating masked arrays 
into numarray, such that they could be used wherever a numarray array 
could be used? Areas that I particularly find myself needing them 
including nd_image filtering and writing C extensions.

-- Russell

P.S. I submitted several feature requests and bug reports for records 
on sourceforge months ago. I hope they'll not be overlooked during 
the review process.

RE: [Numpy-discussion] RecArray.tolist() suggestion

From: Perry G. <pe...@st...> - 2004-07-12 21:14:29

Francesc Alted wrote:
> 
> As Perry said not too long ago that numarray crew would ask for 
> suggestions
> for RecArray improvements, I'm going to suggest a couple.
> 
> I find quite inconvenient the .tolist() method when applied to RecArray
> objects as it is now:
> 
> >>> r[2:4]
> array(
> [(3, 33.0, 'c'),
> (4, 44.0, 'd')],
> formats=['1UInt8', '1Float32', '1a1'],
> shape=2,
> names=['c1', 'c2', 'c3'])
> >>> r[2:4].tolist()
> [<numarray.records.Record instance at 0x406a946c>, 
> <numarray.records.Record instance at 0x406a912c>]
> 
> 
> The suggested behaviour would be:
> 
> >>> r[2:4].tolist()
> [(3, 33.0, 'c'),(4, 44.0, 'd')]
> 
> Another thing is that an element of recarray would be returned as a tuple
> instead as a records.Record object:
> 
> >>> r[2]
> <numarray.records.Record instance at 0x4064074c>
> 
> The suggested behaviour would be:
> 
> >>> r[2]
> (3, 33.0, 'c')
> 
> I think the latter would be consistent with the convention that a
> __getitem__(int) of a NumArray object returns a python type instead of a
> rank-0 array. In the same way, a __getitem__(int) of a RecArray should
> return a a python type (a tuple in this case).
> 
These are good examples of where improvements are needed (we are 
also looking at how best to handle multidimensional arrays and
should have a proposal this week).

What I'm wondering about is what a single element of a record array
should be. Returning a tuple has an undeniable simplicity to it.
On the other hand, we've been using recarrays that allow naming the
various columns (which we refer to as "fields").  If one can refer
to fields of a recarray, shouldn't one be able to refer to a field
(by name) of one of it's elements? Or are you proposing that basic
recarrays not have that sort of capability (something added by a
subclass)?

Perry

Re: [Numpy-discussion] How to read data from text files fast?

From: Chris B. <Chr...@no...> - 2004-07-09 16:43:56

Bruce,

Thanks for your feedback.

Bruce Southey wrote:
> While I am not really following your thread, I just wanted to comment that the
> Python Cookbook (at least the printed version) has some ways to count lines in a
> file - assuming that the number of lines provides the size.

The number of lines does not necessarily provide the size. In the 
general case, it doesn't at all. My whole goal here is the general case: 
being able to read a bunch of numbers out of any format of text file. 
This can be used as part of a parser for many file formats. If I was 
shooting for just one format, this would be easier, but not general 
purpose. Now that I have this, I can write a number of file format 
parsers in python with improved performance and easier syntax.

Under Unix (but not
> windows),

I am aiming for a portable solution.

> Alternatively if sufficient memory is available, storing the file in memory
> (during the counting of elements) should always be faster than reading it a
> second time from the hard disk.

The primary reason to scan the file ahead of time to count the elements 
is to save the memory of duplicate copies of data. The other reason is 
to make memory management easier, but since I've already solved that 
problem, I'm done.

thanks,
-Chris

-- 
Christopher Barker, Ph.D.
Oceanographer

NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chr...@no...

[Numpy-discussion] Numpy compiling error... Help!

From: Thomas K. <tho...@ho...> - 2004-07-09 15:01:47

Hi

I'm trying to compile/install numpy on a RH9 machine. When doing so I run 
into problems.

I give the command:
python setup.py install

and get a long answer, with this error at the end:
gcc -shared build/temp.linux-i686-2.2/lapack_litemodule.o -L/usr/lib/atlas 
-llapack -lcblas -lf77blas -latlas -lg2c -o 
build/lib.linux-i686-2.2/lapack_lite.so
/usr/bin/ld: cannot find -llapack
collect2: ld returned 1 exit status
error: command 'gcc' failed with exit status 1

Does anyone know what I've done wrong? I've spent alot of time on this and 
really needs help now...

Regards
Thomas

_________________________________________________________________
Hitta rätt på nätet med MSN Sök http://search.msn.se/

258 messages has been excluded from this view by a project administrator.

Flat | Threaded

<< < 1 .. 353 354 355 356 357 .. 480 > >> (Page 355 of 480)