Re: [Pytables-users] Some experiences with PyTables

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

2011/12/7 Francesc Alted <fa...@py...>

> What you are saying is correct, except that the 'guilty' of dropping the
> trailing null characters is NumPy, not HDF5.  Look at this:
>

> In [27]: import numpy as np
>
> In [28]: np.array(["aaa"])
> Out[28]:
> array(['aaa'],
>       dtype='|S3')
>
> In [29]: np.array(["aaa\x00\x00"])
> Out[29]:
> array(['aaa'],
>       dtype='|S5')
>
> Of course, this behaviour for NumPy was discussed long time ago during its
> introduction (around NumPy 1.0 or so, back in 2006), and people (specially
> Travis) found this to be the most convenient for the majority of usages.
> If you are interested in getting the trailing bytes, you can always do:
>
> In [53]: a = np.array(["aaa\x00\x00"])
>
> In [54]: a[0]
> Out[54]: 'aaa'
>
> In [55]: "".join([chr(i) for i in a.view('b')])
> Out[55]: 'aaa\x00\x00'
>

Hmm, in this case the element to convert is a np.string_ and not a ndarray,
but the solution to get the trailing nulls is even easier:

In [70]: a = np.string_('aaa\x00\x00')

In [71]: a
Out[71]: 'aaa'

In [72]: a.data[:]
Out[72]: 'aaa\x00\x00'

-- 
Francesc Alted