From: Michael F. <mfe...@cr...> - 2015-07-31 14:37:59
|
Hi - I've found it useful to allow some Chapel arrays to be read without knowing their size in advance. In particular, non-strided 1-D Chapel arrays that have sole ownership over their domain could be read into where that operation will resize the array to match the data read. I've prototyped this for JSON and Chapel style textual array formats (e.g. [1,2,3,4] ). E.g. var A:[1..0] int; mychannel.read(A); could read into A any number of elements and adjust its domain accordingly. The alternative is that such an operation is an error. So, I think that this kind of feature would be an improvement, but I'm not sure everyone will agree. To start the discussion, I have four design questions: 1) Does changing the size of a read array when possible seem like the right idea? Or should reading an array always insist that the input has the same size as the existing array (which I believe is behavior we are stuck with for arrays that share domains...) 2) Should any-dimensional rectangular arrays be written in binary in a form that encodes the size of each dimension? (In other words, write the domain first). Such a feature would make something like (1) possible for multi-dimensional arrays but might not match what people expect for binary array formats. (I don't think we've documented what you actually get when writing an array in binary yet...) 3) Any suggestions for a Chapel array literal format for multi-dimensional arrays? How would you write such arrays in JSON (and would anyone want to)? At one point there was a proposal to put the domain in array literals, like this: var A = [ over {1..10} ]; but that doesn't really answer how to write multidimensional array literals. One approach would be to store the array elements in a flat way and just reshape them while reading; e.g. var A = [ over {1..2, 1..3} 11, 12, 13, 21, 22, 23 ]; where the spacing would not be significant. If we had a reasonable format, we could extend support like (1) to any-dimensional arrays that do not share domains, even for some textual formats. 4) I'm finding that each layout or distribution needs to be adjusted separately in order to implement these operations (they are currently implemented in dsiSerialReadWrite - part of the domain map (dmap) interface). But, it seems to me that how to read/write is reasonably independent of how the array is represented (as long as the I/O code can access the elements somehow). Is there a particular reason why these I/O operations are implemented on a per-dmap basis, rather than once for each type of array (rectangular, sparse, associative, etc)? I'd like the implementation approach to make it more likely that writing an array of a particular shape and then reading it with a different domain map will work. But, I might be confused about how it works now... Thanks for any thoughts, -michael |