From: Michael F. <mfe...@cr...> - 2015-07-31 14:37:59
|
Hi - I've found it useful to allow some Chapel arrays to be read without knowing their size in advance. In particular, non-strided 1-D Chapel arrays that have sole ownership over their domain could be read into where that operation will resize the array to match the data read. I've prototyped this for JSON and Chapel style textual array formats (e.g. [1,2,3,4] ). E.g. var A:[1..0] int; mychannel.read(A); could read into A any number of elements and adjust its domain accordingly. The alternative is that such an operation is an error. So, I think that this kind of feature would be an improvement, but I'm not sure everyone will agree. To start the discussion, I have four design questions: 1) Does changing the size of a read array when possible seem like the right idea? Or should reading an array always insist that the input has the same size as the existing array (which I believe is behavior we are stuck with for arrays that share domains...) 2) Should any-dimensional rectangular arrays be written in binary in a form that encodes the size of each dimension? (In other words, write the domain first). Such a feature would make something like (1) possible for multi-dimensional arrays but might not match what people expect for binary array formats. (I don't think we've documented what you actually get when writing an array in binary yet...) 3) Any suggestions for a Chapel array literal format for multi-dimensional arrays? How would you write such arrays in JSON (and would anyone want to)? At one point there was a proposal to put the domain in array literals, like this: var A = [ over {1..10} ]; but that doesn't really answer how to write multidimensional array literals. One approach would be to store the array elements in a flat way and just reshape them while reading; e.g. var A = [ over {1..2, 1..3} 11, 12, 13, 21, 22, 23 ]; where the spacing would not be significant. If we had a reasonable format, we could extend support like (1) to any-dimensional arrays that do not share domains, even for some textual formats. 4) I'm finding that each layout or distribution needs to be adjusted separately in order to implement these operations (they are currently implemented in dsiSerialReadWrite - part of the domain map (dmap) interface). But, it seems to me that how to read/write is reasonably independent of how the array is represented (as long as the I/O code can access the elements somehow). Is there a particular reason why these I/O operations are implemented on a per-dmap basis, rather than once for each type of array (rectangular, sparse, associative, etc)? I'd like the implementation approach to make it more likely that writing an array of a particular shape and then reading it with a different domain map will work. But, I might be confused about how it works now... Thanks for any thoughts, -michael |
From: Brian G. <br...@op...> - 2015-07-31 15:05:32
|
Thanks Michael! I've needed this many times while doing Project Euler problems. On Fri, Jul 31, 2015 at 7:37 AM, Michael Ferguson <mfe...@cr...> wrote: > Hi - > > I've found it useful to allow some Chapel arrays to be read without > knowing their > size in advance. In particular, non-strided 1-D Chapel arrays that have > sole ownership > over their domain could be read into where that operation will resize the > array to > match the data read. I've prototyped this for JSON and Chapel style > textual array formats > (e.g. [1,2,3,4] ). > > E.g. > var A:[1..0] int; > mychannel.read(A); > > could read into A any number of elements and adjust its domain accordingly. > The alternative is that such an operation is an error. > > So, I think that this kind of feature would be an improvement, but I'm not > sure everyone will agree. To start the discussion, I have four design > questions: > > 1) Does changing the size of a read array when possible seem like the > right idea? > Or should reading an array always insist that the input has the same > size as > the existing array (which I believe is behavior we are stuck with for > arrays > that share domains...) > > 2) Should any-dimensional rectangular arrays be written in binary in a > form that > encodes the size of each dimension? (In other words, write the domain > first). > Such a feature would make something like (1) possible for > multi-dimensional > arrays but might not match what people expect for binary array formats. > (I don't think we've documented what you actually get when writing an > array in binary yet...) > > 3) Any suggestions for a Chapel array literal format for multi-dimensional > arrays? > How would you write such arrays in JSON (and would anyone want to)? > At one point there was a proposal to put the domain in array literals, > like this: > var A = [ over {1..10} ]; > > but that doesn't really answer how to write multidimensional array > literals. > One approach would be to store the array elements in a flat way and > just reshape > them while reading; e.g. > var A = [ over {1..2, 1..3} > 11, 12, 13, > 21, 22, 23 ]; > where the spacing would not be significant. > > If we had a reasonable format, we could extend support like (1) to > any-dimensional > arrays that do not share domains, even for some textual formats. > > 4) I'm finding that each layout or distribution needs to be adjusted > separately > in order to implement these operations (they are currently implemented > in dsiSerialReadWrite - part of the domain map (dmap) interface). But, > it seems > to me that how to read/write is reasonably independent of how the array > is > represented (as long as the I/O code can access the elements somehow). > Is there > a particular reason why these I/O operations are implemented on a > per-dmap basis, > rather than once for each type of array (rectangular, sparse, > associative, etc)? > I'd like the implementation approach to make it more likely that > writing an > array of a particular shape and then reading it with a different domain > map > will work. But, I might be confused about how it works now... > > Thanks for any thoughts, > > -michael > > > > ------------------------------------------------------------------------------ > _______________________________________________ > Chapel-developers mailing list > Cha...@li... > https://lists.sourceforge.net/lists/listinfo/chapel-developers > |
From: Damian M. <da...@es...> - 2015-07-31 22:50:24
|
There are a lot of issues here. On Fri, 31 Jul 2015, Michael Ferguson wrote: > I've found it useful to allow some Chapel arrays to be read without > knowing their size in advance. In particular, non-strided 1-D Chapel > arrays that have sole ownership over their domain could be read into > where that operation will resize the array to match the data read. I've > prototyped this for JSON and Chapel style textual array formats ? > (e.g. [1,2,3,4] ). > > E.g. > var A:[1..0] int; ... revisiting history here. This look like what Algol68 called flexible array bounds. Note that I was not programming when Algol68 was released!! I do remember reading stuff about the issues when I was using this in the early 80s. Because it was so long ago that I do not have references, but unlike when I was younger, we now have access to Google. So fishing for algol68 flexible array Not that it says anything negative http://www.cs.virginia.edu/~mpw7t/cs655/pos2.html There are books by the guy who coined the term 'Software Engineering', Friedrich L Bauer. He was one of the original Algol68 architects who died only about 4 months ago). Both discuss this topic. One of the books is titled 'Compiler Construction: An Advanced Course', the other 'Algorithmic Language and Program Development'. >From memory, 'flex' bounds were regarded as a big no-no by many people, even though those saying this agreed that there were lots of cases where having them would be nice and make for much cleaner algorithms. They were the basic mechanism behind Algol68's strings. > mychannel.read(A); > > could read into A any number of elements and adjust its domain accordingly. > The alternative is that such an operation is an error. > So, I think that this kind of feature would be an improvement, but I'm not > sure everyone will agree. To start the discussion, I have four design > questions: > 1) Does changing the size of a read array when possible seem like the > right idea? Or should reading an array always insist that the input > has the same size as the existing array (which I believe is behavior > we are stuck with for arrays that share domains...) I always prefer consistency. Not sure whether that is a very valid reason. That said, and as Michael mentioned later, reading is a bit separate to how something is stored. > 2) Should any-dimensional rectangular arrays be written in binary in a > form that encodes the size of each dimension? (In other words, write > the domain first). Such a feature would make something like (1) > possible for multi-dimensional arrays but might not match what people > expect for binary array formats. (I don't think we've documented > what you actually get when writing an array in binary yet...) Remind me what the argument is against demanding that prior to reading the array contents, the domain be read and then used to allocate the array? > 3) Any suggestions for a Chapel array literal format for > multi-dimensional arrays?How would you write such arrays in JSON (and > would anyone want to)? At one point there was a proposal to put the > domain in array literals, like this: > var A = [ over {1..10} ]; > > but that doesn't really answer how to write multidimensional array > literals. One approach would be to store the array elements in a flat > way and just reshape them while reading; e.g. > > var A = [ over {1..2, 1..3} > 11, 12, 13, > 21, 22, 23 ]; > > where the spacing would not be significant. Do you mean reshape (resize?) then after reading? > If we had a reasonable format, we could extend support like (1) to > any-dimensional arrays that do not share domains, even for some textual > formats. Not sure what you mean here. > 4) I'm finding that each layout or distribution needs to be adjusted > separately in order to implement these operations (they are currently > implemented in dsiSerialReadWrite - part of the domain map (dmap) > interface). But, it seems to me that how to read/write is reasonably > independent of how the array is represented (as long as the I/O code > can access the elements somehow). Is there a particular reason why > these I/O operations are implemented on a per-dmap basis, rather than > once for each type of array (rectangular, sparse, associative, etc)? > I'd like the implementation approach to make it more likely that > writing an array of a particular shape and then reading it with a > different domain map will work. But, I might be confused about how it > works now... There are a lot of issues there, from design to implementation. They need lots of thought. Are we confusing language features with the practicality of I/O implementation? Just a thought. Regards - Damian Pacific Engineering Systems International, 277-279 Broadway, Glebe NSW 2037 Ph:+61-2-8571-0847 .. Fx:+61-2-9692-9623 | unsolicited email not wanted here Views & opinions here are mine and not those of any past or present employer |
From: Michael F. <mfe...@cr...> - 2015-08-03 13:13:57
|
Hi Damian - Thanks for your response. I'll answer some specific questions below.. On 7/31/15, 6:50 PM, "Damian McGuckin" <da...@es...> wrote: > >There are a lot of issues here. > >On Fri, 31 Jul 2015, Michael Ferguson wrote: > >> I've found it useful to allow some Chapel arrays to be read without >> knowing their size in advance. In particular, non-strided 1-D Chapel >> arrays that have sole ownership over their domain could be read into >> where that operation will resize the array to match the data read. I've >> prototyped this for JSON and Chapel style textual array formats >? >> (e.g. [1,2,3,4] ). JSON and Chapel array literals both use the square-bracket syntax. See json.org if you're not familiar with JSON. I'd like the I/O system to be able to read such arrays. In particular, I'd like the I/O system to be able to read arrays written in the same way as a Chapel array literal. If you're using a Chapel array literal, e.g.: var A = [1,2,3,4]; you don't have to specify the domain before-hand. So why should you have to read a domain before the array contents when you are doing I/O? >> >> E.g. >> var A:[1..0] int; > >... revisiting history here. > >This look like what Algol68 called flexible array bounds. Note that I was >not programming when Algol68 was released!! I do remember reading stuff >about the issues when I was using this in the early 80s. Because it was >so >long ago that I do not have references, but unlike when I was younger, we >now have access to Google. So fishing for > > algol68 flexible array > >Not that it says anything negative > > http://www.cs.virginia.edu/~mpw7t/cs655/pos2.html > >There are books by the guy who coined the term 'Software Engineering', >Friedrich L Bauer. He was one of the original Algol68 architects who died >only about 4 months ago). Both discuss this topic. One of the books is >titled 'Compiler Construction: An Advanced Course', the other >'Algorithmic >Language and Program Development'. > >From memory, 'flex' bounds were regarded as a big no-no by many people, >even though those saying this agreed that there were lots of cases where >having them would be nice and make for much cleaner algorithms. What would be most useful about this bit of history is if we knew *why* flexible array bounds were regarded as a no-no by many people. A lot has changed in languages and their implementation since then, and it would be easy to dismiss as a problem that just needed a slightly better solution. > >They were the basic mechanism behind Algol68's strings. > >> mychannel.read(A); >> >> could read into A any number of elements and adjust its domain >>accordingly. >> The alternative is that such an operation is an error. > >> So, I think that this kind of feature would be an improvement, but I'm >>not >> sure everyone will agree. To start the discussion, I have four design >> questions: > >> 1) Does changing the size of a read array when possible seem like the >> right idea? Or should reading an array always insist that the input >> has the same size as the existing array (which I believe is behavior >> we are stuck with for arrays that share domains...) > >I always prefer consistency. Not sure whether that is a very valid reason. > >That said, and as Michael mentioned later, reading is a bit separate to >how something is stored. Both ways are getting us consistency to a different thing. I desire consistency between I/O operations and array literals in Chapel. I don't think we can have that and also it work for all array scenarios. However, I don't think making it an error to read an array of a different shape when the domain is shared would be so bad... we are already doing that kind of thing with the array-as-list operations like push_back. I think it's worth the improvement in productivity, and I don't think we can convince people who have used Python and the like that they can't have e.g. myArray.push_back(1); because it resize the array. Similarly, in the I/O scenario I'm working on, I'm reading JSON which is an existing format (that we have no control over). JSON does not use a format for arrays that starts with the array length, and all arrays are variable sized. > >> 2) Should any-dimensional rectangular arrays be written in binary in a >> form that encodes the size of each dimension? (In other words, write >> the domain first). Such a feature would make something like (1) >> possible for multi-dimensional arrays but might not match what people >> expect for binary array formats. (I don't think we've documented >> what you actually get when writing an array in binary yet...) > >Remind me what the argument is against demanding that prior to reading >the >array contents, the domain be read and then used to allocate the array? Three reasons: 1) Similarity with Chapel's array literal syntax 2) Some file formats do not include the length (e.g. JSON) 3) You can't do a whole-array read from JSON or a Chapel array literal-like format without this functionality. So, it makes the I/O code more complicated. For example, if I have record MyRecord { var myArray: [1..0] int; // (start with empty array) } I might want to read something in this JSON format: {"myArray":[1,2,3]} {"myArray":[4,5]} I'd like to be able to do that with repeated calls to readf that indicate to use JSON format like this: var r:MyRecord; readf("%jt", r); where the format string has j for JSON format and t means "read or write anything" (vs e.g. a number or string). If I had to read the array's size separately I would have several problems in expressing this: 1) I now have to write a readThis for MyRecord when I didn't before (the readThis would resize the array). I still have to resize the array, I just have to do it in user code 2) Since when doing the I/O, I don't actually know the size of the array, I have to either: a) write my own array element reading code that resizes the array as it goes, or b) read the array twice, once to count the number of elements and again to actually read them. I think all this comes at a serious productivity cost (and a cost that I don't think is worthwhile if all we are hoping for is consistency between array types - which has a nebulous benefit on productivity especially if we can provide a reasonable error message when the implementation behaves differently than someone expects). > >> 3) Any suggestions for a Chapel array literal format for >> multi-dimensional arrays?How would you write such arrays in JSON (and >> would anyone want to)? At one point there was a proposal to put the >> domain in array literals, like this: > >> var A = [ over {1..10} ]; >> >> but that doesn't really answer how to write multidimensional array >> literals. One approach would be to store the array elements in a flat >> way and just reshape them while reading; e.g. >> >> var A = [ over {1..2, 1..3} >> 11, 12, 13, >> 21, 22, 23 ]; >> >> where the spacing would not be significant. > >Do you mean reshape (resize?) then after reading? I don't see the difference, but whether it's during or after reading is an implementation matter that I don't think we need to decide upon now. > >> If we had a reasonable format, we could extend support like (1) to >> any-dimensional arrays that do not share domains, even for some textual >> formats. > >Not sure what you mean here. We'd be able to read(myArray) even if myArray is multidimensional, if we're using the Chapel array literal format that stores the domain as the first thing. (but only if the array's domain is not shared, since we don't want to resize the array if it means resizing other arrays also). Cheers, -michael |
From: Rafael L. J. <rla...@um...> - 2015-08-03 16:31:34
|
Hi everyone, I hope not to make too much noise in this thread, but several years ago I implemented this for Chapel, in a way that in those days made sense to me, athought it had big problems for real use. The idea was to define the file class and implement in Block, BlockDom and BlockArr the write and read functions, so each one could write/read its contents to a file. Writting was easy, the problem was defining Block and BlockDom for reading, as there is no way to creating them and giving concrete values latter. For writting the interfaz was this : outfile = new file(filename, FileAccessMode.readwrite); outfile.open(); // write out the array itself outfile.write(Dist); outfile.write(Dom); outfile.write(A); // close the file outfile.close(); And for reading, for example : var Dist2 = new dmap(new Block(rank=2, boundingBox=[1..5, 1..5])); var Dom2: domain(2) dmapped Distrib = [1..5,1..5]; var outfile = new file(filename, FileAccessMode.readwrite); outfile.open(); outfile.read(Dist2); outfile.read(Dom2); outfile.read(A); outfile.close(); As I said, the problem was that the distribution and domain need to get predefined with the correct rank value, and latter, inside the read(Dist2) and read(Dom2), its contents were changed to the readed values. I know that this is not usable, but just to give more ideas. Also this could work in parallel from several locales, but that is another problem. Greets, Rafael > Hi Damian - > > Thanks for your response. I'll answer some specific questions below.. > > On 7/31/15, 6:50 PM, "Damian McGuckin" <da...@es...> wrote: > >There are a lot of issues here. > > > >On Fri, 31 Jul 2015, Michael Ferguson wrote: > >> I've found it useful to allow some Chapel arrays to be read without > >> knowing their size in advance. In particular, non-strided 1-D Chapel > >> arrays that have sole ownership over their domain could be read into > >> where that operation will resize the array to match the data read. I've > >> prototyped this for JSON and Chapel style textual array formats > > > >? > > > >> (e.g. [1,2,3,4] ). > > JSON and Chapel array literals both use the square-bracket syntax. > See json.org if you're not familiar with JSON. > I'd like the I/O system to be able to read such arrays. In particular, > I'd like the I/O system to be able to read arrays written in the > same way as a Chapel array literal. > > If you're using a Chapel array literal, e.g.: > > var A = [1,2,3,4]; > > you don't have to specify the domain before-hand. So why should > you have to read a domain before the array contents when you > are doing I/O? > > >> E.g. > >> var A:[1..0] int; > > > >... revisiting history here. > > > >This look like what Algol68 called flexible array bounds. Note that I was > >not programming when Algol68 was released!! I do remember reading stuff > >about the issues when I was using this in the early 80s. Because it was > >so > >long ago that I do not have references, but unlike when I was younger, we > >now have access to Google. So fishing for > > > > algol68 flexible array > > > >Not that it says anything negative > > > > http://www.cs.virginia.edu/~mpw7t/cs655/pos2.html > > > >There are books by the guy who coined the term 'Software Engineering', > >Friedrich L Bauer. He was one of the original Algol68 architects who died > >only about 4 months ago). Both discuss this topic. One of the books is > >titled 'Compiler Construction: An Advanced Course', the other > >'Algorithmic > >Language and Program Development'. > > > >From memory, 'flex' bounds were regarded as a big no-no by many people, > >even though those saying this agreed that there were lots of cases where > >having them would be nice and make for much cleaner algorithms. > > What would be most useful about this bit of history is if we knew > *why* flexible array bounds were regarded as a no-no by many > people. A lot has changed in languages and their implementation since > then, and it would be easy to dismiss as a problem that just needed > a slightly better solution. > > >They were the basic mechanism behind Algol68's strings. > > > >> mychannel.read(A); > >> > >> could read into A any number of elements and adjust its domain > >> > >>accordingly. > >> > >> The alternative is that such an operation is an error. > >> > >> So, I think that this kind of feature would be an improvement, but I'm > >> > >>not > >> > >> sure everyone will agree. To start the discussion, I have four design > >> questions: > >> > >> 1) Does changing the size of a read array when possible seem like the > >> > >> right idea? Or should reading an array always insist that the input > >> has the same size as the existing array (which I believe is behavior > >> we are stuck with for arrays that share domains...) > > > >I always prefer consistency. Not sure whether that is a very valid reason. > > > >That said, and as Michael mentioned later, reading is a bit separate to > >how something is stored. > > Both ways are getting us consistency to a different thing. > I desire consistency between I/O operations and array literals > in Chapel. I don't think we can have that and also it work for > all array scenarios. However, I don't think making it an error > to read an array of a different shape when the domain is shared > would be so bad... we are already doing that kind of thing with > the array-as-list operations like push_back. I think it's worth > the improvement in productivity, and I don't think we can convince > people who have used Python and the like that they can't have > e.g. > > myArray.push_back(1); > > because it resize the array. > > Similarly, in the I/O scenario I'm working on, I'm reading JSON > which is an existing format (that we have no control over). JSON > does not use a format for arrays that starts with the array length, > and all arrays are variable sized. > > >> 2) Should any-dimensional rectangular arrays be written in binary in a > >> > >> form that encodes the size of each dimension? (In other words, write > >> the domain first). Such a feature would make something like (1) > >> possible for multi-dimensional arrays but might not match what people > >> expect for binary array formats. (I don't think we've documented > >> what you actually get when writing an array in binary yet...) > > > >Remind me what the argument is against demanding that prior to reading > >the > >array contents, the domain be read and then used to allocate the array? > > Three reasons: > 1) Similarity with Chapel's array literal syntax > 2) Some file formats do not include the length (e.g. JSON) > 3) You can't do a whole-array read from JSON or a Chapel array > literal-like format without this functionality. So, it makes > the I/O code more complicated. > > For example, if I have > > record MyRecord { > var myArray: [1..0] int; // (start with empty array) > } > > I might want to read something in this JSON format: > {"myArray":[1,2,3]} > {"myArray":[4,5]} > > I'd like to be able to do that with repeated calls to readf > that indicate to use JSON format like this: > > var r:MyRecord; > > readf("%jt", r); > > where the format string has j for JSON format and t means > "read or write anything" (vs e.g. a number or string). > > If I had to read the array's size separately I would have > several problems in expressing this: > 1) I now have to write a readThis for MyRecord when I didn't > before (the readThis would resize the array). I still have > to resize the array, I just have to do it in user code > 2) Since when doing the I/O, I don't actually know the size > of the array, I have to either: > a) write my own array element reading code that resizes the > array as it goes, or > b) read the array twice, once to count the number of elements > and again to actually read them. > > I think all this comes at a serious productivity cost (and a cost > that I don't think is worthwhile if all we are hoping for is > consistency between array types - which has a nebulous benefit > on productivity especially if we can provide a reasonable error > message when the implementation behaves differently than someone > expects). > > >> 3) Any suggestions for a Chapel array literal format for > >> > >> multi-dimensional arrays?How would you write such arrays in JSON (and > >> would anyone want to)? At one point there was a proposal to put the > >> > >> domain in array literals, like this: > >> var A = [ over {1..10} ]; > >> > >> but that doesn't really answer how to write multidimensional array > >> literals. One approach would be to store the array elements in a flat > >> way and just reshape them while reading; e.g. > >> > >> var A = [ over {1..2, 1..3} > >> > >> 11, 12, 13, > >> 21, 22, 23 ]; > >> > >> where the spacing would not be significant. > > > >Do you mean reshape (resize?) then after reading? > > I don't see the difference, but whether it's during or after reading > is an implementation matter that I don't think we need to decide > upon now. > > >> If we had a reasonable format, we could extend support like (1) to > >> any-dimensional arrays that do not share domains, even for some textual > >> formats. > > > >Not sure what you mean here. > > We'd be able to read(myArray) even if myArray is multidimensional, > if we're using the Chapel array literal format that stores the > domain as the first thing. (but only if the array's domain is > not shared, since we don't want to resize the array if it means > resizing other arrays also). > > > Cheers, > > -michael > > > ---------------------------------------------------------------------------- > -- _______________________________________________ > Chapel-developers mailing list > Cha...@li... > https://lists.sourceforge.net/lists/listinfo/chapel-developers -- Rafael Larrosa Jiménez Centro de Supercomputación y Bioinformática - http://www.scbi.uma.es Universidad de Málaga EMAIL: rla...@um... Edificio de Bioinnovación TELEF: + 34951952788 C/ Severo Ochoa 34 FAX : +34951952792 Parque Tecnológico de Andalucía 29590 Málaga (SPAIN) |
From: Damian M. <da...@es...> - 2015-08-03 23:44:02
|
Michael, I am replying to your email now to keep things ticking over. Please realise that some answers are incomplete. Also, I think you are talking from the perspective of having more Chapel documentation than I have I looked at the 'heap' example recently posted by Mark Clemens and then commented up by Brad. I realized that this syntax var D : domain(1) = { 0 .. -1 }; in a domain exists now. Where is this documented please? So is domain resizing already handled in Chapel formally? I looked for resize in the Chapel 0.97 document and it is not really addressed. On Mon, 3 Aug 2015, Michael Ferguson wrote: > What would be most useful about this bit of history is if we knew > *why* flexible array bounds were regarded as a no-no by many > people. I was simply reminding people to look. I have never used the concept in Algol68 although I use it ever so occassionally in C/C++. > A lot has changed in languages and their implementation since then, and > it would be easy to dismiss as a problem that just needed a slightly > better solution. While I could be somebody who might be in the dismissive camp, there was something about it being back-the-front. That sounds fundamental but it beyond my expertise. >>> mychannel.read(A); >>> >>> could read into A any number of elements and adjust its domain >>> accordingly. Adjusting the domain in a) size has been in use in C/C++ so I think it is a quite sound concept as long as it does not conflict with memory allocation algorithms b) dimensions might have issues with program proofs. I have never used the idea of change in dimensions so I am of little use. > I think it's worth the improvement in productivity, and I don't think we > can convince people who have used Python and the like that they can't > have e.g. > > myArray.push_back(1); > > because it resize the array. I would prefer that Chapel be more aligned with something like C/C++ rather than Python. Just a personal preferance. I love Python and will continue to use it. Chapel I see as more of a C++ replacement. >> >>> 2) Should any-dimensional rectangular arrays be written in binary in a >>> form that encodes the size of each dimension? (In other words, write >>> the domain first). Such a feature would make something like (1) >>> possible for multi-dimensional arrays but might not match what people >>> expect for binary array formats. (I don't think we've documented >>> what you actually get when writing an array in binary yet...) >> >> Remind me what the argument is against demanding that prior to reading >> the >> array contents, the domain be read and then used to allocate the array? > > Three reasons: > 1) Similarity with Chapel's array literal syntax But the array literal syntax is known at compile time. What you are proposing is done at run-time. > 2) Some file formats do not include the length (e.g. JSON) Yes. Cannot this be handled by a two pass exercise as you note later. > 3) You can't do a whole-array read from JSON or a Chapel array > literal-like format without this functionality. So, it makes > the I/O code more complicated. I would prefer complexity in the I/O rather than in the language. > For example, if I have > > record MyRecord { > var myArray: [1..0] int; // (start with empty array) > } > > I might want to read something in this JSON format: > {"myArray":[1,2,3]} > {"myArray":[4,5]} > > I'd like to be able to do that with repeated calls to readf > that indicate to use JSON format like this: > > var r:MyRecord; > > readf("%jt", r); > > where the format string has j for JSON format and t means > "read or write anything" (vs e.g. a number or string). Does 'read' really need to know about JSON? Cannot it be a module? I worry that all of this is going to make Chapel's I/O as complex as C++ streams? Do not really want to go there? > If I had to read the array's size separately I would have > several problems in expressing this: > 1) I now have to write a readThis for MyRecord when I didn't > before (the readThis would resize the array). I still have > to resize the array, I just have to do it in user code Or in the module? Or have I missed something? > 2) Since when doing the I/O, I don't actually know the size > of the array, I have to either: > a) write my own array element reading code that resizes the > array as it goes, or Too much overhead. > b) read the array twice, once to count the number of elements > and again to actually read them. That's what I was implying. But quite low down, not really in user code. > I think all this comes at a serious productivity cost (and a cost > that I don't think is worthwhile if all we are hoping for is > consistency between array types - which has a nebulous benefit > on productivity especially if we can provide a reasonable error > message when the implementation behaves differently than someone > expects). I need to think about that more. I still prefer complexity in the I/O rather than in the language. >>> 3) Any suggestions for a Chapel array literal format for >>> multi-dimensional arrays?How would you write such arrays in JSON (and >>> would anyone want to)? At one point there was a proposal to put the >>> domain in array literals, like this: >> >>> var A = [ over {1..10} ]; >>> >>> but that doesn't really answer how to write multidimensional array >>> literals. One approach would be to store the array elements in a flat >>> way and just reshape them while reading; e.g. >>> >>> var A = [ over {1..2, 1..3} >>> 11, 12, 13, >>> 21, 22, 23 ]; >>> >>> where the spacing would not be significant. I like that. However, I would like to be careful about the keyword 'over' because I have an idea which wants to use 'over' for something else. Will chat about that separately. Regards - Damian Pacific Engineering Systems International, 277-279 Broadway, Glebe NSW 2037 Ph:+61-2-8571-0847 .. Fx:+61-2-9692-9623 | unsolicited email not wanted here Views & opinions here are mine and not those of any past or present employer |
From: Michael F. <mfe...@cr...> - 2015-08-04 12:35:16
|
Hi Damian - I'll try to answer your questions below. On 8/3/15, 7:43 PM, "Damian McGuckin" <da...@es...> wrote: > >Michael, > >I am replying to your email now to keep things ticking over. Please >realise that some answers are incomplete. Also, I think you are talking >from the perspective of having more Chapel documentation than I have > >I looked at the 'heap' example recently posted by Mark Clemens and then >commented up by Brad. I realized that this syntax > > var D : domain(1) = { 0 .. -1 }; > >in a domain exists now. Where is this documented please? So is domain >resizing already handled in Chapel formally? I looked for resize in the >Chapel 0.97 document and it is not really addressed. Section 19.8.1 says you can use assignment to resize a domain. For example, D = { 1..10 } In addition, we recently implemented operations similar to C++ vectors for 1-D arrays with non-shared domains. Examples of that are here: {test/release/}examples/primers/arrayVectorOps.chpl > >On Mon, 3 Aug 2015, Michael Ferguson wrote: > >> What would be most useful about this bit of history is if we knew >> *why* flexible array bounds were regarded as a no-no by many >> people. > >I was simply reminding people to look. I have never used the concept in >Algol68 although I use it ever so occassionally in C/C++. I had a look at "A History of ALGOL 68" by Lindsey, which only reports that appending to a flexible array was commonly desired but not implemented in the standard library. It didn't say that flexible arrays were "considered harmful" or anything like that. Flexible arrays in Algol are marked as such - so a Chapel analogue would be if we had something like var A:[flex 1..10] int; which would mean that A could be resized. But all Chapel arrays can already be resized - it's just that for a long time we had the rule that you had to separately declare the domain in order to do the resizing, like this: var D:domain(1) = {1..10}; var A:[D] int; D = {1..100}; // resizes A That changed with the push_back idea (see the arrayVectorOps example above). Now the array can be resized through an operation on the array if the domain is not shared with other arrays (otherwise, such operations generate an error). E.g.: var A:[1..2] int; A.push_back(1); ( results in [0,0,1] ). > >>>> mychannel.read(A); >>>> >>>> could read into A any number of elements and adjust its domain >>>> accordingly. > >Adjusting the domain in > >a) size has been in use in C/C++ so I think it is a quite sound concept > as long as it does not conflict with memory allocation algorithms > >b) dimensions might have issues with program proofs. I have never used the > idea of change in dimensions so I am of little use. I'm only talking about (a). I don't think that changing the dimension in these cases makes any sense. Changing the dimension would change the type of the domain... but changing the size only changes the contents of the domain. > >> I think it's worth the improvement in productivity, and I don't think >>we >> can convince people who have used Python and the like that they can't >> have e.g. >> >> myArray.push_back(1); >> >> because it resize the array. > >I would prefer that Chapel be more aligned with something like C/C++ >rather than Python. Just a personal preferance. I love Python and will >continue to use it. Chapel I see as more of a C++ replacement. The example, push_back, came from C++. We try to be aware of what languages Chapel users will already be familiar with. > >> 3) You can't do a whole-array read from JSON or a Chapel array >> literal-like format without this functionality. So, it makes >> the I/O code more complicated. > >I would prefer complexity in the I/O rather than in the language. We're really talking about complexity in the standard library vs complexity in common user code. Cheers, -michael |
From: Damian M. <da...@es...> - 2015-08-04 13:19:02
|
On Tue, 4 Aug 2015, Michael Ferguson wrote: > Section 19.8.1 says you can use assignment to resize a domain. > For example, > D = { 1..10 } Saw stuff like this long ago. My eyes finally hit this. If the domain variable being assigned was used to declare arrays, these arrays are reallocated as discussed in 20.11. I have never noticed the word 'reallocated'. And the important explanatory information is actually in 20.11. > In addition, we recently implemented operations similar to C++ > vectors for 1-D arrays with non-shared domains. Examples of > that are here: > > {test/release/}examples/primers/arrayVectorOps.chpl I will read. >> On Mon, 3 Aug 2015, Michael Ferguson wrote: >> >>> What would be most useful about this bit of history is if we knew >>> *why* flexible array bounds were regarded as a no-no by many >>> people. >> >> I was simply reminding people to look. I have never used the concept in >> Algol68 although I use it ever so occassionally in C/C++. > > I had a look at "A History of ALGOL 68" by Lindsey, which only reports > that appending to a flexible array was commonly desired but not > implemented in the standard library. It didn't say that flexible > arrays were "considered harmful" or anything like that. Sorry, I gave the wrong impression. There were big overheads in some implementations, although I only ever tried three of them. Also, I think some people mentioned that the way it was done was driven by the needs of string handling, rather than a more optimal method. As I said, long time ago. And then I discovered C++ in the 1980s and my brain thought it could forget all other languages. > Flexible arrays in Algol are marked as such - so a Chapel analogue > would be if we had something like > > var A:[flex 1..10] int; > > which would mean that A could be resized. But all Chapel arrays > can already be resized That is NOT obvious to somebody only using some of the language features. And I have been looking at Chapel for a few years now. Then again, I do not program by reallocating so maybe I was never trying to do resizing. I tend to compute storage needs outside of a a set of routines or a whole program and then make an allocation immediate on jumping into a new routine. Old habits die hard. I assume that Chapel arrays come off the heap in a single threaded program. > - it's just that for a long time we had the rule that you had to > separately declare the domain in order to do the resizing, like this: > > var D:domain(1) = {1..10}; > var A:[D] int; > > D = {1..100}; // resizes A I understand this now from reading 20.11. I will revisit your original ideas about reading. Regards - Damian Pacific Engineering Systems International, 277-279 Broadway, Glebe NSW 2037 Ph:+61-2-8571-0847 .. Fx:+61-2-9692-9623 | unsolicited email not wanted here Views & opinions here are mine and not those of any past or present employer |