chapel-developers Mailing List for Chapel

chapel-developers — Discussion list for Chapel developers in the community

You can subscribe to this list here.

2010	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov (5)	Dec (5)
2011	Jan (8)	Feb (15)	Mar (8)	Apr (13)	May (18)	Jun (66)	Jul (35)	Aug (13)	Sep (47)	Oct (53)	Nov (23)	Dec (5)
2012	Jan (3)	Feb (13)	Mar (25)	Apr (22)	May (33)	Jun (43)	Jul (48)	Aug (18)	Sep (98)	Oct (17)	Nov (7)	Dec (31)
2013	Jan (102)	Feb (33)	Mar (50)	Apr (36)	May (59)	Jun (26)	Jul (9)	Aug (31)	Sep (70)	Oct (33)	Nov (24)	Dec (75)
2014	Jan (106)	Feb (22)	Mar (68)	Apr (62)	May (29)	Jun (18)	Jul (35)	Aug (41)	Sep (21)	Oct (4)	Nov (2)	Dec (32)
2015	Jan (12)	Feb (17)	Mar (34)	Apr (35)	May (44)	Jun (43)	Jul (15)	Aug (5)	Sep (2)	Oct (57)	Nov (11)	Dec (5)
2016	Jan (12)	Feb (6)	Mar (86)	Apr (13)	May (16)	Jun (7)	Jul (13)	Aug (12)	Sep (15)	Oct (8)	Nov (17)	Dec (1)
2017	Jan (25)	Feb	Mar (50)	Apr (17)	May (29)	Jun (36)	Jul (33)	Aug (45)	Sep (4)	Oct (18)	Nov (17)	Dec (6)
2018	Jan (8)	Feb (5)	Mar (20)	Apr (6)	May (3)	Jun (1)	Jul (4)	Aug	Sep (1)	Oct (5)	Nov (3)	Dec (13)
2019	Jan (6)	Feb (2)	Mar (21)	Apr (3)	May (15)	Jun (10)	Jul (9)	Aug (8)	Sep (4)	Oct (1)	Nov (11)	Dec (7)
2020	Jan (12)	Feb (5)	Mar (8)	Apr (3)	May (2)	Jun (30)	Jul (40)	Aug (23)	Sep (20)	Oct (1)	Nov	Dec (1)

Flat | Threaded

1 2 3 .. 109 > >> (Page 1 of 109)

[Chapel-developers] Year-end reminder: switch to Chapel Discourse

From: Brad C. <bra...@hp...> - 2020-12-17 01:15:42

Hello Chapel Community —

This is a final reminder for 2020 that we're in the process of retiring 
these SourceForge-based Chapel mailing lists in favor of our new Discourse 
site:

 	https://chapel.discourse.group/

If you are interested in keeping in touch with the Chapel community, 
please be sure to register there, as these mailing lists will be going 
away very soon.

Since my last message:

* We've made the Discourse site publicly readable such that you don't need
   to register to browse its contents (though you still do to post)

* We've posted some instructions on how to use Discourse like a mailing
   list, which can be used to recreate the experience of these mailing
   lists for those who aren't attracted to web-based forums:

 	https://chapel.discourse.group/t/welcome-to-the-chapel-programming-language-discourse-page/8

* We've added the ability to sign into the site using your GitHub
   credentials to avoid having to create a new account from scratch.


Best wishes for the end of 2020 and the start of the new year,
-Brad

[Chapel-developers] Missing out on Chapel news? (new release, Bossies, PACT'20 keynote)

From: Brad C. <bra...@hp...> - 2020-10-16 02:07:00

Hi Chapel Community —

This is a second reminder that if you feel like it's been a bit too quiet 
here recently, remember that we're in the process of replacing the Chapel 
mailing lists with our new Discourse site:

 	https://chapel.discourse.group/

In particular, steer yourself towards the "Announcements" category 
(https://chapel.discourse.group/c/announcements) to catch up on recent 
news like:

* Highlights of today's Chapel 1.23.0 release

* Chapel being named a 2020 Bossie Award winner

* How to watch my keynote on "Compiling Chapel" from PACT'20 last week


We hope to see you there!
-Brad

Re: [Chapel-developers] Regarding Contributions

From: Albrecht, B. <ben...@hp...> - 2020-09-28 13:34:26

Hi Suyash,

I suggest starting on the contributing page: https://chapel-lang.org/contributing.html

It should walk you through the entire process of getting started with Chapel to identifying contributions you can make to the project.

Thanks,
Ben

From: Suyash Patil <suy...@gm...>
Date: Saturday, September 26, 2020 at 12:35 PM
To: "cha...@li..." <cha...@li...>
Subject: [Chapel-developers] Regarding Contributions

Hi,
I am Suyash Patil, a second year undergraduate student, currently pursuing Engineering. I am very eager to start my open source journey with Chapel. I know C, C++ and Git. I want to know what are the projects available and what is required to contribute to them.

Regards,
Suyash Patil

[Chapel-developers] Regarding Contributions

From: Suyash P. <suy...@gm...> - 2020-09-26 16:34:33

Hi,
I am Suyash Patil, a second year undergraduate student, currently pursuing
Engineering. I am very eager to start my open source journey with Chapel. I
know C, C++ and Git. I want to know what are the projects available and
what is required to contribute to them.

Regards,
Suyash Patil

[Chapel-developers] Chapel mailing lists changing to Discourse

From: Brad C. <bra...@hp...> - 2020-09-24 21:55:38

Dear Chapel mailing list subscribers —

After years of talking about it but failing to act, we've finally started 
the process of retiring Chapel's SourceForge-hosted mailing lists (the 
ones where you're receiving this message) in favor of a more modern, 
ad-free way of supporting discussions within the community via email or 
the web.  Specifically, we've launched a Chapel Discourse site.

If you're not familiar with Discourse, it's a web-based technology for 
discussions that can be used both from a browser or in more of a mailing 
list mode.  Discourse supports 'topics' sorted into 'categories' where you 
can think of:

* topics = an email thread or a discussion thread on a web forum
* category = like a mailing list or a tag/folder on a web forum

We invite and encourage everyone subscribed here to register and to join 
us for further discussion about Chapel at:

 	https://chapel.discourse.group/

Once you've registered, I suggest:

* taking a look at the 'Categories' tab, which is a good way to get an
   overview of the site, particularly if you're coming from a mailing list
   mindset:

 	https://chapel.discourse.group/categories

   Each top-level category should have a pinned "about this category" post
   that's intended to describe what it's for and how you can post to it via
   email, once registered.

* Decide which categories you want to follow or mute.

* Take a moment to introduce yourself to the community:

 	https://chapel.discourse.group/t/introduce-yourself/

* Send us your questions and feedback in the "site feedback"
   category:

 	https://chapel.discourse.group/c/site-feedback



At some point this fall, we will be disabling the SourceForge mailing 
lists, but for a time, we'll keep both forums going while people work on 
converting over.


Looking forward to further Discourse with you,
-Brad (on behalf of the Chapel team at HPE)

Re: [Chapel-developers] Chapel - Contributing to some GSoC projects

From: Rohit S. <roh...@gm...> - 2020-09-10 02:06:16

Hi Engin,

Thanks for replying so quickly! I will visit the gitter channel.

Thanks,
Rohit.

On Wed, Sep 9, 2020 at 9:37 AM Kayraklioglu, Engin <en...@hp...> wrote:

> Hi Rohit,
>
> Thanks for your interest. Note that some of the items in the project idea
> list has been taken up by this year's GSoC students:
>
>
> https://summerofcode.withgoogle.com/organizations/4605282207924224/#projects
>
> For contributing to Chapel, start by reading
>
> https://chapel-lang.org/contributing.html
>
> The best way for community interaction is through our Gitter channel:
>
> https://gitter.im/chapel-lang/chapel
>
> Engin
>
> On 9/9/20, 9:32 AM, "Rohit Shinde" <roh...@gm...> wrote:
>
>     Hello everyone,
>     I have been learning Chapel out of interest for a while now. I was
> interested in learning a parallel language.
>
>     While looking for things to play around with in Chapel, I came across
> the GSoC page where ideas for projects were listed.
>
>     I was interested in a couple of them, and I was wondering if it would
> be a good idea to pick one of them up. The ones I am interested in are:
>
>     1. Making an iterator library for Chapel.
>     I am quite familiar with Python's iterators and I think I could build
> something equivalent for Chapel.
>     2. Developing modules for Chapel's standard library.
>     3. String performance improvements.
>     4. Web libraries
>     5. Implementing a Parser.
>     I have some experience in this area. Not a whole lot. But it would be
> fun to hack on it until I get something working.
>
>     Please let me know what you think. I love Chapel a lot and I would
> like to contribute to the community.
>
>
>     Thanks,
>     Rohit.
>
>

Re: [Chapel-developers] Chapel - Contributing to some GSoC projects

From: Kayraklioglu, E. <en...@hp...> - 2020-09-09 16:38:01

Hi Rohit,

Thanks for your interest. Note that some of the items in the project idea list has been taken up by this year's GSoC students:

https://summerofcode.withgoogle.com/organizations/4605282207924224/#projects

For contributing to Chapel, start by reading

https://chapel-lang.org/contributing.html

The best way for community interaction is through our Gitter channel:

https://gitter.im/chapel-lang/chapel

Engin

On 9/9/20, 9:32 AM, "Rohit Shinde" <roh...@gm...> wrote:

    Hello everyone,
    I have been learning Chapel out of interest for a while now. I was interested in learning a parallel language. 

    While looking for things to play around with in Chapel, I came across the GSoC page where ideas for projects were listed.

    I was interested in a couple of them, and I was wondering if it would be a good idea to pick one of them up. The ones I am interested in are:

    1. Making an iterator library for Chapel.
    I am quite familiar with Python's iterators and I think I could build something equivalent for Chapel.
    2. Developing modules for Chapel's standard library.
    3. String performance improvements.
    4. Web libraries
    5. Implementing a Parser.
    I have some experience in this area. Not a whole lot. But it would be fun to hack on it until I get something working. 

    Please let me know what you think. I love Chapel a lot and I would like to contribute to the community.


    Thanks,
    Rohit.

[Chapel-developers] Chapel - Contributing to some GSoC projects

From: Rohit S. <roh...@gm...> - 2020-09-09 16:31:58

Hello everyone,

I have been learning Chapel out of interest for a while now. I was
interested in learning a parallel language.

While looking for things to play around with in Chapel, I came across the
GSoC page where ideas for projects were listed.

I was interested in a couple of them, and I was wondering if it would be a
good idea to pick one of them up. The ones I am interested in are:

   1. Making an iterator library for Chapel.
   I am quite familiar with Python's iterators and I think I could build
   something equivalent for Chapel.
   2. Developing modules for Chapel's standard library.
   3. String performance improvements.
   4. Web libraries
   5. Implementing a Parser.
   I have some experience in this area. Not a whole lot. But it would be
   fun to hack on it until I get something working.

Please let me know what you think. I love Chapel a lot and I would like to
contribute to the community.

Thanks,
Rohit.

Re: [Chapel-developers] Slicing is an optimal approach in Matrix Multiplication - GEMM

From: Brad C. <bra...@hp...> - 2020-09-08 22:20:42

Hi Damian —

Sorry for the belated response.  I believe that what you've written here 
should be fine performance-wise; specifically, that no temporary will be 
introduced to capture the RHS '[(r, c) in slab] ...' expression.

-Brad


On Tue, 1 Sep 2020, Damian McGuckin wrote:

>
> On another point in the same code, I try and grab several adhacent rows from 
> the original matrix 'v' and transpose them, and put them into what I call a 
> slab. It is not a tile like you see in the chapel DGEMM.
>
> 	var vslab : [cslice, common] R;
> //either
>        [(r, c) in vslab.domain] vslab[r, c] = v[c, r];
> //or
> 	[j in cslice] vslab[j, common] = v[j, common];
>
> where R is a general real(?w).
>
> Technically vslab is 'const' so I stabbed in the dark and tried
>
> 	const slab : domain(2) = (cslice, common);
> 	const vslab = [(r, c) in slab] v[c, r];
>
> It seems to run in the same elapsed time, is genuinely 'const', and looks 
> cleaner.
>
> Does it create any un-necessary data, i.e. does it create a temporary on the 
> right before assigning to vslab or does it do it only in cslice*common
> real(?w) numbers?
>
> Thanks - Damian
>
> Pacific Engineering Systems International, 277-279 Broadway, Glebe NSW 2037
> Ph:+61-2-8571-0847 .. Fx:+61-2-9692-9623 | unsolicited email not wanted here
> Views & opinions here are mine and not those of any past or present employer
>

Re: [Chapel-developers] Parallellism (subject to a constraint)

From: Ferguson, M. P. P. (C. Developer) <mic...@hp...> - 2020-09-08 13:23:01

Hi Damian -

A tiny bit of follow-up is inline below:
    
    That expands my understanding greatly.
    
    Your excellent explanation belongs in the documentation somewhere.
    
If you can think of a good place to add it - feel free to make a PR
pasting it in to documentation somewhere that makes sense to
you, and we can, in the PR review process, get it to something
that renders reasonably well & makes sense to me as well.

    > In your example, a task calling `putStage` could, on a non-x86 system,
    > end up storing the value of `stage[i]` in a per-core cache somewhere,
    > (let's say, L1 cache, or a write reorder buffer), and put
    > off committing it to memory indefinitely. As a result, the parallel
    > program would have a load imbalance problem since the current value
    > of this variable isn't being communicated to the other tasks.
    
    Understood. If I update a column of a matrix in one core in some parallel 
    task T, I would still hope that if subsequently, i.e. in a serial sense 
    and after the task T has finished, I try and read that same column of the 
    matrix in another core in another parallel task T', I see only the updated 
    column.
    
    As I said, I assume that a program spawing 2 tasks, say T1 and T2 updating 
    columns 1 and 2 respectively of a matrix, can be assured on completion of 
    both T1 and T2 that any subsequent tasks will see only the values written 
    by T1 and T2 into the matrix.

Yes. The MCM chapter of the spec says this:

> Chapel’s fork-join constructs introduce additional order dependencies. Operations within a task cannot behave as though they started before the task started. Similarly, all operations in a task must appear to be completed to a parent task when the parent task joins with that task.

https://chapel-lang.org/docs/language/spec/memory-consistency-model.html

I view your questions above as rephrasing / corollary to the same idea.

Best,

-michael

Re: [Chapel-developers] Parallellism (subject to a constraint)

From: Damian M. <da...@es...> - 2020-09-06 06:42:29

Michael,

On Fri, 4 Sep 2020, Ferguson, Michael Paul Pratt (Chapel Developer) wrote:

> Here I am interpreting "locking" as a software lock that protects a 
> critical section and causes other tasks trying to enter the critical 
> section to wait. (Like ` pthread_mutex_lock`).

That was my understanding although I thought it was simpler than a
pthread_mutex_lock.

> Atomics don't use locks on any normal configurations of Chapel. Atomics 
> are generally pretty fast and are directly supported by the processor 
> (or possibly the network).

> They use special CPU instructions to ensure atomicity. One potential 
> source of confusion is that on x86 these might involve the `lock` 
> prefix; but AFAIK that instruction prefix has this name for historical 
> reasons and might more reasonably be named something else like `atomic`.

That expands my understanding greatly.

Your excellent explanation belongs in the documentation somewhere.

> You can use a relaxed atomic to get pretty much the same effect
> as a ``volatile`` in your program and I would expect it to have
> similarly low overhead.

Ditto. But I need to learn more about all the various types of atomic.
And clear documentation on that seems very thin on the ground. Well, it 
was the last time I looked for a C++ content.

> If you do some research on volatile, you might learn that it's not able 
> to control the way the CPU processor itself optimizes loads/stores 
> (rather; it only prevents the compiler from doing so). While that might 
> be OK on x86 it would not function on other platforms like ARM for 
> example. (x86 uses TSO - Total Store Order - which is a stronger 
> constraint than other platforms).

OK. Stopping the compiler doing naughty things is a good start as I have 
found when trying to interrogating the Floating Point Control/Status 
Register.

> In your example, a task calling `putStage` could, on a non-x86 system,
> end up storing the value of `stage[i]` in a per-core cache somewhere,
> (let's say, L1 cache, or a write reorder buffer), and put
> off committing it to memory indefinitely. As a result, the parallel
> program would have a load imbalance problem since the current value
> of this variable isn't being communicated to the other tasks.

Understood. If I update a column of a matrix in one core in some parallel 
task T, I would still hope that if subsequently, i.e. in a serial sense 
and after the task T has finished, I try and read that same column of the 
matrix in another core in another parallel task T', I see only the updated 
column.

> Or, worse than that, the write in `putStage` might store something in 
> memory that is neither the previous value nor the new value but some mix 
> of the two. (For example, maybe only the low byte is set to the new 
> value on a platform only able to write 1 byte at a time to memory). It 
> is hard to see how the algorithm could function correctly in this 
> situation.

Agreed.  See my earlier comment.

> These cases are part of the reason that processors support atomics -
> they allow different tasks/threads/cores to communicate in a
> reasonable manner.

As I said, I assume that a program spawing 2 tasks, say T1 and T2 updating 
columns 1 and 2 respectively of a matrix, can be assured on completion of 
both T1 and T2 that any subsequent tasks will see only the values written 
by T1 and T2 into the matrix.

> But you don't have to just take it from me

I believe you. Your knowledge is good enough for me.

> The atomic section of the spec is here:
>
> https://chapel-lang.org/docs/master/language/spec/task-parallelism-and-synchronization.html#atomic-variables

I have read that numerous times and it found it less than useful.
>
> However the information you probably need is here:

You are very correct.

> https://chapel-lang.org/docs/language/spec/memory-consistency-model.html#relaxed-atomic-operations

I had seen/read this - many times. Every time I try to read it, my head 
hurts!

> However the specification does not currently describe acquire, release, 
> or acqRel orderings. I will add an open issue note so that it is clearer 
> that this is missing from the spec and not just something one isn't 
> finding.

Great.

> I will update the link to refer to the right section (which BTW we
> could not do before with the PDF spec).

Thanks.

> PR #16341 will make the documentation improvements I mentioned here.

Perfect.

Thanks for the insight - Damian

P.S. If you want to know how useful your advice is, read on.

For my problem, I will now use atomics. I believe relaxed ordering will do 
the job based on my (very limited) understanding of the details of memory 
consistency.

This is for a Jacobi SVD.

My algorithm, now using atomics, with a matrix of order 36, i.e. a 36x36 
real(w)'s. on my 6-core machine, would have 35 tasks created by a forall 
over 1..N-1. On average, almost 6 tasks going concurrently all the time. 
Each such task would on average, process N/2 columns of the matrix one 
after the other.

You can alternatively reorganize the algorithm into M (sequential) steps. 
Each one of those M steps, say the I'th, would involve multiple Jacobi 
rotations on a full column pair within the matrix. Each such column pair 
operation can be run in parallel as each I'th step has only operations 
which are independent of each other. No need for atomics. There are 70 
such groups for the 36x36 matrix, requiring a total of 630 tasks where 
there are on average between 1 and 18 tasks per group. Obviouls you can 
merge some of the column pairs to reduce the parallelization load. But the 
code to do this seriously obscures the logic of the underlying algorithm 
and hence the program's readability, and makes the code parallel-centric 
which goes against all of our programming KPIs. There are several other 
reordering algorithms but I do not understand them well enough to program 
them in Chapel nor can I find a remotely readable parallel implementation 
of them in any reference in any language.

Apart from the readability concerns, it is the nominal 600+ tasks for the 
algorithm which avoids atomics as opposed to 36 tasks for one which needs 
atomics. I will run with the code which uses 'atomics'. There are also 
only 6 extra LOC (lines of code within the algorithm) over the serial 
version and the algorithm details are unchanged.

The discepancy just gets worse for even larger matrices.

Cache misses do occur in both approaches because they need to process a 
column-major matrix by columns. Unavoidable sadly.

For reasonably sized matrices, the Jacobi algorithm is arguably now the 
algorithm of preference for SVD. It is far more accurate than a 1970s 
Golub+Reinsch (Householder reduction-based) SVD so as that which I have 
provided as a reference point for the GSoC project. On the down side, it 
has more floating point computations than the Golub+Reinsch approach. But 
on the plus side from a parallelization perspective, the Jacobi algorithm 
(with atomics as I describe) will I think have a much reduced parallelism 
overhead compared to my Golub+Reinsch code.

Pacific Engineering Systems International, 277-279 Broadway, Glebe NSW 2037
Ph:+61-2-8571-0847 .. Fx:+61-2-9692-9623 | unsolicited email not wanted here
Views & opinions here are mine and not those of any past or present employer

Re: [Chapel-developers] Parallellism (subject to a constraint)

From: Ferguson, M. P. P. (C. Developer) <mic...@hp...> - 2020-09-04 13:21:45

Hi Damian -

(Just responding to the 2nd part of your mail).

See reply inline below.

So, to avoid treading on the heals of its predecessor,
Task#I+1 is continually reading/polling stage[I] which it must read it
from memory to ensure it knows what Task#I is doing. Again no locking is
required. This is your garden variety C volatile variable. How do I handle
it in Chapel? An atomic variable seems like massive overhill as I assume
locking is involved with an atomic. I can access stage[??] through a pair
of custom (tiny) external C routines

void putStage(long *stage, long i, long k) { stage[i] = k; }
and
long getStage(long *stage, long i) { return stage[i]; }

But that seems like I am sticking my head in the sand and avoiding the
underlying problem. And it probably cripples any optimization being done
in the code which is calling these routines.

I still contend that Chapel needs a 'vol[atile]' declaration concept or
something like it. The identifier so declared is much like a 'var[iable]'
but it must always be accessed through, i.e. written-to/read-from, memory.
But I do not know enough of Chapel's big picture so I could be talking
through my hat! I wish I had known enough of Chapel to be a productive
participant in the conversation in 2012 when volatile got removed from
Chapel 1.15.0. Then again, maybe I am talking about peek/poke for atomics
which appears to have lesser overhead than other types of memory ordering.
But again, I have no idea what the overhead for these really is but it all
seems way too high for what I want.

> An atomic variable seems like massive overhill as I assume
> locking is involved with an atomic

Here I am interpreting "locking" as a software lock that
protects a critical section and causes other tasks trying
to enter the critical section to wait. (Like ` pthread_mutex_lock`).

Atomics don't use locks on any normal configurations of
Chapel. Atomics are generally pretty fast and are directly
supported by the processor (or possibly the network).

They use special CPU instructions to ensure atomicity.
One potential source of confusion is that on x86 these
might involve the `lock` prefix; but AFAIK that instruction prefix
has this name for historical reasons and might more reasonably
be named something else like `atomic`.

You can use a relaxed atomic to get pretty much the same effect
as a ``volatile`` in your program and I would expect it to have
similarly low overhead.

If you do some research on volatile, you might learn that it's not
able to control the way the CPU processor itself optimizes
loads/stores (rather; it only prevents the compiler from doing so).
While that might be OK on x86 it would not function on other platforms
like ARM for example. (x86 uses TSO - Total Store Order - which is
a stronger constraint than other platforms).

In your example, a task calling `putStage` could, on a non-x86 system,
end up storing the value of `stage[i]` in a per-core cache somewhere,
(let's say, L1 cache, or a write reorder buffer), and put
off committing it to memory indefinitely. As a result, the parallel
program would have a load imbalance problem since the current value
of this variable isn't being communicated to the other tasks.

Or, worse than that, the write in `putStage` might store something
in memory that is neither the previous value nor the new value
but some mix of the two. (For example, maybe only the low
byte is set to the new value on a platform only able to write
1 byte at a time to memory). It is hard to see how the algorithm
could function correctly in this situation.

These cases are part of the reason that processors support atomics -
they allow different tasks/threads/cores to communicate in a
reasonable manner.

But you don't have to just take it from me - the linux kernel developers
also frown upon using volatile; see

https://www.kernel.org/doc/html/latest/process/volatile-considered-harmful.html

By the way, the online Chapel documentation on atomics does not appear to
explain (or link to an explanation of) memory ordering types. In an older
PDF document, it refers to C11 which then refers to the C++ definition and
other documentation.

The atomic section of the spec is here:

https://chapel-lang.org/docs/master/language/spec/task-parallelism-and-synchronization.html#atomic-variables

However the information you probably need is here:

https://chapel-lang.org/docs/language/spec/memory-consistency-model.html#relaxed-atomic-operations

However the specification does not currently describe
acquire, release, or acqRel orderings. I will add an open
issue note so that it is clearer that this is missing from the spec
and not just something one isn't finding.

If you click on the Chapel Language Specification you land on a page which
has no reference whatsoever about atomics. If I actually had any solid
grasp of the subject, I would offer to rewrite it, but I do not, which is
why I am reading about it in the first place. I find that by the time I
have gone through all the links to links, I have long forgotten what my
precise original Chapel problem was. Just my 2c. Might be worth putting
onto the to-do list.

I will update the link to refer to the right section (which BTW we
could not do before with the PDF spec).

PR #16341 will make the documentation improvements I mentioned
here.

Best,

-michael

Re: [Chapel-developers] Parallellism (subject to a constraint)

From: Damian M. <da...@es...> - 2020-09-04 05:52:33

Hi Ben,

Michael, I am guessing that the stuff of most interest to yourself is in 
the latter half of this email but the rest is background. Hopefully I am 
not rambling too much.

On Wed, 2 Sep 2020, Albrecht, Ben wrote:

> For example, say task 0 processes column pairs (1..N, 2) serially. After 
> completing (2,2), task 0 create a new parallel task (task 1) to process 
> the pairs (2..N, 3).

Not quite. Sorry for my poor explanation. Let's assume that Task #0 is the 
controlling process. It starts:

* Task #1 fto processes column PAIRS (row#1, columns#1..N) serially

After Task#1 has completed (1,1), it starts

* Task #2 with qualifications:
   - Task#2 is given responsibility for (row#2, columns#1..N) but
   - initially you want Task#2 to only process (2,1), i.e. column#1
     -- because Task#1 is by now busy processing column#2 and we
     -- do not want Task#2 and come along and mess with that work
   - Task#2 must check with Task#1 before proceeding to the next column, i.e.
     -- it can only process column (2,2) if the parent has finished with
     -- (1,2) it, and it can only process (2,3) if the parent has finished
     -- with (1,3) and so on until it gets to (2, N).

After Task#2 has completed (2,1) it starts
* Task #2 with qualifications:
   - Task#2 is given responsibility for (row#2, columns#1..N) but
   ... same stuff as above ...

To give each spawned task (that is currently processing row I) the job of 
spawning Task#I+1 for row I+1 seems less than optimal. But maybe I am 
missing something.  Also, Task#I+1 continually needs to check with Task#I 
whether it can step to the next column, i.e. to ensure that its creator 
(i.e. Task#I) has completed its own processing of the next column that 
Task I+1 now wants to process. So, I (maybe wrongly) let Task#0 handle all 
of the spawning, which it hopefully does in such a way as to be optimal 
for the underlying system (or locales of systems) using a 'forall'.

Additionally, the overhead I saw in my last use of a 'cobegin' was so high 
that I now stay away from it. It was awful. I have yet to experiment 
seriously with 'coforall'. I very rarely see the need for a begin.

> In the task-parallel representation, you are creating the tasks as 
> needed, rather than spinning them all up at once and having them wait 
> for work.

Even if I start up the tasks as needed, each new task still needs to 
communicate with its creater to know if it is safe to proceed to the next 
column. So I cannot see what your approach buys but I am always happy to 
learn. I think that I did not clearly explain that each task needs to keep 
monitoring what its creator is doing.

> However, this may require some effort to reach desired performance. For 
> example, a naive implementation of this representation would create N 
> tasks, rather than creating a number tasks appropriate for your hardware 
> like the forall approach does.

I like the fact that forall creates tasks appropriate to the hardware.

> See https://chapel-lang.org/docs/master/primers/taskParallel.html and 
> https://chapel-lang.org/docs/master/users-guide/index.html#task-parallelism 
> for more background on task parallelism in Chapel

Thanks for that although I found that this did not explain much. But that 
is probably because I am coming from such a beginners-level base. Also, 
somebody needs to add something about the overhead in each approach (which 
is way beyond my knowledge currently).

This next two paragraphs needs some insight/input from others as they are 
related to some past discussions.

In my algorithm, Task#I must keep track of where it is up, so let's define 
an array of int's,

 	stage[0..N] : int

where stage[I] reflects the current column that task#I has completed. Each 
element of that array, i.e. stage[I] is updated by only Task#I so there is 
no need for the overhead that an atomic variable would involve. Task#I+1 
must read stage[I] as it steps through the columns to make sure it does 
not try to update column 'K' before Task#I has finished with it. Task#I+1 
should run slightly behind Task#I because it starts later. In practice. a 
a thread started after another thread may not always run behind a thread 
started before it.  So, to avoid treading on the heals of its predecessor, 
Task#I+1 is continually reading/polling stage[I] which it must read it 
from memory to ensure it knows what Task#I is doing. Again no locking is 
required. This is your garden variety C volatile variable. How do I handle 
it in Chapel? An atomic variable seems like massive overhill as I assume 
locking is involved with an atomic. I can access stage[??] through a pair 
of custom (tiny) external C routines

 	void putStage(long *stage, long i, long k) { stage[i] = k; }
and
 	long getStage(long *stage, long i) { return stage[i]; }

But that seems like I am sticking my head in the sand and avoiding the 
underlying problem. And it probably cripples any optimization being done
in the code which is calling these routines.

I still contend that Chapel needs a 'vol[atile]' declaration concept or 
something like it. The identifier so declared is much like a 'var[iable]' 
but it must always be accessed through, i.e. written-to/read-from, memory. 
But I do not know enough of Chapel's big picture so I could be talking 
through my hat! I wish I had known enough of Chapel to be a productive 
participant in the conversation in 2012 when volatile got removed from 
Chapel 1.15.0. Then again, maybe I am talking about peek/poke for atomics 
which appears to have lesser overhead than other types of memory ordering. 
But again, I have no idea what the overhead for these really is but it all 
seems way too high for what I want.

By the way, the online Chapel documentation on atomics does not appear to 
explain (or link to an explanation of) memory ordering types. In an older 
PDF document, it refers to C11 which then refers to the C++ definition and 
other documentation. Even in the current primers

 	https://chapel-lang.org/docs/primers/atomics.html?highlight=atomic

it says

 	For more information on Chapels atomics, see the Chapel Language
 	Specification.

If you click on the Chapel Language Specification you land on a page which 
has no reference whatsoever about atomics. If I actually had any solid 
grasp of the subject, I would offer to rewrite it, but I do not, which is 
why I am reading about it in the first place. I find that by the time I 
have gone through all the links to links, I have long forgotten what my 
precise original Chapel problem was. Just my 2c. Might be worth putting 
onto the to-do list.

Thanks - Damian

Pacific Engineering Systems International, 277-279 Broadway, Glebe NSW 2037
Ph:+61-2-8571-0847 .. Fx:+61-2-9692-9623 | unsolicited email not wanted here
Views & opinions here are mine and not those of any past or present employer

[Chapel-developers] FW: Parallellism (subject to a constraint)

From: Albrecht, B. <ben...@hp...> - 2020-09-02 18:31:56

(resending to the mailing list)

On 9/2/20, 2:28 PM, "Albrecht, Ben" <ben...@hp...> wrote:

Hi Damian,

I don’t have much to add on improving this algorithm implementation.
At a high level, the load imbalance and data dependence across parallel
iterations makes me wonder if a lower-level recursive task-parallel
implementation would be more suitable for this algorithm.

For example, say task 0 processes column pairs (1..N, 2) serially. After
completing (2,2), task 0 create a new parallel task (task 1) to process the
pairs (2..N, 3). After task 1 processes (3,3), it creates a new parallel task
to process (3..N, 4), and so on.

In the task-parallel representation, you are creating the tasks as needed,
rather than spinning them all up at once and having them wait for work.
However, this may require some effort to reach desired performance. For
example, a naive implementation of this representation would create N tasks,
rather than creating a number tasks appropriate for your hardware like the
forall approach does.

See https://chapel-lang.org/docs/master/primers/taskParallel.html and https://chapel-lang.org/docs/master/users-guide/index.html#task-parallelism for more background on task parallelism in Chapel

Hope that helps.

Thanks,
Ben

On 8/29/20, 12:10 AM, "Damian McGuckin" <da...@es...> wrote:

Hi,

There exists a need to process the N columns of an array 'U' as

for j in 1..N do
{
// treat column j as the anchor column

for k in j+1 .. N do // Stage 'j' processing
{
// mess with operations on column 'j' and 'k'
}
}

The inner loops can be treated as independent of each other subject to a
constraint, i.e. there is a need to guarantee (or somehow enforce) that

when stage 'j' wants to process column 'k', it knows (or can
check) that processing of that same column 'k' by stage(s) prior,
i.e. 'j-1', is completed (or can twiddle its thumbs waiting for
that to happen before proceeding).

That last statement is obviously recursive although it does not need to
be programmed as such.

One can program this (maybe poorly) as

var stage : [0..N] atomic int;

stage[0].write(N);
// define that the 'ghost' precursor stage
// has completed processing of all columns
forall j in 1..N-1 do
{
// define that this stage has completed NO columns
stage[j].write(0);
// this routine must update stage[j] with 'k' when
// it has finished its own processing of column 'k'
processStage(j, U, ......);
}

and then update stage[j] within

processStage(j, U, ......)

to reflect the column 'last processed', say 'k', in that stage. This
then allows the next stage

processStage(j+1, U, ......)

to inspect that variable, i.e. stage[j], during its own operation to
ensure that it does not attempt to use any column 'i' where 'i > k'.

This approach involves waiting which is a big no-no and demands that any
distributed implementation update what amounts to a variable in the
primary locale. This is less than ideal although the overhead is yet to
be quantified. There is an upside. Because one would probably process
columns in blocks, say of 4 or 8 (or at a pinch 16), the apparent need to
test the atomic variable every columns drops by that same blocking factor.
So, while the waiting is not so critical (even if it is detrimental to the
algorithm's readability), the atomic variables in the primary locale are
still a worry.

Are there better ways to attack this problem?

And yes, if looking at parallel Jacobi SVD sweeps, there are algorithms
that try and parallelize that logic very differently. But they do/did not
have the benefit of Chapel at their disposal at the time there were being
developed. And besides, they are quite complex, do even naughtier things
to the readability of the algorithm. Avoiding them, would really, really,
be a desirable thing.

Thanks - Damian

_______________________________________________
Chapel-developers mailing list
Cha...@li...
https://lists.sourceforge.net/lists/listinfo/chapel-developers

Re: [Chapel-developers] chapelLanguageSpec.pdf

From: Ferguson, M. P. P. (C. Developer) <mic...@hp...> - 2020-09-02 15:45:27

Hi Damian -

We've removed the PDF spec since we wanted to more easily
link between spec sections and other documentation.

The spec is here:

 https://chapel-lang.org/docs/language/spec/index.html

Right now there is 1 webpage per chapter but we could consider
making an all-in-one-page view of the spec and/or a pre-rendered
PDF version if that was useful.

Printing one of the chapters to PDF creates something relatively
manageable on my system.

-michael

    
    Where does this live these days please?  I find perusing a PDF file more 
    useful sometimes than messing around within a browser, especially when my 
    mouse hand is busy holding a glass or a cup of refreshment.
    
    Regards - Damian
    
    Pacific Engineering Systems International, 277-279 Broadway, Glebe NSW 2037
    Ph:+61-2-8571-0847 .. Fx:+61-2-9692-9623 | unsolicited email not wanted here
    Views & opinions here are mine and not those of any past or present employer
    
    
    _______________________________________________
    Chapel-developers mailing list
    Cha...@li...
    https://lists.sourceforge.net/lists/listinfo/chapel-developers

[Chapel-developers] chapelLanguageSpec.pdf

From: Damian M. <da...@es...> - 2020-09-02 03:53:27

Where does this live these days please?  I find perusing a PDF file more 
useful sometimes than messing around within a browser, especially when my 
mouse hand is busy holding a glass or a cup of refreshment.

Regards - Damian

Pacific Engineering Systems International, 277-279 Broadway, Glebe NSW 2037
Ph:+61-2-8571-0847 .. Fx:+61-2-9692-9623 | unsolicited email not wanted here
Views & opinions here are mine and not those of any past or present employer

Re: [Chapel-developers] Slicing is an optimal approach in Matrix Multiplication - GEMM

From: Damian M. <da...@es...> - 2020-09-01 08:24:52

On Mon, 31 Aug 2020, Brad Chamberlain wrote:

> What link did you use for the .zip?  I just used the "Source code (zip)" 
> link from:
>
> 	https://github.com/chapel-lang/chapel/releases/tag/1.22.1
>
> and it seemed to unpack fine for me.

The problem is not the 1.22.1 release. That does not have Rahul's stuff in 
it.

I went to the master repository and clicked 'Code' and it gave me

 	chapel-master.zip

While it listed the table of contents cleanly, it could not extract things 
because it had problems with file names which were too long.

I had to update unzip from 6.0.1 on my system. I now have 6.0.5 and it 
can extract chapel-master.zip cleanly.

Regards - Damian

Pacific Engineering Systems International, 277-279 Broadway, Glebe NSW 2037
Ph:+61-2-8571-0847 .. Fx:+61-2-9692-9623 | unsolicited email not wanted here
Views & opinions here are mine and not those of any past or present employer

Re: [Chapel-developers] Slicing is an optimal approach in Matrix Multiplication - GEMM

From: Damian M. <da...@es...> - 2020-09-01 07:12:16

Brad,

As often happens, you did not solve my problem but gave me enough extra 
insight to allow me to solve my own problems.

On Mon, 31 Aug 2020, Brad Chamberlain wrote:

> Original:
>
>>         forall r in rows do
>>         {
>>             const ref ur = u[r, ..];
>>
>>             for j in cslice do
>>             {
>>                 t[r, j] = vmDot(common, ur, vslab[j, ..]);
>>             }
>>         }
>>     } }
>
>
> Forall expr:
>
>>   forall r in rows do
>>   {
>>       const ref ur = u[r, ..];
>>       const x = [j in cslice] vmDot(common, ur, vslab[j, ..]);
>>
>>       t[r, cslice] = x;
>>   }

> A way to check would be to write the initialization of 'x' as:
>
>>            const x = for j in cslice do vmDot(common, ur, vslab[j, ..]);
>
> If this returned the lost 5%, I think that's the answer.

Sadly no. And I should correct myself, it is 6%. Both your suggestion and 
my attempt labelled as just Forall above take about 8.5 seconds for my 
GEMM of 4000*4000 on my old 6-core E5-1660.

On the other hand, the attempt labelled Original which has no intermediate 
copy takes 8seconds, or say 7.96, or 8.04, or ...

The problem turns out to be the temporary which should have been obvious 
to me. Avoiding the temporary altogether with

 	t[r, cslice] = for j in cslice do vmDot(common, ur, vslab[j, ..])

which I will call the Single Statement approach, and that 6% is recovered.

Interestingly, the Original approach

              for j in cslice do
              {
                  t[r, j] = vmDot(common, ur, vslab[j, ..]);
              }

takes the same time as the Single Statement approach so I will stick with 
it.

Thanks for the insight - Damian

Pacific Engineering Systems International, 277-279 Broadway, Glebe NSW 2037
Ph:+61-2-8571-0847 .. Fx:+61-2-9692-9623 | unsolicited email not wanted here
Views & opinions here are mine and not those of any past or present employer

Re: [Chapel-developers] Slicing is an optimal approach in Matrix Multiplication - GEMM

From: Damian M. <da...@es...> - 2020-09-01 06:37:42

On another point in the same code, I try and grab several adhacent rows 
from the original matrix 'v' and transpose them, and put them into what I 
call a slab. It is not a tile like you see in the chapel DGEMM.

 	var vslab : [cslice, common] R;
//either
         [(r, c) in vslab.domain] vslab[r, c] = v[c, r];
//or
 	[j in cslice] vslab[j, common] = v[j, common];

where R is a general real(?w).

Technically vslab is 'const' so I stabbed in the dark and tried

 	const slab : domain(2) = (cslice, common);
 	const vslab = [(r, c) in slab] v[c, r];

It seems to run in the same elapsed time, is genuinely 'const', and looks 
cleaner.

Does it create any un-necessary data, i.e. does it create a temporary on 
the right before assigning to vslab or does it do it only in cslice*common
real(?w) numbers?

Thanks - Damian

Pacific Engineering Systems International, 277-279 Broadway, Glebe NSW 2037
Ph:+61-2-8571-0847 .. Fx:+61-2-9692-9623 | unsolicited email not wanted here
Views & opinions here are mine and not those of any past or present employer

Re: [Chapel-developers] Slicing is an optimal approach in Matrix Multiplication - GEMM

From: Brad C. <bra...@hp...> - 2020-09-01 02:01:00

Hi Damian —

What link did you use for the .zip?  I just used the "Source code (zip)" 
link from:

 	https://github.com/chapel-lang/chapel/releases/tag/1.22.1

and it seemed to unpack fine for me.

-Brad



On Mon, 31 Aug 2020, Damian McGuckin wrote:

> On Mon, 31 Aug 2020, Damian McGuckin wrote:
>
>> On Sat, 29 Aug 2020, Damian McGuckin wrote:
>>
>>>  Can you remind me how to grab a copy of master?
>> 
>> Don't worry. I dragged the information from the deep reaches of my brain.
>
> I downloaded the '.zip' file from Github.
>
> Unzipped the file but something is wrong with it. It is corrupt as it tried 
> to create files with names which are actually Chapel programs.
>
> Has anybody seem this?
>
> Tried it twice.
>
> Regards - Damian
>
> Pacific Engineering Systems International, 277-279 Broadway, Glebe NSW 2037
> Ph:+61-2-8571-0847 .. Fx:+61-2-9692-9623 | unsolicited email not wanted here
> Views & opinions here are mine and not those of any past or present employer
>
>
> _______________________________________________
> Chapel-developers mailing list
> Cha...@li...
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.sourceforge.net_lists_listinfo_chapel-2Ddevelopers&d=DwICAg&c=C5b8zRQO1miGmBeVZ2LFWg&r=QUQW-BniEL_d2a7btR4rP5TPiNmpm1pG-Qa_xXzGVKc&m=XPazNXDKey7Mz6PSh8ttSGgFYERvjuk0QvNsQUEq9gY&s=1Hy6yKp5jCCLIMPg2llU_o8mO-SKj2OzSBrB2vpnxKQ&e=

Re: [Chapel-developers] Slicing is an optimal approach in Matrix Multiplication - GEMM

From: Brad C. <bra...@hp...> - 2020-09-01 01:56:27

Hi Damian —

Taking your three versions:

Original:

>        forall r in rows do
>        {
>            const ref ur = u[r, ..];
>
>            for j in cslice do
>            {
>                t[r, j] = vmDot(common, ur, vslab[j, ..]);
>            }
>        }
>    } }


Forall expr:

> 	forall r in rows do
> 	{
> 	    const ref ur = u[r, ..];
> 	    const x = [j in cslice] vmDot(common, ur, vslab[j, ..]);
>
> 	    t[r, cslice] = x;
> 	}
>

Succinct:

> 	forall r in rows do
> 	{
> 	    const x = [j in cslice] vmDot(common, u[r, ..], vslab[j, ..]);
>
> 	    t[r, cslice] = x;
> 	}
>
> slows down seriously, about 25+%.

I believe the difference between the final two is a simple case of Chapel 
not doing loop hoisting optimizations for non-trivial expressions. 
Specifically, you and I can see that `u[r, ..]` is independent of the 
value of 'j' so could be evaluated once and re-used for all iterations of 
the 'j' loop, but the Chapel compiler isn't mature enough to do this yet. 
So your "Forall expr" version gets an improvement by manually hoisting the 
evaluation of that expression out of the loop.

The delta between the original and forall expression version is less 
obvious, but I would guess that it could be due to the use of nested 
parallelism (though we'd hope that the impact would be more minimal than 
5%, at least for loops with large trip counts).  Specifically, by default, 
'[j in cslice]' will be executed in parallel, but it'll first check to see 
whether there's already a task per core, and if so, will serialize the 
loop.  Maybe this execution-time check is adding the 5% overhead?  A way
to check would be to write the initialization of 'x' as:

>           const x = for j in cslice do vmDot(common, ur, vslab[j, ..]);

If this returned the lost 5%, I think that's the answer.

-Brad

Re: [Chapel-developers] Compiling the latest DGEMM code and 0-based indexing issues

From: Damian M. <da...@es...> - 2020-09-01 01:12:15

On Mon, 31 Aug 2020, Brad Chamberlain wrote:

> That error messages suggests to me that you're not compiling with 
> version 1.22.0.  Could you run `chpl --version` in that workspace to 
> verify?

Oops. Senior's moment. I changed our system last week to have the chapel 
compiler as part of everybody's environment. And made a typo.

Isn't it nice when your mistakes are there for the whole world to see!

Thanks - Damian

Pacific Engineering Systems International, 277-279 Broadway, Glebe NSW 2037
Ph:+61-2-8571-0847 .. Fx:+61-2-9692-9623 | unsolicited email not wanted here
Views & opinions here are mine and not those of any past or present employer

Re: [Chapel-developers] Compiling the latest DGEMM code and 0-based indexing issues

From: Brad C. <bra...@hp...> - 2020-08-31 20:11:11

Hi Damian —

That error messages suggests to me that you're not compiling with version 
1.22.0.  Could you run `chpl --version` in that workspace to verify?

-Brad


On Mon, 31 Aug 2020, Damian McGuckin wrote:

>
> I compiled it with 1.22.0 and I get:
>
> $CHPL_HOME/modules/internal/DefaultRectangular.chpl:582: In function 
> 'dsiDim':
> $CHPL_HOME/modules/internal/DefaultRectangular.chpl:583: error: tuple index 0 
> is out of bounds
> $CHPL_HOME/modules/internal/DefaultRectangular.chpl:583: note: tuple elements 
> start at index 1
> $CHPL_HOME/modules/internal/ChapelArray.chpl:1392: Function 'dsiDim' 
> instantiated as: dsiDim(this: borrowed domain(2,int(64),false), param d = 0)
>
> Sure, the Matrix is indexed from 0 but I can see nothing in the code that 
> should stop it compiling with 1.22.0.  Besides, 1.22.0 is the release with 
> tuples which begin with 0.
>
> Regards - Damian
>
> Pacific Engineering Systems International, 277-279 Broadway, Glebe NSW 2037
> Ph:+61-2-8571-0847 .. Fx:+61-2-9692-9623 | unsolicited email not wanted here
> Views & opinions here are mine and not those of any past or present employer
>
>
> _______________________________________________
> Chapel-developers mailing list
> Cha...@li...
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.sourceforge.net_lists_listinfo_chapel-2Ddevelopers&d=DwICAg&c=C5b8zRQO1miGmBeVZ2LFWg&r=QUQW-BniEL_d2a7btR4rP5TPiNmpm1pG-Qa_xXzGVKc&m=Ll-QKque2gISU-_vKw5rkeEELxW0xOLi9j4_ZUkAcvk&s=8LmuEL7DVmDDmsa2VW8l0NzNUBURXR0wqhu3MNakchM&e=

Re: [Chapel-developers] Slicing is an optimal approach in Matrix Multiplication - GEMM

From: Rahul G. <u61...@an...> - 2020-08-31 19:29:00

Hi Damian,

I just did a fresh install from the zip file I downloaded from GitHub and it built without any issues.

Regards,
Rahul
--
Rahul Ghangas
Advanced Computing (R&d) (Honours)
The Australian National University
Ph- +61 0435040074
Email - rah...@an... , rah...@gm...

> On Sep 1, 2020, at 5:01 AM, Damian McGuckin <da...@es...> wrote:
> 
> On Mon, 31 Aug 2020, Damian McGuckin wrote:
> 
>> On Sat, 29 Aug 2020, Damian McGuckin wrote:
>> 
>>> Can you remind me how to grab a copy of master?
>> 
>> Don't worry. I dragged the information from the deep reaches of my brain.
> 
> I downloaded the '.zip' file from Github.
> 
> Unzipped the file but something is wrong with it. It is corrupt as it tried to create files with names which are actually Chapel programs.
> 
> Has anybody seem this?
> 
> Tried it twice.
> 
> Regards - Damian
> 
> Pacific Engineering Systems International, 277-279 Broadway, Glebe NSW 2037
> Ph:+61-2-8571-0847 .. Fx:+61-2-9692-9623 | unsolicited email not wanted here
> Views & opinions here are mine and not those of any past or present employer

Re: [Chapel-developers] Slicing is an optimal approach in Matrix Multiplication - GEMM

From: Damian M. <da...@es...> - 2020-08-31 19:01:33

On Mon, 31 Aug 2020, Damian McGuckin wrote:

> On Sat, 29 Aug 2020, Damian McGuckin wrote:
>
>>  Can you remind me how to grab a copy of master?
>
> Don't worry. I dragged the information from the deep reaches of my brain.

I downloaded the '.zip' file from Github.

Unzipped the file but something is wrong with it. It is corrupt as it 
tried to create files with names which are actually Chapel programs.

Has anybody seem this?

Tried it twice.

Regards - Damian

Pacific Engineering Systems International, 277-279 Broadway, Glebe NSW 2037
Ph:+61-2-8571-0847 .. Fx:+61-2-9692-9623 | unsolicited email not wanted here
Views & opinions here are mine and not those of any past or present employer

14 messages has been excluded from this view by a project administrator.

Flat | Threaded

1 2 3 .. 109 > >> (Page 1 of 109)

chapel-developers Mailing List for Chapel

a Productive Parallel Programming Language

chapel-developers — Discussion list for Chapel developers in the community