## Diff of /PDL/Book/NiceSlice.pod[ee22fe] .. [99decf]  Maximize  Restore

### Switch to side-by-side view

--- a/PDL/Book/NiceSlice.pod
+++ b/PDL/Book/NiceSlice.pod
@@ -37,6 +37,11 @@
print dims($data); ($nx, $ny,$nz) = dims($data); +See also the C<shape> function which returns the pdl shape as +a pdl: + +$datashape = shape($data); + The number of elements in a piddle is equally easy: print nelem($data);
@@ -54,22 +59,33 @@
the second.  Behind the scenes, this is implemented by the C<slice>
function.

+    $section =$gal->slice('337:357,178:198');
+
Use the on-line documentation:

-    perldl> help slice
+    pdl> help slice

to explore the full set of options.  C<slice> is probably the most
-frequently used PDL function so we will explore it in some detail.  But
-first we notice that C<slice> is implemented via a named function.
+frequently used PDL function so we will explore it in some detail.
+But first we notice that C<slice> is implemented via a named function.
+
Through the magic of C<PDL::NiceSlice> and source filtering you can
access C<slice> functionality in a form very similar to the vector
array syntax found in many array computation languages such as
-FORTRAN-90 and Matlab.
+FORTRAN-90 and Matlab.  The chief difference being that the argument
+to the C<slice> method call is a I<string> descrbing the elements
+to be selected.  For the new C<PDL::NiceSlice> syntax, you don't use
+the method or function call and the argument does not need to be
+wrapped up in a string.
+
+In this chapter, we will usually show the C<PDL::NiceSlice> syntax
+but refer to the operation as a slice even though with the new
+sytax there is no longer an explicit C<slice> method being called.

-The argument syntax to C< slice > is just a list of ranges, the simplest
-if of the form C< A:B > to specify the start and end pixels.  This
+The slicing argument syntax is just a list of ranges, the simplest
+if of the form C<A:B> to specify the start and end pixels.  This
generalises to arbitrary dimensions;

$data = zeroes(1000); @@ -77,29 +93,29 @@$data = zeroes(100,100,20);
$sec =$data(0:20,40:60,1:3);

-Note that PDL, just like Perl and C, uses  B<ZERO OFFSET> arrays. i.e.
+Note that PDL, just like Perl and C, uses B<ZERO OFFSET> arrays. i.e.
the first element is numbered 0, not 1.  Just like Perl you can use
C<-N> to refer to the last elements:

$data = zeroes(1000);$sec = $data(-10:-1); # Elements 990 to 999 (last) -One can also specify a step in the slice using the form C< A:B:C > where -C< C > is the step. Here is an example: - - perldl>$x = sequence(24); # Create a piddle of increasing value
-    perldl> print $x +One can also specify a step in the slice using the form C<A:B:C> where +C<C> is the step. Here is an example: + + pdl>$x = sequence(24); # Create a piddle of increasing value
+    pdl> print $x [0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23] - perldl> print$x(16:22:2)
+    pdl> print $x(16:22:2) [16 18 20 22] Quite often one wants all the elements along many of the dimensions, one -can just use  C< : > ' or just omit the specifier altogether: - - perldl>$a = zeroes(10,20,3)
-    perldl> print dims $a(:,5:10,:) +can just use "C<:>" or just omit the specifier altogether: + + pdl>$a = zeroes(10,20,3)
+    pdl> print dims $a(:,5:10,:) 10 6 3 - perldl> print dims$a(,5:10,)
+    pdl> print dims $a(,5:10,) 10 6 3 Omitting the range allows specification of just one index along the dimension: @@ -108,7 +124,7 @@$col = $z(42,:); # Column 42 (Dims = 1x200)$row = $z(:,42); # Row 42 (Dims = 100x1) -Since the second argument to slice is just a Perl character string it is easy to manipulate: +You also can use perl scalars to construct the slicing specifications:$x1 = 2; $x2 = 42;$sec = $data($x1:$x2); @@ -117,19 +133,19 @@ Here's the biggy: - perldl>$x = sequence(24);
-    perldl> print $x; + pdl>$x = sequence(24);
+    pdl> print $x; [0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23] - perldl>$slice = $x(4:20:2); - perldl> print$slice;
+    pdl> $slice =$x(4:20:2);
+    pdl> print $slice; [4 6 8 10 12 14 16 18 20] All very well. But now we modify the slice using the assignment operator. - perldl>$slice .= 0;
-    perldl> print $slice; + pdl>$slice .= 0;
+    pdl> print $slice; [0 0 0 0 0 0 0 0 0] - perldl> print$x;
+    pdl> print $x; [0 1 2 3 0 5 0 7 0 9 0 11 0 13 0 15 0 17 0 19 0 21 22 23 24] Modifying the slice automatically modfies the original data! However it @@ -149,8 +165,8 @@ If you monitor the memory consumed by the PDL process on your computer (UNIX/Linux users can try the C<top> command) you will see that the -amount of memory consumed does not go up - B< even when the slice is -modified > . This is because the way PDL is written allows many of the +amount of memory consumed does not go up - B<even when the slice is +modified> . This is because the way PDL is written allows many of the simple operations on slices to be optimised - i.e. a temporary internal copy of the slice need not be made. Of course sometimes - for example when passing to an external subroutine - this optimisation is not @@ -165,13 +181,13 @@ extremely useful when using PDL's advanced B<threading> features (see I<PDL threading and the signature> later. -If a dimension is of size unity it can be removed using C< () >: +If a dimension is of size unity it can be removed using C<()>:$z = zeroes 100,30;
$col =$z(42,:); # Column 42 - 2D (Dims = 1x30)
$col =$z((42),:); # Column 42 - 1D (Dims = 30)

-And then one can put them back again using  C< * > ':
+And then one can put them back again using "C<*>":

$col2 =$col(*,:,*); # Dims now = 1x30x1

@@ -184,7 +200,7 @@

At this point we would like to introduce the varied notations for
-calling C< slice > and it's friends.  This is because it will be
+calling C<slice> and it's friends.  This is because it will be
commonly seen in PDL code and is very handy.  While at first unfamilar
to C and FORTRAN users it is not rocket science, PDL users will quickly
become used to it.
@@ -202,7 +218,7 @@
functions, which while useful do clutter Perl's namespace, at some point

-For example here are 3 different ways of calling C< slice > :
+For example here are 3 different ways of calling C<slice>:

$t = slice($z,":,*3,:"); # Function call (old style)
$t = slice$z,":,*3,:"; # Function call (old style)
@@ -219,12 +235,12 @@

=head2 The C<dice> and C<dice_axis> functions - irregular subsets along axes

-As well as take regular slices along axes via the C< slice > function,
-another common requirement is to take B< irregular slices > , by which
+As well as take regular slices along axes via the C<slice> function,
+another common requirement is to take B<irregular slices>, by which
we mean a list of arbitrary coordinates.  This operation is referred to
-in PDL as B< dicing > a piddle.
-
-The C< dice_axis > function performs a dice along a specified axis:
+in PDL as B<dicing> a piddle.
+
+The C<dice_axis> function performs a dice along a specified axis:

$a = sequence(10,20);$b = dice_axis $a, 0, [3,7,9]; # Dice along axis 0 @@ -234,13 +250,13 @@ For a 2D piddle dicing along axis 0 selects columns, dicing along axis 1 selects rows. In general in N-dimensions dicing along a given axis reduces the number of elements along that axis, but the number of -dimensions remains unchanged. The C< dice > function allows all axes to +dimensions remains unchanged. The C<dice> function allows all axes to be specified at once:$z = zeroes 10,20,50;
print dims dice $z,[2,3,5],[10,11,12],[30..35,39,40]; -The list of axes in the dice can be specified using Perl's  C< [] > ' +The list of axes in the dice can be specified using Perl's "C<[]>" list reference notation or using a 1D piddle:$z = sequence 10,20;
@@ -258,7 +274,7 @@
$z = zeroes(3,4);$t = $z->xchg(0,1); # Axes 0 and 1 swapped, dims now = 4,3 -This is a simple matrix transpose. The method C<<$z->transpose >> and
+This is a simple matrix transpose.  The method C<<$z->transpose>> and the equivalent operator C<~$z> also do this, though they
also make a copy (i.e. return a new piddle) not a slice and can operate
on 1D piddles (i.e. convert a row vector into a column vector).
@@ -269,8 +285,8 @@
$z = zeroes(3,4,5,6,7);$t = $z->xchg(1,3); # Dims now 3,6,5,4,7 -A different way of switching dimensions around is provided by C< -$z->mv(A,B) > which justs moves the axis C< A > to posiition C< B > :
+A different way of switching dimensions around is provided by C<$z->mv(A,B)> +which justs moves the axis C<A> to posiition C<B> :$z = zeroes(3,4,5,6,7);
$t =$z->mv(1,3); # Dims now 3,5,6,4,7
@@ -281,7 +297,7 @@
$t =$z->reorder(4,3,0,2,1); # Dims now 7,6,3,5,4

Note C<reorder> is our first example of a pure PDL method - it does
-not exist as a function and can only be called using the C<< $z->reorder(...) >> syntax. +not exist as a function and can only be called using the C<<$z->reorder(...)>> syntax.

@@ -291,8 +307,8 @@
is the C<clump> function for combining dimensions together.  Suppose
we have a 3-D datacube piddle:

-    perldl> $a = xvals(5,3,2); - perldl> print$a;
+    pdl> $a = xvals(5,3,2); + pdl> print$a;

[
[
@@ -312,48 +328,50 @@

But say we wanted to sum over the first B<TWO> dimensions? i.e.  replace
our datacube with a 1-D vector containing the sums of each plane. What
-we need to do is to clump' the first two dimensions together to make
+we need to do is to "clump" the first two dimensions together to make
one dimension, and then use C<sumover>. Surprisingly enough this is
what C<clump> does:

-    perldl> $b =$a->clump(2); # Clump first two dimensions together
-    perldl> print $b; + pdl>$b = $a->clump(2); # Clump first two dimensions together + pdl> print$b;
[
[0 1 2 3 4 0 1 2 3 4 0 1 2 3 4]
[0 1 2 3 4 0 1 2 3 4 0 1 2 3 4]
]
-    perldl> $c = sumover$b;
-    perldl> print $c; + pdl>$c = sumover $b; + pdl> print$c;
[30 30]

Now we know about C<mv> it is also easy to sum over the last two dimensions:

-    perldl> print sumover $a->mv(0,2)->clump(2) + pdl> print sumover$a->mv(0,2)->clump(2)
[0 6 12 18 24]
-    It is also possible using the special form C< clump(-1) > to clump B< all > the dimensions together:
-
-    perldl> $x = sequence(10,20,30,40); - perldl> print dims$x->clump(-1);
+
+It is also possible using the special form C<clump(-1)> to clump B<all>
+the dimensions together:
+
+    pdl> $x = sequence(10,20,30,40); + pdl> print dims$x->clump(-1);
240000
-    perldl> print sumover $x->clump(-1); # Same as sum($x)
+    pdl> print sumover $x->clump(-1); # Same as sum($x)
28799880000

-Uncannily this is almost exactly how the C< sum > function is implemented in PDL.
+Uncannily this is almost exactly how the C<sum> function is implemented in PDL.

After our first look at threading in Chapter 2 we know how to add a
vector to rows of an image:

-    perldl> print $a = pdl([1,0,0],[1,1,0],[1,1,1]); + pdl> print$a = pdl([1,0,0],[1,1,0],[1,1,1]);
[
[1 0 0]
[1 1 0]
[1 1 1]
]
-    perldl> print $b = pdl(1,2,3); + pdl> print$b = pdl(1,2,3);
[1 2 3]
-    perldl> print $a+$b;
+    pdl> print $a+$b;
[
[2 2 3]
[2 3 3]
@@ -363,7 +381,7 @@
But say we wanted to add the vector to the columns.  You might think to
transpose C<$a> : - perldl> print$a->xchg(0,1)+$b; + pdl> print$a->xchg(0,1)+$b; [ [2 3 4] [1 3 4] @@ -371,30 +389,30 @@ ] But the result is the transpose of the desired result. We could of -course just transpose the result but a cleaner method is to use C< dummy -> to change the dimensions of C<$b> :
-
-    perldl> print $b->dummy(0); # Result has dims 1x3 +course just transpose the result but a cleaner method is to use C<dummy> +to change the dimensions of C<$b> :
+
+    pdl> print $b->dummy(0); # Result has dims 1x3 [ [1] [2] [3] ] -C< dummy > just inserts a dummy dimension' of size unity at the -specified place. C< dummy(0) > put's it at position 0 - i.e. the first +C<dummy> just inserts a "dummy dimension" of size unity at the +specified place. C<dummy(0)> put's it at position 0 - i.e. the first dimension. The result is a column vector. Then we easily get what we want: - perldl> print$a + $b->dummy(0); + pdl> print$a + $b->dummy(0); [ [2 1 1] [3 3 2] [4 4 4] ] -Because of the threading rules the unit dimension makes C<$b >
-implicitly repeat along axis 0.  i.e.  it is as if C<< $b->dummy(0) >> +Because of the threading rules the unit dimension makes C<$b>
+implicitly repeat along axis 0.  i.e.  it is as if C<<$b->dummy(0)>> B<looked> like: [ @@ -403,65 +421,65 @@ [3 3 3] ] -C< dummy > can also be used to insert a dimension of size >1 with the +C<dummy> can also be used to insert a dimension of size >1 with the data B<explicitly> repeating: - perldl> print dims$b->dummy(0,10);
+    pdl> print dims $b->dummy(0,10); 10 3 - perldl> print$b->dummy(0,10);
+    pdl> print $b->dummy(0,10); [ [1 1 1 1 1 1 1 1 1 1] [2 2 2 2 2 2 2 2 2 2] [3 3 3 3 3 3 3 3 3 3] ] -=head2 Completely general subsets of data with C< index > , C< which > -and C< where> +=head2 Completely general subsets of data with C<index> , C<which> +and C<where> Our look at advanced slicing concludes with a look at completely general subsets, specified using arbitrary conditions. Let's make a piddle of real numbers from 0 to 1: - perldl> print$a = sequence(10)/10;
+    pdl> print $a = sequence(10)/10; [0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9] We can make a conditional array whose values are 1 where a condition is true using standard PDL operators. For example for numbers below 0.2 and above 0.9: - perldl> print$a<0.25 | $a>0.85; + pdl> print$a<0.25 | $a>0.85; [1 1 1 0 0 0 0 0 0 1] -We'll use this as an example of an arbitrary condtion. Using C< which > +We'll use this as an example of an arbitrary condtion. Using C<which> we can return a piddle containing the positions of the elements which match the condition: - perldl>$idx = which($a<0.25 |$a>0.85); print $idx; + pdl>$idx = which($a<0.25 |$a>0.85); print $idx; [0 1 2 9] i.e. elements 0..2 and 9 in the original piddle are the ones we want. -We can select these using the C< index > function: - - perldl> print$a->index($idx); +We can select these using the C<index> function: + + pdl> print$a->index($idx); [0 0.1 0.2 0.9] So here we have an arbitrary, non-contiguous slice. However thanks to the magic of PDL we can still modify this as if it was still a more boring kind of slice and have our results affect the original: - perldl>$a->index($idx) .= 0; # Set indexed values to zero - perldl> print$a;
+    pdl> $a->index($idx) .= 0; # Set indexed values to zero
+    pdl> print $a; [0 0 0 0.3 0.4 0.5 0.6 0.7 0.8 0] -In fact PDL posesses a convenience function called C< where > which +In fact PDL posesses a convenience function called C<where> which actually lets you combine these steps at once:$a = sequence(10)/10;
$a->where($a<0.25 | $a>0.85) .= 0; print$a; # Same result as above

-i.e.  we make a subset of values B< where > a certain condition is true.
+i.e.  we make a subset of values B<where> a certain condition is true.
You can of course use index with explicit values:

# Increment first and last values
@@ -469,19 +487,19 @@
$a = sequence(10);$a->index(pdl(0,9))++;

-What if you had a 2-D array?  C< index > is obviously one-dimensional.
-What happens is an implicit C< clump(-1) > (i.e.  the whole array is
+What if you had a 2-D array?  C<index> is obviously one-dimensional.
+What happens is an implicit C<clump(-1)> (i.e.  the whole array is
viewed as 1-D):

-    perldl> $a = sequence(10,2); - perldl>$a->index(pdl(0,9)) .= 999;
-    perldl> print $a; + pdl>$a = sequence(10,2);
+    pdl> $a->index(pdl(0,9)) .= 999; + pdl> print$a;
[
[999 1 2 3 4 5 6 7 8 9]
[ 10 11 12 13 14 15 16 17 18 999]
]

-You can of course use C< where > too for any number of dimensions:
+You can of course use C<where> too for any number of dimensions:

# e.g.  make a cube with a sphere of 1's in the middle:
$cube = rvals(100,100,100); @@ -495,25 +513,25 @@ fundamental aspect of any array processing language and PDL is no exeption (as you can tell from the preceding examples). In PDL those functions might be even more important since they are absolutely vital -in using PDL B< threading > , the fast (and automagic) vectorised +in using PDL B<threading> , the fast (and automagic) vectorised iteration of "elementary operations" over arbitrary slices of -multidimensional data. First we have to explain what we mean by B< -threading > in the context of PDL, especially since the term B< -threading > already has a distinct meaning in computer science that only +multidimensional data. First we have to explain what we mean by B<threading> +in the context of PDL, especially since the term B<threading> +already has a distinct meaning in computer science that only partly agrees with its usage within PDL. In the following we will explain the general use of PDL threading and highlight the close interplay with those slicing and dicing routines that you have just -become familiar with ( C< slice > , C< xchg > , C< mv > , etc). But +become familiar with (C<slice>, C<xchg>, C<mv>, etc). But first things first: what is PDL threading? =head3 Threading -B< Threading > already has been working under the hood in many examples +B<Threading> already has been working under the hood in many examples we have encountered in previous sections. It allows for fast processing -of large amounts of data in a scripting language (such as C< perl >). -And just to be sure, PDL B< Threading > ist B< not > quite the same as -threading in the computer science sense. Both concepts are related but -more about that later. +of large amounts of data in a scripting language (such as C<perl>). +And just to be sure, PDL B<Threading> is B<not> the same as threading +in the computer science sense or in the perl sense. Both concepts +are related but more about that later. =head3 A simple example @@ -525,7 +543,7 @@ use PDL::IO::Pic;$im = rpic 'beads.jpg'; # image stored in the JPEG format

-The following code snippet calculates the maxima of all rows of this image C< $im > : +The following code snippet calculates the maxima of all rows of this image C<$im> :

$max =$im->maximum;

@@ -533,60 +551,33 @@
the piddles involved using a little helper routine (see box) to print
out the shape of piddles in the course of computations:

-($max =$im->pdim('Input')->maximum)->pdim('Result');
+    ($max =$im->pdim('Input')->maximum)->pdim('Result');

that generates the following output:

Input Byte D [256,256]
-
Result Byte D [256]

-    \begin_inset Float figure
-    placement H
-    wide false
-    sideways false
-    status open
-
-    \begin_inset Box Boxed
-    position "c"
-    hor_pos "c"
-    has_inner_box 1
-    inner_pos "t"
-    use_parbox 0
-    width "110text%"
-    special "none"
-    height "55theight%"
-    height_special "none"
-    status open
-
-Since is important to keep track of the dimensions of piddles when using
-(and especially when introducing) B< PDL threading > we quickly define a
-shorthand command (a method) that lets us print the dimensions of
-piddles nicely formatted as needed:
-
-    { package PDL;
-    sub pdim { # pretty print type+dimensions and
-                  # allow for optional string arg
-       my ($this) = @_; - print (($#_ > 0 ?
-    $_[1]\t - : - \begin_inset Quotes eld - - \begin_inset Quotes erd - - ). - -$this->info("%T %D
-    \backslash
-    n")); # use info to print type and dims
-
-       return $this; - - }} + Since is important to keep track of the dimensions + of piddles when using (and especially when + introducing) B<PDL threading> we quickly define a + shorthand command (a method) that lets us print + the dimensions of piddles nicely formatted as + needed: + + { package PDL; + sub pdim { + # pretty print type+dimensions and + # allow for optional string arg + my ($this) = @_;
+         print (($#_ > 0 ? "$_[1]\t" : "" ) .
+             $this->info("%T %D\n")); # use info to print type and dims + return$this;
+    }
+  }

Two observations: note how we temporarily switched into the package PDL
-so that pdim can be used as a method on piddles and we made the function
+so that C<pdim> can be used as a method on piddles and we made the function
return the piddle argument so that it can be seamlessly integrated into
method invocation chains:

@@ -595,32 +586,32 @@
A small utility routine

So let's dissect what has happened.  If you look at the documentation of
-C< maximum > it says This function reduces the dimensionality of a
+C<maximum> it says This function reduces the dimensionality of a
piddle by one by taking the maximum along the 1st dimension.

-In this respect C< maximum > behaves quite differently from C< max > .
-C< max > will always return a scalar with a value equal to that of the
-largest element of a (possibly multidimensional) piddle.  C< maximum > ,
+In this respect C<maximum> behaves quite differently from C<max>.
+C<max> will always return a scalar with a value equal to that of the
+largest element of a (possibly multidimensional) piddle.  C<maximum>,
however, is by definition an operation that takes the maximum only over
a one-dimensional vector.  If the input piddle is of higher dimension
-this B< elementary operation > is automatically iterated over all
+this B<elementary operation> is automatically iterated over all
one-dimensional subslices of the input piddle

-And, most imporatantly, this automatic iteration (we call it the B<
-threadloop >) is implemented as fast optimized C loops.  As a
+And, most imporatantly, this automatic iteration (we call it the
+B<threadloop>) is implemented as fast optimized C loops.  As a
convention, these subslices are by default taken along the first
dimensions of the input piddle.  In our current example the subslices
are one-dimensional and therefore taken along the first dimension.  All
results are placed in an appropriately sized output piddle of N-1
-dimension s, one value for each subslice on which the operation was
+dimensions, one value for each subslice on which the operation was
performed.

Now it should be no surprise that

-    $im3d = sequence short, 5,10,3; # a 3D image (volume) -$max = $im3d->pdim('Input')->maximum; - print$max->pdim('Result') .  " \n"; generates
+    pdl> $im3d = sequence short, 5,10,3; # a 3D image (volume) + pdl>$max = $im3d->pdim('Input')->maximum; + pdl> print$max->pdim('Result') .  " \n"; generates

Input   Short D [5,10,3]
Result  Short D [10,3]
@@ -649,12 +640,27 @@

Now suppose we do not want to calculate the maxima along the first
dimension but rather along the second (the column maxima).  However, we
-just heard that C< maximum > works by definition on the first dimension.
+just heard that C<maximum> works by definition on the first dimension.
How can we make it do the same thing on the second dimension? Here is
where the dimension manipulation routines come in handy: we use a to
make a piddle in which the original second dimension has been moved to
first place.  Guess how that is done: yes, using C<xchg> we get what
-we want
+we want:
+
+    pdl> $im3d = sequence short, 5,10,3; # a 3D image (volume) + pdl>$max = $im3d->xchg(0,1)->pdim('Input')->maximum; + pdl> print$max->pdim('Result') .  " \n";
+
+generates
+
+    Input   Short D [5,10,3]
+    Result  Short D [5,3]
+    [
+     [ 45  46  47  48  49]
+     [ 95  96  97  98  99]
+     [145 146 147 148 149]
+    ]
+

If you check C<pdim>'s output you see how the originally second
dimension of size 10 has been moved to the first dimension (step 1->2)
@@ -675,7 +681,7 @@

dimension is the z dimension, say in a [8,256,256] shaped piddle just
-use C< mv > to obtain the desired result:
+use C<mv> to obtain the desired result:

$convolved =$stack->mv(0,2)->conv2d($kernel); @@ -685,22 +691,22 @@ involved need to be manipulated so that iteration happens as desired (e.g. over the intended subslices) and the result has the intended dimensionality. Formulating your problem at hand in a way that makes -use of threading rather than resorting to nested C< for > -loops at the +use of threading rather than resorting to nested C<for>-loops at the perl level can make the difference between a script that is executed faster than you can type and one that is crawling along and giving you plenty of time to have your long overdue lunch break. -=head3 Why threading and why call it B< threading > ? +=head3 Why threading and why call it B<threading> ? So what are the advantages of relying on threading to perform things you -can achieve in perl also with explicit C< for > loops and the C< slice > +can achieve in perl also with explicit C<for>-loops and the C<slice> command? There are several (very good) reasons. The more you use PDL for your daily work the quicker you will appreciate this. Before we get into the details of the why and how let's admit: PDL is by no means the first data language that supports this type of automatical implicit looping facility: the authors have in fact been inspired by -several previous data language implementations, most notably C< Yorick > +several previous data language implementations, most notably C<Yorick> Similar concepts are also implemented in APL and J, although well hidden by a wealth of terminology and notation very different from that of most @@ -710,12 +716,12 @@ with in a separate chapter) and the conceptual interplay with the dimension manipulation routines. -The first and most important reason to use B< PDL threading > is simply -B< speed > . The alternative to threading are loops at the perl level. +The first and most important reason to use B<PDL threading> is simply +B<speed> . The alternative to threading are loops at the perl level. That is certainly a viable alternative, however, if we rewrite our maximum routine along these lines a quick benchmark test will prove our -point. First of all, here is the code that does the equivalent of C< -maximum > on 2D input without using threading +point. First of all, here is the code that does the equivalent of C<maximum> +on 2D input without using threading sub mymax { # we only cover the case of 2D input @@ -729,7 +735,7 @@ return$result;
}

-We have written it so that C< mymax > can just deal with 2D input
+We have written it so that C<mymax> can just deal with 2D input
piddles.  A routine for the general n-dimensional case would have been
more involved .  Note that we explicitly have to create an output piddle
of the desired type and size.  By comparison, the corresponding
@@ -740,12 +746,12 @@
return $pdl->maximum; } -In fact, we only wrapped C< maximum > in another subroutine to have the -same calling overhead as C< mymax > . We are trying to be fair (even -though we are biased). So let's compare the performance of C< mymax > -versus C< mythreadmax > . How? Remember that we are using perl, after +In fact, we only wrapped C<maximum> in another subroutine to have the +same calling overhead as C<mymax> . We are trying to be fair (even +though we are biased). So let's compare the performance of C<mymax> +versus C<mythreadmax> . How? Remember that we are using perl, after all, and that there is (almost) always a module that does just what you -need. Here and now C< Benchmark > +need. Here and now, that would be C<Benchmark>. Our benchmarking script looks like this @@ -759,13 +765,14 @@ If we run this script it generates - PDL thread: 4 wallclock secs ( 2.48 usr + 0.64 sys = 3.12 CPU) @ 12802.88/s (n=40009) - Perl loops: 3 wallclock secs ( 1.80 usr + 1.23 sys = 3.03 CPU) @ 12.86/s (n=39) + PDL thread: 4 wclk secs (2.48 usr + 0.64 sys = 3.12 CPU) @ 12802.88/s (n=40009) + Perl loops: 3 wclk secs (1.80 usr + 1.23 sys = 3.03 CPU) @ 12.86/s (n=39) That proves our point: while the example using threading is executed at a rate of nearly 13,000 per second using explicit loops has brought down -the speed to less than 13/s, a very significant difference. Obviously, -the difference between threading and explicit looping depends somewhat +the speed to less than 13/second, a very significant difference. + +Obviously, the difference between threading and explicit looping depends somewhat on the nature of the elementary operation and the piddles in question. The difference becomes most striking the more elementary operations are involved and the faster an individual elementary operation can be @@ -774,7 +781,7 @@ than that required to execute the explicit looping code. Another distinct advantage becomes apparent when comparing the code -required to implement the equivalent of the C< maximum > functionality +required to implement the equivalent of the C<maximum> functionality explicitly in perl code. We have to write extra code to create the right size ouput piddle, explicitly handle dimension sizes, etc. All in all the code is much less concise and also less general. @@ -783,7 +790,7 @@ yourself you increase the probability of introducing errors into your code. When using threading, PDL checks all dimensions for you, makes sure it loops over the correct indices internally and keeps you from -having to do the bookkeeping: after all, B< that > is what computers are good at. +having to do the bookkeeping: after all, B<that> is what computers are good at. Even though PDL threading makes your life much easier in one respect by taking care of some of the "messy" details it leaves you with another @@ -794,32 +801,34 @@ key to using an array-oriented high level language like PDL successfully. This is what the programmer new to PDL and used to low-level programming has to learn: avoid explicit loops where possible -and try to use automatically performed B< thread loops > instead. +and try to use automatically performed B<thread loops> instead. There is yet another benefit that comes with the threading approach. By looking at places where threading can be efficiently used you are also rethinking your problem in a way so that it can be very effectively -parallelize d! The keen reader has probably already observed that those +parallelized! The keen reader has probably already observed that those internal automatic loops of elementary operations over subslices do not -have to be performed sequentially. - - use Benchmark; - use PDL; -$a = sequence(10,300);
-    timethese(0, { # run each for at least 3 CPU secs
-                'PDL mulithreaded' => '$pl = maximum$a;',
-                'PDL singlethreaded' => '$a->add_threading_magic(1,10); -$pt = maximum $a;', - }); - -The command - -$a->add_threading_magic(1,10);
-
-explicitly tells PDL to use 10 (possibly concurrently run depending on
-dimension 1.  For example, on an Cygwin PC with 2 AMD Opteron CPUs this
-script yields the following output:
+have to be performed sequentially.  In fact, as of PDL-2.4.10, there
+is a new capability where PDL now support automatic parallelization
+
+   use Benchmark qw(:hireswallclock);
+   use PDL;
+   $a = zeros(2_000_000); +$b = zeros(2_000_000);
+
+   timethese(20,{threaded => '$a **= 1.3'}); + + set_autopthread_targ(0); # Set target to 0 for unthreaded + timethese(20,{unthreaded => '$b **= 1.3'});
+
+For a Vista/Cygwin system with a quad-core i5 processor we
+see an greater than 2.5X reduction in wall clock time by using
+multiple processor cores.  See documentation for L<PDL::ParallelCPU>
+using C<help PDL::ParallelCPU> in one of the PDL shells, or with
+C<pdldoc PDL::ParallelCPU> from the command line.

=head3 The general case: PDL functions and their signature

@@ -834,12 +843,12 @@
dimension of subslices in a elementary operation of a function in PDL?
We know from the preceding examples that some PDL functions work on a
one-dimensional subvector of the data and generate a zero-dimensional
-result (a scalar) from each of the processed subslices, for example: C<
-maximum > , C< minimum > , C< sumover > , C< prodover > , etc.
-Two-dimensional convolution ( C< conv2d >), on the other hand, consumes
+result (a scalar) from each of the processed subslices, for example: C<maximum>,
+C<minimum>, C<sumover>, C<prodover>, etc.
+Two-dimensional convolution (C<conv2d>), on the other hand, consumes
a 2D subslice in an elementary operation.  But how do we get this
information in general for any given function?  It is easy: you just
-have to check the function's B< signature > !
+have to check the function's B<signature>!

The signature is a string that contains this information in concise
symbolic form: it names the parameters of a function and the dimensions
@@ -849,10 +858,10 @@
about special type conversions that are to be performed at run-time.

Generally, you can find the signature of a function using the perldl
-online help system.  Just type C< sig <funcname> > at the command
prompt, e.g.:

-    perldl> sig maximum
+    pdl> sig maximum

Signature: maximum(a(n); [o]c())

@@ -860,16 +869,16 @@

a(n); [o]c()

-This signature states that C< maximum > is a function with two arguments
-named C< a > and C< c > . Wait a minute: above it seemed that C< maximum
-> only takes one argument and returns a result! The apparent
-contradiction is resolved by noting that the formal argument C< c > is
-flagged with the C< [o] > option identifying C< c > is an output
-argument.  This seems to suggest that we could C< maximum > also call as
-maximum($im,$result);
+This signature states that C<maximum> is a function with two arguments
+named C<a> and C<c>. Wait a minute: above it seemed that C<maximum> only
+takes one argument and returns a result! The apparent
+contradiction is resolved by noting that the formal argument C<c> is
+flagged with the C<[o]> option identifying C<c> is an output
+argument.  This seems to suggest that we could C<maximum> also call as
+C<maximum($im,$result)>;

This is in fact possible and an intended feature of PDL that is useful
-in B< tight loops > where it helps to avoid unneccesary reallocation of
+in B<tight loops> where it helps to avoid unneccesary reallocation of
variables (see below).  In general, however, we will call functions in
the usual way that can be written symbolically as:

@@ -882,21 +891,21 @@
The other important information supplied by the signature is the
dimensionality of each of these arguments in an elementary operation.
Each formal parameter carries this information in a list of formal
-dimension identifiers enclosed in parentheses.  So indeed C< a(n) >
-marks C< a > as a one-dimensional argument.  Additionally, each
-dimension has a B< named > size in a signature, in this example C< n > .
-C< c() > has an empty list of dimension sizes: it is declared to be
+dimension identifiers enclosed in parentheses.  So indeed C<a(n)>
+marks C<a> as a one-dimensional argument.  Additionally, each
+dimension has a B<named> size in a signature, in this example C<n> .
+C<c()> has an empty list of dimension sizes: it is declared to be
zero-dimensional (a scalar).

If piddles that are supplied as runtime arguments to a function have
more dimensions than specified for their respective formal arguments in
-the signature then these dimensions are treated by PDL as B< extra
-dimensions > and lead to the operation being B< threaded > over the
+the signature then these dimensions are treated by PDL as B<extra dimensions>
appropriate subslices, just what we have seen in the simple examples
above.

As mentioned before a higher dimensional piddle can be viewed as an
-array (again B< not > in the perl array sense) of lower dimensional
+array (again B<not> in the perl array sense) of lower dimensional
subslices.  Anybody who has ever worked with matrix algebra will be
familiar with the concept.  For some of the following examples it will
be useful to illustrate this concept in somewhat more detail.  Let's
@@ -906,90 +915,101 @@

A boring piddle, you say? Yes, boring, but simple enough to clearly see
what is going on in the following.  First we look at it as a 3D array of
-0D subslices.  Since we know the syntax of the C< slice > method already
+0D subslices.  Since we know the syntax of the C<slice> method already
we can write down all 0D subslices, no problem:

-    $pdl->slice("($i),($j),($k)");
+    $pdl(($i),($j),($k));

Well, obviously we have not written down all 3*4*5 = 60 subslices
-literally but rather in a more concise way.  It is understood that C< $i -> can have any value between 0 and 2, C<$j > between 0 and 3 and C< $k -> between 0 and 4. To emphasize this we sometimes write - -$pdl->slice("($i),($j),($k)")$i=0..2; $j=0..3;$k=0..4
-
-With the meaning as above (and '..' C< not > meaning the perl list
-operator).  In that way we enumerate all the sublices.  Quite
+literally but rather in a more concise way.  It is understood that C<$i> +can have any value between 0 and 2, C<$j> between 0 and 3 and C<$k> +between 0 and 4. To emphasize this we sometimes write + +$pdl(($i),($j),($k))$i=0..2; $j=0..3;$k=0..4
+
+With the meaning as above (and '..' C<not> meaning the perl list
+operator).  In that way we enumerate all the subslices.  Quite
analogously, when dealing with an elementary operation that consumes 1D
-slices we want to view C< $pdl > as an C< [ > 4,5 C< ] > array of 1D -subslices: - -$pdl->slice(":,($i),($j)") $i=0..3;$j=0..4
-
-And similarly, as a C< [ > 5 C< ] > array of 2D subslices
-
-    $pdl->slice(":,:,($i)") $i=0..4 - -You see how we just insert a ':' for each complete dimension we include -in the subslice. In fig. XXX the situation is illustrated graphically +slices we want to view C<$pdl> as an C<[4,5]> array of 1D subslices:
+
+    $pdl(:,($i),($j))$i=0..3; $j=0..4 + +And similarly, as a C<[5]> array of 2D subslices: + +$pdl(:,:,($i))$i=0..4
+
+You see how we just insert a ":" for each complete dimension we include
+in the subslice.  In fig. XXX the situation is illustrated graphically
for a 2D piddle.  Depending on the dimensions involved in an elementary
-operation we therefore often group the dimensions (what we call the B<
-shape >) of a piddle in a form that suggests the interpretation as an
+operation we therefore often group the dimensions (what we call the B<shape>)
+of a piddle in a form that suggests the interpretation as an
array of subslices.  For example, given our 3D piddle above that has a
-shape C< [ > 3,4,5 C< ] > we have the following different
+shape C<[3,4,5]> we have the following different
interpretations:

-    ()[3,4,5] a 3,4,5-array of 0D slices
-    (3)[4,5] a 4,5-array of 1D slices (of shape [3])
-    (3,4)[5] a 5-array of 2D slices (of shape [3,4])
-    (3,4,5)[] a 0D array of 3D slices (of shape [3,4,5])
+    ()[3,4,5]  a shape [3,4,5] array of 0D slices
+    (3)[4,5]   a shape [4,5]   array of 1D slices (of shape [3])
+    (3,4)[5]   a shape [5]     array of 2D slices (of shape [3,4])
+    (3,4,5)[]  a        0D     array of 3D slices (of shape [3,4,5])

The dimensions in parentheses suggest that these are used in the
elementary operation (mimicking the signature syntax); in the context of
-threading we call these the B< elementary dimensions > .  The following
-group of dimensions in rectangular brackets are the B< extra dimensions
-> .  Conversely, given the elementary/extra dims notation we can easily
+threading we call these the B<elementary dimensions> .  The following
+group of dimensions in rectangular brackets are the B<extra dimensions>.
+Conversely, given the elementary/extra dims notation we can easily
obtain the shape of the underlying piddle by appending the extradims to
-the elementary dims.  For example, a C< [ > 3,6,1 C< ] > array of 2D
-subslices (3,4)
+the elementary dims.  For example, a C<[3,6,1]> array of 2D
+subslices of shape C<[3,4]>:

(3,4)[3,6,1]

-identifies our piddle's shape as C< [ > 3,4,3,6,1 C< ] >
+identifies our piddle's shape as C<[3,4,3,6,1]>

Alright, the principles are simple.  But nothing is better than a few
examples.  Again a typical imaging processing task is our starting
point.  We want to convert a colour image to greyscale.  The input image
is represented as a two-dimensional array of triples of RGB colour
-coordinates, or in other words, a piddle of shape C< [3,n,m] > .
+coordinates, or in other words, a piddle of shape C<[3,n,m]> .
Without delving too deeply into the details of digital colour
-representation it suffices to note that commonly a grey value B< i >
+representation it suffices to note that commonly a grey value B<i>
corresponding to a colour represented by a triple of red, green and blue
-intensities B< (r,g,b) > is obtained as a weighted sum:
-
-\begin_inset Formula $-i=\frac{77}{256}r+\frac{150}{256}g+\frac{29}{256}b$
-
-A straight forward way to compute this weighted sum in PDL uses the C<
-inner > function.  This function implements the well-known B< inner
-product > between two vectors.  In a elementary operation C< inner >
+intensities B<(r,g,b)> is obtained as a weighted sum:
+
+=for asciitex
+i=\frac{77}{256} r + \frac{150}{256} g + \frac{29}{256} b
+
+
+       77        150       29
+   i = --- r  +  --- g  +  --- b
+       256       256       256
+
+
+A straight forward way to compute this weighted sum in PDL uses the C<inner>
+function.  This function implements the well-known B<inner product>
+between two vectors.  In a elementary operation C<inner>
computes the sum of the element-by-element product of two
one-dimensional subslices (vectors) of equal length:

-\begin_inset Formula $-c=\sum_{i=0}^{n-1}a_{i}b_{i}$
-
-Now you should already be able to guess C< inner > 's signature:
-
-    perldl> sig inner
+=for asciitex
+c=\sum_{i=0}^{n-1}  a_{i}  b_{i}
+
+
+        __ n-1
+   c = \          a   b
+       /__ i=0     i   i
+
+
+Now you should already be able to guess C<inner>'s signature:
+
+    pdl> sig inner

Signature: inner(a(n); b(n); [o]c())

-C< a(n); b(n); [o]c(); > : two one-dimensional input parameters C< a(n)
-> and C< b(n) > and a scalar output parameter C< c() > .  Since C< a >
-and C< b > both have the same named dimension size C< n > the
+C<a(n); b(n); [o]c();> : two one-dimensional input parameters C<a(n)
+> and C<b(n)> and a scalar output parameter C<c()> .  Since C<a>
+and C<b> both have the same named dimension size C<n> the
corresponding dimension sizes of the actual arguments will have to match
at runtime (which will be checked by PDL!).  We demonstrate the
computation starting with a colour triple that produces a sort of
@@ -1011,21 +1031,20 @@
COLOR Byte D [3,500,300]

The code needs no modification! Let us analyze what is going on.  We
-know that C< $conv > has just the required number of dimensions (namely +know that C<$conv> has just the required number of dimensions (namely
one of size 3).  So this argument doesn't require PDL to perform
-threading.  However, the first argument C< $im > has two B< extra -dimensions > (shape C< [ > 500,300 C< ] >). In this case threading +threading. However, the first argument C<$im> has two B<extra dimensions>
+(shape C<[500,300]>).  In this case threading
works (as you would probably expect) by iterating the inner product over
-the combination of all 1D subslices of C< $im > with the one and only -subslice of C<$conv > creating a resulting piddle (the greyscale image)
+the combination of all 1D subslices of C<$im> with the one and only +subslice of C<$conv> creating a resulting piddle (the greyscale image)
that is made up of all results of these elementary operations: a 500x300
-array of scalars, or in other words, a 2D piddle of shape C< [ > 500,300
-C< ] > .
+array of scalars, or in other words, a 2D piddle of shape C<[500,300]>.

We can more concisely express what we have said in words above in our
new way to split piddle arguments in elementary dimensions and extra
-dimensions.  At the top we write C< inner > 's signature and at the
-bottom the C< slice > expressions that show the subslices involved in
+dimensions.  At the top we write C<inner>'s signature and at the
+bottom the C<slice> expressions that show the subslices involved in
each elementary operation:

Piddles $im$conv $grey @@ -1038,9 +1057,9 @@ subslices are used in a elementary operation. It is a way to keep track what is going on behind the scenes when PDL threading is at work. Threading makes it possible that we can call the greyscale conversion -with piddles representing just one RGB pixel (shape C< [ > 3 C< ] >), a -line of RGB pixels (shape C< [ > 3,n C< ] >), RGB images (shape C< [ > -3,m,n C< ] >), volumes of RGB data (shape C< [ > 3,m,n,o C< ] >), etc. +with piddles representing just one RGB pixel (shape C<[3]>), a +line of RGB pixels (shape C<[3,n]>), RGB images (shape C<[3,m,n]>), +volumes of RGB data (shape C<[3,m,n,o]>), etc. All we have to do is wrap the code above into a small subroutine that also does some type conversion to complete it: @@ -1069,11 +1088,10 @@ a(tri = 3); [o] b() There is some new syntax in this signature that we haven't seen before: -writing C< tri = 3 > signifies that in a elementary operation C< rgbtogr -> will work on 1D subslices (we have encountered this before); -additionally, the size of the first dimension (named suggestively C< tri ->) B< must > be three. You get the idea. - +writing C<tri=3> signifies that in a elementary operation C<rgbtogr> +will work on 1D subslices (we have encountered this before); +additionally, the size of the first dimension (named suggestively C<tri>) +B<must> be three. You get the idea. What we have just seen is worth keeping in mind! By using PDL functions in our own subroutines we can make new functions with the ability to thread over subslices. Obviously, this is useful. We will come back to @@ -1092,10 +1110,10 @@ occur in both piddles. However, one piddle can have more extra dimensions than the other without causing a mismatch. That sounds strange? Ok, here is an example. We use one of the fundamental -arithmetic operations in PDL, addition implemente d by the ' C< + > ' +arithmetic operations in PDL, addition implemente d by the "C<+>" operator. You know already that in an array-oriented language like PDL addition is performed element-by-element on scalars. So the signature -of ' C< + > ' comes as no surprise +of "C<+>" comes as no surprise a(); b(); [o] c() @@ -1110,7 +1128,7 @@$d = pdl [3,4];
print $a +$b, "\n";

-No big deal.  Extradims for both piddles have shape C< [ > 3 C< ] >
+No big deal.  Extradims for both piddles have shape C<[3]>
obviously matching, resulting in

[2 3 4]
@@ -1127,9 +1145,9 @@
through our new terminology and check that we can formally agree with
what we intuitive ly expected anyway.

-C< $a > 's B< extradim > (s) has shape C< [ > 3 C< ] > , those of C<$c
-> shape C< [ > 3 2 C< ] > .  The B< corresponding > B< extradim > (s) in
-this case is just the first one for the piddles involved.  It is equal
+C<$a>'s B<extradim> (s) has shape C<[3]> , those of C<$c> shape C<[3,2]>.
+The B<corresponding> B<extradim>(s) in this case is
+just the first one for the piddles involved.  It is equal
to 3 in both input piddles, so clearly matches.

$a$c
@@ -1139,23 +1157,19 @@
Now, here is something we have not explicitly discussed yet: what is the
shape of the automatically created output piddle given the shape of the
extradims of the input piddles involved? Well, the result is created so
-that it has as many extradims as that input B< piddle > (s) with the
+that it has as many extradims as that input B<piddle>(s) with the
piddles.  In our current example that leaves us with a result with
-extradim shape C< [ > 3,2 C< ] > : [o] c() ()[3,2] Remembering that we
+extradim shape C<[3,2]>: C<[o] c() ()[3,2]>. Remembering that we
obtain the shape of the output piddle by appending the shapes of the
extradims to that of the elementary dimensions (here a scalar, i.e.  0D)
-that leaves us with a result piddle of shape C< [
->
-3,2 C< ]
->
-.
-
-In the next example we want to multiply C< $c > with C<$d > so that
-each row of C< $c > is multiplied by the corresponding element of C<$d
-> or expressed in slices
-
-    $result->slice("($i),($j)") =$c->slice("($i),($j)") * $d->slice("($j)") $i=0..2,$j=0..1
+that leaves us with a result piddle of shape C<[3,2]>.
+
+In the next example we want to multiply C<$c> with C<$d> so that
+each row of C<$c> is multiplied by the corresponding element of C<$d>
+or expressed in slices (with niceslice syntax):
+
+    $result(($i),($j)) =$c(($i),($j)) * $d(($j)) $i=0..2,$j=0..1

How do we achieve that by threading?

@@ -1163,8 +1177,8 @@

is not the right way.

-Why? Well, the extradims don't match, C< [ > 3,2 C< ] > does not match
-C< [ > 2 C< ] > since 2 is not equal to 3.  Just to see how PDL checks
+Why? Well, the extradims don't match, C<[3,2]> does not match
+C<[2]> since 2 is not equal to 3.  Just to see how PDL checks
this let us actually execute the command.  The slightly obscure error
message is something like this

@@ -1175,14 +1189,14 @@
This is PDL's way to tell you that the extra dimensions don't match.

So how do we do it? We use one of the dimension manipulation methods
-again.  This time C< dummy > comes in handy.  We want to multiply each
-element in the nth row of C< $c > with the nth element of C<$d > .  So
-we have to repeat each element of C< $d > as many times as there are -elements in each row of C<$c > .  This is exactly what we can achieve
-by inserting a dummy dimension of size $c-> B< getdim > (0) as dimension -0 of C<$d > :
-
-    perldl> print $d->dummy(0,$c->getdim(0))->pdim("New dims");
+again.  This time C<dummy> comes in handy.  We want to multiply each
+element in the nth row of C<$c> with the nth element of C<$d> .  So
+we have to repeat each element of C<$d> as many times as there are +elements in each row of C<$c> .  This is exactly what we can achieve
+by inserting a dummy dimension of size C<<$c->getdim(0)>> as dimension +0 of C<$d> :
+
+    pdl> print $d->dummy(0,$c->getdim(0))->pdim("New dims");
New dims Double D [3,2]
[
[3 3 3]
@@ -1190,7 +1204,7 @@
]

Using this trick we have a our threaded multiplication do what we want.
-And now the extra dimensions B< match > (!):
+And now the extra dimensions B<match>(!):

$result =$c * $d->dummy(0,$c->getdim(0));
print $result; @@ -1202,10 +1216,10 @@ "($i),(j)" "($i),($j)" "($i),($j)"

But hang on, we want to verify (somewhat formally) that the right
-subslices of the original C< $d > are used in each elementary operation. -That is easily achieved by noting that the slice "($i),($j)" of the -dummied C<$d > is equivalent to the subslice "($j)" of the original 1D -piddle C<$d > .  So we finally arrive at
+subslices of the original C<$d> are used in each elementary operation. +That is easily achieved by noting that the slice C<($i),($j)> of the +dummied C<$d> is equivalent to the subslice C<($j)> of the original 1D +piddle C<$d> .  So we finally arrive at

$c$d $result "($i),(j)" "($j)" "($i),($j)" @@ -1216,20 +1230,20 @@ But before we try our understanding on such an example we look once again at the way extra dims have to match in a thread loop. In the -previous example, we had to find out about the size of C<$c > 's first
-dimension (using C< getdim(0) >) to make a dummy dimension that would
-fit C< $c > 's extradims in the threaded multiplication. Since similar +previous example, we had to find out about the size of C<$c>'s first
+dimension (using C<getdim(0)>) to make a dummy dimension that would
+fit C<$c>'s extradims in the threaded multiplication. Since similar situations occur very often when writing threaded PDL code the matching rules for extra dimensions allow a dimension size of 1 to match any -other dimension size: it is the B< elastic > dimension size in a sense +other dimension size: it is the B<elastic> dimension size in a sense that it grows in a thread loop as required. As in the thread loop the corresponding extra dimension is marched through all its positions (e.g. -C< slice(":,($i)") $i=0..n-1 >) the elastic dimension just uses its one -and only position 0 repeatedly ( C< slice(":,(0)")$i=0..n-1 >).
+C<slice(":,($i)")$i=0..n-1>) the elastic dimension just uses its one
+and only position 0 repeatedly ( C<slice(":,(0)") $i=0..n-1>). Therefore, an equivalent and more concise way to write the threaded multiplicat ions makes use of this and the fact that a dummy dimension of size 1 is created by default if the second argument is omitted (see -C< help dummy >) +C<help dummy>) print$c->pdim('c') * $d->dummy(0)->pdim('dd');$A [l,m] $B [n,l]$AB [n,m]
@@ -1246,7 +1260,7 @@

and that means

-    $AB->slice("($i),($j)") = inner$A->slice(":,($j)"),$B->slice("($i),:")$i=0..n-1, $j=0..m-1 +$AB(($i),($j)) = inner $A(:,($j)), $B(($i),:) $i=0..n-1,$j=0..m-1

and that is exactly the definition of the matrix product as we explained
above! Our bit of formalism has sort of "proved" it.  You see that the
`