[TCLCORE] expr as "little math language" / TIP 133

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Upon reading TIP #133 some time ago, it occurred to me that the addition of
a few more operations to the [expr] little language could make it a lot
more convenient for coding complicated mathematical expressions. The ones I
thought of are

   =3D     definition
   :=3D    assignment
   ;     "gobbled"
   ,     list construction

which are defined below. I think at least some of these could be suitable
for inclusion in TIP 133, but I also fear that some might call for a
separate TIP as they require greater changes in the implementation of the
[expr] parser. Therefore I have grouped them below in three steps, where
Step 1 are the (I think) most straightforward, Step 2 perhaps slightly
trickier, and Step 3 could be trickier still.

~ Step 1: Assignment and Gobbling

The assignment operation :=3D should be such that

   expr {$a :=3D $b}

is equivalent to

   set $a $b

i.e., expr {"z" :=3D $x*$y} would assign the value of x times y to z as well
as return it. The priority of :=3D should be lower than that of ?, and it
should be "right-associative", i.e.,

   expr {$a :=3D $b :=3D $c}

should be equivalent to

   set $a [set $b $c]

rather than to

   set [set $a $b] $c

since the latter is probably far less common.

The "gobbled" operation ; should be such that

   expr {$a ; $b}

is equivalent to

   expr {$a} ; expr {$b}

i.e., first the left operand is evaluated, then the right operand, and the
value of the right operand is returned. The priority of ; should be lower
than that of :=3D. The ; operation is associative.

Note: The name "gobbled" is taken from Plain METAFONT, where the gobbled
operation can be used in an expression to throw away the thing on the left
hand side (although MF's gobbled is a bit stronger than this ; as it also
throws away the expression on the right hand side). A more Tclish reference
is perhaps that it is the opposite (left and right operands reversed) of
the famous K combinator. I fear neither of these explain what they are good
for though, so here is an example:

  expr {
     "x2" :=3D $x  * $x  ;
     "x4" :=3D $x2 * $x2 ;
     "x8" :=3D $x4 * $x4 ;
     $x8 * $x8
  }

computes x^16 in only four multiplications. (Yes, there is already the pow
function that does this, and in Tcl8.5 there will be the ** operation as
well, but this example generalises also to all other functions than can be
expressed as compositions of several smaller pieces.) Furthermore the
meaning of the above should be immediately clear to the majority of
programmers, even if they haven't had the details of :=3D and ; explained to
them.

~ Step 2: List Construction

The list construction operation , should be such that

   expr {$a , $b}

is equivalent to

   linsert $a end $b

or more intuitively, this appends $b to the list $a. (When $a has a numeric
internal representation, this should of course make use of the obvious
optimization as [list $a $b], but that is semantically irrelevant.) The ,
operation should have priority between :=3D and ;. Furthermore it should be
"left-associative", i.e.,

   expr {1,2,3}

is equivalent to

   expr {(1,2),3}

but not to

   expr {1,(2,3)}

(the first two return the same thing as [list 1 2 3], but the last rather
the same as [list 1 [list 2 3]]).

This changes the parsing of expressions such as atan2($y,$x), fmod($a,$b),
hypot($x,$y), and pow($a,$b), but not necessarily their meaning. What is
inside the parentheses still contains the same information; it has merely
been collected in a list. Conversion to Tcl_Value's is still possible. The
main practical difference to the current state of things may well be that
expressions such as

   atan2($L)

which now produce compile-time errors, would instead produce execution-time
errors if $L is not a list of two numbers (and passing a pair of numbers
could be useful). Another consequence of this is that the introduction of
math functions which take an arbitrary number of arguments (max and min are
naturally defined in that way) might become simpler.

~ Step 3: Definition of Local Constants

The idea here is that =3D should be used for defining symbolic names for
values that can be accessed without the need for a $ as with proper
variables. The best name I've so far come up with for these is "local
constants", reflecting two things about them: (i) They are local to the
[expr] command in which they are defined (even local to a particular
invocation of that command), and (ii) once defined, they may not be
redefined. The "constant" part is however slightly ambiguous, as it could
also be interpreted to mean "must have a fixed numerical value", which is
not the intention. The idea is instead that the above x^16 example should
be possible to write as

  expr {
     x2 =3D $x * $x ;
     x4 =3D x2 * x2 ;
     x8 =3D x4 * x4 ;
     x8 * x8
  }

i.e., without the "" quoting of names to be defined and without the $ to
access their values later on.

Technically this would have to involve several modifications to the [expr]
parser, byte compiler, and possibly some other stuff as well, but probably
not extremely much. The parser already responds to

   expr {whatever}

by `Error: syntax error in expression "whatever": variable references
require preceding $' whereas

   expr {whatever ()}

produces the error `Error: unknown math function "whatever"'. Hence tokens
that could be names of "local constants" are already scanned and at least
the  character following them is looked at. Thus definitions can be
spotted, and before throwing the above syntax error it could be checked if
the token is a defined "local constant".

In principle, the local constants can be stored in a dictionary (name
points to value), but perhaps in bytecoding it is possible to just let them
accumulate on a stack.

Priority-wise, the =3D operation is a bit of a bastard. It probably should
have the same priority as assignment :=3D, since that is what people would
expect, but at the same time the thing on its left is not a general
expression and that makes the entire `name =3D' a kind of unary operation.

~ A Real Example

The examples above are really toy examples, and may fail to convey the full
effect that the operations proposed have on [expr]---in some sense
extending it from a command that evaluates expressions to a command that
acts as a wrapper for a fairly complete "little math language"---so here is
an example from Real Life. (It is probably much too long to make a good
example, but it is also something that I've actually had to write.)

The details of what is computed aren't important (and I've removed all
comments to shorten this posting), but notice that pretty much every
command is a combination of [expr] and [set], where the latter mostly
caches the value of some expression that is going to be used in several
places.

proc tensionize_curveto {postvar joinvar prevar x0 y0 x1 y1 x2 y2 x3 y3} {
   upvar 1 $prevar preL $postvar postL $joinvar joinL
   set a1 [expr {$x1-$x0}]
   set b1 [expr {$y1-$y0}]
   set a2 [expr {$x3-$x2}]
   set b2 [expr {$y3-$y2}]
   set postL [list postdir [list $a1 $b1]]
   set preL [list predir [list $a2 $b2]]
   set t [expr {double(abs($a1)+abs($b1))}]
   if {$t>0} then {
      set dtheta [expr {$t / ($a1*$a1 + $b1*$b1)}]
      lappend postL [expr {atan2($b1,$a1)}] $dtheta
   } else {
      set dtheta 7.0
      lappend postL 0.0 $dtheta
   }
   set t [expr {double(abs($a2)+abs($b2))}]
   if {$t>0} then {
      set dphi [expr {$t / ($a2*$a2 + $b2*$b2)}]
      lappend preL [expr {atan2($b2,$a2)}] $dphi
   } else {
      set dphi 7.0
      lappend preL 0.0 $dphi
   }
   set c [expr {$x3-$x0}]
   set d [expr {$y3-$y0}]
   set theta_re [expr {$a1*$c + $b1*$d}]
   set theta_im [expr {-$a1*$d + $b1*$c}]
   set theta [expr {atan2($theta_im,$theta_re)}]
   set phi_re [expr {$a2*$c + $b2*$d}]
   set phi_im [expr {$a2*$d - $b2*$c}]
   set phi [expr {atan2($phi_im,$phi_re)}]
   set t [expr {hypot($theta_re,$theta_im)}]
   set c_theta [expr {$theta_re / $t}]
   set s_theta [expr {$theta_im / $t}]
   set t [expr {hypot($phi_re,$phi_im)}]
   set c_phi [expr {$phi_re / $t}]
   set s_phi [expr {$phi_im / $t}]
   set f_n1 [expr {$s_theta - $s_phi/16}]
   set f_n2 [expr {$s_phi - $s_theta/16}]
   set f_n3 [expr {$c_theta - $c_phi}]
   set f_n123 [expr {sqrt(2)*$f_n1*$f_n2*$f_n3}]
   set golden [expr {(sqrt(5)-1)/2}]
   set rho_denom [expr {2 + $golden*$c_theta + (1-$golden)*$c_phi}]
   set sigma_denom [expr {2 + (1-$golden)*$c_theta + $golden*$c_phi}]
   set rho [expr {(2 + $f_n123) / $rho_denom / 3}]
   set sigma [expr {(2 - $f_n123) / $sigma_denom / 3}]
   set d_numer [expr {sqrt(2)*(
     (abs($c_theta)*$dtheta + abs($c_phi/16)*$dphi) * abs($f_n2*$f_n3) +
     (abs($c_theta/16)*$dtheta + abs($c_phi)*$dphi) * abs($f_n1*$f_n3) +
     abs($f_n1*$f_n2) * (abs($s_theta)*$dtheta + abs($s_phi)*$dphi))}]
   set d_rho_denom [expr {$golden*abs($s_theta)*$dtheta +\
     (1-$golden)*abs($s_phi)*$dphi}]
   set d_sigma_denom [expr {(1-$golden)*abs($s_theta)*$dtheta +\
     $golden*abs($s_phi)*$dphi}]
   set r_rho [expr {abs($d_numer/(2+$f_n123)) +\
     abs($d_rho_denom/$rho_denom)}]
   set r_sigma [expr {abs($d_numer/(2-$f_n123)) +\
     abs($d_sigma_denom/$sigma_denom)}]
   set t [expr {hypot($c,$d)}]
   if {$a1 !=3D 0 || $b1 !=3D 0} then {
      set alpha [expr {$rho * $t / hypot($a1,$b1)}]
      set r_alpha [expr {$r_rho + $dtheta}]
   } else {
      set alpha infinity
      set r_alpha $r_rho
   }
   if {$a2 !=3D 0 || $b2 !=3D 0} then {
      set beta [expr {$sigma * $t / hypot($a2,$b2)}]
      set r_beta [expr {$r_sigma + $dphi}]
   } else {
      set beta infinity
      set r_beta $r_sigma
   }
   set joinL [list tension $alpha $beta [expr {log10($r_alpha)}]\
     [expr {log10($r_beta)}]]
   set gamma_err [expr {3*($r_alpha+$r_beta)}]
   if {$alpha=3D=3D"infinity"} then {
      set gamma1 infinity
      set gamma2 0
   } elseif {$beta=3D=3D"infinity"} then {
      set gamma1 0
      set gamma2 infinity
   } else {
      set t1 [expr {$theta+$phi - 3*$phi*$alpha}]
      set r1 [expr {($dtheta + abs(1-3*$alpha)*$dphi +\
        3*abs($phi)*$r_alpha*$alpha) / abs($t1)}]
      set t2 [expr {$theta+$phi - 3*$theta*$beta}]
      set r2 [expr {(abs(1-3*$beta)*$dtheta + $dphi +\
        3*abs($theta)*$r_beta*$beta) / abs($t2)}]
      set gamma1 [expr {$beta*$beta*$beta / ($alpha*$alpha*$alpha) *\
        $t1/$t2}]
      set gamma2 [expr {1/$gamma1}]
      set gamma_err [expr {$gamma_err + $r1 + $r2}]
   }
   set gamma_err [expr {log10($gamma_err)}]
   lappend postL $gamma1 $gamma_err
   lappend preL $gamma2 $gamma_err
   return [list $postL $joinL $preL]
}

With the suggested new operations, the above could be coded as

proc tensionize_curveto {postvar joinvar prevar x0 y0 x1 y1 x2 y2 x3 y3} {
   upvar 1 $prevar preL $postvar postL $joinvar joinL
   expr {
      a1 =3D $x1-$x0;
      b1 =3D $y1-$y0;
      a2 =3D $x3-$x2;
      b2 =3D $y3-$y2;
      "postL" :=3D ("postdir", (a1,b1));
      "preL" :=3D ("predir", (a2,b2));
      t0 =3D double(abs(a1)+abs(b1));
      t0>0 ? (
         dtheta =3D $t0 / (a1*a1 + b1*b1);
         "postL" :=3D ($postL, atan2(b1,a1), dtheta)
      ) : (
         dtheta =3D 7.0;
         "postL" :=3D ($postL, 0.0, dtheta)
      );
      t1 =3D double(abs(a2)+abs(b2));
      t1>0 ? (
         dphi =3D t1 / (a2*a2 + b2*b2);
         "preL" :=3D ($preL, atan2(b2,a2), dphi)
      ) : (
         dphi =3D 7.0;
         "preL" :=3D ($preL, 0.0, dphi)
      );
      c =3D $x3-$x0;
      d =3D $y3-$y0;
      theta_re =3D a1*c + b1*d;
      theta_im =3D -a1*d + b1*c;
      theta =3D atan2(theta_im,theta_re);
      phi_re =3D a2*c + b2*d;
      phi_im =3D a2*d - b2*c;
      phi =3D atan2(phi_im,phi_re);
      t2 =3D hypot(theta_re,theta_im);
      c_theta =3D theta_re/t2;
      s_theta =3D theta_im/t2;
      t3 =3D hypot(phi_re,phi_im);
      c_phi =3D phi_re/t3;
      s_phi =3D phi_im/t3;
      f_n1 =3D s_theta - s_phi/16;
      f_n2 =3D s_phi - s_theta/16;
      f_n3 =3D c_theta - c_phi;
      f_n123 =3D sqrt(2)*f_n1*f_n2*f_n3;
      golden =3D (sqrt(5)-1)/2;
      rho_denom =3D 2 + golden*c_theta + (1-golden)*c_phi;
      sigma_denom =3D 2 + (1-golden)*c_theta + golden*c_phi;
      rho =3D (2 + f_n123) / rho_denom / 3;
      sigma =3D (2 - f_n123) / sigma_denom / 3;
      d_numer =3D sqrt(2)*(
        (abs(c_theta)*dtheta + abs(c_phi/16)*dphi) * abs(f_n2*f_n3) +
        (abs(c_theta/16)*dtheta + abs(c_phi)*dphi) * abs(f_n1*f_n3) +
        abs(f_n1*f_n2) * (abs(s_theta)*dtheta + abs(s_phi)*dphi)
      );
      d_rho_denom =3D
         golden*abs(s_theta)*dtheta + (1-golden)*abs(s_phi)*dphi;
      d_sigma_denom =3D
         (1-golden)*abs(s_theta)*dtheta + golden*abs(s_phi)*dphi;
      r_rho =3D abs(d_numer/(2+f_n123)) + abs(d_rho_denom/rho_denom);
      r_sigma =3D abs(d_numer/(2-f_n123)) + abs(d_sigma_denom/sigma_denom);
      t4 =3D hypot(c,d);
      a1 !=3D 0 || b1 !=3D 0 ? (
         alpha =3D rho * t4 / hypot(a1,b1);
         r_alpha =3D r_rho + dtheta
      ) : (
         alpha =3D "infinity";
         r_alpha =3D r_rho
      );
      a2 !=3D 0 || b2 !=3D 0 ? (
         beta =3D sigma * t4 / hypot(a2,b2);
         r_beta =3D r_sigma + dphi
      ) : (
         beta =3D "infinity";
         r_beta =3D r_sigma
      );
      joinL :=3D ("tension", alpha, beta, log10(r_alpha), log10(r_beta));
      "gamma_err" :=3D 3*(r_alpha + r_beta);
      alpha =3D=3D "infinity" ? (
         gamma1 =3D "infinity";
         gamma2 =3D 0
      ) : beta =3D=3D "infinity" ? (
         gamma1 =3D 0;
         gamma2 =3D "infinity"
      ) : (
         t5 =3D theta+phi - 3*phi*alpha;
         r1 =3D (dtheta + abs(1-3*alpha)*dphi + 3*abs(phi)*r_alpha*alpha) /
              abs(t5);
         t6 =3D theta+phi - 3*theta*beta;
         r2 =3D (abs(1-3*beta)*dtheta + dphi + 3*abs(theta)*r_beta*beta) /
              abs(t6);
         gamma1 =3D beta*beta*beta / (alpha*alpha*alpha) * t5/t6;
         gamma2 =3D 1/gamma1;
         "gamma_err" :=3D $gamma_err + r1 + r2
      );
      "gamma_err" :=3D log10($gamma_err);
      "postL" :=3D ($postL, gamma1, $gamma_err);
      "preL" :=3D ($preL, gamma2, $gamma_err);
   }
}

Even if the second form isn't beautiful, it is a lot more readable than the
first form. The ability to introduce auxiliary quantities makes it possible
to practically encode much larger formulae using [expr].

~ Miscellaneous Ideas

With [expr] bodies as large as that above, it would be necessary for
debugging to include all defined local constants and their values in the
errorInfo when the [expr] return with an error. Otherwise the block of code
would be much too hard to decipher.

Another trick that the above (in particular Step 3) could be used for is
supporting named constants. Consider

set cdefs {
   pi =3D 3.141592653589793238;
   e =3D 2.718281828459045235;
   Euler_gamma =3D .5772156649015328606;
}

expr $cdefs {sin($x*pi) + $y*Euler_gamma - e}

Since [expr] concatenates its arguments before parsing them, the
effect of the above is the same as that of

   expr {
      pi =3D 3.141592653589793238;
      e =3D 2.718281828459045235;
      Euler_gamma =3D .5772156649015328606;
      sin($x*pi) + $y*Euler_gamma - e
   }

but in a more reusable way. This does not face the problem with potential
incompatibilities when different packages want to extend [expr] in
different ways that one might have with a global "define expr constant"
command, since the constants are explicitly specified in each command where
they are used.

(It might even be possible to cache interpretations of expressions in cases
such as the above. As long as $cdefs ends with a ; at top level, the two
arguments of  [expr] can be parsed separately. The internal representation
of {sin($x*pi) + $y*Euler_gamma - e} would of course need to specify for
which set of previously defined "local constants" it is valid.)

Lars Hellstr=F6m

[TCLCORE] expr as "little math language" / TIP 133

The Tool Command Language implementation

[TCLCORE] expr as "little math language" / TIP 133