It appears the chain rule is applied by diff() with

commutative multiplication to form "f'(g(x))*g'(x)" as

the derivative of "f(g(x))".

That is OK if the functions and variables involved are

scalars, so that they always commute. However, I've

constructed an example for which diff() returns the

wrong result. I have a patch which makes it return the

correct result.

Here is the example problem. We want to compute the

derivative of ||A.x||^2 w.r.t. x, where A is a matrix

and x is a vector. The derivative of ||x||^2 is 2

transpose(x). The derivative of transpose(x) is the

identity matrix. Applying the chain rule, the

derivative is 2 transpose(A.x) . A.

(C1) declare(A,nonscalar)$

(C2) declare(x,nonscalar)$

(C3) norm2(x):=transpose(x).x$

(C4) gradef('norm2(x),2 .transpose(x))$

(C5) gradef('transpose(x),1)$

(C6) display2d:FALSE$

(C7) diff('norm2(A.x),x);

(D7) 2*A*'TRANSPOSE(A . x) <-- OOPS.

The multiplication operator for A times transpose(A.x)

should be the noncommutative multiplication, and it

should be transpose(A.x) times A.

This particular problem can be fixed by changing line

401 of comm.lisp from "#'MUL2" to "#'NCMUL2". Then the

return value from diff() is correct.

However making that change would introduce

noncommutative multiplications into expressions where

they don't belong. I guess that there should be some

kind of test to see if commutative is OK. What is the

right test, and how is it implemented?

Note 1. gradef('norm2(x),2 .transpose(x))$ shouldn't be

needed, except that diff('norm2(x),x); returns

transpose(x)+x, which is wrong, without that gradef().

Note 2. It appears the reason the gradef('norm2(x),...)

is needed is because the derivative of transpose(x).A

w.r.t. x is transpose(A), but Maxima thinks it is A.