It appears the chain rule is applied by diff() with
commutative multiplication to form "f'(g(x))*g'(x)" as
the derivative of "f(g(x))".
That is OK if the functions and variables involved are
scalars, so that they always commute. However, I've
constructed an example for which diff() returns the
wrong result. I have a patch which makes it return the
Here is the example problem. We want to compute the
derivative of ||A.x||^2 w.r.t. x, where A is a matrix
and x is a vector. The derivative of ||x||^2 is 2
transpose(x). The derivative of transpose(x) is the
identity matrix. Applying the chain rule, the
derivative is 2 transpose(A.x) . A.
(C4) gradef('norm2(x),2 .transpose(x))$
(D7) 2*A*'TRANSPOSE(A . x) <-- OOPS.
The multiplication operator for A times transpose(A.x)
should be the noncommutative multiplication, and it
should be transpose(A.x) times A.
This particular problem can be fixed by changing line
401 of comm.lisp from "#'MUL2" to "#'NCMUL2". Then the
return value from diff() is correct.
However making that change would introduce
noncommutative multiplications into expressions where
they don't belong. I guess that there should be some
kind of test to see if commutative is OK. What is the
right test, and how is it implemented?
Note 1. gradef('norm2(x),2 .transpose(x))$ shouldn't be
needed, except that diff('norm2(x),x); returns
transpose(x)+x, which is wrong, without that gradef().
Note 2. It appears the reason the gradef('norm2(x),...)
is needed is because the derivative of transpose(x).A
w.r.t. x is transpose(A), but Maxima thinks it is A.
Log in to post a comment.