Chain Rule


 

16.1 INTRODUCTION

16.1.1 Building Complex Functions from Basic Ones

In calculus, we can build from basic functions more general functions. One possibility is to add functions like . An other possibility is to multiply functions like . A third possibility is to combine functions like . The composition of functions is non-commutative: . Indeed, we have which is completely different from .

Figure 1. and can be combined to .

16.1.2 The Chain Rule: From Single Variable to Higher Dimensions

How can we express the rate of change of a composite function in terms of the basic functions it is built of? For the sum of two functions, we have the addition rule , for multiplication we have the product rule . We usually just write or and do not always write the argument. As you know from single variable calculus, the derivative of the composite function is given by chain rule. This is . Written out in more details with argument, we can write . We generalize this here to higher dimensions. Instead of we just write . This is the Jacobean matrix we know. Now, the same rule holds as before and this is called the chain rule in higher dimensions. On the right hand side, we have the matrix product of two matrices.

16.1.3 Dimensions and the Chain Rule

Let us see why this makes sense in terms of dimensions: and , then and and which is the same type of matrix than because maps so that also . The name chain rule comes because it deals with functions that are chained together.

16.2 LECTURE

16.2.1 The Multivariable Chain Rule

Given a differentiable function , its derivative at is the Jacobian matrix . If is another function with , we can combine them and form . The matrices and combine to the matrix product at a point. This matrix is in . The multi-variable chain rule is:

Theorem 1. .

16.2.2 Scalar Functions and the Gradient

For , the single variable calculus case, we have and . In general, is now a matrix rather than a number. By checking a single matrix entry, we reduce to the case . In that case, is a scalar function. While is a row vector, we define the column vector If is a curve, we write instead of . The symbol is addressed also as "nabla".1 The special case is:

Theorem 2. .

Proof. is the limit of which is (D chain rule) in the limit the sum .

Proof of the general case: Let . The entry of the Jacobian matrix is . The case of the entry reduces with and to the case when is a curve and is a scalar function. This is the case we have proven already. ◻

16.3 EXAMPLES

Example 1. Assume a ladybug walks on a circle and is the temperature at the position , then is the rate of change of the temperature. We can write Now, . The gradient of and the velocity are Now

Figure 2. If is a height, the rate of change is the gain of height the bug climbs in unit time. It depends on how fast the bug walks and in which direction relative to the gradient it walks.

16.4 ILLUSTRATIONS

16.4.1 Power from Potential: A Chain Rule Connection

The case is extremely important. The chain rule tells that the rate of change of the potential energy at the position is the dot product of the force at the point and the velocity with which we move. The right hand side is power force times velocity. We will use this later in the fundamental theorem of line integrals.

16.4.2 Chaos via Derivatives: Lyapunov Exponents and Entropy in Iterated Maps

If , then is again a map from to . We can also iterate a map like The derivative is by the chain rule the product of Jacobian matrices. The number is called the Lyapunov exponent of the map at the point . It measures the amount of chaos, the "sensitive dependence on initial conditions" of . These numbers are hard to estimate mathematically. Already for simple examples like the Chirikov map one can measure positive entropy . A conjecture of Sinai tells that that the entropy of the map is positive for large . Measurements show that this entropy satisfies . The conjecture is still open.2

16.4.3 Hamilton’s Equations and Energy Conservation

If is a function called the Hamiltonian and , then . This can be interpreted as energy conservation. We see that a Hamiltonian differential equation always preserves the energy. For the pendulum, , we have or .

Figure 3. The map is a Henon map. We see some orbits. The map on the right appeared in the first hourly. The torus is filled with a blue "stochastic sea" containing red "stable islands".

16.4.4 The Chain Rule Unlocks Inverses

The chain rule is useful to get derivatives of inverse functions. Like which then gives

16.4.5 Implicit Differentiation: Finding the Mystery Slope

Assume is a curve. We can not solve for . Still, we can assume . Differentiation using the chain rule gives Therefore In the above example, the point is on the curve. Now and . So, . This is called implicit differentiation. We could compute with it the derivative of a function which was not known.

16.4.6 Guaranteed Solutions: The Implicit Function Theorem

The implicit function theorem assures that a differentiable implicit function exists near a root of a differentiable function .

Theorem 3. If , there exists and a function with .

Proof. Let be so small that for fixed , the function has the property and and in . The intermediate value theorem for now assures a unique root of near . The chain rule formula above then assures that for , the differential quotient written down for has a limit . ◻

P.S. We can get the root of by applying Newton steps . Taylor (seen in the next class) shows the error is squared in every step. The Newton step works also in arbitrary dimensions. One can prove the implicit function theorem by just establishing that Id is a contraction and then use the Banach fixed point theorem to get a fixed point of which is a root of .

Figure 4. The Newton step.
Figure 5. If we apply the map again and again and plot points, we get an orbit. Such simple dynamical systems are largely not understood. Which points do not escape to infinity? What is the boundary of this set. Proving that there are regions which stay bounded is hard and needs "hard implicit function theorems". The Newton method allows to get a grip on proving this, where the Newton step is applied on spaces of functions. Some of the hardest analysis which humans have invented for tackling mathematical problems come to play in this seemingly simple map .

Units 16 and 17 are together taught on Wednesday. Homework is all in unit 17.


  1. Etymology tells that the symbol is inspired by a Egyptian or Phoenician harp.↩︎
  2. To generate orbits, see http://www.math.harvard.edu/k̃nill/technology/chirikov/.↩︎