Chain Rule

Chain rule over a curve

Suppose f(x,y) is a real-valued function on the plane, and ๐ซ(t)=(x(t),y(t)) is a parametric curve. There is a recipe to compute the derivative of the composite function (fโˆ˜๐ซ)(t)=f(x(t),y(t)):

ddtf(x,y)=โˆ‚fโˆ‚xdxdt+โˆ‚fโˆ‚ydydt.

Example

Particle moving through temperature gradient

Problem: A particle traverses a parametric curve in the plane given by ๐ซ(t)=(cosโก(t),et). The ambient temperature is a function of distance from the y-axis: T(x,y)=1x2+3. How fast is the temperature changing at t=4? Solution: We compute โˆ‚Tโˆ‚x=โˆ’2x(x2+3)2 and โˆ‚Tโˆ‚y=0, while xโ€ฒ(t)=โˆ’sinโก(t). Then:

dfdt=2cosโก(t)sinโก(t)(cos2โก(t)+3)2,dfdt|t=4=2cosโก(4)sinโก(4)(cos2โก(4)+3)2โ‰ˆ0.084.

Derivation of chain rule over a curve

Letโ€™s suppose that some small change ฮ”t produces a change ฮ”x and ฮ”y via the parametrization ๐ซ(t). By differentiability of f, we have:

ฮ”f=โˆ‚fโˆ‚xฮ”x+โˆ‚fโˆ‚yฮ”y+ฮ”xโ‹…ฮต1+ฮ”yโ‹…ฮต2.

Divide both sides by ฮ”t, and take the limit as ฮ”tโ†’0. Notice that ฮต1โ†’0 and ฮต2โ†’0 because ฮ”tโ†’0 implies ฮ”x,ฮ”yโ†’0. It follows that we have:

ฮ”fฮ”tโ†’dfdt=โˆ‚fโˆ‚xdxdt+โˆ‚fโˆ‚ydydt+dxdtโ‹…0+dydtโ‹…0.

So the sum of products in this chain rule comes from the linear approximation formula, which in turn comes from the tangency of the trace curves.

Chain rule over several parameters

The chain rule over a curve can be used to derive a more general chain rule for the case when the input variables are functions of several parameters. The formulas can be remembered as โ€œsum of all productsโ€ (considering all products that make sense).

Suppose f(x,y) is a function, and suppose x(s,t) and y(s,t) are both functions of s and t. We can โ€œconsiderโ€ f itself to be a function of s and t via the composition. Then:

โˆ‚fโˆ‚s=โˆ‚fโˆ‚xโˆ‚xโˆ‚s+โˆ‚fโˆ‚yโˆ‚yโˆ‚s,โˆ‚fโˆ‚t=โˆ‚fโˆ‚xโˆ‚xโˆ‚t+โˆ‚fโˆ‚yโˆ‚yโˆ‚t.

These formulas may be derived immediately from the formula for the chain rule over a curve by considering the trace curves: let t=b remain fixed for the first equation, and let s=a remain fixed for the second equation.

Exercise 10A-01

Chain rule, several parameters

Let f(x,y,z)=xy+z and x=s2, y=st, and z=st2.

  • (a) Generalize the chain rule above to the case of 3 variables and compute the partials โˆ‚fโˆ‚s and โˆ‚fโˆ‚t.
  • (b) Write out f in terms of s and t and compute the same partials directly.

Example

Polar partials and chain rule

Problem: Suppose f(x,y) is a real-valued function of points in the plane. Let r and ฮธ be the standard polar coordinates on the plane.

  • (a) Write โˆ‚fโˆ‚ฮธ in terms of โˆ‚fโˆ‚x and โˆ‚fโˆ‚y and x and y.
  • (b) Suppose f=x2y3. Find fฮธ at (x,y)=(1,1).

Solution: (a) First compute โˆ‚xโˆ‚ฮธ=โˆ’rsinโก(ฮธ) and โˆ‚yโˆ‚ฮธ=rcosโก(ฮธ). Then:

โˆ‚fโˆ‚ฮธ=โˆ‚fโˆ‚xโˆ‚xโˆ‚ฮธ+โˆ‚fโˆ‚yโˆ‚yโˆ‚ฮธ=โˆ’rsinโก(ฮธ)โˆ‚fโˆ‚x+rcosโก(ฮธ)โˆ‚fโˆ‚y=โˆ’yโˆ‚fโˆ‚x+xโˆ‚fโˆ‚y.

(b) Compute fx=2xy3, fy=3x2y2. So fฮธ=โˆ’2xy4+3x3y2 and fฮธ(1,1)=โˆ’2+3=1.

Exercise 10A-02

Polar partials

Write โˆ‚fโˆ‚r in terms of โˆ‚fโˆ‚x and โˆ‚fโˆ‚y and x and y.

It is easy to generalize the chain rule to an arbitrary number of variables and parameters. Formally, the rule is:

โˆ‚fโˆ‚ti=โˆ‚fโˆ‚x1โˆ‚x1ti+โˆ‚fโˆ‚x2โˆ‚x2ti+โ‹ฏ+โˆ‚fโˆ‚xnโˆ‚xnti,

where f(x1,x2,โ€ฆ,xn) is a function of many variables, and each xโ„“(t1,t2,โ€ฆ,tk) is a function of many parameters.

To remember the formula, it sometimes helps to draw a tree. The summation has one term for each path from the output variable z=f(x,y) down to the relevant input variable:

Exercise 10A-03

Chain rule: 3 variables, 3 parameters

Find โˆ‚uโˆ‚s at the point (r,s,t)=(2,1,0) given that:

u=x4y+y2z3,x=rset,y=rs2eโˆ’t,z=r2ssinโก(t).

Directional derivative

The partial derivative โˆ‚fโˆ‚x gives the rate of change of z=f(x,y) with respect to the variable x.

Think of x as a coordinate that moves the input to the function along a line (x,b). The partial fx gives the rate of change of f with respect to this coordinate.

We can move along other lines, at other speeds. Let ๐ฏ be any vector in the xy-plane. Consider the line L๐ฏ(t)=(a,b)+t๐ฏ that starts at (a,b) and moves along ๐ฏ using the parameter t. We can take the derivative of f with respect to t, and the result is called the derivative of f with respect to the vector ๐ฏ. Here is the defining formula:

D๐ฏf(a,b)=ddtf(๐‹(t))=limtโ†’0โกf(a+th,b+tk)โˆ’f(a,b)t,

where we write ๐ฏ=(h,k) for the components of ๐ฏ.

Observe that the directional derivatives D๐ž๐ข(f) and D๐ž๐ฃ(f) with respect to the unit vectors are identical with the partial derivatives fx and fy. (Plug in ๐ž๐ข=(1,0) into the limit formula above, for example.)

Exercise 10A-04

Linearities of directional derivative

  • (a) Show that D๐ฏ(ฮปf+g)=ฮปโ‹…D๐ฏf+D๐ฏg.
  • (b) Show that Dฮปโ‹…๐ฏf=ฮปโ‹…D๐ฏf.

Lengthening the direction vector

Notice that the derivative of f with respect to a vector 2๐ฏ is twice the derivative of f with respect to ๐ฏ. This is because the line L๐Ÿ๐ฏ is the same line as L๐ฏ but traversed twice as quickly.

Calculating directional derivatives We can use the two partial derivatives fx and fy to calculate D๐ฏf for vectors ๐ฏ other than coordinate basis vectors ๐ž๐ข and ๐ž๐ฃ. Recall ๐ฏ=(h,k). Use the chain rule instead of the limit definition to compute:

D๐ฏf(a,b)=ddtf(๐‹(t))|t=0=fx(a,b)โ‹…xโ€ฒ(0)+fy(a,b)โ‹…yโ€ฒ(0)=fx(a,b)โ‹…h+fy(a,b)โ‹…k.

Example

Derivative with respect to a vector

Problem: Find the derivative of f(x,y)=xey at the point (2,โˆ’1) with respect to the vector ๐ฏ=(2,3). Solution: The partials at (2,โˆ’1) are given by fx=ey=eโˆ’1 and fy=xey=2eโˆ’1. So we calculate D๐ฏ(f)(2,โˆ’1)=eโˆ’1โ‹…2+2eโˆ’1โ‹…3โ‰ˆ2.94.

Unit vectors?

Many authors treat directional derivatives as if they only make sense with respect to unit vectors: D๐ฎ(f). If you are interested in the analogue of partial derivative for other directions, you will want to take unit vectors. It may be reasonable to reserve the word โ€˜directionalโ€™ for this case, but it is certainly beneficial to allow the notation D๐ฏ to extend to vectors ๐ฏ that do not have unit length.

Example

Directional derivative

Problem: Find the โ€˜directionalโ€™ derivative of f(x,y)=xey at the point (2,โˆ’1) in the direction of the vector ๐ฏ=(2,3). Solution: The key phrase โ€œin the direction ofโ€ suggests that we need the unit vector in the direction of ๐ฏ. The norm of ๐ฏ is 13, so we want D113๐ฏ(f). By the linearity rule with ฮป=113, we can just multiply the previous result by 113, obtaining 0.82.

Exercise 10A-05

Directional derivative

Suppose f(x,y,z)=10+yz2+x2zโˆ’xy2. Find the rate of change of f at the point (1,2,1) in the direction of the vector (0,1,1).

Vector fields

A real-valued function f(x,y) may be called a scalar field. The word โ€˜fieldโ€™ indicates a mathematical structure that assigns a concrete object of a certain fixed type to every point in space. A function f may be interpreted as assigning a scalar value, the output f(x,y), to each point in space.

A vector field ๐… assigns a vector value to each point of space. The data of a vector field can be conveyed as a collection of coordinate functions, where each coordinate function takes a point p in space as input:

๐…(x,y)=(F1(x,y),F2(x,y)).

(In 3D, the vector field has 3 coordinate functions.) We can also represent the same data as:

๐…(p)=F1(p)๐ž๐ข+F2(p)๐ž๐ฃ.

Vector fields in 3D are harder to visualize: Vector fields are often used to represent a fluid flow (the vector gives flow velocity) or a force field (vector gives force). Both types of data require the assignment of a vector to each point in space.

Gradient

Given a function f(x,y), a natural vector field may be formed using its partial derivative functions. This is called the gradient of f, and f is called the potential function of this vector field:

โˆ‡f=(โˆ‚fโˆ‚x,โˆ‚fโˆ‚y)=(fx,fy).

In 3D, the gradient of f(x,y,z) has components โˆ‡f=(fx,fy,fz).

Recall that the partial fx is large when f increases steeply in the +x direction, and it is large and negative when fx increases steeply in the โˆ’x direction. These facts generalize with the combined components: โˆ‡f points in the direction of steepest increase of f. This can be understood by recalculating the formula for D๐ฎ(f):

Gradient and directional derivative

D๐ฎ(f)=fxโ‹…h+fyโ‹…k=โˆ‡fโ‹…๐ฎ

If we consider unit vectors ๐ฎ pointing in various directions, we see that the greatest value of D๐ฎ(f) occurs when ๐ฎ aligns with โˆ‡f. This means that โˆ‡f points in the direction in which D๐ฎ(f) is largest, i.e. the direction of steepest incline.

Exercise 10B-01

Compute gradient

For each of the following functions, compute the gradient:

  • (a) f(x,y)=x2โˆ’3xy
  • (b) f(x,y)=cosโก(yโˆ’x)
  • (c) g(x,y,z,w)=xzeyw.
Exercise 10B-02

Derivative using gradient

Let g(x,y,z)=z2โˆ’xy2. Find D๐ฏ(f) at the point (2,1,3), where ๐ฏ=(โˆ’1,2,2), by first computing the gradient and evaluating a dot product.

The level curves of a function f(x,y) are the sets of points satisfying f(x,y)=c for various cโˆˆโ„. The gradient vector is perpendicular to the level curves:

Proposition: Gradient โŸ‚ level curves

The level curves to f are perpendicular to the gradient vector.

Proof: Let ๐ซ(t) parametrize the level curve f(x,y)=c, so f(๐ซ(t))=c for all t. Then:

0=ddtc=ddtf(๐ซ(t))=โˆ‡fโ‹…๐ซโ€ฒ(t),

so โˆ‡fโŸ‚๐ซโ€ฒ(t).

This result can be generalized to surfaces. The level surfaces of a function f are given by f(x,y,z)=k. Any curve ๐ซ(t) in such a level surface satisfies f(๐ซ(t))=k, and therefore 0=ddtf(๐ซ(t))=โˆ‡fโ‹…๐ซโ€ฒ(t). This means the tangent plane to a level surface f(x,y,z)=k has normal vector given by ๐ง=โˆ‡f at any given point.

Example

Tangent plane to a hyperboloid

Problem: Find the tangent plane at (2,1,3) to the one-sheeted hyperboloid given by 4x2+9y2โˆ’z2=16.

Solution: Consider the function f(x,y,z)=4x2+9y2โˆ’z2, so the hyperboloid is the level curve f(x,y,z)=16. The normal vector is then given by the gradient โˆ‡f=(8x,18y,โˆ’2z). At (2,1,3) this takes the value (16,18,โˆ’6). The tangent plane is perpendicular to this vector and passes through (2,1,3):

โˆ‡fโ‹…((x,y,z)โˆ’(2,1,3))=0,16(xโˆ’2)+18(yโˆ’1)โˆ’6(zโˆ’3)=0.

Exercise 10B-03

Direction with given slope

You are hiking on a mountain with altitude in meters given by the function:

f(x,y)=2500+100(x+y2)eโˆ’0.3y2.

You are at the point (โˆ’1,โˆ’1). Which two directions could you head (as an angle ฮธ from the +x-axis) in order to ascend at a 20% grade?

If you follow the gradient vectors, you will go up or down a path of steepest ascent. This is also the path that crosses levels curves most quickly.

Problems due 29 Oct 2023, 9:00pm

Problem 10-01

Chain rule over a curve and directional derivative

Let f(x,y)=x2โˆ’3xy. Let ๐ซ(t)=(cosโก(t)โˆ’1,sinโก(t)) parametrize a particle traversing a circle which passes through the origin heading north. Notice that it is a unit-speed parametrization.

  • (a) Find the derivative ddtf(๐ซ(t)) for t=0 when the particle is at the origin.
  • (b) Find the tangent vector to ๐ซ at t=0 and call it ๐ฏ. Write a parametric equation of the line L(t) through (0,0) at t=0 with velocity vector ๐ฏ. Evaluate the derivative ddtf(L(t)) at t=0.
  • (c) Compute D๐ฏ(f) at the origin.
Problem 10-02

Chain rule: spherical to Cartesian partials

Write the partials โˆ‚fโˆ‚ฯ, โˆ‚fโˆ‚ฮธ, โˆ‚fโˆ‚ฯ† in terms of Cartesian coordinates and partials. Here ฯ, ฮธ, ฯ† are the spherical coordinate functions.

Problem 10-03

Chain rule: implicit partial differentiation

The set of points (x,y,z) satisfying x2+y2โˆ’2z2+12xโˆ’8zโˆ’4=0 is a hyperboloid of one sheet. Near the point (1,1,1), let us consider this equation to determine z as a function of x and y.

  • (a) Take the partial derivative of both sides of the equation with respect to x. (Take y to be constant, and z(x,y) to be a function, so both sides are functions of x and y.) Now solve for โˆ‚zโˆ‚x.
  • (b) Solve the equation for z in terms of x and y and find the partial derivative โˆ‚zโˆ‚x directly.
  • (c) Make a formal rule that generalizes the above calculation: Let F(x,y,z)=0 be an equation that is taken to determine z as a function of x and y. Write a formula for โˆ‚zโˆ‚x in terms of partials of F. Indicate where you use the chain rule!
โ˜… Problem 10-04

Write up and submit: A-01, A-02, A-04, A-05; B-01, B-03, B-04.