Geometric Derivation of Euler-Lagrange Equation

Intuitive geometric derivation.


Given a functional S:

 S = int_{x_1}^{x_2}{f(y(x), y'(x), x)}{dx}

Euler-Lagrange says that the function y(x) at a stationary point of the functional S obeys:

 frac{partial f}{partial y} - frac{d}{dx}frac{partial f}{partial y'} = 0

Where y' = frac{dy}{dx}.

This result is often proven using integration by parts – but the equation expresses a local condition, and should be derivable using local reasoning.

We will explore an alternate derivation below.

Motivating Example

Physical systems in stable equilibrium will move to a configuration that locally minimizes their potential energy. For example, consider a chain draped over two pulleys (at height H, separated by distance 2L), with excess chain resting on the ground.


The chain will take a shape between the two pulleys that minimizes its gravitational potential energy. The space is interesting: If the chain is taut, then it will be high above the ground, and have high energy. If the chain is very saggy, it will pull up lots of chain from the ground, and have high energy. In between is the optimal shape.

The potential energy of a given shape of the chain y(x) between the pulleys is:

 S = -int y dm = -int_{-L}^{+L} y(x) sqrt{1+y'(x)^2} ;dx

We have integrated only along the section between the pulleys, because we can define U_g = 0 at ground level, and ignore constants (the energy of the chain dangling outside the pulleys) and constant factors (mu, g).

Now we want to find the function y(x) that minimizes S (subject to the boundary conditions y(-L) = y(+L) = -H).

Geometric Derivation

For y to be a stationary point of S, small perturbations to the function y(x) must not alter the value of S (to first order). So consider an infinitesimal segment of y(x) (grey), discretized around x_2. We will consider perturbing y(x) at x_2 by an amount delta y (changing from the blue –> red), and examine its effect delta S on S.


Recall: S = int{f(y(x), y'(x), x)}{dx}

Perturbing this point affects f in two ways:

  1. It alters the value of y(x) at x_2.

  2. It alters the derivatives y'(x) around x_2.

The first contribution is simply:

 delta S_1 = frac{partial f}{partial y} delta y

The second contribution is in two parts: increasing the derivative on the left, and decreasing the derivative on the right. Based on the figure, we have (before perturbation):

 y'(x)|_{x_1} = frac{Delta y_1}{Delta x} ;;;;; y'(x)|_{x_2} = frac{Delta y_2}{Delta x}

And after perturbation:

 hat{y}'(x)|_{x_1} = frac{Delta y_1 + delta y}{Delta x} ;;;;; hat{y}'(x)|_{x_2} = frac{Delta y_2 - delta y}{Delta x}

So the change in derivatives is:

 Delta y'(x)|_{x_1} = frac{delta y}{Delta x} ;;;;; Delta y'(x)|_{x_2} = frac{-delta y}{Delta x}

This affects S by an amount:

 begin{aligned} delta S_2 &= frac{partial f}{partial y'}|_{x_1} Delta y'(x)|_{x_1} + frac{partial f}{partial y'}|_{x_2} Delta y'(x)|_{x_2}  &= -left(frac{partial f}{partial y'}|_{x_2} - frac{partial f}{partial y'}|_{x_1}right) frac{delta y}{Delta x} end{aligned}

Since x_2 = x_1 + Delta x, we can write:

 frac{partial f}{partial y'}|_{x_2} - frac{partial f}{partial y'}|_{x_1}  = frac{d}{dx}frac{partial f}{partial y'}|_{x_1} Delta x

So finally, the effect of changing the derivatives is:

 delta S_2 = -frac{d}{dx}frac{partial f}{partial y'}|_{x_1} delta y

And the net effect of perturbing the point is:

 delta S = delta S_1 + delta S_2 =  left(frac{partial f}{partial y}  -frac{d}{dx}frac{partial f}{partial y'}|_{x_1}right) delta y

So clearly, requiring delta S = 0 for small perturbations delta y anywhere along the function implies the Euler-Lagrange equation:

 frac{partial f}{partial y} - frac{d}{dx}frac{partial f}{partial y'} = 0

And now both terms have meaning:

  1. frac{partial f}{partial y} : The amount small perturbations affect f by directly changing the value y(x).

  2. - frac{d}{dx}frac{partial f}{partial y'} : The net amount perturbations affect f by changing the derivatives y'(x) in the neighborhood of x. (the derivative to the left increases, and to the right decreases – so the net effect depends on how frac{partial f}{partial y'} changes over the span dx).

At a stationary point, these effects must exactly cancel.

Final Comments

For completeness, we derive the solution to the example, and extend it to the case of a fixed-length chain.


In this case:

 f(y, y', x) = y sqrt{1+(y')^2}

Applying Euler-Lagrange directly, we find:

 begin{aligned} frac{partial f}{partial y} &= frac{d}{dx} frac{partial f}{partial y'}  sqrt{1 + (y')^2} &= frac{d}{dx} left( frac{yy'}{sqrt{1+(y')^2}} right) end{aligned}

We could solve this, but the simpler approach is to use a theorem:

Thm: If frac{partial f}{partial x} = 0, then f - y' frac{partial f}{partial y'} is a constant.


 begin{aligned} frac{d}{dx} left( f - y' frac{partial f}{partial y'} right) &=  y'frac{partial f}{partial y}  + y''frac{partial f}{partial y'} + frac{partial f}{partial x} - frac{d}{dx} left( y' frac{partial f}{partial y'} right)  &= y'frac{partial f}{partial y} + y''frac{partial f}{partial y'}  - left( y'' frac{partial f}{partial y'} + y' frac{d}{dx}frac{partial   f}{partial y'} right)  &= y' left( frac{partial f}{partial y}  - frac{d}{dx} frac{partial f}{partial y'} right)  &= 0 end{aligned}

Where the last step is applying the original form of the Euler-Lagrange equation.

Since our particular f(y, y', x) is independent of x, we can apply this theorem:

 begin{aligned} f - y' frac{partial f}{partial y'} &= c  sqrt{1 + (y')^2} - frac{y(y')^2}{sqrt{1+(y')^2}} &= c  y' = frac{dy}{dx} &= sqrt{frac{y^2}{c^2} - 1}  int dx &= int frac{dy}{sqrt{frac{y^2}{c^2}- 1}}  x(y) &= c text{, arcosh}(frac{y}{c}) + b  y(x) &= c text{, cosh}(frac{x-b}{c})  end{aligned}

For constants c, b chosen to satisfy boundary conditions. This gives us the familiar catenary curve.


We know the catenary is also the shape formed if we hang a fixed-length chain between two fixed endpoints. This is no coincidence.

The fixed-length chain problem is a constrained minimization problem, with the same potential energy functional S, but the additional constraint that the arclength of y(x) is some specified constant.

Specify constrained problems by (L, s) – separation distance and chain arclength.

And unconstrained problems by (L, H) – separation distance and height.

We will show that the solution to this problem is also a catenary, by showing that:

Theorem: For a given constrained-problem (L, s), there is a corresponding unconstrained-problem (L, H) that contains the constrained-problem as a sub-problem. Therefore the constrained problem must also be a catenary.

Lemma: For any d, l in R: 2d leq l, we can always find some unconstrained configuration (L, H) such that the length of chain between x=-d and x=+d is exactly l.

Proof: Refer to the family of unconstrained solutions we found. With centered choice of coordinates, b=0, and the family takes the form:

 y_c(x) = c text{, cosh}(frac{x}{c})

The boundary condition y(L)=-H determines the parameter c. Now, clearly the arclength between x=-d and x=+d is a monotone decreasing function of c, with extremes at 2d and +infty. (eg, for d=1, the arclength turns out to be 2c text{, sinh}(frac{1}{c}))

So any value of l in this range is achievable for some c_0 – and further, there is some (L, H) that gives rise to this c (in particular, (1, c_0 text{, cosh}(frac{1}{c_0})) works).

Result: Now, given a constrained-problem (L, s), construct an unconstrained-problem (L', H') such that the length of chain between x = -L and x = +L is exactly s. (which Lemma guarentees we are able to do). Over this span, the unconstrained problem is identical to the constrained problem: arclength s, and endpoints separated by 2L. If one situation's optimal-solution (say A) had lower potential energy (over the span) than the other (B), then B could assume the form of A over the span, without violating any constraints but resulting in lower energy. Contradiction of optimality, so both problems must have identical solutions over the span – and the constrained optimal solution is also a catenary.

So we have also derived the solution to the constrained problem. (no Lagrange multipliers necessary!)