Gauss's Principle of Least Constraint

This note is about a formalization of classical mechanics due to Gauss. Gauss noticed that, roughly, the constrained motion of masses is as close as possible to their unconstrained motions, while still satisfying the constraints. For example, a pendulum bob would naturally fall straight down, but is constrained to a circle by its string -- so its true acceleration will be as close as possible to straight down, while still remaining on the string. This generalizes to essentially any constrained system. Specifically, to find the true accelerations of masses in a constrained system, we first find the accelerations as if they were unconstrained, and then “project” to the closest acceleration that satisfies the constraints. \[a_{\text{true}} = \argmin_{a \in \text{ConsistentAccelerations}} ||a - a_{\text{unconstrained}} ||^2_M\]

This formulation (which also holds in generalized coordinates) is another nice extremal principle in physics, and is the basis of some recent fast rigid-body physics simulations (MuJoCo), using convex optimization.

In this note, we state, prove, and discuss Gauss's principle, assuming minimal prior knowledge. We use it for a double-pendulum simulation just for fun.

Introduction & Motivation

Of course, this is the general problem of classical mechanics, but we are focusing on how to deal with constraints. Directly applying Newtonian mechanics can get very messy when dealing with kinematic constraints, since it requires solving for all the intermediary constraint forces. Often, we are not interested in what forces are required to enforce the constraints, we just want to know the eventual movement of the masses. (For example, a ball rolling along a curved track – we are not interested in the forces required to keep the ball on the track, just in the eventual motion of the ball). Re-formulations of mechanics (such as Lagrangian mechanics, and Gauss's principle) give us “implicit ”ways of dealing with constraints.

Organization. We first setup the problem and notation, then in Section we state Gauss's Principle and give several examples. In Section we formally define the notion of "consistent accelerations" required for Gauss's Principle. Finally, we give a proof of Gauss's Principle in Section . We conclude with remarks, historical references, and open problems.

Setup and Definitions

A constrained system (to us) is one where the configuration $X$ is constrained to lie on a manifold. For example, we may have the constraint $x_1 = -x_2$, or if two masses are connected by a rod of length $\ell$, the constraint $||\vec x_1 - \vec x_2||_2 = \ell$. Here, we will consider only constraints that do no net work on the system (this is the case for most natural physical constraints).

Given a configuration $X$ on the manifold, let $T_X$ be the set of possible velocities of the system at configuration $X$, which are consistent with the constraints. Notice that $T_X$ is a subspace, and is in fact exactly the tangent space of the manifold at $X$. (We will see examples soon). Our condition that constraint forces do no net work is exactly the condition that (in inertial coordinates) the collection of constraint forces $\Fconst$ is orthogonal to $T_X$.

The state of a system is given by the positions and velocities $(X, \dot{X})$.

The mass matrix is the matrix $M$ such that $\frac{1}{2}||\dot{q}||^2_M = \frac{1}{2}\dot{q}^TM\dot{q}$ is the kinetic energy of the system, where $\dot{q}$ is the generalized velocities. (Such a matrix always exists when the mapping $q_i(r_1, \dots r_n)$ between generalized coordinates $q$ and spatial coordinates $r$ is time-independent).

Our goal is: Given the constraint manifold and the current state $(X, \dot{X})$ of a system, find the accelerations $\ddot{X}$. This determines the entire trajectory $X(t)$.

This goal is the analog of Newton's law ($F=ma$) – it specifies the local evolution of a system.

Gauss's Principle

Consider a constrained system subject to internal and external forces, where the constraint forces do no net work. Describing the system in generalized coordinates, let $A_U$ be the generalized accelerations of the unconstrained system (with non-constraint forces only). Suppose the system is in state $(X, \dot{X})$, of generalized positions and velocities. Then, the true generalized accelerations of the constrained system are \[A_{\text{true}} = \argmin_{A \in \consaccel_{X, \dot{X}}} ||A - A_U||^2_M \]

Where the norm $M$ is the mass matrix, and $\consaccel_{X, \dot{X}}$ is the set of accelerations consistent with the constraints at the current state.

For example, if we describe our system in inertial, spatial coordinates, and there is an external (non-constraint) force $\vec F_i$ acting on mass $m_i$, then the true accelerations satisfy:

\[\{\vec a_1, \dots, \vec a_n\} = \argmin_{\{\vec a_1, \dots, \vec a_n\} \in \consaccel} \sum_i m_i ||\vec a_i - \frac{\vec F_i}{m_i}||^2\]

Examples

Pendulum

Model the configuration space as the position of the mass $(x, y) \in \R^2$, constrained to lie on the manifold $C = \{(x, y): x^2 + y^2 = 1\} \subset \R^2$. The mass matrix is simply $M = \begin{bmatrix} m &0\\ 0 &m \end{bmatrix}$. The unconstrained acceleration of the mass is simply $g$, straight down by gravity.

At a given state (position and velocity $v$) of the mass, let us consider the space of consistent accelerations. The fixed velocity determines the radial/centripital acceleration ($a_r$, in blue), so the only freedom is in the tangential accelerations ($a_T$). Thus, the consistent accelerations are exactly those with the given radial acceleration, and arbitrary tangential acceleration. (This will be discussed further below.) The projection of $g$ onto the consistent accelerations (as dictated by Gauss's Principle) is equivalent to considering the tangential component of gravity.

Note that we could not have directly used the angle of a pendulum as a generalized coordinate when applying Gauss's Principle, since we can't express the unconstrained acceleration of the pendulum in terms of only this coordinate. Thus, Gauss's Principle requires us to work in an ambient space that includes even "off-shell" trajectories of the constrained system. However, it is still possible to use generalized coordinates in our representation, as we will see below.

Double Pendulum

In practice, it is still possible to use generalized coordinates in our computations, though we will need to map it back to spatial coordinates. This is useful for simulations, because we can use coordinates that implicitly respect constraints. For example, for the double pendulum, we can represent the configuration space by the two angles $\vec \theta = (\theta_1, \theta_2)$. The set of consistent accelerations in $\theta$-space is simply all of $\R^2$, since the angular accelerations can be arbitrary. Thus, we just need to translate this set back into spatial coordinates. The spatial coordinates $\vec r = (\vec r_1, \vec r_2)$ of the masses are functions of $\theta$: $\vec r_i(\theta_1, \theta_2)$. Now, at a given state $S = (\theta_1, \theta_2, \dot{\theta_1}, \dot{\theta_2})$, we have $$ \ddot{\vec r} = Q_S \ddot{\vec\theta} + \vec b_S $$ for some matrix $Q_S$ and vector $b_S$ depending on the state. This follows simply by differentiating $\vec r(\vec \theta)$. Thus, we can express the set of all kinematically feasible accelerations as the affine space $$\consaccel_S = \{Q_S \ddot{\vec\theta} + \vec b_S : \ddot{\vec \theta} \in \R^2\}$$ This is extremely convenient, because we can now apply Gauss's Principle to directly find $\ddot{\vec\theta}$, the generalized accelerations: $$A = \argmin_{A \in \consaccel_S} ||A - A_\text{unconstrained}||^2 \iff \ddot{\vec\theta}_{true} = \argmin_{\ddot{\vec\theta} \in \R^2} ||Q\ddot{\vec\theta} + b - A_\text{unconstrained}||^2 $$ This is, presumably, something like what MuJoCo does.

Just for fun, here is an example implementation of the above (plus some dampening for stability). Try some random configurations. On the left is a phase-space plot of $\theta$ vs $\dot{\theta}$ ($\theta_1$ in blue, $\theta_2$ in red).

Sliding Blocks

Take the ambient space as the $x$ coordinate of mass $m_1$, and the $x, y$ coordinates of mass $m_2$: $(x_1, x_2, y_2) \in \R^3$. The 2-dimentional constraint manifold (for appropriate choice of orgin) is defined by $\{(x_1, x_2, y_2): \tan \theta = y_2/(x_1-x_2)\}$, expressing the constraint that the block remain on the wedge. The mass matrix is again diagonal, since we are using inertial coordinates: $M = diag(m_1, m_2, m_2)$.

And More...

(Left as an exercise for the reader). Note that we can handle internal non-constraint forces, such as springs -- this is just incorporated into the unconstrained accelerations.

Consistent Accelerations

We define $\consaccel$ in order to move from constraints on the positions of masses to constraints on their accelerations. This can be done by differentiating the manifold constraint at the current state.

Formally, given the current state $(X, \dot{X})$ of a constrained system, define the “consistent accelerations ”as the following set. Consider all trajectories $\t X(t)$ which are kinematically feasible (lie on the constraint manifold), and agree with the current state at $t = 0$: $(\t X(0) = X, \dot{\t X}(0) = \dot{X})$. The consistent accelerations are the set of all accelerations for these feasible trajectories: \[\consaccel_{X, \dot X} := \{\ddot{\t X}(0)\}\]

For example, consider the consistent accelerations of a pendulum at its bottommost point:

The true acceleration at this point is purely radial (upwards, $a_{true}$ in blue), and the consistent accelerations (dashed green) correspond to further tangential accelerations. Note that these tangential accelerations can be identified with the tangent space of the constraint (tangential displacements).

This follows directly from differentiating the constraints. More formally: let $X_0(t)$ and $X_1(t)$ be two consistent trajectories (on the manifold), which agree in position and velocity at $t=0$. Let matrix $C_{X}$ be the (linear) constraints defining the manifold at point $X$ (ie, $C_X$ is the orthogonal complement to the tangent space $T_X$). That is, $C_{X} \dot{X_i}(t) = 0$ along the trajectory. Differentiating, \begin{align*} \frac{d}{dt}(C_{X} \dot{X_i}(t)) &= \vec 0\\ \dot{C_{X}} \dot{X_i}(t) + C_X \ddot{X_i}(t) &= \vec 0 \end{align*} Since $X_0$ and $X_1$ agree in position and velocity $(\dot{X_i)}$ at $t=0$, subtracting the last relation for the two trajectories $i=0$ and $i=1$ yields \[C_X( \ddot{X_1}(0) - \ddot{X_2}(0)) = \vec 0\] Thus, the difference in accelerations is a consistent velocity: $\ddot{X_1}(0) - \ddot{X_2}(0) \in T_X $ This shows that $\consaccel_{X, \dot X}$ is contained in an affine shift of $T_X$. (In fact, it is easy to show that this containment is an equality, but we will not need this fact).

Proof of Gauss's Princple

We will prove this for the case of inertial, spatial coordinates. This is without loss of generality, because the objective function is invariant under changes of coordinates (because, if $A, M$ are accelerations and mass matrices in generalized coordinates, and $a, m$ are their counterparts in spatial coordinates, then $||A||_M = ||a||_m$, and furthermore the constraint manifold transforms naturally).

Here are several equivalent ways of proceeding:

Approach 1: Suppose the true acceleration is $A_{true}$, and consider a consistent acceleration $A$. By Lemma , any consistent acceleration $A$ must have the form $A = A_{true} + \delta$ for some $\delta \in T_X$ (where $T_X$ is the tangent space to the constraint manifold, ie the “consistent velocities”).

We will show that any consistent acceleration $A = A_{true} + \delta$ will have larger norm $||A - A_U||_M^2 \geq ||A_{true} - A_U||_M^2$.

Consider the gradient of the objective \begin{align*} \grad_A ||A - A_U||_M^2 &= (M(A - A_U))^T \end{align*} The key observation is that at the true accelerations $A = A_{true}$, this gradient is exactly the constraint forces $\Fconst$ on the system. That is, \[ \grad_A ||A - A_U||_M^2 |_{A = A_{true}} = 2(M(A_{true} - A_U))^T = 2\Fconst^T \] Because, $MA_{true} - MA_U = F_{net} - F_{\text{external}} = \Fconst$. Further, since constraint forces do no work, we know that $\Fconst \perp T_X$.

Thus, at the true acceleration, the gradient ($=\Fconst$) is orthogonal to all possible consistent perturbations ($\delta \in T$). This is sufficient to show optimality, by convexity of the objective, and the set $\consaccel$. So the true acceleration minimizes the objective among the set of consistent accelerations.

Approach 2: Consider the minimizer $A^*$ in $$ A^* = \argmin_{A \in \consaccel_{X, \dot{X}}} ||A - A_U||^2_M $$ By Lemma , we know the set $\consaccel$ is an affine shift of $T_X$. Thus, this minimization is simply the projection of $A_U$ onto an affine shift of $T_X$, under the $M$-norm. This projection $A^*$ is charecterized uniquely by $M(A^* - A_U) \perp T_X$. But, as we saw above, the true acceleration $A_{true}$ satisfies this, since $M(A_{true} - A_U)$ is the constraint force. Thus, $A^* = A_{true}$.

Approach 3: In fact, we can directly show that any consistent acceleration $A$ must have larger objective: $$ \forall A \in \consaccel_{X, \dot{X}}: ~\|A - A_U\|_M^2 \geq \|A_{true} - A_U\|_M^2. $$

By Lemma , any consistent acceleration $A$ must have the form $A = A_{true} + \delta$ for some $\delta \in T_X$. Then, using the fact that $\Fconst \perp \delta$, $$ \begin{align*} \|A - A_{U}\|_M^2 &= \|A_{\rm true} + \delta - A_{U}\|_M^2\\ &= \|A_{\rm true} - A_{U}\|_M^2 + \|\delta\|_M^2 + 2\delta^TM(A_{\rm true} - A_{U})\\ &= \|A_{\rm true} - A_{U}\|_M^2 + \|\delta\|_M^2 + 2\delta^T \Fconst\\ &= \|A_{\rm true} - A_{U}\|_M^2 + \|\delta\|_M^2\\ &\geq \|A_{\rm true} - A_{U}\|_M^2 \end{align*} $$

Note that it was important we used inertial coordinates in the proof, so we could apply $F_{net} = MA$.

Concluding Remarks

It would be nice to express Gauss's Principle intrinsically (via the configuration manifold), instead of relying on an embedding of the constraint manifold in $\R^k$. However, it there is an obstacle: To talk about the "unconstrained motion" of masses, it is not enough to only know about their "on-shell" trajectories -- we need to be able to describe their trajectories with no constraints applied. This is the issue we ran into in the pendulum example, where we could not use the angle as a generalized coordinate.
Question: Can Lemma , relating consistent velocities to consistent accelerations, be stated intrinsically? (without relying on an embedding in $\R^k$). This is unclear, since a priori the velocity-space and acceleration-space are different structures.
There is a heuristic technique used in game physics (Verlet Integration) to deal with position constraints. The technique is essentially "update positions as if unconstrained, and then project back onto the consistent positions." This is similar to Gauss's Principle, though the projection is happening on positions, and not accelerations. I believe this is can be shown to be equivalent in some cases, under reasonable assumptions.
Related extremal principles (not technically equivalent, but all involve similar ideas):

Lagrange's Principle of Least Action

That the true trajectory of a system is that which minimizes the Action, among all consistent trajectories.

Thomson's Theorem in Electrostatics

That the true distribution of charges is that which minimizes the electric field energy, among all distributions with the same boundary conditions. For example, the voltages in a network of capacitors is such that the potential energy is minimized. Related, the true currents in a network of resistors is the minimum-energy flow, subject to the flow constraints.

See, e.g. the [Feynman Lectures, Vol I Lec 19] for a discussion of the above.

References

There is surprisingly little written about this, as far as I could find. Here are some good references:

Principles of Least Action and of Least Constraint [Ekkehard Ramm] is a short survey of the history of some extrmal principles, including least-constraint.
The original paper by Gauss, "Über ein neues allgemeines Grundgesetz der Mechanik". He states his principle, and gives a proof by one example.
The MuJoCo computation page, describing how MuJoCo uses a variant of Gauss's Principle for fast physics simulation.
Page 254 of [Whittaker] , a book on analytical dynamics, gives a proof of a version of Gauss's Principle (described as the "least-curvature principle").
The Udwadia–Kalaba equation is essentially just Gauss's Principle (reading the wikipedia page is not recommended).

Acknowledgements. Thanks to Thibaut Horel for suggestions on the presentation and proof, and Darius Shi for catching various errors.

Questions, comments, suggestions are welcome: preetum@cs.harvard.edu
Last Updated: January 9, 2018.