Newton's identities

From Wikipedia, the free encyclopedia

In mathematics, Newton's identities relate two different ways of describing the roots of a polynomial. They were found by Isaac Newton in about 1666, apparently in ignorance of earlier work (1629) by Albert Girard. These useful identities have immediate applications in many areas of mathematics, including Galois theory, invariant theory, group theory, combinatorics, as well as further applications outside mathematics, including general relativity. They can be considered as applications of ideas in computational algebraic geometry, particularly Gröbner bases.

1 Mathematical statement
2 Application to the characteristic polynomial of a matrix
3 Relation with invariant theory
4 Relation with symmetric groups
5 Relation with computational algebraic geometry
6 See also
7 References

[edit] Mathematical statement

Consider the polynomial

$p(\lambda) = \prod_{\alpha=1}^n \left( \lambda - x_\alpha \right) = \sum_{j=0}^n (-1)^{n-j} a_j \lambda^j$

where the $x α$ are the roots and the $a j$ are the coefficients. Define the power sums

$t_j = \sum_{\alpha=1}^n x_\alpha^j \ {\rm \ for\ } j \geq 0 \,.$

Then the original form of Newton's identities is the recurrence

$t_1 = a_1 t_0\,,$

$t_2 = a_1 t_1 - 2 a_2 t_0\,,$

$t_3 = a_1 t_2 - a_2 t_1 + 3 a_3 t_0\,,$

$t_4 = a_1 t_3 - a_2 t_2 + a_3 t_1 - 4 a_4 t_0\,,$

$t_5 = a_1 t_4 - a_2 t_3 + a_3 t_2 - a_4 t_1 + 5 a_5 t_0\,,$

where the pattern is obvious. For an elementary proof see the book by Tignol cited below.

From these formulae we can readily obtain more useful formulae, expressing the power sums in terms of the coefficients:

$t_1 = a_1\,,$

$t_2 = a_1^2 - 2 a_2\,,$

$t_3 = a_1^3 - 3 a_1 a_2 + 3 a_3\,,$

$t_4 = a_1^4 - 4 a_1^2 a_2 + 4 a_1 a_3 + 2 a_2^2 - 4 a_4\,.$

Unfortunately, these formulae have the disadvantage that the pattern is no longer obvious.

Finally, we can solve to obtain expressions giving the coefficients in terms of the power sums:

$a_1 = t_1\,,$

$a_2 = \frac{1}{2} \left( t_1^2 - t_2 \right)\,,$

$a_3 = \frac{1}{6} \left( t_1^3 - 3 t_1 t_2 + 2 t_3 \right)\,,$

$a_4 = \frac{1}{24} \left( t_1^4 - 6 t_1^2 t_2 + 3 t_2^2 + 8 t_1 t_3 - 6 t_4 \right)\,,$

and so forth, where we can see a partial pattern (factorials in the denominator, first term and last term), but again the general pattern is probably not obvious. But if you know about the theory of finite groups, take a hard second look! (Spoiler in a subsequent section.)

Newton seems to have left it there, and hence missed some lovely discoveries.

[edit] Application to the characteristic polynomial of a matrix

When the polynomial

$p(\lambda) = \prod_{\alpha=1}^n \left( \lambda - x_\alpha \right) = \sum_{j=0}^n (-1)^{n-j} a_j \lambda^j$

is the characteristic polynomial of a linear operator or matrix A, the roots $x α$ are the eigenvalues of the operator or matrix. Then the power sums $t j$ are the traces of the powers of the operator or matrix:

$t_j = {\rm tr} \left( A^j \right).$

Since the traces of the matrix powers can be computed easily without the knowlegde of the eigenvalues, it is possible to obtain the coefficients of the characteristic polynomial by the Newton's identities. Those ideas combine into an algorithm of a running time of n matrix-matrix multiplications, which using the naive implementation is $n 4 + O (n 3)$ .

[edit] Relation with invariant theory

Invariant theory is concerned with polynomial invariants of various algebraic or geometric objects in mathematics, including polynomial invariants of quadratic forms, and more generally, polynomial invariants of tensors. From the latter, we obtain connections with the representation theory of groups.

A fundamental topic in invariant theory is symmetric polynomials, which arise when we express the coefficients of a polynomial in terms of its roots. That is, multiplying out

$p(\lambda) = \prod_{\alpha=1}^n \left( \lambda - x_\alpha \right)$

we have

$a_1 = -\sum_{j} x_j\,$

$a_2 = \sum_{j < k} x_j \, x_k$

$a_3 = -\sum_{j < k < \ell} x_j \, x_k \, x_\ell$

and so forth. In particular, we can use such expressions to obtain the characteristic polynomial of a linear operator if we know its eigenvalues.

Now the point for the theory of invariants is that if we consider the $a_j(x_1,x_2,\dots)$ to be polynomials in the roots, then for a given n they form an algebraic basis for the space of symmetric polynomial functions of the roots. That is, every polynomial function of the roots which is invariant under every permutation of the roots, such as exchanging $x 1, x 2$ , is given by a specific algebraic combination (by addition, multiplication, and multiplication by a scalar) of the functions $a j$ . For this reason, $a_1, \, a_2, \dots, a_n$ are called the elementary symmetric polynomials.

The point is that the expressions above give a different basis for the space of symmetric polynomials, namely

$t_1 = a_1\,$

$t_2 = a_1^2 - 2 a_2$

$t_3 = a_1^3 - 3 a_1 a_2 + 3 a_3$

$t_4 = a_1^4 - 4 a_1^2 a_2 + 4 a_1 a_3 + 2 a_2^2 - 4 a_4$

and so forth, where $t j$ is the sum of the jth powers of the roots (known as a power-sum symmetric function of the roots). The fact that we can obtain in this way two distinct ways of representing all symmetric functions of the roots of a polynomial, without knowing the roots themselves, is of fundamental importance for Galois theory.

We can obtain finer decompositions by writing general symmetric polynomials as sums of homogeneous polynomials; that is, a symmetric polynomial in which all the terms have the same degree. A convenient basis for the homogeneous symmetric polynomials is given by the Schur polynomials, which correspond to the partitions of an integer, which can be enumerated by Ferrers diagrams (this is the "unlabeled" combinatorial concept corresponding to Young diagrams). For example, corresponding to the partition 4+2+1 = 7, we have the Schur polynomial

$\sigma_{4,2,1} (x_1, x_2, x_3) = \frac{\det \left[ \begin{matrix} x_1^6 & x_2^6 & x_3^6 \\ x_1^3 & x_2^3 & x_3^3 \\ x_1 & x_2 & x_3 \end{matrix} \right] }{\det \left[ \begin{matrix} x_1^2 & x_2^2 & x_3^2 \\ x_1 & x_2 & x_3 \\ 1 & 1 & 1 \end{matrix} \right] }$

which is a homogeneous symmetric polynomial of degree seven in three variables. Amazingly enough, the determinant in the denominator cancels out when everything is fully expanded:

$\sigma_{4,2,1} (x_1, x_2, x_3) = x_1 \, x_2 \, x_3 \; \left( x_1^3 \, x_2 + x_1 \, x_2^3 + x_1^3 \, x_3 + x_1 \, x_3^3 + x_2^3 \, x_3 + x_2 \, x_3^3 \right.$

$\; \; \; \left. + x_1^2 \, x_2^2 + x_1^2 \, x_3^2 + x_2^2 \, x_3^2 + 2 \, (x_1^2 \, x_2 \, x_3 + x_1 \, x_2^2 \, x_3 + x_1 \, x_2 \, x_3^2) \right)$

There are three other partitions of seven into three parts, so the space of homogeneous polynomials of degree seven in three variables has dimension four, with each polynomial uniquely expressible as a linear combination of four Schur polynomials. Each of these Schur polynomials can in turn be expressed as an algebraic combination of (a polynomial function of) the three elementary symmetric polynomials in three variables, $a 1, a 2, a 3$ .

[edit] Relation with symmetric groups

The alert reader with some knowledge of the theory of finite groups, particularly the Pólya enumeration formula, will have noticed two striking facts in the discussion above:

the Schur polynomials are defined in terms of integer partitions (or Ferrers diagrams), which correspond to the conjugacy classes of the symmetric group,
up to alternating signs, the expressions found above giving the coefficients in terms of the power sums are exactly the cycle index polynomials of the symmetric groups.

[edit] Relation with computational algebraic geometry

Algebraic geometry is primarily concerned with the algebraic problem of finding the roots of systems of polynomials in many variables and the geometric interpretation of this problem in terms of algebraic varieties. A fundamental computational technique for algebraic geometry, which has far-reaching implications in many other fields of mathematics, including differential equations, is the notion of a Gröbner basis. For our purposes we don't need to know precisely what these are, we need only know that Buchberger's algorithm for obtaining a Gröbner basis generalizes two of the most fundamental algorithms in mathematics:

Gauss's algorithm for obtaining the row reduction of a matrix,
division algorithm for dividing a polynomial in one variable by another such polynomial.

The resulting Gröbner basis of an ideal in a ring of multivariable polynomials is analogous to a vector basis for a subspace of vector space, and is ideally suited for computations involving ideals in polynomial rings, which is the basis concept of algebraic geometry. Indeed, the famous Nullstellensatz of David Hilbert establishes a perfect correspondence between a fundamental algebraic concept, the ideals of a nice kind of ring, and an equally fundamental geometric concept, varieties in an affine space, or even in a projective space.

Gröbner bases are defined with respect to choice of a certain term order (which specifies "priority" among monomials in carrying out the generalized division algorithm), and one of the most useful choices for algebraic geometry is an elimination order. In particular, using an elimination order we can solve a system of equations such as the identities giving the power sums in terms of the coefficients, to obtain equations giving the coefficients in terms of the power sums. Needless to say, we obtain the expressions found above.

[edit] See also

elementary symmetric polynomial
Schur polynomial
fluid solutions, an article giving an application of Newton's identities to computing the characteristic polynomial of the Einstein tensor in the case of a perfect fluid, and similar articles on other types of exact solutions in general relativity.

[edit] References

Tignol, Jean-Pierre (2001). Galois' theory of algebraic equations. Singapore: World Scientific. ISBN 978-981-02-4541-2.
— A readable, historically oriented introduction to Galois theory.

Cameron, Peter J. (1999). Permutation Groups. Cambridge: Cambridge University Press. ISBN 978-0-521-65378-7.
— A fascinating introduction to permutation groups, including the Pólya cycle index, oligomorphic permutation groups, and the connection with mathematical logic.

Bergeron, F.; Labelle, G.; and Leroux, P. (1998). Combinatorial species and tree-like structures. Cambridge: Cambridge University Press. ISBN 978-0-521-57323-8.
— A fairly readable monograph, which may leave the incorrect impression that this seemingly abstract approach has no practical advantages over more conventional approaches to generating function techniques.

Sturmfels, Bernd (1992). Algorithms in Invariant Theory. New York: Springer-Verlag. ISBN 978-0-387-82445-1.
— A delightful monograph explaining the application of Gröbner bases to the invariant theory of finite groups, including Schur polynomials and an approach to Galois theory using invariants.

Cox, David; Little, John, and O'Shea, Donal (1992). Ideals, Varieties, and Algorithms. New York: Springer-Verlag. ISBN 978-0-387-97847-5.
— A beautiful introduction to algebraic geometry using Gröbner bases, including a chapter on invariants of finite groups.

Tucker, Alan (1980). Applied Combinatorics, 5/e, New York: Wiley. ISBN 978-0-471-73507-6.
— One of the most elementary and readable textbooks discussing Pólya's enumeration formula and cycle index polynomials.

Retrieved from "http://en.wikipedia.org../../../n/e/w/Newton%27s_identities.html"

Newton's identities

From Wikipedia, the free encyclopedia

Contents

[edit] Mathematical statement

[edit] Application to the characteristic polynomial of a matrix

[edit] Relation with invariant theory

[edit] Relation with symmetric groups

[edit] Relation with computational algebraic geometry

[edit] See also

[edit] References

Views

Navigation

interaction

Search

In other languages