Formal Summation and Dirichlet L-functions

Recall the classical Riemann zeta function:

\zeta(s) = \sum_{n\geq 1}\frac{1}{n^s}

and the Dirichlet L-functions for a character \chi: \mathbb Z \to \mathbb C:

S(s,\chi) = \sum_{n\geq 1}\frac{\chi(n)}{n^s}.

defined for \Re(s) > 1.  These functions can be analytically continued to the entire complex plane (except a pole at s=1 in the case of \zeta(s)). In particular, the values at non positive integers carry great arithmetic significance and enjoy many properties.

For instance, L(s,\chi) is an algebraic integer and in fact equal to B_{k+1,\chi}/(k+1) where B_{k,\chi} are the generalized Bernoulli numbers. Moreover, these values satisfy p-adic congruences and integrality properties such as the Kummer congruence (and generalizations to the L-functions).

The standard proof of these proceeds by showing that L(s,\chi) satisfies a functional equation that relates L(1-s,\chi) to L(s,\overline{\chi}) and then computing the values L(n,\chi) for n \geq 1 integral using analytic techniques (such as Fourier analysis).

Proving the p-adic properties is then by working directly with the definition of the generalized Bernoulli numbers instead of the L-functions. However, the L-functions are clearly the fundamental object here and it would be nice to have a way to directly work with the values at negative integers (without using the functional equation).

One might be tempted to extend the series definition to the negative integers and say:

\zeta(-k)  "="  1^k + 2^k + 3^k + \dots

For instance, consider the following (bogus) computation:

\zeta(0)\frac{t^0}{0!} = 1^0\frac{t^0}{0!} + 2^0\frac{t^0}{0!} + \dots

\zeta(-1)\frac{t^1}{1!} = 1^1\frac{t^1}{1!} + 2^0\frac{t^1}{1!} + \dots

\zeta(-2)\frac{t^2}{2!} = 1^2\frac{t^2}{2!} + 2^2\frac{t^2}{2!} + \dots

. . .

and let us “sum” the columns first:

\sum_{k\geq 0}\zeta(-k)\frac{t^{k}}{k!} = e^{t} + e^{2t} + \dots = \frac{e^t}{1 + e^t}

which, remarkably enough, is the right generating function for \zeta(-k)! We have exchanged the summation over two divergent summations and ended up with the right answer.

In fact, it is possible to rigorously justify this procedure of divergent summation and moreover, one can use it to prove a lot of arithmetic properties of these values rather easily (like the Kummer congruence). I learnt the basic method from some lecture notes of Prof. Akshay Venkatesh here: Section 3, Analytic Class Number formula and L-functions.

I then discovered that one could use these techniques to compute the explicit values (along the outline above) and prove some more stuff. I wrote this up in an article (that also explains the basic technique and should be (almost) self contained) here: Divergent Series and Dirichlet L-functions.


Schur’s Lemma and the Schur Orthogonality Relations.

Both Schur’s Lemma and the Schur Orthogonality relations are part of the basic foundation of representation theory. However, the connection between them is not always emphasized and the Orthogonality relations are proven more computationally.

The standard proofs of the relations never made sense to me, however there is very direct way to derive them from Schur’s Lemma (which makes perfect sense to me!) and simple facts about projections on vector spaces. More importantly, it gives a categorical interpretation of the inner product. I think this approach should be emphasized way more than it currently is and I hope this post will go a tiny way towards fixing that.

Throughout this post, V,W will denote irreducible representations of a finite group G such that |G| \neq 0 \in k. In other words, the characteristic of our field does not divide the order of the group.  For a representation N, I will denote it’s character by \chi_N.

Also, Hom(V,W) will denote the representation of linear maps between V,W while Hom_G(V,W) will denote the representation of linear maps that respect the G– action. In fact, Hom_G(V,W) = Hom(V,W)^G. That is, the linear maps fixed by G are precisely the maps that respect the G– structure as can be easily verified.

Recall that the Schur orthogonality relations state the following:

Theorem 1 [Schur Orthogonality Relations]:

\langle\chi_V,\chi_W \rangle := \frac{1}{|G|}\sum_{g\in G}\overline{\chi_V(g)}\chi_W(g) = \begin{cases} 1 & V \cong W\\ 0 & V \not\cong W \end{cases}         


It expresses the inner product of two irreducible representations simply in terms of whether or not the two representation s are isomorphic. We do not lose anything by restricting to irreducible representations since the general case follows simply by the bilinear properties of the inner product.

As mentioned above, we will prove this using just Schur’s Lemma and a simple observation about projections of vector spaces. Recall Schur’s Lemma:

Lemma 1 [Schur’s Lemma]:                         Hom_G(V,W) = \begin{cases} k & V \cong W\\ 0 & V \not\cong W \end{cases} 

Proof: The proof of this is very simple and follows from the idea that the kernel and image of a map between representations are themselves representations. Since V,W were assumed to be irreducible, Schur’s lemma pops out immediately.


Note the remarkable similarity with the orthogonality relations. The second observation we will need it the following:

Lemma 2: For any representation M of G, the operator P = \frac{1}{|G|}\sum_{g\in G}g from M \to M is in fact a projection of M onto M^G, the subspace fixed by G. Also, for any projection P, the trace of P is equal to the dimension of it’s image.

Proof: The proof is again very easily verified. One simply has to check that P fixes M^G and that the image is fixed by G. The second statement can be proved by taking a basis of M^G and extending it to M.

As simple as the above two lemmas are, they are all we will need. Let us apply the second observation taking M = Hom_G(V,W) and take the trace of the projection operator P.

In fact, we will have established the following general theorem (not requiring that V,W be irreducible any longer):

Theorem 2:

\langle \chi_V,\chi_W\rangle = \dim Hom_G(V,W)

Our proof shows it for irreducible representations but note that both sides are bilinear in (V,W).


Compared to the standard proof (in say Serre), this proof is very conceptual and factors the orthogonality relations into Theorem 2 and Schur’s Lemma. In fact, Theorem 2 is of great interest by itself. For instance, one can prove the Frobenius Reciprocity theorem extremely easily from what we have shown.

The proof runs as follows: Let f: H \to G be a map of groups. Note that representations of H,G over k are the same as modules over k[H],k[G]. Let f^*: Rep_H \to Rep_G be the induced representation functor and f_*: Rep_G \to Rep_H be the restriction functor.

It is easy to show that in fact f^*,f_* correspond to the usual pullback and pushforward on modules and the usual adjunction (Tensor-Hom in this case) shows:

Hom_G(f^*V,W) = Hom_H(V,f_*W) .

Applying theorem 2 to this after taking dimensions on both sides, we end up with:

\langle f^*\chi_V, \chi_W\rangle = \langle \chi_V,f_*\chi_W \rangle

which is one way of writing the Frobenius reciprocity theorem.


Closing Remarks:

Ultimately, I think the value of this approach is twofold:

  1. This proof clarifies that what we should really be interested in is Hom_G(V,W).
  2. We have given a categorical interpretation to the inner product. That is, the inner product is in fact a representation and even more, it is the Hom-set in the category of representations over G and therefore the category of representations is enriched over itself. This eventually leads to the idea of Tannakian categories.


Noether Normalization, Spreading out and the Nullstellensatz .

Hilbert’s Nullstellensatz plays a central role in algebraic geometry. It can be seen as the fundamental link between the modern theory of schemes and the classical theory of algebraic varieties over fields. Since this is one of the first results a novice in algebraic geometry learns and is often proved very algebraically, one often does not gain a good understanding of the proof till much later.

I would like to fix my own understanding the result and it’s geometric nature in this post. I will go through a few proofs of the theorem and point out the geometric ideas behind it. The proof of Hilbert’s lemma is usually broken up into the following two steps: 1) Prove the weak Nullstellensatz and 2) Derive the strong Nullstellensatz using the Rabinowitsch or other means. I will be focusing solely on the first step in this post. Nothing in this is new to me except perhaps the presentation and mistakes.

The weak Nullstellensatz is a statement about solving polynomial equations in multiple variables over a field. The one variable version of the problem is well understood (think Galois theory) and says that any polynomial f(x) over a field k will have all it’s solutions in some finite extension of k. The Nullstellensatz says that this result propagates to multiple variables. That is:


               Theorem 1 [Weak Nullstellensatz]: Let f_1(x_1,\dots,x_n), \dots, f_m(x_1,\dots,x_n) be a set polynomials in R = k[x_1,\dots,x_n]. Then exactly one of the following is true:

  1. There exist polynomials Q_1,\dots,Q_m \in R such that \sum_{i}f_iQ_i = 1.
  2.  There exist values (a_1,\dots,a_n) \in \overline{k}^n such that f_i(x_1,\dots,x_n) = 0 for all i = 1,2\dots,m.

If both conditions were simultaneously true, you could derive a contradiction by plugging in the (a_1,\dots,a_n) into the equation in (1). Therefore, it suffices to assume that 1) is false and prove 2). This leads to the following reformulation.


               Theorem 1.2 : Let I = (f_1,f_2,\dots,f_m) be the ideal spanned by the polynomials in k[x_1,\dots,k_n]. If I is a proper ideal, then there is a finite field extension k' of k and a homomorphism A = R/I \to k'.

This reformulation is equivalent to Theorem 1 and furthermore, by replacing I by a maximal ideal \mathfrak m containing it, we can assume that A is a finitely generated field over k.

One can think of this statement as saying that any maximal ideal of R = k[x_1,\dots,x_n] is in fact of the form (x_1-a_1,\dots,x_n-a_n) after base changing to the algebraic closure and therefore, the closed point of a finite type scheme over a field are precisely the points of a corresponding algebraic variety. This is the connection between scheme theory and classical algebraic geometry that I alluded to above.

We can also compare this statement to the one-dimensional case and interpret the weak Nullstellensatz as saying that any solutions to a system of polynomial equations over a field in fact lie in an algebraic extension of the field.

We can make one final simplification before moving on to proofs. It is easily seen that it is sufficient to prove the theorem for algebraically closed fields since we can then recover the above formulation by passing to the algebraic closure of k. Therefore, from now on, I will assume that k is algebraically closed. What will be important for us is that k is infinite and perfect.

Of the above formulations, the most important to us will be the idea that we want to locate rational points (in the algebraically closed case) on finite type schemes over a field. There is a common strategy to all our proofs that runs so:

Let X be the given finite type k- scheme on which we want find a point with values in an algebraic closure. The idea then is to find a Y that receives a finite map \pi: X \to Y with a rational point in the image. Usually, Y will be some well understood variety like affine space or some open subset of it.

Given such a map, we simply take a rational point on the base and base change X to it. This will be finite over k and hence it’s closed points will be finite field extensions of k which will also simultaneously be a point of X. This point on X is what we were looking for all along!


Noether Normalization


One of the standard ways to prove the theorem is to use something called Noether Normalization, whose proof is also often couched in commutative algebra. I will give a geometric interpretation of the proof here and use it to quickly prove the Nullstellensatz.


               Theorem 2 [Noether Normalization]: Let A be finitely generated ring over the field k and let X be the corresponding scheme. Then, there is a finite, surjective map \pi: X \to \mathbb A^d_k. Note that this forces d = \dim X.

Proof: We can embed X in some \mathbb A^n (essentially, pick generators for A). Embed \mathbb A^n further into \mathbb P^n in the standard way and let \overline X be the projective closure of X.

We can assume that X is not all of \mathbb A^n since otherwise take n = d and \pi to be the embedding map. Therefore, \overline X is not all of \mathbb P^n and we can find a rational point  P at infinity that is not on \overline X. (Exercise! Hint: One way is to homogenize the explicit defining equations of X). Similarly, we can find a rational hyperplane H (ie, defined over k) in \mathbb P^n.

Project \overline X to H through P. The map is proper and hence has closed image. Since P was chosen to be at infinity, this restricts to a map f: X\to \mathbb A^n \cap H \cong \mathbb A^{n-1}. If the map is surjective, take \pi to be f, otherwise inductively continue the process till it is surjective.

The map f is finite since it is quasi-finite finite and proper and therefore, we have established that there is some finite map pi: X \to \mathbb A^d.

Remark 0: The entire content of the theorem is in ensuring that the finite map is surjective. Otherwise, one could simply take a closed embedding into some affine space!

Remark 1: If one writes out the maps in the above proof explicitly, you recover the usual algebraic proof with some minor changes (and linear substitutions throughout). There is of course a fair bit of choice involved with the point P and the hyperplane H but one also has to choose P (essentially) in the standard proof. Taking H to be as simple as possible makes the computation easy.

Remark 2: Our life above was simplified by assuming k to be infinite. If k is infinite, the only change to be made is to embed take a Veronese embedding of \mathbb P^n first before choosing P and H. This corresponds to taking a hypersurface of high degree instead of a hyperplane for H.


Deriving the Nullstellensatz: Proving the weak Nullstellensatz is a very short road from here and can be done in a couple of slightly different ways. Recall that R = k[x_1,\dots,x_n]

First: Suppose L = R/\mathfrak m is a field that is finitely generated as a ring over k. Then, by Noether normalization, we can find a finite map Spec L \to \mathbb A^d where d = \dim Spec L. However, the dimension of any field is 0 and so, \mathbb A^d = k and L is finite over k as required.


Second: More in line with the outlined strategy, one can proceed so: Let A = R/I be a finitely generated ring over k. By Noether normalization, we can find a finite map k[y_1,\dots,y_d] \to A.

Since our finite map is surjective, take any rational point on the base \mathbb A^d and let X' be the fiber of X over this point. Since finiteness is maintained under pull backs, X' is in fact finite over the field k and is therefore a finite field extension.

That is, there is a solution of the polynomials defining X in a finite field extension of k, to be precise, in the field corresponding to X'

Unwinding the proof, we are essentially doing the following: Since the x_i are finite over k[y_1,\dots,y_d], they each satisfy some monic polynomial with coefficients in k[y_1,\dots,y_d]. We can substitute in values of k for the variables y_1,\dots,y_d and end up with a system of monic equations for the x_i over k.

Crucially, these equations will necessarily have a common solution in A since the normalization map was surjective. However, since the fibers are finite, the solutions will all lie in some finite extension as required!


Remark: This last proof is really constructive in the sense that one can follow the procedure to find solutions for the system of polynomial equations we started with. It depends of course on Noether normalization being constructive but our proof is easily seen to provide an algorithm.



Transcendence basis and Noether Normalization


We do not in fact need the entire power of Noether normalization to prove the Nullstellensatz. One can make do with a suitable weaker version that only applies to field extensions and is in fact equivalent to the existence of a transcendence basis for a finitely generated field extension.

Recall that for a field k, all finitely generated field extensions L/k can be broken up in the form k \to k(y_1,\dots,y_d) \to L where the y_1,\dots, y_d are algebraically independent over k and L/k(y_1,\dots,y_d) is algebraic. Elements y_1,\dots,y_d in the above decomposition are called a transcendence basis.

Further, if k is perfect, then by the primitive element theorem, we can in fact write L = k(y_1,\dots,y_d)[t]/f(t) where f(t) is a monic polynomial over k(y_1,\dots,y_n). Recall our standing assumption that k is perfect.

This can be seen as a (very) weak version of Noether normalization in the following way: We can find an affine algebraic variety X/k such that it’s function field K(X) = L. Then, the existence of a transcendence basis amounts to saying that the generic point of X maps to the generic point of k[y_1,\dots,y_d] such that the resulting field extension is finite.

In fact, this isn’t that for off from Noether normalization for the following reason. We can spread out the map on generic points to a map over some open subset of the affine plane in the following way:

Consider the coefficients of f(t) (as defined a couple of paragraphs above). They are a finite set of rational polynomials in the variables y_1,\dots,y_n. We can assume they have a common denominator h(y_1,\dots,y_d) \in k[y_1,\dots,y_d]. Then, let R = k[y_1,\dots,y_d,1/h(y)] and S = R[t]/f(t). It is easy to see that S is a model for L over R in the sense that S\otimes_R k(y_1,\dots,y_d) = L. It is equally clear that S continues to be finite over R.

Since R corresponds to an open subset of k[y_1,\dots,y_d] and S to a variety with function field L, we might as well take X = Spec S. Thus, we have found a finite surjective map an open subscheme of \mathbb A^d as promised.

We can use these ideas to prove the Nullstellensatz.


The Nullstellensatz once again


Let A = k[x_1,\dots,x_n]/\mathfrak m be a finitely generated ring over k that also happens to be a field. By our above result on transcendence basis, there is an isomorphism A \cong k(y_1,\dots,y_d)[t]/(f(t)) = L.

Now, L is the union (=direct limit) of rings of the form k[y_1,\dots,y_d,1/h(y)][t]/(f(t)) for appropriate h(y) \in k[y_1,\dots,y_n] and f(t) a monic polynomial over k[y_1,\dots,y_d,1/h(y)]. This follows directly from our discussion on spreading out in the last section since any function field is the union of the rings corresponding to distinguished open sets.

Since A is finitely generated over k, we can find some h_0(y) such that all the x_i map to S = k[y_1,\dots,y_d,1/h_0(y)][t]/(f(t)). However, since k is infinite (recall our standing assumption), we can find (a_1,\dots,a_d) \in k^n such that h_0(a_1,\dots,a_d) \neq 0. Thus, we can find a map S \to k[t]/( f(t,a_1,\dots,a_n) )= k' where we evaluate f(t) at the values y_k = a_k.

However, this tells us that there is a map A \to S \to k' which forces A to itself be a finite field extension of k (since k') is (and A was assumed to be a field).


Remark: This proof has unmistakable similarities to the our second proof two sections ago. The key difference is that we have substituted Noether Normalization with the existence of a transcendence basis.

Both the proofs proceed by realizing our finitely generated scheme X = Spec A as something finite over something well understood. In the previous section, we could take the base to be all of \mathbb A^d and further demand that the map be surjective. This allowed us to pick any rational point on the base to find a point on X.

However, in the proof in this section, we can only take our base to be an open subset of \mathbb A^n (in our proof, this corresponds to R = k[y_1,\dots,y_d,1/h_0(y)] and accordingly, we need to take a point that lies in this open set. Since surjectivity is basically obtained from our construction of spreading out, this is not too hard to do if we insist our field be infinite. After this, the two proofs proceed in exactly the same way.


Doing away with Transcendental basis


By modifying our last proof a little, we can in fact even do away with any reliance on transcendental basis. Instead, we can proceed by induction on the number of variables in our algebra.

Let A = k[x_1,\dots,x_n]/\mathfrak m be a field as usual. We want to show that A is a finite field extension of k. Equivalently, we want to show that the x_k are algebraic over k (thinking of the x_k as elements in A). We will do this by induction on n.

The case of n = 1 is tautological. So let us suppose that n \geq 2. Denote by \alpha_k the image of x_k in A. I will henceforth write A = k(\alpha_1,\dots,\alpha_n). For contradiction, we can suppose that \alpha_1 is transcendental over k.

By our inductive hypothesis, this implies that the \alpha_2,\dots,\alpha_n are algebraic over k(\alpha_1). By a similar spreading out argument as before, we can in fact suppose that the image of A is of the form k[\alpha_1,1/h(\alpha_1)](\alpha_2,\dots,\alpha_n) where h is a polynomial over k.

Therefore, we have shown that there is a finite map Spec A \to U where U is an open subscheme of \mathbb A^1_k and we can conclude the proof in a couple of ways.

We can specialize \alpha_1 to a rational value and proceed as before or alternatively,  we can end with the slick observation that if a field is integral over a ring B, then B, then B is itself a field. However, in our case B = k[\alpha_1,1/h(\alpha_1)] and this is certainly not a field!


Remark: The idea here is that any transcendental extension over k will necessarily contain k(x). This is equivalent to the first step of building a transcendental basis.

However, once we know that it contains k(x), we can spread out as before and obtain a map X \to \mathbb U \subset \mathbb A^1. Since finiteness can be preserved by spreading out, we can use our induction hypothesis to establish finiteness and then proceed as before.

This proof can be surprising because we seem to be get the finiteness hypothesis for free. However, considering the first non trivial case of n = 2 and one polynomial equation, we see that this is really somehow the obvious thing to do. Treat one of the variables as constant and solve for the other variable! Then, figure out an appropriate specialization so that a solution exists. The appropriate specialization is equivalent to finding a point in k^n such that h(x) does not vanish…




The Groupoid Cardinality of Finite Semi-Simple Algebras

A groupoid is category where all the morphisms are isomorphisms and groupoid cardinality is a way to assign a notion of size to groupoids. Roughly, the idea is that one should weigh an object inversely by the number of automorphisms it has (and we only count each isomorphic object as one object).

It is important to count only one object from each isomorphism class since we want the notion of groupoid cardinality to be invariant under equivalences of groupoids (in the sense of category theory) and every category is equivalent to it’s skeleton. For further motivation for the idea of a groupoid cardinality, see Qiaochu Yuan’s post on them

This seems like quite a strange thing to do but it turns out to be quite a useful notion. One of my favorite facts about Elliptic curves is that the groupoid cardinality of the supersingular elliptic curves in characteristic p is p-1/24! See the Eichler-Deuring mass formula. 

Another interesting computation along these lines is that the number of finite sets is e. One can ask this question of various groupoids and the answer is often interesting. I will ask it today of semi-simple finite algebras of order n. By an algebra, I will always implicitly mean commutative in this post.

By a classification theorem, we know that semi-simple finite algebras are simply products of finite fields.  In fact, if n = ab with a,b coprime and SSA(k) denotes the set of semi-simple finite algebras of order k (upto isomorphism), then SSA(n) = SSA(a)\times SSA(b) where the map simply takes algebras M,N \to M\times N.

Therefore the groupoid cardinality of SSA(n) is the product of the groupoid cardinality of SSA(a) and SSA(b). We will now restrict to the case n = p^m. Computing a few small examples, one sees that the groupoid cardinality is always 1! As surprising as this is, it is not too hard to prove. However, the proof that I know of this fact is not very illuminating (at least to me). Please let me know if there is any to make this result seem obvious. 

It will help to establish some notation about semi simple algebras of prime power order p^m. It is easy to see that these algebras correspond to (unordered) partitions of m in the following manner:

To a partition m = k_1+k_2+\dots + k_r, we associate the algebra \mathbb F_{p^{k_1}}\times \mathbb F_{p^{k_2}}\times\dots\times\mathbb F_{p^{k_r}}. In fact, let us group the identical k_i‘s together and write the partition as k = \sum_{i\geq 0}s_kk where s_i is the number of k's that appear and call the corresponding algebra of type $s = (s_1,s_2,\dots)$

Since a finite field of order p^n has n automorphisms and there are no isomorphisms between fields of different sizes, we see that the number of algebras of type s is \prod_{k\geq 1}k^{s_k}s_k!. The k^s corresponds to isomorphisms of fields while the $s_k!$ corresponds to permutations between different finite fields.

For instance, in the case of \mathbb F_{2}\times\mathbb F_{2}\times\mathbb F_{4}, there is an automorphism that corresponds to switching the first two factors and 2 automorphisms of the last factor giving a total of 2\times 2 = 4 automorphisms.

Thus, we need to compute the following beast:

                                                                                                                        \sum_{s}\prod_{k\geq 1}\frac{1}{k^{s_k}s_k!}                                                                                                        (1)

where the sum is over partitions s = (s_1,s_2,\dots) such that \sum s_kk = m. While this might seem imposing at first sight, there is a trick familiar to people with experience in generating functions that makes the computation straightforward.

The key observation is to realize that:

\prod_{k\geq 1}(1+x^k+x^{2k} +\dots) = \sum_{r\geq 1}p(r)x^r

Here, p(r) denotes the partititons of r. This follows from a naive expansion of the right hand side and is a formal identity of power series. I believe the observation goes all the way back to Euler. In our case, we only need to consider a slight modification. Consider the power series:

\prod_{k \geq 1}(\sum_{i\geq 0}\frac{x^i}{k^ii!}) = \prod_{k\geq 1}e^{x^k/k} = e^{-\log(1-x)} = \frac{1}{1-x}.

However, expanding the first sum, one sees that the coefficient of x^m is simply our (1) which is equal to 1 as we see by expanding the geometric series!

The Weak Mordell-Weil Theorem

Let A be an abelian variety over a field K. A basic object to investigate is the group A(K). Let K be a number field. In light of the Birch and Swinnerton-Dyer conjecture and it’s relation to the class number formula, one should think of A(K) as the analog of the class group of a global field.

Thus, one might conjecture some finiteness properties of this group. It is not true that A(K) is finite as can bee seen by looking at some examples of elliptic curves but it is true that A(K) is finitely generated as an abelian group and this is the content of the Mordell-Weil Theorem.

The proof is usually broken up into two parts:

  • Weak Mordell-Weil Theorem: A(K)/nA(K) is finite for any integer n.
  • Descent using a Height Function: Deduce the full theorem from the above using a measure of size on the points of A(K).

I will focus on the first part in this section and prove it in a motivated (but sophisticated) fashion. This proof will also differ from the standard proofs in trading in for classical algebraic number theory  results (finiteness of class group and finite generation of unit group) for class field theory. This will greatly simplify the second half of the proof.

I will make free use of general theory about Abelian Varieties, Algebraic geometry and Galois Cohomology. The point of this post is not to fill in the details but to show a framework that makes the proof seem natural.

The Kummer Sequence:

Recall that K is a number field and A is an abelian variety over it. I will denote by A[n] the kernel of the multiplication by n map on A. I will often identify A,A[n] with the \overline K points of A,A[n] respectively. Also denote by G_K the Galois group of \overline K over K.

Now, we can consider the exact sequence of G_K modules:

0 \to A[n](\overline K) \to A(\overline K) \xrightarrow{\times n} A(\overline K) \to 0

Taking Galois invariants, we get the long exact sequence:

0\to A[n](K) \to A(K) \xrightarrow{\times n} A(K) \to H^1(G_K, A[n](\overline K)) \to \dots

and truncating the series, we have an injection:

0 \to A(K)/nA(K) \to H^1(G_K,A[n](\overline K)).

So, we now see that to show that A(K)/nA(K) is finite, we simply need to show that H^1(G_K, A[n](\overline K)) is finite. Unfortunately, this is not true! However, we will show that A(K)/nA(K) actually lands in a subgroup of the cohomology group and this subgroup can easily be shown to be finite.


Ramification of Galois Cohomology groups:


A little more precisely, let S be a finite set of primes of K and G_{K,S} the Galois group of the extension of K that is unramified away from S. In other words, for a prime \mathfrak p not in S with inertia group I_{\mathfrak p} \subset G_K, I_{\mathfrak p} is in the kernel of the quotient map G_K \to G_{K,S} and this characterizes G_{K,S}.

We want to show that there is a finite set of primes S such that the image of A(K)/nA(K) lands in H^1(G_{K,S}, A[n](\overline K)) (Note that this is a canonical subgroup of H^1(G_{K}, A[n](\overline K)) by the restriction map. ) This is by far the hardest part of the proof:


Controlling the image of the boundary map:


To do this, let us examine the first boundary map in the Kummer sequence more closely. For an element x \in A(K), the boundary map takes x to a cocycle f_x in the following way:

Pick a y \in A(\overline K) such that ny=x and define f_x(g) = gy - y. One can check that this is independent of the choice of y and that it is indeed a cocycle.

Now we see that for the image to land in H^1(G_{K,S}, A[n](\overline K)), we need to be able to find, for each x \in A(K), a y \in A(K), ny = x such that the inertia groups at primes away from S fix y. Equivalently, we want y to lie in an extension of K unramified away from S. However, this follows immediately from the following observation:

Away from S, the multiplication by n map on a model of A over \mathcal O_{K,v} is unramified. This can be checked over the special and generic fiber of \mathcal O_{K,v} since the multiplication by n map acts on the tangent spaces by literally multiplying by n (which is a unit away from S).

In particular, the fiber of a point P in A(K)  lies in an extension unramified outside of S and the fiber consists precisely of points Q such that nQ = P. This is precisely what we were required to prove.

We are done with harder part and only need to show:


Finiteness of H^1(G_{K,S}, A[n](\overline K)):


Since there are only finitely many points in A[n](K) and A[n] is etale over a residue field away from S, we can find a finite, unramified away from S, extension L of K that splits H^1(G_{K,S}, A[n](\overline K)). That is, H^1(G_{L,S}, A[n](\overline K)) = Hom(G_{L,S}, A[n](\overline K)). Moreover, this is finite by Class field theory since we are looking for abelian extensions with bounded degree and ramification only at a finite number of predetermined points (and hence bounded ramification at these points).

Now, the rest of the proof follows easily on considering the inflation-restriction sequence:

 0 \to  H^1(G_{L,S}, A[n](\overline K)) \to H^1(G_{K,S}, A[n](\overline K)) \to H^1(Gal(L/K), A[n](\overline K))

The first set is finite as discussed above while the third group is finite since both the group and the module are finite. This establishes finiteness in the middle as required.



Closing Thoughts:


I find that this proof clarifies the role of class field theory versus the Elliptic Curve machinery. Furthermore, the usual proofs proceed by using both the finiteness of the class group and finite generation of the unit group in an essential way. This makes the analysis fairly complicated but it does have the advantage of not using any class field theory.

The proof here instead trades some complexity by using class field theory instead of more elementary algebraic number theory. This makes the final part of the proof almost trivial and the idea clearer.



Congruent Numbers and Elliptic Curves

A congruent number n is a positive integer that is the area of a right triangle with three rational number sides. In equations, we are required to find rational positive numbers a,b,c such that:

\displaystyle a^2+b^2 = c^2    and    \displaystyle n = \frac12 ab.                       (1)

The story of congruent numbers is a very old one, beginning with Diophantus. The Arabs and Fibonacci knew of the problem in the following form:

Find three rational numbers whose squares form an arithmetic progression with common difference k.

This is equivalent to finding integers X,Y,Z,T with T\neq 0 such that Y^2 - X^2 = Z^2 - Y^2 = k which reduces to finding  a right triangle with rational sides

\displaystyle \frac{Z+X}{T}, \frac{Z-X}{T}, \frac{2Y}{T}

with area k. This is the congruent number problem for k. The Arabs knew several examples of congruent numbers and Fermat stated that no square is a congruent numbers. Since we can scale triangles to assume that n is square free, this is equivalent to saying that 1 is not a congruent number.

As with many other problems in number theory, the proof of this statement had to wait four centuries for Fermat. The problem led Fermat to discover his method of infinite descent.

In more recent times, the problem has been fruitfully translated into one about Elliptic Curves. We perform a rational transformation of the defining equations (1) for a congruent number in the following way. Set x = n(a+c)/b and y = 2n^2(a+c)/b. A calculation shows that:

\displaystyle y^2 = x^3 - n^2x.                                          (2)

and y \neq 0. If y = 0, then a=-c and b = 0 but then n = \frac12 ab = 0. Conversely, given x,y satisfying (2), we find a = (x^2-y^2)/y, b = 2nx/y and c = x^2+y^2/n and one can check that these numbers satisfy (1).

The projective closure of (2) defines an elliptic curve that we will call E_n. We are interested in finding rational points on it that do not satisfy y=0. I will prove that n is a congruent number precisely when E_n has positive rank.

The proof is an interesting use of Dirichlet’s Theorem on Arithmetic Progressions and some neat ideas about Elliptic Curves and their reductions modulo primes. I will essentially assume the material in Silverman’s first book and the aforementioned Dirichlet’s Theorem.

So far, we know that n is a congruent number if and only if E_n has rational points (x,y) with y \neq 0. Recall that for an elliptic curve in the standard Weierstrass form (as in (2)), y = 0 if and only if (x,y) has 2-torsion.

Therefore, our problem reduces to showing that the only torsion of E_n is 2-torsion for all n. In fact, we always have non-trivial $2$-torsion and the points are given by (0,0), (n,0),(-n,0) and the point at infinity.

Denote the m-torsion my E_n[m]. The rough outline of the proof is as follows:

  1. E_n[m] maps injectively into the reduction of E_n modulo a prime p for all but finitely many primes.
  2. The number of \mathbb F_p points of E_n is independent of p and equal to p+1 whenever p \equiv 3 \pmod 4.
  3. This would imply that m|p+1 for a set of primes $p$ of density $1/2$ but by Dirichlet’s Unit Theorem, the set of such primes is of density 1/\varphi(m) < 1/2 for all m>4.

Step 1:

Recall that the size of E_n[m](\mathbb Q) is finite (and has exactly m^2 elements in \overline{\mathbb Q}. Also, recall that if (x_1,y_1,z_1) and (x_2,y_2,z_2) are two points in the projective plane (over any field k), then they are equal if and only if

\displaystyle x_1y_2-x_2y_1, x_1z_2-x_2z_1, y_1z_2-y_2z_1    (3)

are all 0. Thinking of our E_n has embedded in the projective plane, reduction modulo p is simply reduction on each of the co-ordinates. Therefore, if we pick any finite set of \mathbb Q points on $E_n$, then for any prime p greater than any of the prime divisors of (3) , it’s reduction will be non-zero.

Since E_n[m] is a group, this is sufficient to show that the reduction map is injective for all but finitely many primes p (once we fix m).

Step 2:

Fix a prime p. One could prove this by an explicit calculation involving quadratic characters but there is a neater way assuming some knowledge about the endomorphism ring R_n =   \mathrm{End}_{\mathbb F_p}(E_n). The relevant facts are the following:

  1. Over a finite field \mathbb F_q, the endomorphism ring is either an order in a quadratic imaginary field or an order in a quaternion algebra.
  2. Further, over \mathbb F_p with p>5, the latter case occurs precisely when the number of elements on the curve is p+1 and the curve is called supersingular.
  3. In either case, for any endomorphism f, there is a dual \hat f such that N(f) =  f\circ\hat f is multiplication by the degree of f, which is always non-negative.

A few words about the above statements: They are in true for all Elliptic curves and not just E_n. A reference for all of the above is the chapter on Finite Fields in Silverman’s first book on Elliptic Curves.

(3) is true over any field. The function N occurring in (3) is the usual norm function on a number field or a quaternion algebra.

One can show using the Tate Module that the dim of R_n\otimes \mathbb Q is at most 4 over \mathbb Q. Since there is always a Frobenius element \varphi such that N(\varphi) = p, this rules out the case of \mathbb Z where the N function maps to the squares. This proves (1).

A sufficient (and necessary) condition for R_n being 4 dimensional is that \hat\varphi be inseparable but this implies a_p = \varphi + \hat\varphi \equiv 0 \pmod p. As a consequence of the Hasse-Weil inequality which says a_p \leq 2\sqrt p, this would show that a_p = 0 p>5. One further proves that |E_n(\mathbb F_p)| = p+1-a_p which shows (2).

Now, given the (3) statements, it is easy to complete step 2. Note that (x,y,z) \to (-x,iy,z) is always an endomorphism of E_n as long as i = \sqrt{-1} is defined in \mathbb F_p, equivalently, as long as p\equiv -1 (4). Therefore, if R_n were two dimensional, it would have to be an order in \mathbb Z[i].

As mentioned before, we also have the Frobenius \mathbb \varphi with norm p. This shows that \varphi is a prime in the ring. However, the norm of any prime element in \mathbb Z[i] is either a square or congruent to 1 modulo 4. This forces R_n to be 4-dimensional and we are done by (2).

Step 3:

If E_n(\mathbb Q) has any torsion point P that is not 2-torsion, then we can find some point Q of order more than 4. Let this order be m.

By step (1), Q has order m for all but finitely many primes and therefore, since E_n(\mathbb F_p) is a group of size p+1 (for p\equiv 4\pmod 4 by step (2)), m|p+1 for all but finitely many primes p \equiv -1 \pmod 4.

Since the (Dirichlet) density of primes of the form 3k+4 has density 1/2 and a finite set of primes does not contribute to (Dirichlet) density, we can put this differently in the following way:

The  Dirichlet density of primes p \equiv -1\pmod m is at least 1/2.

However, we know that the density of primes of the form km+1 is exactly 1/\varphi(m) which is strictly less than 1/2 whenever m > 4. This provides the required contradiction and completes the proof.

The statements about the densities of various primes above all follow from Dirichlet’s Theorem on Arithmetic Progressions.


I would like to end by talking about a conjectural algorithm to detect congruent numbers. It is called Tunnell’s algorithm and is based on the idea above: A number n is congruent if and only if E_n(\mathbb Q) has positive rank.

The algorithm is easy to describe (and execute) but it’s correctness depends on a very deep theorem about Elliptic Curves (the Birch and Swinnerton-Dyer conjecture).

The algorithm is as follows: For a square free integer n, define:

{\begin{matrix}A_{n}&=&\#\{(x,y,z)\in {\mathbb  {Z}}^{3}|n=2x^{2}+y^{2}+32z^{2}\}\\B_{n}&=&\#\{(x,y,z)\in {\mathbb  {Z}}^{3}|n=2x^{2}+y^{2}+8z^{2}\}\quad \\C_{n}&=&\#\{(x,y,z)\in {\mathbb  {Z}}^{3}|n=8x^{2}+2y^{2}+64z^{2}\}\\D_{n}&=&\#\{(x,y,z)\in {\mathbb  {Z}}^{3}|n=8x^{2}+2y^{2}+16z^{2}\}.\end{matrix}}

Tunnell proved unconditionally that if n is an odd congruent number, then 2A_n = B_n and if n is an even congruent number, then 2C_n = D_n. The converse is true assuming the Birch Swinnerton-Dyer conjecture.

More precisely, if we denote by L (s)= L(E_n,s) the L-function for E_n over $\mathbb Q$ at s=1, recall that the Birch-Swinnerton Dyer conjecture states that the order of vanishing of L(s) at s = 1 is the rank of E_n(\mathbb Q).

Therefore, n is a congruent number if and only if L(E_n,s) = 0. What Tunnel showed was that:

L(E_n) = \begin{cases}\gamma(2A_n-B_n) & n \text{is odd}\\\gamma(2C_n-D_n) & n \text{ is even}\end{cases}

where \gamma is a non-zero constant. Note that E_n always has complex multiplication over \mathbb Q since (x,y,z) \to (-x,iy,z) is an automorphism of order 4.

The unconditional direction of Tunnel’s criterion follows from the following theorem of Coates-Wiles in 1976:

Theorem[Coates-Wiles(1976)] If an Elliptic Curve over \mathbb Q has complex multiplication by a ring of integers with class number 1 and has positive rank over \mathbb Q, then the corresponding L-function vanishes at s=1.


Descent on Vector Spaces and Cohomology

It is quite often of interest to study the properties of some variety {X/\mathbb Q}. However, it is generally much easier to study varieties over algebraically closed fields and so we need some way of translating a property of {X_{\overline{\mathbb Q}}/\overline{\mathbb Q}} to {X/\mathbb Q} . This idea is known as descent and in this post, I would like to say a little bit about the simplest example of descent – over vector spaces.

Let {L/K} be an extension of fields and {V} a vector space over {K} . Consider {W = V\otimes_K L } . The Galois Group {G = \mathop{Gal}(L/K)} acts on {W} through the second factor one can consider the {K} -vector space {W^G} . This is the vector space fixed by {G} .

Theorem 1 (Descent of Vector Spaces) The natural map {W^G\otimes_K L \rightarrow W} is an isomorphism. In particular, if {W} is finite dimensional, then {\dim_K W^G = \dim_L W} .

It is not hard to prove this theorem directly but I would like to relate it to another theorem. This is also well known and is a generalization of the famous Hilbert’s Theorem 90. Let {{GL}_n(L)} be the group of invertible {n} -dimensional matrices over {L} and consider the cohomology {H^1(G,{GL}_n(L))} . This is not a group unless {n=1} since {{GL}_n(L)} is non-commutative in general. However, it is a pointed set and we have the following theorem:

Theorem 2 (Hilbert’s 90) \displaystyle H^1(G,{GL}_n(L)) = \{0\}.

Hilbert stated the above theorem (in a disguised form) for {n=1} and {L/K} a finite cyclic extension. Noether generalized the theorem to arbitrary extensions. I do not know who is responsible for the generalization to general linear groups but I saw this theorem first in Serre’s “Galois Cohomology”.

In this post, I will show that the above theorems are equivalent on the following sense:

Let {W} be a {L} -vector space. We will say that a group {G} acts semi-linearly on it if \sigma(lv) = \sigma(l)\sigma(v) \text{ for all }\sigma \in G.

The typical example is when {G = \mathop{Gal}(L/K)} acts co-ordinate wise on {W = L^n} or equivalently {W = V\otimes_K L} for a {K} -vector space {V} . We will show that this is essentially the only example by proving:

Theorem 3 There is a bijection:

\displaystyle H^1(G,{GL}_n(L)) \longleftrightarrow \frac{\{\text{n-dimensional L- vector spaces with semilinear G-action}\}}{\text{isomorphisms}}

Proof: I will use {x^g} to denote {g} acting on {x} throughout:

Let us first establish the maps. Given a 1-cocyle {\eta:G \rightarrow \ mathop{GL}_n(L)} , let the corresponding vector space {W_\eta} be {L^n} with the action for {(g,w) \in G\times W} being given by {(g,w) \rightarrow \eta_g(w^g)} where {w^g} stands for the action of {g} co-ordinate wise. It is easy to verify that this is well defined:

Given {l\in L} , {\eta_g((lw)^g) = \eta_g(l^gw^g) = l^g\eta_g(w^g)} . This shows that the action is semi-linear. Then, given a one-cocycle cohomologous to {\eta} (that is, given {\tau_g = A^{-1}\eta_g A^g} ), we have the following isomorphism of vector spaces with group actions given by:

\displaystyle W_\eta \rightarrow W_\tau, w \rightarrow A^{-1}w.

That is, {A\tau_g((A^{-1}w)^g) = \eta_g(w^g)} as can be easily checked. This establishes that {\eta \rightarrow W_\eta} is well defined.

To construct the inverse, let {W} be a {L} -vector space with a semilinear {G} -action. Fix a basis {e_1,\dots, e_n} . Denote the column vector corresponding to this basis as {[e]} . Define {\eta_g} to be the unique transformation such that {\eta_g[e]^g = [e]} . To check that this is a 1-cocycle, note that:

\displaystyle \eta_g\eta_h^g[e]^{gh} = \eta_g(\eta_h[e]^h)^g = \eta_g[e]^g = [e]

and hence by uniqueness {\eta_{gh} = \eta_g\eta_h^g} .

Finally, it is easy to verify that these maps really are inverses. \Box

To see that we really have shown that Theorem 1 and Theorem 2 are equivalent, note that Theorem 1 is clearly true for {L^n = K^n\otimes_KL} . Then, if {H^1(G,{GL}_n(L)) = \{0\}} , there is a unique {L} -dimensional vector space with a semi-linear action and we can check Theorem {1} on this unique vector space.

Conversely, if Theorem {1} is true, then {H^1(G,{GL}_n(L))} is a one-element set.

Proofs for Theorem 1 and Theorem 2 can be found in many places. Serre’s Galois Cohomology is a good place to read about Group Cohomology generally and Theorem 2 in particular.

UPDATE: I later discovered that the content of this post is Exercise 1.9 in Poonen’s “Rational Points on Varieties”. He also proves Theorem 1 as Lemma 1.3.10.