Monday, 7 November 2011

What is Mathematics?

This blog post shares its title with one of the great books about mathematics, by Richard Courant and Herbert Robbins (while not in itself making any claims to greatness, naturally!). That book is the idea bridge between school mathematics and university mathematics, and should still be read by anyone who is considering studying the subject at university. While most school mathematics is a question of learning how to solve specific types of problems by rote, higher mathematics is not at all like that: that is both the point of Courant and Robbins' book, and the reason for this post.

The nature of mathematics has been an issue for philosophical discussion ever since the ancient Greeks, particularly where the question is about the relationship between mathematics and the real world. I'm not going to try to summarise, let alone add to, the huge body of literature on the subject. What I want to talk about is how to classify mathematics as a part of human learning. Today, it is usually classified as a "science", at least by educators: it is a science as far as GCSE and A-levels are concerned, though in higher education there is often a separate faculty which contains mathematics and IT.

What is a science? - Inductive Reasoning

To say anything about whether mathematics is a science requires some idea of what a science is. The exact details of this are matters of debate, with interesting contributions from people like David Hume, Karl Popper, Thomas S. KuhnPaul Feyerabend, and Michael Polanyi (whose Personal Knowledge I was reading while working on a second draft of this post, and who inspired a number of revisions). All these authors well worth reading if interested in the subject. I think it would be fair to say that most of those who consider themselves scientists will believe in some version of the "scientific method": a cycle of theory, prediction (i.e. the design of an experiment with an idea of what should happen if the theory is correct), and experiment, leading to a revised theory if the existing one falls short which is the source of new predictions and so on.

The point of this kind of scientific method is that is is supposed to support the use of inductive reasoning, which basically boils down to (OK, is oversimplified to) the idea that it is possible to use the past to predict the future. If something holds for some class of situations in the past, it at least suggests that it will hold similar future situations: every time I have dropped a ball from my hand, it has fallen towards the centre of the Earth; the next time I drop it, it should do the same. This is in fact a statistical argument. The problem with this is that it is not possible to prove that induction will work (as pointed out by Hume), even when the type of statements which it is used to support are restricted.

Mathematics and Deductive Reasoning

But nothing like this is involved in mathematics. There is no experimentation, no expectation that it is possible to prove existing results incorrect (or incomplete), and no inductive reasoning of the kind which is involved in scientific research. Instead, mathematics is the domain of deductive reasoning, and its results are sure and certain (with some possible exceptions, which I will come to later).

In mathematics, results are established using deductive reasoning. The idea is that this consists of a set of rules which can be applied to prove a conclusion from a collection of hypotheses, usually known as "axioms" in this context. Things are rather more complex than this makes it sound. The rules are pretty convincing (for example, one, known as modus ponens from medieval logic textbooks, says that if it's possible to prove one statement A and another statement which has the form "A implies B", then it's possible to prove B too).  But they really work best applied in a very formal way, using symbols which indicate the logical relationship between statements. And then establishing any useful theorem is extremely complex, time-consuming, and hard to follow. So in practice mathematicians tend to use relatively informal arguments to establish results, which could in theory be turned into formal demonstrations in symbolic logic - at least, hopefully they could. Such arguments often use well established proof methods, such as reductio ad absurdam (proof by contradiction - also a medieval term), where the opposite of the theorem's conclusion is assumed, and then this is shown to lead to a contradiction, implying that the result of the theorem must hold. In the end, a proof of a theorem is correct if it convinces the worldwide community of mathematicians, and there have been movements to make this notion different from the then-accepted norm, such as the intuitionists.

As well as the complexities of formalising the idea of proof, another nuance in the brief definition of deductive reasoning is the role of axioms. Basically, mathematicians are free to define any collection of axioms they want, and see if they lead to interesting results. Most mathematicians work with already well established collections of axioms. These often take the form of definitions for convenience - it is much easier to state a theorem "If A is an X, then..." rather than "If A satisfies the axioms X1,X2,X3,X4,...", by using the definition "A is an X if it satisfies the axioms X1,X2,X3,X4,...". The number of axioms is unlimited, and can even be infinite (in which case, there would be a rule used to define the axioms, along the lines of, "For any mathematical property P, if P is true for 0 and, whenever P is true for n it is also true for n+1, then P is true for all numbers" (this is an axiom from number theory known as the principle of induction).

An example would be to work with groups, which are one of the most important mathematical structures; the term "group" has a formal definition, and this is turned into a simple axiom in many theorems: "If G is a group, and ..." as opposed to "If G is a set with a binary operation . with the properties that G has an identity, all members of G have inverses, and . is associative, and ..." - long enough to be cumbersome even without expanding the meanings of the subsidiary definitions of operations, identities, inverses, and associativity. (Another condition, closure, is frequently added to this definition, but this can be, and often is, sensibly subsumed into the meaning of "operation".) And that's without even considering what a set is: set theory is formulated so it defines what can be done with sets rather than what a set is (e.g. it tells you that the union of two sets is a set), and this would hardly fit easily into the statement of a theorem.

Axioms

What determines a mathematician's choice of axioms? There is one basic rule: they should be consistent. The reason for this is that it is possible to prove that from inconsistent axioms any statement can be proved. If it is possible to give a model - basically an example - which satisfies the axioms, then they must be consistent (this is another theorem from logic). For many axioms, particularly those collected into definitions, there are huge numbers of models already known. Where a mathematician is seeking to solve a particular problem, and develops axioms related to this problem, the motivation behind the problem may well give an obvious model for the axioms.

A small warning - where the mathematical objects become really fundamental, it is hard or impossible to prove that they are consistent. There is a circularity about the most fundamental mathematical objects: logic is used to reason about sets, and sets are employed in many aspects of logic, including the definition of "model" which is used in the theorem alluded to in the preceding paragraph. This means, too, that there is no way to prove that mathematics as a whole is consistent, and indeed, that consistency of a set of axioms is only relative to the presumed consistency of more fundamental definitions and axioms such as those of set theory. In fact, the axioms defining set theory (in the form they are normally seen) are the end results of an attempt to restrict the idea of a set to avoid the paradoxes which were first seen at the end of the nineteenth century (for example, the "set which contains all sets which are not members of themselves" - is it a member of itself or not?).

Consistency is the only real requirement, though a mathematician would look for other desirable properties of a new set of axioms with which he or she wishes to work. They should be productive: it should be possible to derive lots of useful (and beautiful, in mathematical terms) theorems. The re-use of ideas in other contexts is important in mathematics, and productivity of axioms is one of the things which makes it possible to do this. The definition of a group is a prime example, as it has been re-used in hundreds if not thousands of contexts, and lies behind many important applications of mathematics in physics. Note the word "useful" - from any collection of consistent axioms, it is possible to derive infinite numbers of trivialities (such as "1=1"), and mathematicians are not interested.

But perhaps even more important to many mathematicians is the idea of beauty. It is hard to explain why some mathematics is elegant or beautiful, but it is certainly a value which is recognised by those who work in the field. Sciences do in fact have a fairly similar concept. Basically, I think beauty is about how smoothly the mathematics seems to flow, so that the truth of what is being proven seems obvious, at least in retrospect.

But What is the Point of Deductive Reasoning?


One philosophical criticism of mathematics is that it never tells you anything which is not effectively included in the axioms involved. But generally working out the implications of a set of axioms is not always easy to do, even if a specific desired endpoint is set: this is illustrated by the time it took for mathematicians to find a proof of Fermat's Last Theorem. In general, it isn't possible to look at a statement and say whether it is provably true, false, or unprovable from a given set of axioms.

What axioms do is to define a certain kind of structure, which can then be investigated with deductive reasoning. If we make a axiom that some aspect of the universe has a particular structural characteristic, then all the mathematics which has been devised which describes that type of structure becomes available to deduce information about the aspect and to predict other properties of the aspect which can (in principle) be tested. Mathematics also includes the study of the absence of structure, that is, randomness: statistics.

Why Does Science Work?


One of the big questions is basically this: why is mathematics as applied in science so good at describing the universe? There isn't necessarily any connection between our thoughts and the behaviour of a black hole thousands of light years away: it is entirely conceivable (and possibly even likely) that our brains will never be capable of understanding the way the universe works at its most fundamental level. The main reason for doubting our ability is that we are insignificant parts of the universe, and therefore indubitably less complex than its whole; it would be even more unlikely that we can understand it than it would be for an amoeba to understand human culture. But this means that the success of science desperately needs an explanation, and I think that defining mathematics as the study of structure provides one.

It is important to realise that we don't directly understand the universe. One of the things that human beings are unbelievably good at is the perception of patterns (it's why we see pictures in the cracks in a wall or in the clouds, for example), and if we can see one in a particular phenomenon, that is a way to find a structure which models that phenomenon. This structure can be investigated using mathematics. Of course, there is no guarantee that the pattern is correct, as is the case with an optical illusion. What mathematics as an analysis of structure makes possible is the use of deductive reasoning to work out what a possible structure in the real world would mean - in effect, it is what enables the design of experiments to check whether the structure is valid or not. When a pattern is false, the mathematical predictions will (in all likelihood) turn out to be false, and the perceived structure will have to be abandoned for another possibility.

One obvious example of how this works is that of symmetry. In mathematics, a symmetry is going to be linked to a kind of group, which describes how objects with that symmetry can be changed (e.g. by reflection) in a way that appears not to change the object. So given an apparent symmetry in the universe, the whole of group theory can be used to understand it better. Such groups and symmetries are used in the theory which underlies particle physics (known as the Standard Model), and which leads to the predictions which are being tested by the Large Hadron Collider.

A second is evolutionary theory. In Darwinian evolution, survival of the fittest is related to probability: an individual which is better fitted for their environment is more likely to survive and produce descendants. And there is a great deal of mathematics about probability and statistics, so this can then be used to look at evolution. This mathematics was used to come up with an evolutionary description of altruism, explaining why it can be better in the long term for individual organisms to act against their personal interests in order to improve the odds for survival of close relatives who will have similar genes.

While mathematics is an important part of science, its processes and methods are different. The differences arise from the need to connect to the real world in scientific work; mathematics can be entirely abstract. And many mathematicians prefer it that way, too, even if it is an attitude which has a less than sympathetic reception from those people who tend to fund research, who want work which can be useful now rather than if and when an application is found for it.

Modern Mathematics and Science

There is a way in which scientific method may creep into mathematics, as I mentioned earlier. Some of the most complex results proved since the mid-seventies rely on computers to look through large numbers of cases and check that a result is valid in each case. This was done, for example, in the proof of the four colour theorem, which states that a map consisting of regions drawn on a flat piece of paper can always be coloured in so that no adjacent regions use the same colour with at most four colours. This is proved by looking at the ways in which a map can be turned into a simpler map which can only be coloured with four colours if the original map can be. The proof can be completed by looking at all the maps which cannot be simplified in this way, and seeing that they can be coloured in; the large but finite number is such that computers were used to check all of them. This means that the proof is basically dependent on the computer software and hardware: all the calculations it carries out must be accurate if the result is to be proved. I don't know what methods were used to establish the accuracy of the software and the hardware involved in this particular case, but to ensure that there isn't a fundamental design flaw in all the hardware available to test the result is hard when computer chips have become as complex as they now are. (There are methods to verify the accuracy of software and hardware: but bugs and manufacturing flaws are still hard to pick up.) This problem has led some to question whether such results can truly be considered proved; and certainly no mathematician is likely to consider the proofs as beautiful.

Polanyi describes mathematics as being physics when it can be applied to models of the real world, and like engineering when it can provide solutions directly applicable to the real world, but that it has more than that in it. I would say that the relationship is not really like that. Mathematics is a method which is immensely important to science, basically forming the whole of a scientist's analytical toolkit. The development of modern science really took off at the point when scientists began to seriously use mathematics for this purpose, rather than basing their thought on non-mathematical philosophical ideas (such as the idea that objects will travel in circular paths if not impeded, as a circle is the perfect shape). But mathematics is not, and can never be, a science.

Art?


So, we have established something which mathematics is not. If it isn't a type of science, maybe it is an art. In some ways, it feels like an art when practised at a high enough level. There is a philosophical question of what type of existence a mathematical object has, and indeed whether it can be said to exist before someone thinks about it, but it is pretty clear that mathematics is not really created from nothing when someone proves a theorem. Mathematicians feel that they discover a result, rather than inventing it, and definitely would not want to believe that it is the act of thinking about it which determines whether a proposed theorem is true or false. Perhaps mathematics is more of a craft than an art. But that is only in the way that most satisfying human activities can be viewed as a craft: there is pleasure to be had from the creation of something beautiful, whether it is a proof of a theorem, or a statue, or a design for an experiment, or a piece of furniture - or even a physical move in a game of football. Similarly, mathematicians may make use of a shared toolkit of techniques, and a shared technical vocabulary, but this is again something shared with huge numbers of other activities which could be labelled a craft. (Indeed, I would go so far as to say that these two properties could be used to define what a craft is.) The technical vocabulary in mathematics is exceptionally highly developed, with symbolic notation and a huge number of words which are given specific technical meanings, as well as an accepted way to add to their number (by creating a definition).

I would say, therefore, that any claim that mathematics is an art or a craft does not capture any of its properties which mark it out from other activities that humans undertake. However, just as with science, mathematics is fundamentally important to many art forms, for similar reasons due to the human appreciation of structure and patterns. Music is perhaps the art form most clearly using mathematics, probably because it is in the main abstract when words are not being sung. It is structured on several levels - form (such as verse/chorus or sonata form), rhythm (both a pulse and patterns such as the "Scotch snap" or the stretching of tempo in waltz time) and pitch (such as the use of a key, or with a twelve note series) for example. Similarly, pictorial art often has structural elements based on perspective (which is all about geometry). This use of patterns is important to our enjoyment of art and, in some cases, to our ability to connect with it in the first place. Of course, art is not just about getting these technicalities right. The spark of creativity which marks art of any quality from the rest needs to be part of any art object as well. Not only that, but great artists will adapt and develop the existing technical ideas, and make something new, which can then act as the ideas which are used by the next generation.

As with science, mathematics may underlie much of the technical side of art. But it is not an art, and this is even clearer than it is to say that it is not a science.

In a Category of Its Own?


My contention, then, is that mathematics is nether science nor art, and only a craft in a sense common many activities of disparate kinds, but that, instead, it is sui generis. In fact, I would go so far as to say that to call it a science or an art is a category error. For mathematics is a part of the theory of every science and every art or craft, but no science, art or craft forms the basis for any mathematics whatsoever. The inspiration, perhaps, but not the basis. To me, pattern recognition and structure are so important parts of what it means to be human that the study of these things deserves to be in a category of its own.