After working with Lean for a while, I've started to appreciate that there is a wider range of what it means to be constructive -- being "fully constructive" seems to me to be like fanatical efforts to make everything X, for example "everything is a function" or "everything is an object" and so on.
In the Lean approach, anything not marked `noncomputable` in the Type universe and above needs to be constructible, in the sense that there is a program that can compute such an object using a functional programming language (where the function is guaranteed to terminate), but proofs don't need to be constructible in the same way. Proofs are still functions, but they're free to use things like the axiom of choice in the course of a proof. The way things are set up, even if you use non-computable things while making your inferences, this "forbidden" data will not leak out.
It seems eminently reasonable to me that you can argue a program computes the correct thing because it doesn't compute the wrong thing, and unless you're specifically studying different toposes/logics, I'm not sure what constructive logic would really give you (and it's not like mathematicians slavishly keep track of where they used the axiom of choice or its consequence the LEM).
How this all looks in Lean practice is that it's stronger to give a definition and prove it satisfies a property than to prove a "there exists" statement. mathlib authors tend to go for the first, unless the result is inherently noncomputable. There's also a facility for lifting propositions up to boolean values, which more or less amounts to giving a constructive proof -- this is used pervasively, for example with finite sets, which are constructive in the sense that if you have one there's an actual finite list backing it.
I guess in short: it's a useful middle ground being able to prove something exists without having to construct it, but if you want to "have" the thing that purportedly exists you need to construct it.
Just pick a different one! There are definitely formalisation efforts out there that use constructive mathematics; for example Coq is so called after the logic, the calculus of constructions, that it uses (https://en.wikipedia.org/wiki/Calculus_of_constructions).
Hi! Yes! Just pick a different one! The problem is that constructivists have been formalising mathematics for decades and have not really managed to break through into the mainstream mathematical community with their efforts. The difference with Lean's maths library is that we absolutely reject constructivism, which makes Lean far less suitable for certain kinds of computations but conversely far far better equipped for proving all the theorems in an undergraduate mathematics curriculum and then going on to tackle research level mathematical questions of the kind which most mathematicians recognise as "normal mathematics". My argument (I should say that I am the author of the blog post) is that this resolutely non-classical approach is far more likely to arouse the interest of mainstream mathematicians, who rejected constructivism 100 years ago and have never looked back. I can certainly see a role for constructive mathematics, however I know from experience that most working mathematicians reject it and hence the pragmatic approach is to reject it when formalising if we are to start appealing to the mathematical masses.
Calling this 'non-classical' is interesting—it's definitely non-classical from the point of view of formalisation, but I'm not sure formalisation has been around long enough to have classics ….
As of course you know but non-mathematicians may not, from a non-formalising mathematician's point of view, it's the non-constructive logic that's classical, to the extent that it's actually sometimes called classical logic (https://en.wikipedia.org/wiki/Classical_logic).
AFAIK the advantage of constructivism is algorithms.
Coq allows you to extract lisp/ML/Haskell programs.
Perhaps we will be seeing the next generation of fully homomorphic encryption programs, or topological data analysis programs, automatically generated by Coq.
Sure, but you would not need to prove it in a constructive way.
See, that is one of the problems with constructive mathematics. It not only requires you to write down the algorithm, but also to prove it constructively, although a non-constructive proof would be perfectly fine. In other words: constructivism is often overkill.
I think we're at cross-purposes; I am talking about mathematics for the purpose of real-life useful applications (e.g. cryptography, data analysis, etc), in which case you need algorithms.
Nope, we are not at cross-purposes. You just don't understand what I am saying.
Let's say you need an algorithm for a real-life situation, and you manage to write one down.
Ok then, you are good to go, just use it!
But ... how do you know that the algorithm you have written down is correct?
Now, a constructive mathematician would require you to also provide a constructive proof for your algorithm. A mainstream mathematician would additionally also be fine with a non-constructive proof for your algorithm. So, who is more real-life?
Yes, but if I told IBM that I had a proof that a particular algebraic property or relationship related to homomorphic encryption holds true, I'm sure they would much prefer that it was constructive, rather than non-constructive, because they can produce a usable algorithm from the constructive proof.
The point is that by exploring mathematics using only constructive proofs, we are guaranteed to produce (possibly useful) algorithms.
If you already have an algorithm, and can prove it works non-constructively then fine, but a collection of N non-constructive proofs is not nearly so useful for real-life applications as a collection of N constructive ones.
Your argument is wrong in so many ways, but let me just show you one: A constructive proof guarantees that there is an algorithm, but it doesn't say anything about the runtime of that extracted algorithm. Now, what good is an algorithm that is much too slow to be practically useful?
In general, my hypothesis is that if you restrict yourself to constructive mathematics, you are limiting yourself. And that is true whether your chosen field is algorithms, or not.
Now, if you offer me for a given algorithm a constructive proof, that's great. But if you needed 10 years to give me that proof, and a non-constructive one can be had in 10 minutes, I might not be so enthusiastic if I had to pay your salary, or if I needed that result really urgently, or both. But on the other hand, IBM probably doesn't care.
Hey, there's no need to by so impolite. My argument is not wrong, it's just a different argument to yours (which I fully understand, and is not difficult to grasp). Your example doesn't discount mine in any way; a slow algorithm is still better than no algorithm at all. The point about proving the correctness of an existing algorithm is moot, I'm talking about mathematical proofs which in themselve lead to algorithms. I'm not claiming we should abandon non-constructive mathematics, just that formalizing mathematics constructively leads to more useful algorithms. Similar to the argument that some programmers make about constraining themselves to think purely functionally in order to produce less buggy code.
Goodnight.
You probably have no experience with constructive proof checkers, else you wouldn't be missing Auggie's point.
> I'm sure they would much prefer that it was constructive, rather than non-constructive
One point against constructivism is that the proof could be less readable.
> can produce a usable algorithm from the constructive proof
No. Between the automated tactics and inductive propositions constructed to carry all kinds of redundant state helpful in proving your properties, your proof objects won't be anywhere near production level. And constructive proof checkers don't intend that to happen anyway: you pretty much define an efficient function in a DSL, then use a different DSL to prove that the function fulfils the properties. The proof object is pretty much a black box with unclear efficiency that wasn't meant for humans to read.
I don't know! But I don't think so. Also think about other properties of an algorithm, like runtime, etc. You could have a non-constructive proof for the upper bound on the runtime, but not a constructive one.
Here is an example: after inputing positive integers a, b and c, output all natural numbers n for which a^n + b^n = c^n.
My algorithm:
If a + b == c then output 1.
If a^2 + b^2 == c^2 then output 2.
Stop.
This algorithm is correct, because Fermat's last theorem holds. Now, try to come up with a simpler algorithm than that, and one that in addition has a constructive proof.
Fermat's last theorem holds constructively. This is because we can take the entire proof an rewrite it using the "classical fragment" of constructive logic[1]. I don't know if the proof of FLT (appears) to use the axiom of choice or not, but if it does we use Shoenfield's absoluteness theorem to remove any uses of the axiom of choice from the proof.
And we are done, because the statement of the theorem in the "classical fragment" of constructive logic is identical to the constructive statement as it does not involve any existentials or disjunctions. However if the theorem did involve a limited use of these connectives, we could use Goedel's Dialectica interpretation[3] to turn them in to their constructive counterparts, as long as the theorem is Pi_2 or lower, which is going to be the case for the vast majority of theorems about software.
Hi Russell, thank you for clarifying this! So what you are saying is, if the algorithm and its correctness theorem have a certain form (which is usually the case), we can rewrite any non-constructive proof of it in a constructive way?
That is very cool! So, even less reason to demand a constructive proof, a non-constructive one is then often good enough even for hard-core constructivists.
Do you know if this has been implemented in any actual (interactive) theorem proving system, for example Lean? It sounds like a great feature to me to allow non-constructive proofs in certain situations, but to have the theorem hold even in a constructive setting.
I should mention that constructive mathematics, at least when done in dependent type theory, really starts to shine when things that normal mathematicians think of as propositions, are viewed as data. For example, normal mathematicians think of list membership as a predicate, but in dependent type theory, list membership is a value of an inductive family that is effectively an index of where that value occurs in that list (and in particular there could be multiple different values if that item occurs in the list at multiple different locations).
And dependent type theory is gleaming when things that normal mathematicians think of as propositions are not only values, but functions! For example a list l1 being a subset of a list l2 is a function from memberships of l1 (which, remember is data) to memberships of l2. Now you get partial evaluation when working on your proofs and things just magically unify.
There is a practical difference between Principia Mathematica taking 370 pages to prove 1 + 1 = 2 versus dependent type theory's proof `refl_equal 2`. (I'm exaggerating, but only a little).
Personally, dependent types never worked for me the way I thought they would when I first learnt about them (I was quite enthusiastic about them and about constructive mathematics initially). I found them too verbose, and too nitpicky. Your list membership is a great example: Usually, I am not concerned with where in the list does the element occur, list membership as a predicate is perfectly natural and usually adequate (but admittedly, if that's the only use case, better use a set than a list). I think an elegant mathematical argument wins not only by what is said, but also by what can be left unsaid.
But yes, being able to work with functions as data is also very important! That's why I am pursuing a combination of simple type theory and set theory as the foundation of a theorem prover, as I have outlined here: http://www.practal.com
Yes, I realize this arguments cuts both ways. One way is that you can write all these theorems in constructive logic, so there is no need for classical axioms. And the other way, is that users should freely use classical axioms because they can be mechanically removed.
Myself, I've never seen an automated theorem prover that will perform this sort of translation automatically. I have, on rare occasion, gone through by hand to remove uses of classical axioms in Coq proofs of theorems than ought not to be using them. Usually the uses are pretty superficial.
How difficult would it be to write a Coq plugin to perform these translations?
The thing I like about Coq is that it appears to have better tooling than the other theorem provers (coqtop, proof-general for emacs), and you can extract code for other programming languages. I haven't looked in great detail at the alternatives though.
The Four Colour Theorem in Coq seems pretty mainstream I'd say... Also I guess you know that you don't personally have to reject LEM to use Agda or whatever, you just can't do everything in it or at least everything the classical way.
This is a milestone result, and its formalisation taught us many things, for example that the systems are capable of handling long proofs about elementary objects.
However finite graph theory is not remotely mainstream mathematics. Take your favourite super-prestigious maths prize, for example the Abel Prize or the Fields Medal. Now look at everyone who has won this prize in the last 10 years. That is the definition of mainstream mathematics. And as you will see if you do this, the areas which these people are working on are a million miles away from finite graph theory. This is precisely the problem. Computer scientists sometimes have a very twisted view of exactly what kind of mathematics is regarded as important in 2021. The experiment I outline above will give you some idea of what is mainstream, and believe me, it's not the four colour theorem.
Finite graph theory is completely mainstream mathematics. There's a kind of elite provicialism that you sometimes see, where only the mathematics that's done at Harvard or wins the big prizes counts, and this is a good example. There are probably more people employed in math departments working on finite graph theory than there are working on the Langlands Program. Robertson and Seymour were too old to be eligible, but the fact that someone like them could never win the Fields Medal, while someone in other fields could, is a statement about who is well-connected with the prize committee, and not a general statement about mathematics.
And I say this as someone with no interest in graph theory, or combinatorics in general.
Buzzard's post suggests to me that when he says "mainstream," he means what the community at large thinks of as important and field-defining (cf. the sentence with important in it). He is totally correct here; in this sense finite graph theory is not mainstream mathematics.
Further, I don't think it is "elite provincialism" to point out this readily verifiable fact. Besides looking at the big prizes, we can look at any of the top journals (Annals, IHES, Inventiones, Journal of the AMS, Acta, whatever). You're not going to find a lot of finite graph theory there.
Now, I think we should separate the question whether it is the case that he's right about what mathematicians value from the question of whether it ought to be the case that the world is this way. As I just said, the answer to the first question is yes. I'd argue that the answer to the second question is also yes, but that's a whole different discussion.
Pointing to the top journals is again a statement about who is well-connected, so if anything it is further proof of my claim of elite provincialism. Mathematics has politics, like any other field, and it's politics that determine that the proof of the Robertson-Seymour Theorem appeared in the "Journal of Combinatorial Theory" and not JAMS.
The irony is that I suspect every single mathematician, elite or not, knows what the four-color theorem is, far more than could tell you what chromatic homotopy is, or state anything about the Langlands program beyond "Uh, it's something about number theory? And groups? Maybe?"
I think the (implicit) charge that the editors of top journals are keeping good papers out for snobbish reasons is mostly unfounded. I don't deny that their tastes shape what gets published, of course. But in most circumstances, they are not subject matter experts, and the first step of evaluating any paper is that is not an obvious desk-reject (for poor writing, crankery, etc.) is to send it to a group of relevant experts for "quick opinions." Only if these opinions are sufficiently positive is the paper sent for a full referee report (again by an expert), and in most (but not all) cases the referee's suggestion is followed. So the picture of editors just arbitrarily killing papers they don't like is not accurate.
Further, it's not like no finite combinatorics/graph theory gets published in these journals. Just not a lot, because it's not sufficiently interesting/valuable to the broader community. (Annals of Math almost published Hales's proof of the Kepler conjecture, after all; eventually a proof appeared in another top journal.)
Also, re: connections, you can easily check the author affiliations for papers in these journals. There are plenty of people from universities that are not so well known. It's hardly an "old boys club."
There was a notorious case that a paper by Wehring solving the largest outstanding question in lattice theory was rejected by JAMS despite glowing referee reports (and being pretty short). So it's pretty much an example of editors arbitrarily killing a paper. (It appeared in Advances in Mathematics.)
You can see a letter to the editor of the Bulletin about it. Look at the affiliations of the people protesting the decision, and the affiliations of the editors defending it: http://www.ams.org/notices/200706/tx070600694p.pdf
I don't really see a problem with this decision, though I am sympathetic to Wehrung. As the editors note, there are a ton of great papers JAMS doesn't publish (due to severe page count constraints at the journal). Their reasoning is not "arbitrary"; it was spelled out quite clearly for the authors in the rejection notice they got. Many other papers in "trendy" subjects face the same fate.
What do you think they say? "You're not well-connected, so we're rejecting you"? How do you think political decisions get defended in academia? For insiders, you listen to the referees. For outsiders, you invoke some vague criterion of lack of fit or lack of interest. And people can come along and say "If this field is important, why doesn't it appear in JAMS"? It's a self-fulfilling prophecy.
You seem determined to believe that the rejection was for "political" reasons despite having no evidence for this claim. The quoted reasoning was not "lack of fit" or "lack of interest," it was lack of "interaction with other areas of mathematics." Going back to my first comment, there's a general sense in the community that this sort of interaction among mathematical subfields (or with physics) is prized in research, and it's one of a few criteria that form the sociologically dominant view of what constitutes "good mathematics." You might disagree with these criteria, but it's not hard to see why that paper might lose out to other excellent papers when judged by them. This doesn't look like politics (favoring "insiders") to me; it looks like the consistent application of a value judgment about what good research is.
Again, whether you think the criteria should be the way they are is a separate conversation.
I suppose it's not exactly mainstream, but Kronheimer and Mrowka are among those trying to prove the four color theorem using instanton homology and gauge theory. It's certainly not finite graph theory, though (their idea of a 3-regular graph is the singular locus of a certain kind of orbifold!)
I am working on some hobby projects, similar but very much more elementary to what you are doing. May I please kindly get your e-mail address (or you may drop me a note at ashok dot khanna at hotmail dot com). I would love to connect and follow your work.
I apologise as I was not able to find your e-mail address on your blog post - sorry if I missed it.
It's mathematics where certain kinds of proof by contradiction are not allowed -- if you want to prove that something exists because you want to use it, then you have to make it, you can't just say "let's assume it didn't exist and go on from there to deduce that 0 = 1 which is definitely wrong, so our assumption is wrong, so it exists, so let's use it".
Here is the reason why it's obvious for mathematicians. If you're trying to prove a theorem (e.g. the theorem that class numbers of imaginary quadratic fields tend to infinity), and then someone comes up with a proof which assumes that a certain generalisation of the Riemann hypothesis is true, and then someone else comes up with a proof which assumes that the exact same generalisation is false, then we mathematicians say "great, the theorem is now proved". This actually happened.
However if your boss asks you to write some code which does a calculation, and the next day you show up in their office with two USB sticks and say "this code is guaranteed to produce the correct answer if the Riemann hypothesis is true, and this other code is guaranteed to produce the correct answer if the Riemann hypothesis is false" then you are going to lose your job, because all you did was prove that the code exists, which is less helpful than it could be.
For me one of the biggest problems with the area of formalisation of mathematics is that for decades it has been dominated by computer scientists, whose view of what is important and beautiful in mathematics does not really coincide with that of the working mathematician. This is what I am fighting to change.
Oh wow! Was it an interesting theorem that got proved this way? Did it get proved in another, less controversial manner as well? Could you provide a pointer to it?
I remember when encountering my first (or one of my first) proof by contradiction (irrationality of sqrt(2), say) feeling unsatisfied because I didn't know how one could know that it was _your_ assumption that was wrong and not one of the more fundamental ones of mathematics.
A way to think about whether something is constructive is that everything is decidable, in the sense that there is some procedure that will give you the answer in finite time.
In Kevin's example where you have two programs, one which depends on the Riemann hypothesis being true and the other which depends on it being false, is that you can't just execute the programs -- they might have undefined behavior since they depend on the truth of some statement during their executions, and there's no general algorithm to decide whether undefined behavior is happening. They depend on undecided facts, so you can't put them together into a single program that decides the result. (Maybe a slogan is "truth can be contingent, but data cannot be.")
In Lean syntax, this is an example type signature of something that takes two programs (h and h') that respectively take in the truth/proof of a proposition or its negation, along with a proof that there is at most one element of α, and produces that element. I don't think there's a way to write this definition in Lean without marking the definition noncomputable:
Yup, I do accept the law of the excluded middle. Unreservedly. From a logical point of view, I cannot really imagine what it would mean NOT to be true.
Yes, Kripke semantics makes sense of constructive logic. Topos theory, too. But I really think of all of these embedded in classical logic, and assuming that the law of excluded middle doesn't hold for general reasoning just doesn't make any kind of sense to me.
Thanks Kevin. It seems like having both approaches is more flexible and thus more general (which can hopefully only be a good thing?). One could always replace a proof by contradiction later with a direct proof, but there's definitely value in using it in the meantime (at least in my uneducated view).
p.s. could you let me know your e-mail so I can keep for future reference, as I'm also working on a hobby project for automated theorem solving.
If you are familiar with "Classical" logic, constructive math omits the law of excluded middle (e.g. "A or not A" must be true). In ZFC, the axiom of choice is omitted.
This leads to proofs that are not only a verification of some idea but also gives you the mechanism to compute the evidence. This is often described as "the computational content of proofs" and is especially valuable in the context of the Curry-Howard isomorphism: for every proof, there is a corresponding program (and vice-versa). This is the basis for formal verification of constructive proofs. When creating a proof in Coq, you are actually writing the corresponding program. And you can even extract the program in another general purpose programming language (usually OCaml in the context of Coq, but Haskell is also supported).
Constructive mathematics rejects the law of excluded middle.
In a non-constructive mathematics it is true that 'either A or not A is true' and that allows you to prove existence of objects with certain properties by disproving their non-existence (if it cannot not-exist then it has to exist, by the law of excluded middle). Constructive mathematics, on the other hand, says that that's not enough and to prove that there exist objects with some properties you have to prove their existence.
What do you mean? (sorry, english is not my first language).
The way I understand it is that if you have the law of excluded middle you cannot discard 'non-constructive' proofs as invalid since they follow straight from the axioms.
Constructive mathematicians do not assert that excluded middle is true since there is no constructive proof of (P not P). Nor do they assert that it is false, since that statement, (not (P or not P)), is constructively refutable i.e. there is a constructive proof of (not not (P or not P)). So there are no instances in constructive mathematics where excluded middle is false. But there are constructive settings where excluded middle holds, e.g. Boolean topoi, but you have to prove excluded middle is validated.
So we can't claim constructive mathematicians reject excluded middle, since they reject the rejection of excluded middle. I admit this will be confusing for people used to classical logic at first, even those for whom English is their native language.
> The way I understand it is that if you have the law of excluded middle you cannot discard 'non-constructive' proofs as invalid since they follow straight from the axioms.
This is accurate, my above point is just a pedantic one about semantics.
“Not affirm” means they don’t take it as an axiom (“we don’t take this to be obviously true”)
“reject” would mean that if, given their set of axioms, it could be proven, they would try and tweak their axioms in order to make it unprovable or provably false in their system (“we think this cannot and should not be true”)
So, “not reject” means they would happily use it in their proofs, if it could be proven from their set of axioms (“interesting. That gives us a powerful tool to do proofs”)
Roughly, that proving a statement of the form "For all a,b,c,..., there exists x,y,z such that some statement involving a,b,c,...,x,y, and z is true." requires actually constructing suitable x,y, and z given suitable a,b,c, etc to derive them from.
With the proviso that the non-constructive statement '∃ x such that P(x)' can also be understood constructively as '¬∀ x ¬P(x)' which does not require constructing a suitable x, so classical proofs can be modeled without adding any extra axioms to the logic. (Clearly the two statements are classically, though not constructively equivalent).
Similarly, all classical uses of LEM (A ∨ B) can be understood as constructive statements of the form
¬(¬A ∧ ¬B).
A lot of responses are going to say something like "it is mathematics without using the law of the excluded middle, or proof by contradiction" or something like that. They are not technically wrong, but I'd like to offer a different perspective.
The constructive logic used in constructive mathematics is classical logic with two new connectives added: the constructive existential, and the constructive disjunction (which could be defined in terms of the constructive existential).
The constructive disjunction and constructive existential connective has semantics given by the BKH interpretation[1].
A proof of the constructive disjunction A + B is either a proof of A or a proof of B. A proof of the constructive existential Σn:N. P(n) is a pair <n,p> where n is a natural number and p is a proof of P(n).
So constructive mathematics isn't classical mathematics hindered, but constructive mathematics is mathematics enhanced with a more expressive language. Naturally constructive mathematics tends to focus on theorems written in this more expressive language that make use of these new constructive connectives, rather than focusing on the classical fragment which is already well studied.
For the non-layman: we can see that the above interpretation is correct because the fundamental logical rules for universal quantification, implication, conjunction, false and true are all identical between classical and constructive logic. And, according to classical mathematicians at least, this set of logical combinatorics is complete. Thus the law of excluded middle holds, even in constructive logic, so long as you use the classical disjunction, which is defined in terms of conjunction, implication, and false. Similarly you can prove that proof by contraction, i.e. double negation elimination ¬ ¬A -> A is a theorem of constructive logic whenever A is a formula that does not use the constructive disjunction or constructive existential (where negation is defined in terms of implication and false).
For the all of the above, I'm particularly focused on first-order number theory, however all of the above carries over to higher-order theories, and I think it even holds for set theory (or at least some large part of set theory such as Zermelo set theory; I forget what happens with the replacement axiom). Higher-order constructive mathematics does come with the caveat that the Axiom of Choice is not a theorem of constructive mathematics even when it is phrased only using the classical existential.
AIUI, the Lean system does support constructive mathematics. You can add axioms and reasoning principles that make the system non-constructive, but they're entirely optional.
There does not seem to be any reason for mathematicians to be turned about by these engines using constructive mathematics. One can use excluded middle as an axiom and do all the classical mathematics one likes. What becomes much easier if the engine is constructive is telling which theorems depend on excluded middle and which do not.
This is in fact how Lean works - the basic logic is constructive and excluded middle is an "extra" axiom. But even that is not as elegant as just writing non-constructive statements in the negative fragment of the logic, where (AIUI) one can already reason classically with no need for extra axioms.
I know we've discussed on HN before whether the elegance of the negative fragment extends to practice, but something about Lean is that it doesn't have LEM as an axiom. It's a theorem following from (a version of) the axiom of choice, the existence of quotient types, proof irrelevance (proofs of the same theorem are equal), and proposition extensionality (logically equivalent propositions are equal). Lean doesn't let you construct things that materially depend on LEM however -- you can only use it to prove that already constructed things have desired properties.