Peano Nats in Interactive SMT
This is further explorations on the theme of using SMT as a core facility for python interactive proving.
https://www.philipzucker.com/sqrt2/ In this post I tried proving the irrationality of sqrt(2) using SMT. There was a nice proof I followed, but these two facts I couldn’t get to prove automatically.
mul_divides = ForAll([x,y,z], Implies(mul(x,y) == z, divides(z,x)))
two_prime = ForAll([x,y], Implies(divides(mul(x,y),2), Or(divides(x,2), divides(y,2))))
It appears to me that SMTs support of Int and Real is a trap for deeper theorems. This is not surprising. SMT solvers are a barrel of decision procedures strapped together via an egraph and SAT solver. There aren’t smart decision procedures for general mathematical questions. Properties of primes are notoriously tough https://en.wikipedia.org/wiki/Goldbach%27s_conjecture . Some properties of course are quite trivial so hope springs eternal.
My assumption is that if you want to tackle these properties, you need to retreat into more general syntactic approaches, using algebraic datatypes or undefined sorts. This is partially because the ematcher does not work well on Real or Int expressions. It normalizes them into something that can’t be matched on like uninterpreted sorts can be matched.
Because I want to use the high automation of z3 for simple questions though, I can defined reflection principles that just say that there is a homomorphism between the Nats and the built in Ints greater than 0.
Kernel
Here is our proof kernel. I use the built in prove
method which basically checks if the negation of the statement is satisfiable. Lemmas can be offered via the by keyword.
Temporarily admitting lemmas can be useful to just keep moving.
Some other features that might be nice in lemma
- Rip off the quantifiers for better countermodels when it fails
- Shorter timeout
- Check unsat cores and recommend which
by
were unnecessary. Or worse yet, if the lemma in question is not in the sat core, somehow you have unsatisfiable assumptions, which is indicative of an inconsistent axiomatization.
The distinction between theorems
and formula
is razor thin, which is nice for a fast blog post but not nice for a library or larger develpoment. It is only syntactic discipline. I should only be using named lemmas in the by
justifications and the stdout should never say countermodel
. It should be a sequence of proved
from z3 import *
B = BoolSort()
Z = IntSort()
def lemma(thm, by=[], admit=False):
if not admit:
prove(Implies(And(by), thm))
else:
print("Admitted!!", thm)
return thm
Nat Definitions
The basic defini9tions of Nat. I do use z3’s built in ADT capabilities. The distinction between this and undefined sorts with injectivity is interesting. ADTs disallow loopy expressions. This corresponds to an occurs check somewhere inside z3.
A (conservative extension) definition mechanism is that you are allowed to defined new symbols/functions if it has a defining equation that is terminating. I am assuming and not checking that these definitions are well founded. In other words, these definitions are axioms.
The importance of conservative extensions for the foundations of mathematics
Nat = Datatype("Nat")
Nat.declare('zero')
Nat.declare('succ', ('pred', Nat))
Nat = Nat.create()
n,m,k = Consts('n m k', Nat)
reify = Function('reify', Nat, Z)
reify_def = ForAll([n], reify(n) == If(Nat.is_zero(n), 0, reify(Nat.pred(n)) + 1))
x,y = Ints("x y")
reflect = Function('reflect', Z, Nat)
reflect_def = ForAll([x], reflect(x) == If(x <= 0, Nat.zero, Nat.succ(reflect(x-1))))
More interesting perhaps is the induction axiom schema. https://en.wikipedia.org/wiki/Axiom_schema An axiom schema is a infinite family of axiomns.
We could represent this in different possible ways.
- a stream or generator enumerating the axioms. This is too british museum
- a trusted function that takes in some data to pick the axiom out of the schema you’d like
- Fully internalizing this parametrization into z3 with perhaps some kind of axiom schema datatype parametrizing the axiom schema.
- Using some higher order logic
ArraySort(Nat,B)
to encode the schema asForAll([P],Implies(And(P(Nat.zero), ForAll([n], Implies(P(n), P(Nat.succ(n))))), ForAll([n], P(n))))
which can be directly asserted.
Anyway, I take the second option. It’s not so bad.
#def recurse(name,base,step):
# f = Function(name, Nat, IntSort())
# n = FreshInt()
# return f, ForAll([n], If(Nat.is_zero(n), base, step(n, f(Nat.pred(n),base,step))))
def induct(P):
return Implies(And(P(Nat.zero), ForAll([n], Implies(P(n), P(Nat.succ(n))))),
#-----------------------------------------------------------
ForAll([n], P(n)))
We can use this axiom schema to get some crucial properties, like that reflect . reify ~> id
and that the definition of a homomoprhism with respect to zero and succ.
reflect_reify = lemma(ForAll([n], reflect(reify(n)) == n), [reflect_def, reify_def, induct(lambda n: reflect(reify(n)) == n)])
reify_ge_0 = lemma(ForAll([n], reify(n) >= 0), [reify_def, induct(lambda n: reify(n) >= 0)])
zero_homo = lemma(reify(Nat.zero) == 0, [reify_def])
succ_homo = lemma(ForAll([n], reify(Nat.succ(n)) == 1 + reify(n)), [reify_def, induct(lambda n: reify(Nat.succ(n)) == 1 + reify(n))])
proved
proved
proved
proved
We can also define plus via reification and reflection. Immediately we can derive some computation rules not referring to Int. Maybe we should have taken these as primitive and then made plus_def
the theorem?
plus = Function('plus', Nat, Nat, Nat)
plus_def = ForAll([n, m], plus(n, m) == reflect(reify(n) + reify(m)))
plus_0_x = lemma(ForAll([n], plus(Nat.zero, n) == n), [reflect_reify, zero_homo, plus_def])
plus_succ_x = lemma(ForAll([n,m], plus(Nat.succ(n), m) == Nat.succ(plus(n,m))), [reflect_reify, reflect_def, reify_def, succ_homo, plus_def])
plus_homo_case_succ = lemma(ForAll([n,m], Implies(reify(plus(n,m)) == reify(n) + reify(m) ,
reify(plus(Nat.succ(n),m)) == reify(Nat.succ(n)) + reify(m) )), [reify_def, plus_succ_x])
#plus_homo = lemma(ForAll([n,m], reify(reflect(reify(n) + reify(m))) == reify(n) + reify(m)), [reflect_reify, reflect_def, reify_def, induct(lambda n: ForAll([m], reify(reflect(reify(n) + reify(m))) == reify(n) + reify(m)))])
plus_homo = lemma(ForAll([n,m], reify(plus(n,m)) == reify(n) + reify(m)),
[plus_def, reflect_reify, reflect_def, reify_def, plus_homo_case_succ, induct(lambda n: ForAll([m], reify(plus(n,m)) == reify(n) + reify(m)))], admit=False)
#plus_l_expand = lemma(ForAll([n, m, k], plus(plus(n, m), k) == reflect(reify(reflect(reify(n) + reify(m))) + reify(k))), [plus_def])
proved
proved
proved
proved
plus_comm and plus_assoc are immediate from properties of Int. This is not how I started. I started by trying to prove plus_assoc and then realizing that really the concept of plus_homo was an important intermediary. This is often how these things go.
#_2 = lemma(ForAll([n,m,k], reflect(reify(reflect(reify(n) + reify(m))) + reify(k)) == reflect(reify(n) + reify(m) + reify(k))), [reflect_reify, plus_homo])
plus_comm = lemma(ForAll([n, m], plus(n, m) == plus(m, n)), [plus_def])
plus_assoc = lemma(ForAll([n, m, k], plus(plus(n, m), k) == plus(n, plus(m, k))), [plus_def, plus_homo])
one = Nat.succ(Nat.zero)
two = Nat.succ(Nat.succ(Nat.zero))
proved
proved
Second verse same as the first for multiplication.
mul = Function("mul", Nat,Nat, Nat)
mul_def = ForAll([n,m], mul(n,m) == reflect(reify(n) * reify(m)))
mul_0_x = lemma(ForAll([m], mul(Nat.zero,m) == Nat.zero), [reflect_def, reify_def, mul_def])
#mul_succ_x = lemma(ForAll([n,m], mul(Nat.succ(n), m) == reflect(reify(Nat.succ(n))*reify(m))), [plus_homo, mul_def, plus_def, succ_homo, reflect_reify, reflect_def, reify_def])
mul_succ_x_lemma = lemma(ForAll([n,m], reflect(reify(Nat.succ(n)) * reify(m)) == reflect((1 + reify(n)) * reify(m))), [succ_homo])
mul_succ_x_lemma2 = lemma(ForAll([n,m], reflect((1 + reify(n)) * reify(m)) == reflect(reify(m) + reify(n) * reify(m))))
# Hmmm. Need to think about how to dignify this principal.
reify_reflect = lemma(ForAll([x], Implies(x >= 0, reify(reflect(x)) == x)), [reify_def, reflect_def], admit=True)
mul_succ_x_lemma3 = lemma(ForAll([n,m], reflect(reify(m) + reify(n) * reify(m)) == reflect(reify(m) + reify(reflect(reify(n) * reify(m))))),
[reflect_reify, reify_reflect, reflect_def, reify_def])
mul_succ_x_lemma4 = lemma(ForAll([n,m], reflect(reify(m) + reify(reflect(reify(n) * reify(m)))) == plus(m, mul(n,m))),
[plus_def, mul_def])
#mul_succ_x = lemma(ForAll([n,m], mul(Nat.succ(n), m) == plus(m, mul(n,m))), [mul_def, reify_reflect, mul_succ_x_lemma, mul_succ_x_lemma2, mul_succ_x_lemma3, mul_succ_x_lemma2, plus_homo, reflect_reify, plus_def])
mul_succ_x = lemma(ForAll([n,m], mul(Nat.succ(n), m) == plus(m, mul(n,m))), [mul_def, plus_def, mul_succ_x_lemma, mul_succ_x_lemma2, mul_succ_x_lemma3, mul_succ_x_lemma4])
mul_comm = lemma(ForAll([n,m], mul(n,m) == mul(m,n)), [mul_def])
proved
proved
proved
Admitted!! ForAll(x, Implies(x >= 0, reify(reflect(x)) == x))
proved
proved
proved
proved
Could cvc5 show some of the things that z3 couldn’t in my post? Can I get the even odd style proof to go through
https://coq.inria.fr/doc/v8.10/stdlib/Coq.Init.Nat.html
even and odd kind of induct by 2. Maybe I need to derive this principle first?
lt = Function("lt", Nat, Nat, B)
lt_def = ForAll([n,m], lt(n,m) == reify(n) < reify(m))
max_ = Function("max", Nat, Nat, Nat)
max_def = ForAll([n,m], max_(n,m) == If(lt(n,m), m, n))
even = Function("even", Nat, B)
even_def = ForAll([n], even(n) == If(Nat.is_zero(n), True,
If(Nat.is_zero(Nat.pred(n)), False,
even(Nat.pred(Nat.pred(n))))))
odd = Function("odd", Nat, B)
odd_def = ForAll([n], odd(n) == If(Nat.is_zero(n), False,
If(Nat.is_zero(Nat.pred(n)), True,
odd(Nat.pred(Nat.pred(n))))))
# Sanity checking
even_zero = lemma(even(Nat.zero), [even_def])
not_even_one = lemma(Not(even(one)), [even_def])
odd_even = lemma(odd(one), [odd_def])
two_even = lemma(even(two), [even_def])
even_two = lemma(ForAll([n], even(Nat.succ(Nat.succ(n))) == even(n)), [even_def, induct(lambda n: even(Nat.succ(Nat.succ(n))) == even(n))])
odd_two = lemma(ForAll([n], odd(Nat.succ(Nat.succ(n))) == odd(n)), [odd_def, induct(lambda n: odd(Nat.succ(Nat.succ(n))) == odd(n))])
even_one = lemma(ForAll([n], even(n) == Not(even(Nat.succ(n)))), [even_def, even_two, induct(lambda n: even(n) == Not(even(Nat.succ(n))))])
odd_one = lemma(ForAll([n], odd(n) == Not(odd(Nat.succ(n)))), [odd_def, odd_two, induct(lambda n: odd(n) == Not(odd(Nat.succ(n))))])
# I am not sure why I needed these lemmas.
even_or_odd = lemma(ForAll([n], even(n) != odd(n)), [even_def, odd_def, even_two, even_one, odd_one, odd_two, induct(lambda n: even(n) != odd(n))])
proved
proved
proved
proved
proved
proved
proved
proved
proved
plus_x_x_even = lemma(ForAll([n], even(plus(n,n))), [plus_succ_x, even_def, even_two, even_one, induct(lambda n: even(plus(n,n)))], admit=True)
#mul_2_even = lemma(ForAll([n], even(mul(two, n))), by=[mul_succ_x])
failed to prove
---------------------------------------------------------------------------
Z3Exception Traceback (most recent call last)
File ~/.local/lib/python3.10/site-packages/z3/z3.py:7130, in Solver.model(self)
7129 try:
-> 7130 return ModelRef(Z3_solver_get_model(self.ctx.ref(), self.solver), self.ctx)
7131 except Z3Exception:
File ~/.local/lib/python3.10/site-packages/z3/z3core.py:4065, in Z3_solver_get_model(a0, a1, _elems)
4064 r = _elems.f(a0, a1)
-> 4065 _elems.Check(a0)
4066 return r
File ~/.local/lib/python3.10/site-packages/z3/z3core.py:1475, in Elementaries.Check(self, ctx)
1474 if err != self.OK:
-> 1475 raise self.Exception(self.get_error_message(ctx, err))
Z3Exception: b'there is no current model'
During handling of the above exception, another exception occurred:
Z3Exception Traceback (most recent call last)
/tmp/ipykernel_390381/4126509453.py in ?()
----> 1 plus_x_x_even = lemma(ForAll([n], even(plus(n,n))), [plus_succ_x, even_def, even_two, even_one, induct(lambda n: even(plus(n,n)))])
2 #mul_2_even = lemma(ForAll([n], even(mul(two, n))), by=[mul_succ_x])
/tmp/ipykernel_390381/3613725135.py in ?(thm, by, admit)
5 def lemma(thm, by=[], admit=False):
6 if not admit:
----> 7 prove(Implies(And(by), thm))
8 else:
9 print("Admitted!!", thm)
10 return thm
~/.local/lib/python3.10/site-packages/z3/z3.py in ?(claim, show, **keywords)
9087 if r == unsat:
9088 print("proved")
9089 elif r == unknown:
9090 print("failed to prove")
-> 9091 print(s.model())
9092 else:
9093 print("counterexample")
9094 print(s.model())
~/.local/lib/python3.10/site-packages/z3/z3.py in ?(self)
7128 """
7129 try:
7130 return ModelRef(Z3_solver_get_model(self.ctx.ref(), self.solver), self.ctx)
7131 except Z3Exception:
-> 7132 raise Z3Exception("model is not available")
Z3Exception: model is not available
div = Function("div", Nat, Nat, B)
div_def = ForAll([m,n], div(m,n) == Exists([k], mul(k,n) == m))
mul_1_x_1 = lemma(ForAll([n], mul(one, n) == plus(n,Nat.zero)), [mul_0_x, mul_succ_x, plus_0_x, plus_succ_x])
mul_1_x = lemma(ForAll([n], mul(one, n) == n), [mul_1_x_1, plus_0_x, plus_comm])
mul_x_1 = lemma(ForAll([n], mul(n, one) == n), [mul_1_x, mul_comm])
div_x_1 = lemma(ForAll([n], div(n, one)), [div_def, mul_x_1])
mul_two_x = lemma(ForAll([n], mul(two, n) == plus(n,n)), [mul_succ_x, mul_1_x, plus_0_x, plus_succ_x, plus_comm])
even_plus_0_0 = lemma(even(plus(Nat.zero, Nat.zero)), [even_def, plus_0_x])
even_succ_succ = lemma(ForAll([n], Implies(even(plus(n,n)), even(plus(Nat.succ(n), Nat.succ(n))))), [even_def, even_two, plus_succ_x, plus_comm])
even_plus_x_x = lemma(ForAll([n], even(plus(n,n))), [even_plus_0_0, even_succ_succ, induct(lambda n: even(plus(n,n)))])
mul_two_even = lemma(ForAll([n], even(mul(two, n))), [mul_two_x, even_plus_x_x])
proved
proved
proved
proved
proved
proved
proved
proved
proved
_1 = lemma(ForAll([n], even(mul(two,mul(n,n)))), by=[mul_two_even])
_2 = lemma(ForAll([m,n], Implies(mul(two,mul(n,n)) == mul(m,m), even(mul(m,m)))), by = [_1])
mul_odd_odd = lemma(ForAll([n], odd(n) == odd(mul(n,n))), [], admit=True)
mul_even_even = lemma(ForAll([n], even(n) == even(mul(n,n))), [], admit=True)
_3 = lemma(ForAll([m,n], Implies(mul(two,mul(n,n)) == mul(m,m), even(m))), by = [_2, mul_even_even])
# Ugh. It still feels far away. Still, it seems like I can chip away at it. Nothing is stopping me.
proved
Admitted!! ForAll(n, odd(n) == odd(mul(n, n)))
Admitted!! ForAll(n, even(n) == even(mul(n, n)))
proved
proved
Bits and Bobbles
It really does have the flavor of using an interactive assistant. I stated many things which didn’t go through or hung until I went through more carefully and first proved the relevant lemmas.
Z3 handles logical equivalences quite well. This is kind of a whole pile of stuff that you do manually at first in coq etc. We immeditately jump into meatier things.
It has poor induction support. Vampire has some?
I could internalize induction as a higher order statement about ArraySort(Nat,B)
predicates. It might not do much better even with that.
You can export smtlib formula and consume them in vampire or cvc5. A sledgehammer/why3 spread spectrum approach is possible.
Being so shallow makes it had to have a theorem database to try and search over.
No good simp. Z3 simplify I think can only use it’s built in rules. Could build an external one, or interop with egglog
My inductions we mostly just restating the theorem. I should really deduplicate that somehow. Maybe enforce induction schema to use z3 Lambda instead of python. Eh.
For these equational proofs, if we tossed them into twee, they give quite pretty proofs
I’ve been at this for quite some time on and off.
- https://www.philipzucker.com/programming-and-interactive-proving-with-z3py/
- https://www.philipzucker.com/more-stupid-z3py-tricks-simple-proofs/
- https://www.philipzucker.com/proving-some-inductive-facts-about-lists-using-z3-python/
Is what I’m doing now any better than then?
For equational proofs it;s nice to write t1 == t2 == t3 == t4. The calc combinator can do this. Lean has a calc and so does dafny. Also agda had a style like this.
def calc(vars, *args, by=[]):
lemmas = []
for lhs,rhs in zip(args[:-1],args[1:]) :
lemmas.append(lemma(ForAll(vars, lhs == rhs), by))
return lemma(ForAll(vars, args[0] == args[-1]), by=lemmas)
#calc([n,m,k], one + (a + b), m, k)
counterexample
[]
counterexample
[]
proved
∀n, m, k : n = k
# maps into IntSort() put into linarith
linarith = []
# I_homo as a structure/typeclass.
# put ring axioms into ring? Meh.
# ring = []
Adding pattern matching to z3 post hoc. Very useful if we want to start defining introspective tactics in python. simp
, unify
, apply
for example.
type(n).__mro__
AstRef.childs = property(lambda self: self.children())
AstRef.head = property(lambda self: self.decl())
AstRef.__match_args__ = ("head", "childs")
match plus(n,n):
case AstRef(head=mul, childs=foo):
print(foo)
print(x)
print(mul)
mul
[n, n]
<__main__.Foo object at 0x7594f3f77eb0>
plus
plus
What about euclidean geometry?
Point = Datatype("Point")
Point.declare("mk-point", ("x", Z), ("y", Z))
Point = Point.create()
DSegment = Datatype("DSegment")
DSegment.declare("mk-dseg", ("a", Point), ("b", Point))
DSegment = DSegment.create()
# line is equivalence class of segments.
# we can't really give canonical coordinates to lines.
# ax + by = c is scale invariant
# y = mx + b can't represent vertical lines.
# y = lam (a - b) + b could use other points
Maybe more intriguing is to do projective geomtry first though
Here we can use the same kind of reify/reflect into coordinates, which are a pretty dominant method for geometric proving as I understand it (Wu’s method, grobner, etc).
Point = DeclareSort("Point")
R = RealSort()
x,y,z,p,q,w = Consts("x y z p q w", R)
mkpoint = Function("mk-point", R, R, R, Point)
point_equal = ForAll([x,y,z,p,q,w], And(x*q - y*p == 0, y*w - z*q == 0, z*p - x*w == 0) == (mkpoint(x,y,z) == mkpoint(p,q,w)))
X = Function("X", Point, R)
Y = Function("Y", Point, R)
Z = Function("Z", Point, R)
mkpoint_XYZ = ForAll([p], mkpoint(X(p), Y(p), Z(p)) == p)
# right to left is trivial.
# left to right
coord_unique = ForAll([p],And(X(p) == X(q), Y(p) == Y(q), Z(p) == Z(p)) == (p == q))
# also could do exists lam, v = lam * w as equality.
class Foo():
pass
Bar = Foo
x = Bar()
match x:
case Bar():
print("bar")
case Foo():
print("foo")
case _:
print(type(x))
Fn = z3.ExprRef
match plus(n,m):
case Fn(plus, [y,z]):
print(n,m)
def match_(t, p, vars):
subst = {}
todo = [(t,p)]
while todo:
t,p = todo.pop()
match t,p:
case x, Fn(v, []) if p in vars:
if v in subst:
if subst[v].eq(x):
pass
else:
return None
else:
subst[v] = x
case Fn(f, args1), Fn(f2, args2):
if f1 == f2 and len(args1) == len(args2):
for a1,a2 in zip(args1, args2):
match_(a1, a2, vars)
else:
return None
case _:
print("no match")
return subst
def kbo():
pass
def resolve():
pass
def auto():
pass
bar
n m
#type(ForAll([n,m], plus(n,m) == plus(m,n)))
#PForAll = QuantifierRef(x) if x.is_forall()
# https://www.hillelwayne.com/post/python-abc/
from abc import ABC
class PForAll(ABC):
@classmethod
def __subclasshook__(cls, C):
return isinstance(C, QuantifierRef) and C.is_forall()
z3.z3.QuantifierRef