A cute little calculus exists for intuitionistic propositional logic.

If you naively follow the sequent calculus rules of intuitionistic logic, you may end up looping.

Specializing the left rule for implication to the various connectives, we can find variants of this rule where we are clearly making progress. This Contraction-Free Sequent Calculi for Intuitionisitc Logic by Roy Dyckhoff https://www.cs.cmu.edu/~fp/courses/15317-f08/cmuonly/dyckhoff92.pdf is a canonical reference, but also there are a ton of notes from CMU courses.

Doing this in python over the z3 AST is interesting from a number of angles. It seems like a nice platform to experiment with intuitionistic proving leveraging SAT and SMT technology.

Here was can write a couple of example problems. We can easily ask z3 if they are classically valid formulas.

from z3 import *
a,b,c,d = Bools('a b c d')
examples = [
    a,
    Implies(a,a),
    Or(a, Not(a)), # excluded middle https://en.wikipedia.org/wiki/Law_of_excluded_middle
    Or(a,b),
    Implies(a,Implies(b,a)),
    Implies(And(a,b), And(b,a)),
    Implies(And(a,b), a),
    Implies(Implies(Implies(a,b),a),a),     # Pierce's law https://en.wikipedia.org/wiki/Peirce%27s_law
    Implies(Not(Not(a)), a),
]
def prove_classical(p):
    s = Solver()
    s.add(Not(p))
    if s.check() == unsat:
        return True, None
    else:
        return False, s.model()

for e in examples:
    valid, countermodel = prove_classical(e)
    if valid:
        print(f"{e}, Classically Valid")
    else:
        print(f"{e}, Classically Invalid. Countermodel {countermodel}")

a, Classically Invalid. Countermodel [a = False]
Implies(a, a), Classically Valid
Or(a, Not(a)), Classically Valid
Or(a, b), Classically Invalid. Countermodel [b = False, a = False]
Implies(a, Implies(b, a)), Classically Valid
Implies(And(a, b), And(b, a)), Classically Valid
Implies(And(a, b), a), Classically Valid
Implies(Implies(Implies(a, b), a), a), Classically Valid
Implies(Not(Not(a)), a), Classically Valid

We can write the LJT prover in python. It is rather similar in form to the ground prolog interpreter. This isn’t really an accident, since they are both implement a sequent calculus search. https://www.philipzucker.com/ground_lambda_prolog/

It is much tighter to just return a boolean rather than more data. We can use the short any and all combinators in particular.

Python has pretty bad support for immutable data structures. The easiest thing to do to add to a list is just a full copy, which can be achieve by mylist + [appended_element]. The freeing thing about python is that it’ll be slow no matter what, so might as well do what is easiest.

A tracing list can give some clues into the execution without too much annotation pain or magic.

There are a number of conceptual insights in the prover. Generally speaking, a big part of the game for theorem proving (or any search problem) is to notice redundancies in possible proofs and get rid of them. In this case, it is noticed that not all the rules of the sequent calculus need backtracking. We can eagerly apply those rules in an “inversion” phase of the prover.

If you need to prove A /\ B, you’ll need to prove A and you’ll need to prove B. There’s not another option. That is why the right And is in the right_inv function.

However, if you need to prove A \/ B, you can prove A or you can prove B. You don’t need to prove both. And you need to make a guess which you need to prove. This is why the right Or rule is in the search function.

A constant source of mild pain is that not rules should be read as a -> _|_ rules.

I was tearing my hair out debugging this. Might be correct.

Source https://www.cs.cmu.edu/~crary/317-f23/homeworks/hw7-handout.pdf

def prove(goal):
    trace = []
    def right_inv(ctx, invctx, goal):
        #print("right",ctx, invctx, goal)
        trace.append(("right",ctx, invctx, goal))
        if is_true(goal): # R true
            return True
        elif is_and(goal): # R and
            return all(right_inv(ctx,invctx,subgoal) for subgoal in goal.children())
        elif is_implies(goal): # R impl
            return right_inv(ctx, invctx + [goal.arg(0)], goal.arg(1))
        elif is_not(goal): # treat not as `a -> False``
            return right_inv(ctx, invctx, Implies(goal.arg(0), BoolVal(False)))
        elif is_false(goal) or is_or(goal) or is_const(goal): # not right invertible
            return left_inv(ctx, invctx, goal)
        else:
            raise Exception(f"Unexpected formula in right_inv {goal}")
    def left_inv(ctx, invctx, goal):
        #print("left", ctx, invctx, goal)
        trace.append(("left",ctx, invctx, goal))
        if len(invctx) == 0:
            return search(ctx, goal)
        c = invctx[-1]
        invctx = invctx[:-1]
        if is_false(c):
            return True
        elif is_and(c): # L and
            return left_inv(ctx, invctx + c.children(), goal)
        elif is_or(c): # L or
            return all(left_inv(ctx, invctx + [c], goal) for c in c.children())
        elif is_implies(c): # Some specializations of L impl
            hyp,conc = c.children()
            if is_false(hyp):
                return left_inv(ctx, invctx, goal)
            elif is_true(hyp):
                return left_inv(ctx, invctx + [conc], goal)
            elif is_and(hyp):
                # curry all
                for x in hyp.children():
                    conc = Implies(x, conc)
                return left_inv(ctx, invctx + [conc], goal)
            elif is_or(hyp):
                return left_inv(ctx, invctx + [Implies(c,conc) for c in hyp.children()], goal)
            elif is_implies(hyp) or is_const(hyp) or is_not(hyp):
                return left_inv(ctx + [c], invctx, goal)
            else:
                raise Exception(f"Unexpected implication in left_inv {c}")
        elif is_const(c):
            return left_inv(ctx + [c], invctx, goal)
        elif is_not(c): # Turn not into intuitinistic c.arg(0) -> bottom
            return left_inv(ctx, invctx + [Implies(c.arg(0), BoolVal(False))], goal)
        else:
            raise Exception(f"Unexpected formula in left_inv {c}")
    def search(ctx, goal):
        #print("search", ctx, goal)
        trace.append(("search",ctx, goal))
        if any(c.eq(goal) for c in ctx): # a slightly extened prop rule. is_const()
            return True
        if is_or(goal):
            if any(right_inv(ctx, [], c) for c in goal.children()):
                return True
        for i, c in enumerate(ctx):
            if is_implies(c):
                hyp,conc = c.children()
                if any(c.eq(hyp) for c in ctx): # hmm. maybe this is ok. Left implprop
                    ctx1 = ctx[:i] + ctx[i+1:]
                    if left_inv(ctx1, [conc], goal):
                        return True
                if is_not(hyp):
                    hyp = Implies(hyp.arg(0), BoolVal(False))
                if is_implies(hyp): # left impl impl
                    a,b = hyp.children()
                    ctx1 = ctx[:i] + ctx[i+1:]
                    return right_inv(ctx1, [Implies(b, conc), a], b) and \
                           left_inv(ctx1, [conc], goal)
        return False
    return right_inv([],[],goal), trace
    
assert not prove(Not(a))[0]
assert not prove(a)[0]
assert prove(Implies(a,a))[0]
assert not prove(Or(a, Not(a)))[0]
assert prove(Not(Not(Or(a, Not(a)))))[0]
assert prove(Implies(a,Implies(b,a)))[0]
assert prove(Implies(And(a,b), And(b,a)))[0]
assert prove(Implies(And(a,b), a))[0]
assert not prove(Implies(Implies(Implies(a,b),a),a))[0]
assert prove(Not(Not(Implies(Implies(Implies(a,b),a),a))))[0]

Some simple examples

prove(Implies(a,a))

(True,
 [('right', [], [], Implies(a, a)),
  ('right', [], [a], a),
  ('left', [], [a], a),
  ('left', [a], [], a),
  ('search', [a], a)])

prove(Implies(a,Implies(b,a)))

(True,
 [('right', [], [], Implies(a, Implies(b, a))),
  ('right', [], [a], Implies(b, a)),
  ('right', [], [a, b], a),
  ('left', [], [a, b], a),
  ('left', [b], [a], a),
  ('left', [b, a], [], a),
  ('search', [b, a], a)])

prove(Implies(And(a,b), And(b,a)))

(True,
 [('right', [], [], Implies(And(a, b), And(b, a))),
  ('right', [], [And(a, b)], And(b, a)),
  ('right', [], [And(a, b)], b),
  ('left', [], [And(a, b)], b),
  ('left', [], [a, b], b),
  ('left', [b], [a], b),
  ('left', [b, a], [], b),
  ('search', [b, a], b),
  ('right', [], [And(a, b)], a),
  ('left', [], [And(a, b)], a),
  ('left', [], [a, b], a),
  ('left', [b], [a], a),
  ('left', [b, a], [], a),
  ('search', [b, a], a)])

The law of excluded middle is not an intuitionistic theorem https://en.wikipedia.org/wiki/Law_of_excluded_middle

prove(Or(a, Not(a)))

(False,
 [('right', [], [], Or(a, Not(a))),
  ('left', [], [], Or(a, Not(a))),
  ('search', [], Or(a, Not(a))),
  ('right', [], [], a),
  ('left', [], [], a),
  ('search', [], a),
  ('right', [], [], Not(a)),
  ('right', [], [], Implies(a, False)),
  ('right', [], [a], False),
  ('left', [], [a], False),
  ('left', [a], [], False),
  ('search', [a], False)])

Pierce’s law https://en.wikipedia.org/wiki/Peirce%27s_law which has something to do with continuation passing is not an intuitionistic theorem.

pierce = Implies(Implies(Implies(a,b),a),a)
prove(pierce)

(False,
 [('right', [], [], Implies(Implies(Implies(a, b), a), a)),
  ('right', [], [Implies(Implies(a, b), a)], a),
  ('left', [], [Implies(Implies(a, b), a)], a),
  ('left', [Implies(Implies(a, b), a)], [], a),
  ('search', [Implies(Implies(a, b), a)], a),
  ('right', [], [Implies(b, a), a], b),
  ('left', [], [Implies(b, a), a], b),
  ('left', [a], [Implies(b, a)], b),
  ('left', [a, Implies(b, a)], [], b),
  ('search', [a, Implies(b, a)], b)])

The double negation translation turns a provable classical formulas into provable intuitionistic ones https://en.wikipedia.org/wiki/Double-negation_translation . The simplest form for propositional formulas just puts two negations out front. Now excluded middle and pierces law are provable.

prove(Not(Not(Or(a, Not(a)))))

(True,
 [('right', [], [], Not(Not(Or(a, Not(a))))),
  ('right', [], [], Implies(Not(Or(a, Not(a))), False)),
  ('right', [], [Not(Or(a, Not(a)))], False),
  ('left', [], [Not(Or(a, Not(a)))], False),
  ('left', [], [Implies(Or(a, Not(a)), False)], False),
  ('left', [], [Implies(a, False), Implies(Not(a), False)], False),
  ('left', [Implies(Not(a), False)], [Implies(a, False)], False),
  ('left', [Implies(Not(a), False), Implies(a, False)], [], False),
  ('search', [Implies(Not(a), False), Implies(a, False)], False),
  ('right', [Implies(a, False)], [Implies(False, False), a], False),
  ('left', [Implies(a, False)], [Implies(False, False), a], False),
  ('left', [Implies(a, False), a], [Implies(False, False)], False),
  ('left', [Implies(a, False), a], [], False),
  ('search', [Implies(a, False), a], False),
  ('left', [a], [False], False),
  ('left', [Implies(a, False)], [False], False)])

prove(Not(Not(pierce)))

(True,
 [('right', [], [], Not(Not(Implies(Implies(Implies(a, b), a), a)))),
  ('right',
   [],
   [],
   Implies(Not(Implies(Implies(Implies(a, b), a), a)), False)),
  ('right', [], [Not(Implies(Implies(Implies(a, b), a), a))], False),
  ('left', [], [Not(Implies(Implies(Implies(a, b), a), a))], False),
  ('left', [], [Implies(Implies(Implies(Implies(a, b), a), a), False)], False),
  ('left', [Implies(Implies(Implies(Implies(a, b), a), a), False)], [], False),
  ('search', [Implies(Implies(Implies(Implies(a, b), a), a), False)], False),
  ('right', [], [Implies(a, False), Implies(Implies(a, b), a)], a),
  ('left', [], [Implies(a, False), Implies(Implies(a, b), a)], a),
  ('left', [Implies(Implies(a, b), a)], [Implies(a, False)], a),
  ('left', [Implies(Implies(a, b), a), Implies(a, False)], [], a),
  ('search', [Implies(Implies(a, b), a), Implies(a, False)], a),
  ('right', [Implies(a, False)], [Implies(b, a), a], b),
  ('left', [Implies(a, False)], [Implies(b, a), a], b),
  ('left', [Implies(a, False), a], [Implies(b, a)], b),
  ('left', [Implies(a, False), a, Implies(b, a)], [], b),
  ('search', [Implies(a, False), a, Implies(b, a)], b),
  ('left', [a, Implies(b, a)], [False], b),
  ('left', [Implies(a, False)], [a], a),
  ('left', [Implies(a, False), a], [], a),
  ('search', [Implies(a, False), a], a),
  ('left', [], [False], False)])

Bits and Bobbles

Obvious performance improvements:

Try to loopify instead of recursive call.
Full copying of context isn’t great. python has terrible immutable data structure support though.

Threading proof objects. Really only search has tough info in it. Everything else is reconstructible easily. That’s the point. Pretty printing proof trees.

Choice of who owns the lists. In rust for example

Pushing an popping

https://research-repository.st-andrews.ac.uk/bitstream/handle/10023/8824/BernPaper14.pdf?sequence=1 Intuitionistic decision procedures since Gentzen 2014

https://www.iltp.de/results.html intuitionsitc logic prover comparsion and databank

I actually did add nanoCop-i to knuckjeldragger

https://www.cs.cmu.edu/~crary/317-f12/recitations/g4ip.pl

https://www.cs.cmu.edu/~fp/courses/15317-f08/cmuonly/dyckhoff92.pdf

https://www.cs.cmu.edu/~crary/317-f23/homeworks/hw7-handout.pdf

https://www.cs.cmu.edu/~fp/courses/15317-s23/recitations/rec09.pdf

https://www.cs.cmu.edu/~crary/317-f23/lectures/12-inversion.pdf

https://www.cs.cmu.edu/~fp/courses/atp/handouts/atp.pdf

https://web2.qatar.cmu.edu/cs/15317/lectures/11-g4ip.pdf

https://www.csl.sri.com/users/sgl/Work/Reports/2019-07-G4-SMT-Tableaux19.pdf A proof-theoretic perspective on SMT-solving for intuitionistic propositional logic intuit Use SAT solver to prune ? Use UNSAT proof as clues?

https://ceur-ws.org/Vol-2271/paper1.pdf Shaving with Occam’s Razor: Deriving Minimalist Theorem Provers for Minimal Logic Paul Tarau

https://github.com/ptarau/TypesAndProofs/blob/master/third_party/dyckhoff_orig.pro

Using sets to avoid bothering to add in redundant subsumed stuff. SAT guidance. Sat models (prune unpropvable branches. A classical countermodel is also a intuitionistic countermodel) and UNSAT certs. How to use an UNSAT cert is much less clear.

https://research-repository.st-andrews.ac.uk/bitstream/handle/10023/8824/BernPaper14.pdf?sequence=1 If goal becomes bottom, can revert to classical logic. simplification not p in context we can replace p everywhere by bottom.

It is an interesting question which of the theories of z3 in the ground case are fine intuitionsitcally. One the meta LJT has refined to pure thoery literals, can we trust the conlcusion of just the theory solvers? For decidable theories, it feels like yes. Really, for decidable theories everything classical is fine anyway? It is just the “non classical” booleans DeclareSort("Prop") we should do this proof procedure on. The theory of arrays is a theory of functions with extensionality baked in. That does seem suspect.

SMT modulo ILP

“Prop” is kind of like quote intuit_quote(p)

Ok yeah, I guess I never got this working.

import kdrag.solvers as solvers
s = solvers.NanoCopISolver()
s.add(Not(pierce))
s.check()

---------------------------------------------------------------------------

KeyboardInterrupt                         Traceback (most recent call last)

Cell In[36], line 4
      2 s = solvers.NanoCopISolver()
      3 s.add(Not(pierce))
----> 4 s.check()


File ~/Documents/python/knuckledragger/kdrag/solvers/__init__.py:675, in NanoCopISolver.check(self)
    672 if "timeout" in self.options:
    673     cmd.extend(str(self.options["timeout"] // 1000 + 1))
--> 675 self.res = subprocess.run(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    677 if b"intuitionistically Satisfiable" in self.res.stdout:
    678     return smt.sat


File /usr/lib/python3.10/subprocess.py:505, in run(input, capture_output, timeout, check, *popenargs, **kwargs)
    503 with Popen(*popenargs, **kwargs) as process:
    504     try:
--> 505         stdout, stderr = process.communicate(input, timeout=timeout)
    506     except TimeoutExpired as exc:
    507         process.kill()


File /usr/lib/python3.10/subprocess.py:1154, in Popen.communicate(self, input, timeout)
   1151     endtime = None
   1153 try:
-> 1154     stdout, stderr = self._communicate(input, endtime, timeout)
   1155 except KeyboardInterrupt:
   1156     # https://bugs.python.org/issue25942
   1157     # See the detailed comment in .wait().
   1158     if timeout is not None:


File /usr/lib/python3.10/subprocess.py:2021, in Popen._communicate(self, input, endtime, orig_timeout)
   2014     self._check_timeout(endtime, orig_timeout,
   2015                         stdout, stderr,
   2016                         skip_check_and_raise=True)
   2017     raise RuntimeError(  # Impossible :)
   2018         '_check_timeout(..., skip_check_and_raise=True) '
   2019         'failed to raise TimeoutExpired.')
-> 2021 ready = selector.select(timeout)
   2022 self._check_timeout(endtime, orig_timeout, stdout, stderr)
   2024 # XXX Rewrite these to use non-blocking I/O on the file
   2025 # objects; they are no longer using C stdio!


File /usr/lib/python3.10/selectors.py:416, in _PollLikeSelector.select(self, timeout)
    414 ready = []
    415 try:
--> 416     fd_event_list = self._selector.poll(timeout)
    417 except InterruptedError:
    418     return ready


KeyboardInterrupt: 

https://en.wikipedia.org/wiki/Double-negation_translation

Glivenko Kuroda Godel-Gentzen

def glivenko(p):
    return Not(Not(p))

def double_negate(p):
    if is_not(p):
        return Not(double_negate(p.arg(0)))
    elif is_and(p):
        return And([double_negate(c) for c in p.children()])
    elif is_or(p):
        return Not(And([Not(double_negate(c)) for c in p.children()]))
    elif is_implies(p):
        return Implies(double_negate(p.arg(0)), double_negate(p.arg(1)))
    elif is_quantifier(p):
        if p.is_forall():
            vs, body = open_binder(p)
            return ForAll(vs, double_negate(body))
        elif p.is_exists():
            vs, body = open_binder(p)
            return Not(ForAll(vs, Not(double_negate(body))))
        else:
            raise Exception(f"Unexpected formula {p}")
    elif is_const(p):
        return Not(Not(p))
    else:
        raise Exception(f"Unexpected formula {p}")

Dialectica

def dialectica(p): if is_and(p): for a in p.children(): params = dialectica(a) zip(*params) elif is_or(p): z = FreshInt() for i,q in enumerate(p.children()): qd,exs,alls = dialectica(q) Implies(z == i, dialectica(q)) return exs+[z],alls elif is_implies(p): hyp,conc = p.children() hypd, exs, alls = dialectica(hyp) concd, exs2, alls2 = dialectica(conc) d = Implies(And(hypd, f(x,w) == alls, g(x) == exs2), concd) return d, exs + exs2 + [f,g], alls+alls2 g = FreshConst(ArraySort()) f = FreshConst(ArraySort())

Returning proof objects. Return form vs Using set for ctx.

def right_inv(ctx, invctx, goal):
    print(f"right {ctx} ;{invctx} |- {goal}")
    if is_true(goal): # R true
        return True
    elif is_and(goal): # R and
        return all(right_inv(ctx,invctx,subgoal) for subgoal in goal.children())
    elif is_implies(goal): # R impl
        return right_inv(ctx, invctx + [goal.arg(0)], goal.arg(1))
    else:
        return left_inv(ctx, invctx, goal)

for e in examples:
    print(e)
    right_inv([],[],e)

Implies(a, a)
right [] ;[] |- Implies(a, a)
right [] ;[a] |- a
After [] ; [a] |- a
Implies(a, Implies(b, a))
right [] ;[] |- Implies(a, Implies(b, a))
right [] ;[a] |- Implies(b, a)
right [] ;[a, b] |- a
After [] ; [a, b] |- a
Implies(And(a, b), And(b, a))
right [] ;[] |- Implies(And(a, b), And(b, a))
right [] ;[And(a, b)] |- And(b, a)
right [] ;[And(a, b)] |- b
After [] ; [And(a, b)] |- b
Implies(And(a, b), a)
right [] ;[] |- Implies(And(a, b), a)
right [] ;[And(a, b)] |- a
After [] ; [And(a, b)] |- a
Implies(Implies(Implies(a, b), a), a)
right [] ;[] |- Implies(Implies(Implies(a, b), a), a)
right [] ;[Implies(Implies(a, b), a)] |- a
After [] ; [Implies(Implies(a, b), a)] |- a
Implies(Not(Not(a)), a)
right [] ;[] |- Implies(Not(Not(a)), a)
right [] ;[Not(Not(a))] |- a
After [] ; [Not(Not(a))] |- a
Or(a, Not(a))
right [] ;[] |- Or(a, Not(a))
After [] ; [] |- Or(a, Not(a))
Or(a, b)
right [] ;[] |- Or(a, b)
After [] ; [] |- Or(a, b)

def search(ctx, goal):
    print(f"Search {ctx} |- {goal}")
    return False


def left_inv(ctx, invctx, goal):
    print(f"left {ctx} ; {invctx} |- {goal}")
    if len(invctx) == 0:
        return search(ctx, goal)
    c = invctx[0]
    invctx = invctx[:-1]
    if is_false(c):
        return True
    elif is_and(c):
        return left_inv(ctx, invctx + c.children(), goal)
    elif is_or(c):
        return all(left_inv(ctx, invctx + [c], goal) for c in c.children())
    elif is_implies(c):
        hyp,conc = c.children()
        if is_false(hyp):
            return left_inv(ctx, invctx, goal)
        elif is_true(hyp):
            return left_inv(ctx, invctx + [conc], goal)
        elif is_and(hyp):
            # curry all
            for x in hyp.children():
                conc = Implies(x, conc)
            return left_inv(ctx, invctx + [conc], goal)
        elif is_or(hyp):
            return all(left_inv(ctx, invctx + [c], goal) for c in hyp.children())
        elif is_implies(hyp) or is_const(hyp):
            return left_inv(ctx + [c], invctx, goal)
        else:
            raise Exception(f"Unexpected implication in left_inv {c}")
    elif is_const(c):
        return left_inv(ctx + [c], invctx, goal)
    elif is_not(c): # bottom -> c.arg(0)
        return left_inv(ctx, invctx + [Implies(BoolVal(False), c.arg(0))], goal)
    else:
        raise Exception(f"Unexpected formula in left_inv {c}")
                
for e in examples:
    print(e)
    right_inv([],[],e)

Implies(a, a)
right [] ;[] |- Implies(a, a)
right [] ;[a] |- a
left [] ; [a] |- a
left [a] ; [] |- a
Search [a] |- a
Implies(a, Implies(b, a))
right [] ;[] |- Implies(a, Implies(b, a))
right [] ;[a] |- Implies(b, a)
right [] ;[a, b] |- a
left [] ; [a, b] |- a
left [a] ; [a] |- a
left [a, a] ; [] |- a
Search [a, a] |- a
Implies(And(a, b), And(b, a))
right [] ;[] |- Implies(And(a, b), And(b, a))
right [] ;[And(a, b)] |- And(b, a)
right [] ;[And(a, b)] |- b
left [] ; [And(a, b)] |- b
left [] ; [a, b] |- b
left [a] ; [a] |- b
left [a, a] ; [] |- b
Search [a, a] |- b
Implies(And(a, b), a)
right [] ;[] |- Implies(And(a, b), a)
right [] ;[And(a, b)] |- a
left [] ; [And(a, b)] |- a
left [] ; [a, b] |- a
left [a] ; [a] |- a
left [a, a] ; [] |- a
Search [a, a] |- a
Implies(Implies(Implies(a, b), a), a)
right [] ;[] |- Implies(Implies(Implies(a, b), a), a)
right [] ;[Implies(Implies(a, b), a)] |- a
left [] ; [Implies(Implies(a, b), a)] |- a
left [Implies(Implies(a, b), a)] ; [] |- a
Search [Implies(Implies(a, b), a)] |- a
Implies(Not(Not(a)), a)
right [] ;[] |- Implies(Not(Not(a)), a)
right [] ;[Not(Not(a))] |- a
left [] ; [Not(Not(a))] |- a
left [] ; [Implies(False, Not(a))] |- a
left [] ; [] |- a
Search [] |- a
Or(a, Not(a))
right [] ;[] |- Or(a, Not(a))
left [] ; [] |- Or(a, Not(a))
Search [] |- Or(a, Not(a))
Or(a, b)
right [] ;[] |- Or(a, b)
left [] ; [] |- Or(a, b)
Search [] |- Or(a, b)

def search(ctx, goal):
    print(f"Search {ctx} |- {goal}")
    if any(c.eq(goal) for c in ctx): # a slightly extened prop rule. is_const()
        return True
    if is_or(goal):
        if any(right_inv(ctx, [], c) for c in goal.children()):
            return True
    for i, c in enumerate(ctx):
        if is_implies(c):
            hyp,conc = c.children()
            if any(c.eq(hyp) for c in ctx): # hmm. maybe this is ok. Left implprop
                ctx1 = ctx[:i] + ctx[i+1:]
                if right_inv(ctx1, [], c.arg(1)):
                    return True
            if is_implies(hyp): # left impl impl
                ctx1 = ctx[:i] + [hyp.arg(0)] + ctx[i+1:]
                if search(ctx1, goal):
                    return True

#assert search([a], a)
"""
assert search([a,b], a)
assert not search([a], b)

def prove(goal):
    return right_inv([],[],goal)

for e in examples:
    print(e, prove(e))
"""
#prove(examples[0])
left_inv([], [a], a)

left [] ; [a] |- a
left [a] ; [] |- a
Search [a] |- a

True

def left_inv(ctx, invctx, goal):
        ctx = ctx.copy()
        invctx = invctx.copy()
        while invctx:
            c = invctx.pop()
            if is_false(c):
                return True
            elif is_and(c):
                invctx.extend(c.children())
            elif is_or(c):
                return all(left_inv(ctx, invctx + [c], goal) for c in c.children())
            elif is_implies(c):
                hyp,conc = c.children()
                if is_false(hyp):
                    continue
                elif is_true(hyp):
                    invctx.append(conc)
                elif is_and(hyp):
                    for x in hyp.children():
                        conc = Implies(x, conc)
                    invctx.append(conc)
                elif is_or(hyp):
                    return all(left_inv(ctx, invctx + [c], goal) for c in hyp.children()) # is this right?
                elif is_implies(hyp) or is_const(hyp):
                    ctx.append(c)
                    continue
                else:
                    raise Exception(f"Unexpected implication in left_inv {c}")
            elif is_const(c):
                ctx.append(c)
                continue
            elif is_not(c): # bottom -> c.arg(0)
                invctx.append(c.arg(0))
            else:
                raise Exception(f"Unexpected formula in left_inv {c}")
        return search(ctx, goal)

def prove(goal):
    while True: # right inversion
        if is_true(goal):
            return True
        elif is_and(goal):
            for c in goal.children():
                if not prove(c):
                    return False
            return True
        elif is_implies(goal):
            goal = goal.get_arg(1)
        else:
            break

ILTP. I am mangling the tptp grammar in a way I’m not sure is right. I put a priorty on or and and.

Again, I think tptp defined the priority of forall and exists very uniintuitively, making them very tight binding

I could allow things other than unit_formula into <=> etc.

import lark 

# https://github.com/inpefess/tptp-lark-parser/blob/master/tptp_lark_parser/resources/TPTP.lark

grammar = """
start                : fof*
fof       : "fof" "(" NAME "," FORMULA_ROLE "," fof_formula ")"  "." 
FORMULA_ROLE         :  "axiom" | "conjecture"

?fof_formula         :  fof_unit_formula "<=>" fof_unit_formula -> iff
                     | fof_unit_formula "=>" fof_unit_formula -> implies
                     | fof_unit_formula "<=" fof_unit_formula -> reverse_implies
                     | fof_or_formula
?fof_or_formula       : fof_and_formula  ("|" fof_and_formula)*
?fof_and_formula      : fof_unit_formula ("&" fof_unit_formula)*
?fof_unit_formula     : "~" fof_unit_formula -> not_formula 
                      | term "!=" term      -> diseq
                      | term "=" term  ->      eq
                      | "(" fof_formula ")" 
                      | term  -> predicate
                      | "!" "[" fof_variable_list "]" ":" fof_unit_formula -> forall
                      | "?" "[" fof_variable_list "]" ":" fof_unit_formula -> exists

arguments            : term ("," term)*

term                :  NAME -> const
            |        | variable -> var
                     |  NAME "(" arguments ")" -> fun_app


FOF_QUANTIFIER       : "!" | "?"
fof_variable_list    : variable ("," variable)*

variable : UPPER_WORD

NONASSOC_CONNECTIVE  : "<=>" | "=>" | "<="  // | "<~>" | "~|" | "~&"

NAME                 : LOWER_WORD
UPPER_WORD           : UPPER_ALPHA ALPHA_NUMERIC*
LOWER_WORD           : LOWER_ALPHA ALPHA_NUMERIC*
NUMERIC              : "0".."9"
LOWER_ALPHA          : "a".."z"
UPPER_ALPHA          : "A".."Z"
ALPHA_NUMERIC        : (LOWER_ALPHA | UPPER_ALPHA | NUMERIC | "_") 

%import common.WS
%ignore WS

"""

parser = lark.Lark(grammar, start="start", parser="lalr")
t = parser.parse("fof(a, axiom, ![X,Y] : (foo(X) & bar(Y))).")
examples = [
    "fof(a, axiom, ![X,Y] : (foo(X) & bar(Y))).",
    "fof(b, axiom, ![X,Y] : foo(X) <=> biz(Y)).",
]
for e in examples:
    t = parser.parse(e)
print(parser.parse("fof(b, axiom, ![X,Y] : (foo(X) <=> biz(Y))).").pretty())

start
  fof
    b
    axiom
    forall
      fof_variable_list
        variable X
        variable Y
      iff
        predicate
          fun_app
            foo
            arguments
              var
                variable X
        predicate
          fun_app
            biz
            arguments
              var
                variable Y

%%file /tmp/g4ip.pl
% https://www.cs.cmu.edu/~crary/317-f12/recitations/g4ip.pl

% G4IP in Prolog

:- op(840, xfy, =>).   % implies, right assoc
:- op(830, xfy, \/).   % or, right assoc
:- op(820, xfy, /\).   % and, right assoc
:- op(800,  fy, ?).    % atom, prefix

% Top-level predicates.

prove(A) :-
  seqR([], [], A).

refute(A) :-
  \+ seqR([], [], A). % negation as failure

% Auxiliary predicates.

append_([], Ys, Ys).
append_([X|Xs], Ys, [X|Zs]) :-
  append_(Xs, Ys, Zs).

memberchk_(X, [X|_]) :- !.
memberchk_(X, [_|Ys]) :- memberchk_(X, Ys).

% Implementation.

% break down asynchronous propositions first -- right then left
%   G = context of synchronous props
%   O = context of props not yet processed
% choose synchronous propositions -- right then left
%   G = context of synchronous props not yet processed
%   H = context of unused synchronous propositions

% breaking asynchronous things down on the right
seqR(O, G, A /\ B) :-
  seqR(O, G, A),
  seqR(O, G, B).
seqR(O, G, A => B) :-
  seqR([A | O], G, B).
seqR(_, _, tt).

% synchronous prop encountered on the right -- switching to the left
seqR(O, G, A \/ B) :-
  seqL(O, G, A \/ B).
seqR(O, G, ff) :-
  seqL(O, G, ff).
seqR(O, G, ?A) :-
  seqL(O, G, ?A).

% breaking asynchronous things down on the left
seqL([A /\ B | O], G, C) :-
  seqL([A,B | O], G, C).
seqL([A \/ B | O], G, C) :-
  seqL([A | O], G, C),
  seqL([B | O], G, C).
seqL([tt | O], G, C) :-
  seqL(O, G, C).
seqL([ff | _], _, _).
seqL([(D /\ E) => F | O], G, C) :-
  seqL([D => (E => F) | O], G, C).
seqL([tt => F | O], G, C) :-
  seqL([F | O], G, C).
seqL([ff => _ | O], G, C) :-
  seqL(O, G, C).
seqL([(D \/ E)=>F | O], G, C) :-
  seqL([D=>F, E=>F | O], G, C).

% synchronous left encountered -- move to gamma context
seqL([(?A) => D | O], G, C) :-
  seqL(O, [(?A) => D | G], C).
seqL([(D => E) => F | O], G, C) :-
  seqL(O, [(D => E) => F | G], C).
seqL([?A | O], G, C) :-
  seqL(O, [?A | G], C).

% context has been processed -- choose a synchronous rule
seqL([], G, C) :- chooseR(G, C).
seqL([], G, C) :- chooseL(G, [], C).

% break down synchronous prop on the right
chooseR(G, A \/ B) :- seqR([], G, A).
chooseR(G, A \/ B) :- seqR([], G, B).
chooseR(G, ?P) :- memberchk_(?P, G).
% chooseR(G, ff) :- fail. % to force a goal to fail, don't include a rule.

% break down synchronous prop on the left
chooseL([?P | G], H, C) :-
  chooseL(G, [?P | H], C).
chooseL([(?P) => B | G], H, C) :-
  append_(G, H, I),
  memberchk_(?P, I),
  !,
  seqL([B], I, C).
chooseL([(?P) => B | G], H, C) :-
  chooseL(G, [(?P) => B | H], C).
chooseL([(D => E) => B | G], H, C) :-
  append_(G, H, I),
  seqR([E => B, D], I, E),
  seqL([B], I, C).
chooseL([(D => E) => B | G], H, C) :-
  chooseL(G, [(D => E) => B | H], C).
% chooseL([], H, C) :- fail.

% Tests.

% prove( ?a => ?a ).
% prove( ?a => (?b => ?a) ).
% prove( (?a => ?b) => (?a => (?b => ?c)) => (?a => ?c) ).
% prove( ?a /\ ?b => ?b /\ ?a ).
% prove( ?a \/ ?b => ?b \/ ?a ).
% prove( (?a \/ ?c) /\ (?b => ?c) => (?a => ?b) => ?c ).
% refute( (?a => ?b \/ ?c) => (?a => ?b) \/ (?a => ?c) ).
% prove( ((?a => ?b) \/ (?a => ?c)) => (?a => ?b \/ ?c) ).
% refute( ((?a => ?b) => ?c) => ((?a \/ ?b) /\ (?b => ?c)) ).
% prove( ((?a \/ ?b) /\ (?b => ?c)) => ((?a => ?b) => ?c) ).
% prove( (?a => ?b) => (?b => ?c) => (?c => ?d) => (?a => ?d) ).
% prove( (?a => ?b) => (?a => ?c) => ?a => ?b ).
% prove( (?a => ?b) => (?a => ?c) => ?a => ?c ).
% prove( ?a => (?a => ?b) => (?a => ?c) => ?b ).
% prove( ?a => (?a => ?b) => (?a => ?c) => ?c ).
% prove( (?a => ?b => ?c) => ?a => ?b => ?c ).
% prove( (?a => ?b => ?c) => ?b => ?a => ?c ).
% prove( ?a => ?b => (?a => ?b => ?c) => ?c ).
% prove( ?b => ?a => (?a => ?b => ?c) => ?c ).
% prove( (?a => ?b) => ?a => ?b ).
% prove( ((?a => ?b) => ?c) => ((?a => ?b) => ?c) ).
% prove( (((?a => ?b) => ?c) => ?d) => (((?a => ?b) => ?c) => ?d) ).
% prove( ((((?a => ?b) => ?c) => ?d) => ?e)
%                        => (((?a => ?b) => ?c) => ?d) => ?e ).
% prove( (((((?a => ?b) => ?c) => ?d) => ?e) => ?f)
%                        => ((((?a => ?b) => ?c) => ?d) => ?e) => ?f ).
% prove( (((((?a => ?b) => ?c) => ?d) => ?e) => ?f)
%                       => (((((?a => ?b) => ?c) => ?d) => ?e) => ?f)
%                       \/ (((((?a => ?b) => ?c) => ?d) => ?e) => ?f) ).
% prove( ((?a => ?b) => ?c) => ?d => ?d \/ ?d ).

Writing /tmp/g4ip.pl

! swipl /tmp/g4ip.pl -g "prove((?a => ?a))."

[31mWarning: /tmp/g4ip.pl:83:
Warning:    Singleton variables: [B]
[0m[31mWarning: /tmp/g4ip.pl:84:
Warning:    Singleton variables: [A]
[0mWelcome to SWI-Prolog (threaded, 64 bits, version 9.3.15)
SWI-Prolog comes with ABSOLUTELY NO WARRANTY. This is free software.
Please run ?- license. for legal details.

For online help and background, visit https://www.swi-prolog.org
For built-in help, use ?- help(Topic). or ?- apropos(Word).

?- ^C

|