Modeling Separation Logic with Python Dicts and Z3
Separation logic has made me feel uneasy. I felt a lot better about it once I realized it’s all about partial maps.
Separation logic is a logic (I suppose this is already uneasy point #1) that is useful for reasoning about heaps and memory.
It defines some logical connectives, separating conjunction and wand, that are brothers of the familiar and
and implies
that somehow intrinsically talk about the heap. Isn’t it weird for the notion of heap to be built into the connectives?
Well, if you already accept temporal logic, which defines logical connectives that talk about time, maybe it isn’t so weird. Also, in general the separating connectives aren’t talking about a heap persay, but some kind of monoid thingo (See O’Hearn review article https://cacm.acm.org/magazines/2019/2/234356-separation-logic/fulltext ).
Another perhaps confusing point is: what is the essence of seperation logic? Do you need to already have deeply internalized Hoare logic for reasoning about programs? I think not. The assertion language of separation logic is already interesting. It is a language to describe classes of heaps. Talking about how heaps change throughout a program is a separate complication that I won’t touch today. Somehow I could never get Chris to agree to this point, which puzzles me.
Another thing that confused me is: what is a heap? My first answer is that it is a region of memory, say addresses 0x10000
to 0x20000
at which we are storing data. This is a somewhat machine-centric perspective. Another answer that may be said by someone who thinks about Java and such things is that the heap is an abstract graph of Objects in scope.
But I think an answer that helps when it comes to separation logic is that a heap is a partial map aka a dictionary. We want to think about a mappings when there may be valid and invalid keys / the domain of the map is changing. This is related to the previous two perspectives. While we may preallocate a big hunk of memory for your heap, it is not legal for you to access pieces you have not malloc or to access regions you’ve freed. In this sense, the machine heap’s address space is changing.
So the assertions of separation logic hold in some heap states and not in others.
To make this concrete, there are a couple of interesting programs we can write. The below is in some sense an implementation of the $\models$ relation here http://www.cs.cmu.edu/afs/cs.cmu.edu/project/fox-19/member/jcr/www15818As2011/ch2.pdf
Modelling Separation using Python Dicts
Python dictionaries are very convenient partial maps. We can evaluate separation logical assertions on particular heaps to see if they are true or false (whether the dictionary is a model of the statement).
Heap predicates are modelled as python functions Dict -> Bool
. For example the empty heap predicate is
def emp(heap):
return len(heap) == 0
emp( {} ) # True
emp( {1 : 2} ) # False
emp( {1 : 2, 3 : 4} ) # False
The assertion that the heap is a singleton is called points to. You see the funny kind of curried shape parametrized predicates need to take here.
def pto(k,v):
def res(heap):
return len(heap) == 1 and k in heap and heap[k] == v
return res
pto(1,3)({}) # False
pto(1,3)({1 : 3}) # True
pto(1,3)( {1 : 3, 4 : 5} ) # False. Perhaps a little surprising. 1 Points to 3 means ONLY 1 -> 3 and nothing extra
The assertion language is modelled as combinators on the predicates. The regular logical connectives just pass the heap along
def ands(p1,p2):
return lambda heap: p1(heap) and p2(heap)
def ors(p1,p2):
return lambda heap: p1(heap) or p2(heap)
def impls(p1,p2):
return lambda heap: not p1(heap) or p2(heap)
ands(emp,emp)({}) # True
ands(pto(1,3) pto(4,5))( {} ) # False for any dictionary
ors(emp, pto(1,3))( {} ) # True
ors(emp, pto(1,3))( {1:3} ) # True
We can actually factor out lifting functions for boolean connectives in general, although we lose short circuiting behavior.
def lift0(b): # lift a boolean
return lambda heap: b
def lift1(boolop): # lift a unary boolean operation
return lambda p: lambda heap: boolop(p(x))
def lift2(boolop2): # lift a binary boolean operation
return lambda p1,p2: lambda heap: boolop2(p1(heap), p2(heap))
nots = lift1(lambda b : not b)
trues = lift0(True)
falses = lift0(False)
Ok, the more interesting one is separating conjunctions *
. Separating conjunction is talk about the disjoint merge of two dictionaries. Disjoint merge is defined if the dictionaries do not share any keys. Then it is clearly how to combine the two dictionaries. The new keys are the disjoint union of old keys.
def disjoint_merge(a, b):
assert not any(k in b for k in a)
return {**a, **b}
disjoint_merge({1:4}, {1:2})
Separating conjunction p1 * p2
says that there exists a way to break up the dictionary such that p1
holds of a
and p2
holds of b
. It is not a priori obvious how to break the dictionary up, so we need to search for a breakup. all_splits
is a generator for all ways to break up a list into two lists.
def all_splits(l):
n = len(l)
for i in range(2**n):
left = []
right = []
for b in range(n):
if (i >> b) & 1 == 0:
left.append(l[b])
else:
right.append(l[b])
yield left, right
Using this function we can define sep
which is a higher order predicate combinator. Woof.
def sep(p1,p2):
def res(heap):
for (l,r) in all_splits(list(heap.keys())):
if p1( { k : heap[k] for k in l } ) and p2( { k : heap[k] for k in r } ):
return True
return False
return res
sep(pto(1,2), pto(2,3))({ 1:2, 2:3 }) # True
sep(pto(1,2), pto(2,3))({ 1:2, 2:3 , 4:5 }) # False
sep( pto(1,2), pto(1,2) )({ 1 : 2}) # False
Using Z3
We can also model heaps using Z3 Arrays and interpret Separation logic assertions into the first order logical language of Z3. CVC4, a different SMT solver, actually has built in support for separation logic.
To me there are a couple choices
- Model partial map as Array(Key, Option(Value)) using Z3 support for algebraic data types
- Split Partial Map into Array(Key, Value) and
valid : Array(Key,Bool)
- Others?
I have a suspicion the latter is preferable. Note that they aren’t quite equivalent, since we need there to exist a default value in the codomain for the second.
I model separation logic assertions as python functions from Array(Key, Value), Array(Key,Bool) -> Z3Bool
The disjointunion
relation holds between three valid
domain arrays Array(Key, Bool). It states that the disjoint union of a
and b
is c
. It is useful for expressing separating conjunction
from z3 import *
import functools
def disjointunion(a,b,c):
k = FreshConst(a.domain())
return ForAll([k], And( Or(a[k], b[k]) == c[k],
Not(And(a[k],b[k]))
))
# set diff might be an alternative useful formulation
def setdiff(c,a):
return lambda k: And(c[k], Not(a[k])), ForAll([k], Implies(a[k], c[k]))
def pto(k,v):
def res(heap,valid):
k1 = FreshConst(heap.domain())
return And(
ForAll([k1], valid[k1] == (k1 == k)),
heap[k] == v
)
return res
def emp(heap,valid):
k1 = FreshConst(heap.domain())
return ForAll([k1], valid[k1] == False)
def sep(f,g):
def res(heap,valid):
a = FreshConst(valid.sort())
b = FreshConst(valid.sort())
# Should we existentially quantify a and b? We're assuming they'll reach a top level existential
return Exists([a,b], And(disjointunion(a,b,valid), f(heap,a), g(heap,b)))
return res
def lift(x):
def res(heap,valid):
return x
return res
def cell(k, vals):
return functools.reduce(sep, [ pto(k+n, v) for n,v in enumerate(vals)])
# http://www.cs.cmu.edu/afs/cs.cmu.edu/project/fox-19/member/jcr/www15818As2011/cs818A3-11.html
# If the heap is extended with a disjoint part in which p1 holds, then p2
# holds for the extended heap.
def wand(p1,p2):
def res(heap,valid):
v2 = FreshConst(valid.sort())
v3 = FreshConst(valid.sort())
# should we skolemize the sitantial into setdiff?
# Or should we define h2 in terms of h3 and h1
# Exists([h3], Implies(p1(setdiff(h3,h)), p2(h3)))
# ForAll([v2], Implies(p1(v2), p2(disjointunionfun()) ))
return ForAll([v2], Exists([v3], Implies( And( p1(heap,v2) , disjointunion(valid,v2,v3)) , p2(heap, v3) )))
return res
I made a cute little helper function for visualizing the heap models. Using a BitVec(8) domain let’s Z3 actually find models. Using Int domains you will have less / no luck,, since I suspect the quantifiers in the above formulation will kill you for SAT queries. You might be ok for UNSAT queries though.
import networkx as nx
def vis_check(pred, N = 3):
s = Solver()
K = BitVecSort(N)
valid = Array("valid", K, BoolSort())
heap = Array("heap", K, K)
s.add(pred(heap,valid))
if s.check() == sat:
m = s.model()
vals = range(2**N)
try:
edges = [ ( i , m.eval(heap[i]).as_long() ) for i in vals] #if bool(m.eval(valid[i])) ]
colors = ["green" if bool(m.eval(valid[i])) else "white" for i in vals ]
#print(colors)
G = nx.DiGraph()
G.add_nodes_from(vals)
G.add_edges_from(edges)
nx.draw(G, with_labels=True, node_color = colors, font_weight='bold')
except:
pass
return m
else:
print("unsat")
Some examples
vis_check(sep(pto(0,1), pto(0,1))) # UNSAT
vis_check(sep(pto(0,1), pto(2,0)))
m = vis_check( cell(x, [3, 7] ) )
K = BitVecSort(3)
x, y = Consts("x y" ,K)
nil = 0
m = vis_check( sep(pto(nil,nil), sep(cell(x, [3, y] ) , cell(y, [2, nil] ))))
Bits and Bobbles
-
Wand is tougher to do with python dicts. We need to invent all possible extenstions. Hmm.. Wand appears to be very useful for weakest precondition also.
-
Volume 6 of Software Foundations. ICFP paper https://softwarefoundations.cis.upenn.edu/slf-current/index.html
-
The Z3 model is kind of like pretending I’m on a tagged architecture. Isn’t that kind of interesting?
-
Reynolds summer school notes are very nice. http://www.cs.cmu.edu/afs/cs.cmu.edu/project/fox-19/member/jcr/www15818As2011/cs818A3-11.html
-
Use garbage collector for dynamic separation logic assertions. The garbage collector has a graph of the current heap. https://docs.python.org/3/library/gc.html
-
Examine what different systems do to examine memory. Frama-C Viper, Dafny, Verifast, Infer
-
Weakest precondition.
-
Hoare logic command is analog of - turnstile. - Is there something physicsy here? Seperation is spacy and temporal is timey. Could you make some kind of relativity thing where predciates about spacelike separated events using seperating conjuction? Relativity style thinking related to comp sci is not new. Lamport diagrams are inspired by slash the same sort of thing as spacetime diagrams and he says this I think.
A curious different approach to the python dictionary thing. I’m not sure it actually works. The idea being that we shouldn’t have to search so much with all_splits
. pto
Knows exactly what pieces to take out of the heap. So model predicates as Dict -> [Dict]
of all possible remainder heaps. If the empty heap is left at the end you have a model?
# a different represdentation. def pred(heap) -> option heap. Doing the set difference
# I guess it is still a search, but we can orient the search a different way. Kind of odd these two perspective are rellated.
def pto(k,v):
def res(heap):
if k in heap and heap[k] == v:
del k in heap # should be doing it pure though
return heap
return None
return res
def emp(heap): # hmnm. This is kind of intruguing. it's the identity function. Am I doing a yoneda? sep and emp form a monoid.
return heap
def sep(p1,p2): # sep is kliesli compose for option
def res(heap):
heap1 = p1(heap)
if heap != None:
return p2(heap)
return None
return res
def ands(p1,p2):
def res(heap):
heap1 = p1(heap)
if heap1 != None:
heap2 = p2(heap)
if heap2 != None:
return heap1 == heap2 # is this right? Kind of in parallel.
return None
return res
def ors(p1, p2): # hmm. Looks like we want it to return [heaps]
def res(heap):
heaps1 = p1(heap)
heaps2 = p2(heap)
return heaps1 + heaps2
# yield from heaps1
return res
- Could perhaps prove general laws about seperating connectives by creating abstract Z3 uninterpeted functions as predicates. A sketch.
# PROVE LAWS
p1 = Function("p1", Array(), Array(), BoolSort())
p2 = Function("p1", Array(), Array(), BoolSort())
heap =
valid =
prove(sep(p1,p2)(heap, valid) == sep(p2,p1)(heap,valid))
def prove_sep( pa, pb ):
heap
valid
prove(Forall([heap,valid], pa(heap,valid) == pb(heap,valid)))