Sauerkraut
Serializing Python's Control State
As pickle
serializes Python’s data state, sauerkraut
serializes Python’s control state.
Concretely, this library equips Python with the ability to stop a running Python function, serialize it to the network or disk, and resume later on. This is useful in different contexts, such as dynamic load balancing and checkpoint/restart. Sauerkraut is designed to be lightning fast for latency-sensitive HPC applications. Internally, FlatBuffers
are created to store the serialized state.
This library is still in development, but the source code is available on GitHub.
Example
The following example captures the state of the function fun1
in the call skt.copy_frame
. It writes this state to a binary file on disk, then reads it back from disk and continues execution of the function from the next line of code.
import sauerkraut as skt
calls = 0
def fun1(c):
global calls
calls += 1
g = 4
for i in range(3):
if i == 0:
print("Copying frame")
# copy_frame makes a copy of the current control + data states.
# When we later run the frame (run_frame), execution will continue
# directly after the call.
# Therefore, line will return twice:
# The first time when the frame is copied,
# the second time when the frame is resumed.
frm_copy = skt.copy_frame()
# Because copy_frame will return twice, we need to differentiate
# between the different returns to avoid getting stuck in a loop!
if calls == 1:
# The copied frame does not see this write to g
g = 5
calls += 1
# This variable is not serialized,
# as it's not live
hidden_inner = 55
return frm_copy
else:
# When running the copied frame, we take this branch.
# This line will run 3 times
print(f'calls={calls}, c={c}, g={g}')
calls = 0
return 3
frm = fun1(13)
serframe = skt.serialize_frame(frm)
with open('serialized_frame.bin', 'wb') as f:
f.write(serframe)
with open('serialized_frame.bin', 'rb') as f:
read_frame = f.read()
code = skt.deserialize_frame(read_frame)
skt.run_frame(code)