Rewriting Peven's Engine in Julia
Why Julia
After releasing the first version of Peven, I struggled to extend the engine design in Python to arbitrary tasks or executors. Ultimately, even setting up judge and actor executors left me with a pile of shims and custom classes that bloated the Petri net formalism. I anticipated that every transition might require a custom adapter, base class, or method override.
I find that Julia fixes most of this at the language level. With multiple dispatch and symbolic types, I can define the net as an abstract structure of places, arcs, transitions, and tokens. An added bonus is a clear separation of concerns across repos. A separate engine helps me keep the runtime lean and stable. I came to the conclusion that the Python layer doesn't need to understand what a Petri net is, and that the engine doesn't need to understand what an LLM call is. I'm sure this will cause me some grief down the road, but I find this separation to be useful in terms of organizing my own thoughts.
Also, as you might've discovered in my writing, I just love Julia. I studied Applied Math and CS at Brown except the CS was just more math (AB-pruning > DBs am I right?). I think Julia comes more naturally to me than Python. It's the ultimate math language and Peven's engine is pure math. Peek at the engine code, it's so streamlined. I'm positive I couldn't have created something so beautiful even if I was a Python god.
Representation
The net itself is a symbolic object. Enablement is incremental. After a transition fires, only transitions in its influence set — the transitions that share input places with it — get rechecked. This idea comes from LoLA 2. Firing is bound by local connectivity, not net size. This way there is no need to recompute all affected transitions across a large net. Markings are copy-on-write so concurrent firings can share a base state without stepping on each other.
Keying runs with runKey means that one net can execute many runs simultaneously. Firing is scoped per runKey, so a transition is enabled for a given run when that run's token satisfies the input arcs, independent of every other run.
Firing semantics
The previous Python engine serialized firings per (transition, runKey) with an in_flight set. Python is limited by GIL anyway. Julia actually supports parallelism natively, allowing the same transition (transition, runKey) to fire as many tokens concurrently as long as it remains enabled. Concretely, if a judge transition has ten tokens sitting in Place A and Transition A fires once per token, Transition A can fire all ten tokens in parallel, each with its own firing_id. Retries get a new attempt under the same firing_id, and inputs are reserved across retries so a failed attempt can't leak tokens back into the marking before its replacement finishes.
Bundle-first scheduling
The default firing rule is greedy: if a token exists, consume it eagerly. That works for most nets, but it breaks when you need to correlate tokens across multiple input places within a batch.
The motivating case is benchmark construction. When you build a benchmark you're building tasks and rubrics side by side. Each piece of data has a corresponding rubric, or evaluation, or scoring function. If the generator for task A finishes after the generator for task B, a greedy judge transition will happily pair task A's output with task B's rubric. That would be bad.
BundleRef(transition_id, runKey, selected_key, ordinal) is the fix. Bundles fix this by joining tokens in a place using a selected_key. A transition then fires when all the tokens for a bundle exist in a place. Bundles are also snapshot-safe: grab returns nothing if the bundle is stale, take throws, so you can't act on a stale marking snapshot.
I was pretty locked in on Julia going in and I'm more bullish now. The engine is smaller, faster, and more honest about what it's doing than the Python version ever was.