November 1, 2025
Quickstrom is a property-based testing tool for web applications, using QuickLTL for specifying the intended behavior. QuickLTL is a linear temporal logic (LTL) over finite traces, especially suited for testing. As with many other logic systems, when a formula evaluates to false — like when a counterexample to a safety property is found or a liveness property cannot be shown to hold — the computer says no. That is, you get “false” or “test failed”, perhaps along with a trace. Understanding complex bugs in stateful systems then comes down to staring at the specification alongside the trace, hoping you can somehow pin down what went wrong. It’s not great.
Instead, we should have helpful error messages explaining
why a property does not hold; which parts of the specification
failed and which concrete values from the trace were involved. Not
false, unsat, or even
assertion error: x != y. We should get the full story. I
started exploring this space a few years ago when I worked actively on
Quickstrom, but for some reason it went on the shelf half-finished. Time
to tie up the loose ends!
The starting point was Picostrom, a minimal Haskell version of the checker in Quickstrom, and Error Reporting Logic (ERL), a paper introducing a way of rendering natural-language messages to explain propositional logic counterexamples. I ported it to Rust mostly to see what it turned into, and extended it with error reporting supporting temporal operators. The code is available at codeberg.org/owi/picostrom-rs under the MIT license.
Between the start of my work and picking it back up now, A Language for Explaining Counterexamples was published, which looks closely related, although it’s focused on model checking with CTL. If you’re interested in other related work, check out A Systematic Literature Review on Counterexample Explanation in Model Checking.
All right, let’s dive in!
A quick recap on QuickLTL is in order before we go into the Picostrom code. QuickLTL operates on finite traces, making it suitable for testing. It’s a four-valued logic, meaning that a formula evaluates to one of these values:
It extends propositional logic with temporal operators, much like LTL:
must hold in the next state, demanding a next state is available. This forces the evaluator to draw a next state.
must hold in the next state, defaulting to if no next state is available.
must hold in the next state, defaulting to if no next state is available.
must hold in the current or a future state. It demands at least states, evaluating on all available states, finally defaulting to .
must hold in the current and all future states. It demands at least states, evaluating on all available states, finally defaulting to .
You can think of as unfolding into a sequence of nested , wrapping an infinite sequence of , connected by :
Or even better, let’s define it inductively with a coinductive base case:
And similarly, can be defined as:
This is essentially how the evaluator expands these temporal operators, but for error reporting reasons, not exactly.
Finally, there are atoms, which are domain-specific
expressions embedded in the AST, evaluating to
or
.
The AST is parameterized on the atom type, so you can plug in an atom
language of choice. An atom type must implement the Atom
trait, which in simplified form looks like this:
trait Atom {
type State;
fn eval(&self, state: &Self::State) -> bool;
fn render(
&self,
mode: TextMode,
negated: bool,
) -> String;
fn render_actual(
&self,
negated: bool,
state: &Self::State,
) -> String;
}For testing the checker, and for this blog post, I’m using the following atom type:
enum TestAtom {
Literal(u64),
Select(Identifier),
Equals(Box<TestAtom>, Box<TestAtom>),
LessThan(Box<TestAtom>, Box<TestAtom>),
GreaterThan(Box<TestAtom>, Box<TestAtom>),
}
enum Identifier {
A,
B,
C,
}The first step, like in ERL, is transforming the formula into negation normal form (NNF), which means pushing down all negations into the atoms:
enum Formula<Atom> {
Atomic {
negated: bool,
atom: Atom,
},
// There's no `Not` variant here!
...
}This makes it much easier to construct readable sentences, in addition to another important upside which I’ll get to in a second. The NNF representation is the one used by the evaluator internally.
Next, the eval function takes an
Atom::State and a Formula, and produces a
Value:
enum Value<'a, A: Atom> {
True,
False { problem: Problem<'a, A> },
Residual(Residual<'a, A>),
}A value is either immediately true or false, meaning that we don’t
need to evaluate on additional states, or a residual, which
describes how to continue evaluating a formula when given a next state.
Also note how the False variant holds a
Problem, which is what we’d report as
.
The True variant doesn’t need to hold any such information,
because due to NNF, it can’t be negated and “turned into a problem.”
I won’t cover every variant of the Residual type, but
let’s take one example:
pub enum Residual<'a, A: Atom> {
// ...
AndAlways {
start: Numbered<&'a A::State>,
left: Box<Residual<'a, A>>,
right: Box<Residual<'a, A>>,
},
// ...
}When such a value is returned, the evaluator checks if it’s possible
to stop at this point, i.e. if there are no demanding operators
in the residual. If not possible, it draws a new state and calls
step on the residual. The step function is
analogous to eval, also returning a Value, but
it operates on a Residual rather than a
Formula.
The AndAlways variant describes an ongoing evaluation of
the
operator, where the left and right residuals
are the operands of
in the inductive definition I described earlier. The start
field holds the starting state, which is used when rendering error
messages. Similarly, the Residual enum has variants for
,
,
,
,
,
and a few others.
When the stop function deems it possible to stop
evaluating, we get back a value of this type:
enum Stop<'a, A: Atom> {
True,
False(Problem<'a, A>),
}Those variants correspond to
and
. In the false case, we get a
Problem which we can render. Recall how the
Value type returned by eval and
step also had True and False
variants? Those are the definite cases.
The Problem type is a tree structure, mirroring the
structure of the evaluated formula, but only containing the parts of it
that contributed to its falsity.
enum Problem<'a, A: Atom> {
And {
left: Box<Problem<'a, A>>,
right: Box<Problem<'a, A>>,
},
Or {
left: Box<Problem<'a, A>>,
right: Box<Problem<'a, A>>,
},
Always {
state: Numbered<&'a A::State>,
problem: Box<Problem<'a, A>>,
},
Eventually {
state: Numbered<&'a A::State>,
formula: Box<Formula<A>>,
},
// A bunch of others...
}I’ve written a simple renderer that walks the Problem
tree, constructing English error messages. When hitting the atoms, it
uses the render and render_actual methods from
the Atom trait I showed you before.
The mode is very much like in the ERL paper,
i.e. whether it should be rendered in deontic (e.g. “x should equal 4”)
or indicative (e.g. “x equals 4”) form:
enum TextMode {
Deontic,
Indicative,
}The render method should render the atom according to
the mode, and render_actual should render relevant parts of
the atom in a given state, like its variable assignments.
With all these pieces in place, we can finally render some error messages! Let’s say we have this formula:
If we run a test and never see such a state, the rendered error would be:
Probably false: eventually B must equal 3 and C must equal 4, but it was not observed starting at state 0
Neat! This is the kind of error reporting I want for my stateful tests.
You can trace why some subformula is relevant by using implication. A common pattern in state machine specs and other safety properties is:
So, let’s say we have this formula:
If or are false, the error includes the antecedent:
Definitely false: B must be greater than 5 and in the next state, C must be less than 10 since A is greater than 0, […]
Let’s consider a conjunction of two invariants. We could of course combine the two atomic propositions with conjunction inside a single , but in this case we have the formula:
An error message, where both invariants fail, might look the following:
Definitely false: it must always be the case that A is less than 3 and it must always be the case that B is greater than C, but A=3 in state 3 and B=0 in state 3
If only the second invariant () fails, we get a smaller error:
Definitely false: it must always be the case that B is greater than C, but B=0 and C=0 in state 0
And, crucially, if one of the invariants fail before the other we also get a smaller error, ignoring the other invariant. While single-state conjunctions evaluate both sides, possibly creating composite errors, conjunctions over time short-circuit in order to stop tests as soon as possible.
Let’s say we have a failing safety property like the following:
The textual error might be:
Definitely false: in the next state, it must always be the case that B is greater than C, but B=13 and C=15 in state 6
But with some tweaks we could also draw a diagram, using the
Problem tree and the collected states:
Or for a liveness property like , where there is no counterexample at a particular state, we could draw a diagram showing how we give up after some time:
These are only sketches, but I think they show how the
Problem data structure can be used in many interesting
ways. What other visualizations would be possible? An interactive state
space explorer could show how problems evolve as you navigate across
time. You could generate spreadsheets or HTML documents, or maybe even
annotate the relevant source code of some system-under-test? I think it
depends a lot on the domain this is applied to.
It’s been great to finally finish this work! I’ve had a lot of fun working through the various head-scratchers in the evaluator, getting strange combinations of temporal operators to render readable error messages. I also enjoyed drawing the diagrams, and almost nerd-sniped myself into automating that. Maybe another day. I hope this is interesting or even useful to someone out there. LTL is really cool and should be used more!
The code, including many rendering tests cases, is available at codeberg.org/owi/picostrom-rs.
A special thanks goes to Divyanshu Ranjan for reviewing a draft of this post.