💳 Flow Tokens - Why token optimization is a flow problem, not a vocabulary problem

on June 21, 2026

🚀 1

The rise of AI agents has brought a topic back to the center of many software architectures: the cost of context.

When an agent executes a workflow, it doesn't just consume a user prompt. It also consumes a large amount of ancillary data:

execution logs
CLI command outputs
tool results
debugging traces
chunks from RAG
raw or semi-structured files

In practice, this context often represents the main cost of the workflow, well before the reasoning phase of the model.

The question then becomes:

How can we reduce token consumption without degrading the quality of the signal transmitted to the model?

A common intuition is to optimize vocabulary: shorten words, abbreviate certain business concepts, artificially condense texts.

This intuition is usually wrong.

Token optimization is not a lexical problem.

It's a flow problem.

The real problem: an uncontrolled context

In many systems, data enters the AI chain in the form of a huge string of characters.

Let's take a classic example.

An agent must analyze the result of a Symfony or Composer command.

The output may contain:

repetitive debug lines
timestamps
ANSI sequences
stack traces
progress lines
genuinely useful information

For a human, the separation between noise and signal is immediate.

For an LLM, everything is initially equivalent.

Each character can become a token.

Each token has a cost.

Each cost consumes a portion of the context window.

The problem is therefore not just the size of the text.

The problem is the lack of structure in the way data flows.

Token optimization is stream discipline

The thesis of this article is simple:

Token optimization is not word shortening. Token optimization is stream discipline.

In other words:

What matters is not how the words are spelled, but how the data moves through the system.

A flow-oriented architecture allows for:

clean the data
segment the data
measure their cost
Compress the repetitions
apply an explicit budget

This change of perspective is important.

We are no longer trying to compress words.

We are trying to control a flow.

Flow as an orchestration model

This is precisely the angle explored with Flow.

Flow is not just a task execution engine.

Flow can be seen as a layer for orchestrating the movement of data.

Data can circulate:

like a pipe
like a stream
like chunks
as a measurable context
such as a budgeted context

This model is particularly interesting for agentic workflows, because an agent rarely consumes strongly typed business objects.

It primarily consumes text.

This text then becomes a resource that needs to be managed.

Demonstration: flow-pipe

To explore this idea, I built a Symfony demo available here:

flow-pipe repository

The project exposes a console command:

php bin/console app:flow-token-demo \
  --input=flow-engine-log --show-chunks

The goal is not to call an LLM.

The goal is to locally simulate a context processing pipeline.

The pipeline follows eight stages:

Load a source
Remove the ANSI sequences
Eliminate the noise
Normalize the spaces
Cut into chunks
Compress
Apply a budget
Produce a usable output

A declarative pipeline

The pipeline can be expressed in declarative form:

source |> strip_ansi |> remove_noise |> normalize_whitespace
  |> chunk:300 |> compress |> budget:1000 |> sink

This notation allows the pipeline to be read from left to right.

Each step transforms the previous one.

The benefit is twofold:

improved readability
improved extensibility

Adding a new operation does not require modifying a central conditional structure.

Each operation becomes a standalone component.

Three meanings of the pipe

In this demonstration, the symbol |> appears at three levels.

1. Expression DSL

First level: the pipe represents a declarative language readable by humans.

source |> compress |> sink

2. Pipe operator of PHP 8.5

PHP 8.5 natively introduces the pipe operator.

It allows for a more explicit composition of callables.

Example :

yield $step |> (fn ($step) => new ClosureJob(...));

The code becomes closer to reading the DSL.

#3. Runtime Composition with Flow

Third level: the actual composition of jobs in Flow.

Each transformation becomes a job applied to a shared context.

This is the layer that actually executes the pipeline.

Inspiration: Pratt parsing

The parsing part of the DSL.

Instead of having a monolithic parser, each operation has:

his name
its parsing logic
its execution logic

This approach avoids an overloaded central parser.

The pipeline becomes extensible by design.

Result

Let's take the flow-engine-log fixture.

Before transformation:

17,398 characters
~4,488 tokens estimated

After pipeline:

334 characters
~88 tokens estimated

That's approximately:

98% off

The important point is not just the ratio.

The important point is that the professional vocabulary remains intact.

The concepts:

flow
stream
pipeline
source
sink

have not been shortened.

What has disappeared is:

the noise
the rehearsals
lines without value

In other words:

The signal remained.

The noise has disappeared.

Signal vs. noise

A second fixture, flow-lexicon, is intentionally not very compressible.

The observed reduction is small.

And that's a good thing.

This means that the content already contains mostly signal.

A good pipeline doesn't try to compress everything.

He seeks to eliminate only what brings no benefit.

And then what?

This demonstration is intentionally synchronous and local.

But it opens up an interesting direction.

A natural evolution would be to move towards:

non-blocking streams
stream_select()
fibers
process pipes
PTY / TTY
an event loop

This would allow us to process the flows as they appear, and not afterwards.

In other words:

no longer waiting for the end of a process to analyze its output.

Read and transform the feed in real time.

Conclusion

The cost of agentic workflows is not just a model problem.

It's an orchestration problem.

A high-performing agent is not one who reads everything.

It is the one that receives its own context, structured and constrained.

True optimization, therefore, does not consist of shortening words.

It consists of controlling the flow.

Control the stream, not the spelling.

Resources

🚀 1

💳 Flow Tokens - Why token optimization is a flow problem, not a vocabulary problem

The real problem: an uncontrolled context

Token optimization is stream discipline

Flow as an orchestration model

Demonstration: flow-pipe

A declarative pipeline

Three meanings of the pipe

1. Expression DSL

2. Pipe operator of PHP 8.5

Inspiration: Pratt parsing

Result

Signal vs. noise

And then what?

Conclusion

Resources

Site

Network

Social