Darkwood Blog Blog
  • Articles
  • Auto
  • Releases
en
  • de
  • fr
Login
  • Blog
  • Articles
  • Auto
  • Releases

πŸ’³ Flow Tokens - Why token optimization is a flow problem, not a vocabulary problem

on June 21, 2026

Log in to add a reaction to this post

πŸš€ 1

The rise of AI agents has brought a topic back to the center of many software architectures: the cost of context.

When an agent executes a workflow, it doesn't just consume a user prompt. It also consumes a large amount of ancillary data:

  • execution logs
  • CLI command outputs
  • tool results
  • debugging traces
  • chunks from RAG
  • raw or semi-structured files

In practice, this context often represents the main cost of the workflow, well before the reasoning phase of the model.

The question then becomes:

How can we reduce token consumption without degrading the quality of the signal transmitted to the model?

A common intuition is to optimize vocabulary: shorten words, abbreviate certain business concepts, artificially condense texts.

This intuition is usually wrong.

Token optimization is not a lexical problem.

It's a flow problem.

The real problem: an uncontrolled context

In many systems, data enters the AI ​​chain in the form of a huge string of characters.

Let's take a classic example.

An agent must analyze the result of a Symfony or Composer command.

The output may contain:

  • repetitive debug lines
  • timestamps
  • ANSI sequences
  • stack traces
  • progress lines
  • genuinely useful information

For a human, the separation between noise and signal is immediate.

For an LLM, everything is initially equivalent.

Each character can become a token.

Each token has a cost.

Each cost consumes a portion of the context window.

The problem is therefore not just the size of the text.

The problem is the lack of structure in the way data flows.

Token optimization is stream discipline

The thesis of this article is simple:

Token optimization is not word shortening. Token optimization is stream discipline.

In other words:

What matters is not how the words are spelled, but how the data moves through the system.

A flow-oriented architecture allows for:

  • clean the data
  • segment the data
  • measure their cost
  • Compress the repetitions
  • apply an explicit budget

This change of perspective is important.

We are no longer trying to compress words.

We are trying to control a flow.

Flow as an orchestration model

This is precisely the angle explored with Flow.

Flow is not just a task execution engine.

Flow can be seen as a layer for orchestrating the movement of data.

Data can circulate:

  • like a pipe
  • like a stream
  • like chunks
  • as a measurable context
  • such as a budgeted context

This model is particularly interesting for agentic workflows, because an agent rarely consumes strongly typed business objects.

It primarily consumes text.

This text then becomes a resource that needs to be managed.

Demonstration: flow-pipe

To explore this idea, I built a Symfony demo available here:

flow-pipe repository

The project exposes a console command:

php bin/console app:flow-token-demo \
  --input=flow-engine-log --show-chunks

The goal is not to call an LLM.

The goal is to locally simulate a context processing pipeline.

The pipeline follows eight stages:

  1. Load a source
  2. Remove the ANSI sequences
  3. Eliminate the noise
  4. Normalize the spaces
  5. Cut into chunks
  6. Compress
  7. Apply a budget
  8. Produce a usable output

A declarative pipeline

The pipeline can be expressed in declarative form:

source |> strip_ansi |> remove_noise |> normalize_whitespace
  |> chunk:300 |> compress |> budget:1000 |> sink

This notation allows the pipeline to be read from left to right.

Each step transforms the previous one.

The benefit is twofold:

  • improved readability
  • improved extensibility

Adding a new operation does not require modifying a central conditional structure.

Each operation becomes a standalone component.

Three meanings of the pipe

In this demonstration, the symbol |> appears at three levels.

1. Expression DSL

First level: the pipe represents a declarative language readable by humans.

source |> compress |> sink

2. Pipe operator of PHP 8.5

PHP 8.5 natively introduces the pipe operator.

It allows for a more explicit composition of callables.

Example :

yield $step |> (fn ($step) => new ClosureJob(...));

The code becomes closer to reading the DSL.

#3. Runtime Composition with Flow

Third level: the actual composition of jobs in Flow.

Each transformation becomes a job applied to a shared context.

This is the layer that actually executes the pipeline.

Inspiration: Pratt parsing

The parsing part of the DSL.

Instead of having a monolithic parser, each operation has:

  • his name
  • its parsing logic
  • its execution logic

This approach avoids an overloaded central parser.

The pipeline becomes extensible by design.

Result

Let's take the flow-engine-log fixture.

Before transformation:

  • 17,398 characters
  • ~4,488 tokens estimated

After pipeline:

  • 334 characters
  • ~88 tokens estimated

That's approximately:

98% off

The important point is not just the ratio.

The important point is that the professional vocabulary remains intact.

The concepts:

  • flow
  • stream
  • pipeline
  • source
  • sink

have not been shortened.

What has disappeared is:

  • the noise
  • the rehearsals
  • lines without value

In other words:

The signal remained.

The noise has disappeared.

Signal vs. noise

A second fixture, flow-lexicon, is intentionally not very compressible.

The observed reduction is small.

And that's a good thing.

This means that the content already contains mostly signal.

A good pipeline doesn't try to compress everything.

He seeks to eliminate only what brings no benefit.

And then what?

This demonstration is intentionally synchronous and local.

But it opens up an interesting direction.

A natural evolution would be to move towards:

  • non-blocking streams
  • stream_select()
  • fibers
  • process pipes
  • PTY / TTY
  • an event loop

This would allow us to process the flows as they appear, and not afterwards.

In other words:

no longer waiting for the end of a process to analyze its output.

Read and transform the feed in real time.

Conclusion

The cost of agentic workflows is not just a model problem.

It's an orchestration problem.

A high-performing agent is not one who reads everything.

It is the one that receives its own context, structured and constrained.

True optimization, therefore, does not consist of shortening words.

It consists of controlling the flow.

Control the stream, not the spelling.

Resources

  • Guillaume Moigneu about tokenizing context discipline
  • flow-pipe
  • slidewire presentation
  • PHP 8.5 pipe operator

Log in to add a reaction to this post

πŸš€ 1

Site

  • Sitemap
  • Contact
  • Legal mentions

Network

  • Hello
  • Blog
  • Apps
  • Photos

Social

Darkwood 2026, all rights reserved