⚙️ Message-oriented vs. Data-oriented orchestration - from data to knowledge
on April 17, 2026
For intellectual property reasons, the subject chosen for the application of this article will not be the one discussed, although it is closely related. For any further information, please contact Omer who will be happy to answer, and apologize for any potential inconvenience.
In this article, we explore two fundamental approaches to software orchestration:
- Message-Oriented Orchestration: via Symfony Messenger synchronous respectively asynchronous
- Data-Oriented Orchestration: via Navi for synchronous and Flow for asynchronous
The case study is based on a classic but structuring problem: text mining applied to a set of Git repositories.
For the practical demonstration, I will take the EIT tutorial from 2007/2008 carried out at the time on classification with Matthieu Beyou during computer science class tutorials.
For the data, we use those of Omer (former work colleague) available on his site https://git.arkalo.ovh (via the api).
The goal is not to produce the best machine learning model, but to understand how the form of orchestration influences the complexity, readability, and scalability of the system.
The problem: transforming repositories into usable knowledge
The dataset consists of a list of Git repositories defined in a repos.json file, the data of the directories listed on https://git.arkalo.ovh/explore/repos.
For your information, you can extract the information using the Composio connector for https://composio.dev/toolkits/gitea. Refer to my previous article for the implementation: https://blog.darkwood.com/fr/article/relacher-les-connecteurs-des-outils-au-langage.
Each deposit becomes a canonical document constructed from:
- repository name
- description
- README
- metadata (owner, topics…)
This document is then transformed via a classic text mining pipeline:
- Pretreatment (cleaning, tokenization)
- Feature Extraction
- TF-IDF Weighting
- Similarity between documents
- Classification / clustering
This pipeline is directly inspired by historical approaches:
- TF-IDF:
weight = tf * log(N / df) - Cosine similarity between documents
- Supervised Naive Bayes Classification
- Unsupervised k-means clustering
What interests us here is not the algorithm, but the way to orchestrate it.
Note that if you are fond of documentation, you can refer to the Resources section at the bottom of the article which lists a number of topics concerning data mining applied in computer science.
Business pipeline (independent of orchestration)
First and foremost, the core business needs to be isolated.
Repository → Document → Tokens → Features → TF-IDF → Similarity → Results
This pipeline represents a data transformation.
Each step:
- takes a piece of data
- produces new data
- without strong dependence on an external context
This is precisely where the two approaches diverge.
Approach 1 Message-Oriented - Orchestration via Symfony Messenger
In the Message-Oriented implementation, the pipeline is not expressed as a continuous data transformation.
It is encapsulated in a message, then executed via the Symfony bus.
Execution Model
Command → Message Bus → Handler → PipelineService → Stages
In concrete terms:
- a CLI command triggers the execution
- A message is sent
- a handler takes care of the execution
- the core business remains centralized in a shared service
RunMessengerPipelineMessage
→ RunMessengerPipelineHandler
→ PipelineService
Separation of responsibilities
This implementation adheres to a key project constraint:
The core business is strictly shared between the two approaches
So :
- Messenger contains no business logic
- he only orchestrates the execution
Actual Pipeline Executed
The handler triggers a deterministic pipeline:
1. ingest
2. preprocess
3. feature build
4. classification
5. clustering
Each step is executed in a common application service (PipelineService).
Concepts introduced by Messenger
The orchestration explicitly introduces:
- a message class
- a dedicated handler
- a dependence on the bus
- a dispatch layer
Command → Message → Handler → Service
These elements are specific to Messenger and do not exist in the data-oriented model
Observability and debugging
Messenger offers a natural debugging model:
- message inspection
- middleware
- bus logging
- Extensibility towards async / queue
Debug = niveau message + middleware
Nature of the overhead
In this MVP, the overhead is conceptually measurable:
- introduction of an artificial message
- Indirection via handler
- the need to structure the execution around the bus
But this overhead is located in the orchestration adapter, not in the hardware.
Summary
This approach transforms the pipeline into:
a distributed work unit
She favors:
- Symfony standardization
- extensibility towards async
- integration with the ecosystem
At the cost of an additional layer of indirection.
Conceptual Example
final class ComputeTfIdfMessage
{
public function __construct(public DocumentId $id) {}
}
final class ComputeTfIdfHandler
{
public function __invoke(ComputeTfIdfMessage $message)
{
$document = $this->repository->get($message->id);
$vector = $this->tfidf->compute($document);
$this->bus->dispatch(new ComputeSimilarityMessage($vector));
}
}
Benefits
- strong decoupling
- resilience (retry, queue)
- native parallelization
- Symfony standard
Structural Limitations
The problem quickly becomes apparent:
➡️ the message becomes an artificial envelope
We manipulate:
- IDs
- persistent states
- indirect transitions
The problem is simply:
data → transformation → data
This introduces:
- the boilerplate
- implicit dependencies
- a loss of overall readability
Approach 2 Data-Oriented - Orchestration via Navi (synchronous) and Flow (asynchronous)
In the Data-Oriented implementation, the pipeline is expressed as an ordered sequence of actions applied to a context.
There is no message.
There is no dispatch.
There is only:
- a piece of data
- a context
- a sequential transformation
Execution Model
Command → WorkflowRunner → Actions → PipelineService → Data
In concrete terms:
- a command triggers a workflow
- The
WorkflowRunnerexecutes a list of actions - each action transforms a
Context - The business services are identical to Messenger
WorkflowRunner
→ PipelineStageAction[]
→ Context
→ PipelineService
Pipeline Structure
The pipeline is explicitly defined as a sequence:
[IngestAction,
PreprocessAction,
FeatureBuildAction,
ClassificationAction,
ClusteringAction]
Each action:
- takes a
Context - applies a transformation
- returns a new
Context
Nature of the Context
The Context becomes the central object:
- it contains the pipeline status
- it evolves at each stage
- it is inspectable
Context₀ → Context₁ → Context₂ → ... → Contextₙ
Concepts introduced by Flow
This approach introduces:
- explicit actions
- a runner
- an evolving context
Data → Action → Data
Unlike Messenger:
- no message
- no handler
- no bus
Observability and debugging
The debugging process changes completely in nature:
Debug = suite d’actions + snapshots de contexte
Benefits :
- visible execution order
- inspectable intermediate state
- deterministic pipeline
Nature of readability
The pipeline can be directly read as a stream:
ingest → preprocess → features → classification → clustering
Without structural transformation.
Structural Overhead
The cost introduced is different:
- need for a Context
- abstraction via actions
But :
- no envelope
- no bus detours
- no break in the data flow
Summary
This approach transforms the pipeline into:
a series of data transformations
She favors:
- immediate readability
- direct transformation of data
- absence of envelope
- deterministic pipeline
- ease of testing
Boundaries
- less suitable for complex distributed systems
- requires strict discipline regarding the purity of the transformations
- Tooling less standard than Messenger
Direct Comparison
| Criteria | Message-Oriented | Data-Oriented | | --- | --- | --- | | Mental model | Events / Messages | Data streams | | Readability | fragmented | linear | | Overhead | high (messages, handlers) | low | | Scalability | excellent | depends on the design | | Debug | indirect | direct | | Business coupling | weak but diffuse | strong but explicit |
| Appearance | Messenger | NaviFlow | | --- | --- | --- | | Central unit | Message | Context | | Orchestration | Bus + Handler | Runner + Actions | | Flow | indirect | direct | | Debug | message-centric | data-centric | | Overhead | message + handler | action + context | | Pipeline | encapsulated | explicit |
Key point: the illusion of complexity
In the case of text mining, each step is:
- pure
- determinist
- functional
Examples:
- TF-IDF → simple mathematical formula
- Similarity cosine → normalized dot product
There is no natural need for messages.
The introduction of Messenger is therefore an architectural decision, not a business necessity.
Main Insight
Message-oriented transforms data into events.
Data-oriented technology transforms data into data.**
In a system like this:
- Message-Oriented adds a layer
- Data-Oriented reveals the model
Implications for Symfony
Symfony is evolving towards:
- async
- workers
- sidekicks (FrankenPHP)
- distributed orchestration
But this raises a fundamental question:
👉 Does everything have to be orchestrated via messages?
The answer depends on the problem.
When to use each approach
Message-Oriented
- distributed workflows
- long tasks
- resilient systems
- industry events
Data-Oriented
- analytical pipelines
- data transformations
- deterministic systems
- intensive calculations
Source code
The project's source code is free and can be viewed here: https://github.com/matyo91/omer-quotes
Conclusion
This project demonstrates one simple thing:
👉 Orchestration is not neutral
Two functionally identical implementations can produce:
- radically different systems
- opposing cognitive costs
- divergent evolutionary capacities
In the case of text mining:
- Message-Oriented makes things more complex
- Data-Oriented clarifie
Symfony Messenger orchestrates a pipeline as a unit of work.
Darkwood Flow orchestrates a pipeline as a data transformation.
Next
The next step in the project is to:
- extend the pipeline (clustering, classification)
- to integrate more advanced models
- expose an API
- compare actual performance
But most importantly:
👉 continue to question the form of the orchestration.
Resources
Thank you for writing the article
- Omer's git repository for data and inspiration: https://git.arkalo.ovh
- Leverage Messenger to Improve Your Architecture - Tugdual Saunier for the article outline: https://speakerdeck.com/tucksaun/tirez-profit-de-messenger-pour-ameliorer-votre-architecture
- Polytech Paris Sud (formerly IFIPS) 2008: Information Extraction from Texts Project - Document Classification by François Yvon and Alexandre Allauzen, carried out as a tutorial during the 2007/2008 academic year using the Perl language with Matthieu Beyou https://www.linkedin.com/in/matthieu-beyou-9a425a32/
Examples from friends on EIT topics and mathematical models applied to AI
- Claude just changed sales calls forever! (free skill) - Alexandra Spalato | AI Automation: https://www.youtube.com/watch?v=FuVIGGWwYKY
- Demystifying AI: A practical guide for PHP developers - Iana IATSUN - PHP Forum 2024: https://www.youtube.com/watch?v=u-yrK_-_p9g
- Embeddings in PHP: Symfony AI in practice: https://speakerdeck.com/lyrixx/embeddings-symfony-ai-en-pratique
- Stack Overflow tags - automatic prediction using machine learning algorithms - Marco Berta: https://www.youtube.com/watch?v=fFKXFDDjEJU
- API Platform Conference 2025 - Gregory Planchat - L'Event Storming dans nos projets API Platform : https://www.youtube.com/watch?v=zyxsibA7by4
- Help! I'm being asked to use AI! - Drupal Camp Grenoble 2026 - Alexandre Balmes: https://speakerdeck.com/pocky/au-secours-on-me-demande-dutiliser-de-lia-drupal-camp-grenoble-2026
- DeepMind's New AI Just Changed Science Forever - Two Minute Papers: https://www.youtube.com/watch?v=Io_GqmbNBbY
- Langflow Models Are Smart. Data Is Everything. Building Context-Rich AI Systems with Unstructured: https://www.youtube.com/watch?v=fNLUv6Pvc6w
- I am a legend: hacking hearthstone with machine learning - Elie Bursztein, Celine Bursztein: https://elie.net/talk/i-am-a-legend Tell me something about myself that I don't yet know by Nathalie | A Voice That Carries: https://x.com/Bonzai_Star/status/2031432381471797589
The Future of AI as Seen by Yann LeCun
- Nobody realizes what Yann LeCun has just created - Grand Angle Nova: https://www.youtube.com/watch?v=P-wAr687qxg
- For those who are curious: Inaugural lecture by Yann LeCun - Deep Learning and Beyond: The New Challenges of AI - École nationale des ponts et chaussées: https://www.youtube.com/watch?v=Z208NMP7_-0
- What is knowledge made of? From Arthur Sarrazin https://www.linkedin.com/in/arthursarazin in https://srzarthur.substack.com/p/what-is-knowledge-made-of