⚙️ NoLife Models - Towards a local infrastructure for AI runtimes with Symfony
on May 13, 2026
For years, using an AI model meant calling a remote API.
The workflow was relatively simple:
- send a prompt;
- wait for a response;
- display text.
But in recent months, a new ecosystem has been emerging around local models.
An ecosystem composed of:
- model catalogues;
- local runtimes;
- benchmark systems;
- structured exports;
- observability;
- governance;
- orchestration.
And gradually, AI is starting to look more like a software infrastructure than a simple chatbot.
It is in this context that *NoLife Models was born:
A Symfony 8 project designed to explore this new layer of local infrastructure around AI models.
The problem: local models are collapsing
Today, there is a gigantic number of models:
-Qwen
- Llama
- Granite
- Gemma
- Mistral
- Phi
- DeepSeek
- etc.
And each one possesses:
- different sizes;
- different quantifications;
- different capacities;
- different window contexts;
- different behaviors.
The problem is therefore no longer:
“How to use a template?”
But rather:
“Which model to use, in what context, on what runtime, with what performance?”
models.dev: towards a standardized model catalog
One particularly interesting project in this evolution is:
The idea is simple: transform the models into structured objects that can be used by tools.
We are no longer just talking about a “model name”.
We now speak of:
- modalities;
- reasoning flags;
- pricing;
- capabilities;
- context windows;
- providers;
- Runtime compatibility.
This is very similar to what Docker Hub represented for containers.
Models: Exploring models as software artifacts
Around this idea, exploration interfaces are also emerging, such as:
The topic then becomes:
- filter;
- compare ;
- explore ;
- to understand the actual capabilities of the models.
Exactly as we started to do with:
- containers;
- packages;
- cloud runtimes;
- OCI images.
Docker ecosystem ↔ AI runtime ecosystem
The analogy with Docker is becoming increasingly relevant.
| Docker ecosystem | AI runtime ecosystem | | ---------------- | -------------------------- | | Docker Hub | models.dev | | Pictures | GGUF / model weights | | dockerd | Ollama / Kronk / vLLM | | Kubernetes | orchestration agent | | Observability | inference tracing | | OCI runtime | runtime abstraction layers |
The subject is therefore no longer simply:
“to run an LLM”.
The topic becomes:
“governing a model infrastructure.”
Ollama: the local HTTP runtime
The runtime that probably popularized this approach is:
Ollama exposes models as local HTTP services.
A few endpoints are sufficient:
/api/tags
/api/generate
/api/chat
And immediately:
- any application;
- any language;
- any orchestrator;
can begin to interact with local models.
This simplicity is extremely powerful.
NoLife Models
It is precisely this observation that motivated the creation of:
NoLife Models is a local application built with:
- Symfony 8
- Symfony UX
- Twig
- Live Components
- Turbo
- HttpClient
The project enables:
- explore a catalog of models;
- to list the installed Ollama models;
- to compare several models;
- to launch benchmarks;
- to export the results.
A runtime-oriented architecture
The central point of the project is probably this abstraction:
interface LocalModelRuntimeInterface
{
/*- @return list<LocalModel> */
public function listLocalModels(): array;
public function generate(
GeneratePromptCommand $command
): ModelInferenceResult;
}
This decision completely transforms the architecture.
The domain no longer depends on:
- d'Ollama;
- from OpenAI;
- from a specific provider.
The domain depends solely on:
- of an inference contract.
And this immediately opens the way to:
- LM Studio;
- vLLM;
- OpenAI-compatible runtimes;
- embedded runtimes;
- future adapters.
Symfony 8 + Hexagonal Architecture
The project follows a DDD/hexagonal approach.
UserInterface
↓
Application
↓
Domain ports
↓
Infrastructure adapters
Responsibilities are separate:
| Layer | Role | | -------------- | ------------------------ | | Domain | contracts + models | | Application | orchestration | | Infrastructure | Ollama adaptor, exports | | UserInterface | Symfony UX + controllers |
The runtime then becomes a simple interface implementation.
Compare models locally
One of the most interesting elements of the project is the comparison engine.
Even prompt. Same runtime surface. Same configuration.
But several models.
This allows for comparison:
- latency;
- loading time;
- generation speed;
- reasoning;
- quality of responses;
- hallucinations;
- OCR;
- vision.
Most importantly:
Mathematical benchmarks alone are not sufficient.
Quality remains a matter of human interpretation.
Benchmarks and reproducibility
The project also allows for the launching of benchmark suites.
The idea:
- execute several prompts;
- on several models;
- with controlled parameters;
- and produce structured results.
This is gradually bringing the project closer to:
- of an evaluation system;
- of a runtime observatory;
- of an observability layer.
Exports and Governance
Exports play a very important role:
- JSON
- CSV
- Markdown
Because an inference without artifacts becomes difficult to audit.
Exports allow:
- to draw;
- reproduce;
- compare ;
- archive;
- analyze.
And it is precisely at this point that we move beyond simple “prompt engineering”.
Kronk: Another direction
Another particularly interesting project is:
Kronk should not be seen as:
“a better Ollama”.
The philosophy is different.
Ollama exposes models as local HTTP services.
Kronk pushes inference directly into the application process.
The inference then becomes:
- embedded;
- programmable;
- integrated into the application runtime.
With :
- GGUF;
- llama.cpp;
- Yzma;
- streaming;
- OpenAI compatible APIs.
The model is gradually ceasing to be a simple HTTP endpoint.
It becomes:
a software dependency.
Towards a local AI infrastructure
The most interesting thing about this development, It's probably because we're seeing the same patterns reappear as in the cloud:
- catalogues;
- runtimes;
- observability;
- orchestration;
- policies;
- exports;
- traces;
- governance.
But applied this time: to local inference.
Conclusion
NoLife Models is not designed as:
- a chatbot;
- an OpenAI wrapper;
- a simple Ollama UI.
The project explores a broader question:
What does a local AI runtime infrastructure look like?
With :
- catalogues;
- runtimes;
- benchmarks;
- exports;
- observability;
- abstractions runtime.
We are probably still at the beginning of this ecosystem.
But the primitives are already starting to appear.
And that becomes extremely interesting to observe.
GitHub Repositories
- 🐙 An open-source database of AI models. : https://github.com/anomalyco/models.dev
- ☺️ Your personal engine for running open source models locally: https://github.com/ardanlabs/kronk
Sources
- 🚀 I created an app to compare your local LLMs with Ollama: https://www.youtube.com/watch?v=YzxE3jQqItI
- 🧩 Extract Insights from Videos with Docling + OpenRAG: https://www.youtube.com/watch?v=Y0b1TANWZ-Y
- 🤯 AI Model explorer based on models.dev: https://github.com/dgageot/modles
- 😮 Baby steps with Kronk https://k33g.hashnode.dev/baby-steps-with-kronk-1
- 😋 How to cook a little coding agent with Docker Model Runner and Docker Agent (and sbx) https://k33g.org/20260419-little-coder-agent.html
- 😍 fabpot Activity https://github.com/symfony/models-dev/commits?author=fabpot
🔗 Links of the week
- Symfony Level Up #9 Sylvain Blondeau: https://symfonylevelup.substack.com/p/symfony-level-up-9
- US giants are pushing the boundaries even further (too far?) and China is leading the way: https://www.youtube.com/watch?v=L4LCSXvA7LU
- Oussama: For massive job cuts! https://www.youtube.com/watch?v=GLfPVWRns-U
- From €0 to €10,000/month with AI: the exact method I wish I had: https://www.youtube.com/watch?v=sRtQmFEhlBE
- Fouloscopie: How to discuss effectively? https://www.youtube.com/watch?v=8J1opDS1otY
- MACI #158 - Discover CKE, our managed Kubernetes - With Antoine Blondeau and Gilles Biannic: https://www.youtube.com/watch?v=FtAF5kN_8pY
- Github Open Source Friday with Spec-Kit: https://www.youtube.com/watch?v=2IArMAhkJcE
- Generate Images Locally with Docker Model Runner and Open WebUI https://www.docker.com/blog/blog-generate-images-locally-dmr-open-webui/
- Digital Defence Commission - DEF'LAN 2026 | LIVE: https://www.youtube.com/watch?v=OW4VCl6P-l4
- Why TTS Models Now Look Like LLMs — Samuel Humeau, Mistral: https://www.youtube.com/watch?v=3jGAU2sbAyY
- Give Your Chat Agent a Voice — Luke Harries, ElevenLabs: https://www.youtube.com/watch?v=DCZZ3AJKzuc
- Voice AI: when is the “Her” moment? — Neil Zeghidour, Gradium AI: https://youtu.be/P_RI1kCkRbo?is=w2jQToL-6ua941SI
- Context Is the New Code — Patrick Debois, Tessl: https://www.youtube.com/watch?v=bSG9wUYaHWU
- Here's one of the engineers explaining how they use LLMs to generate $30B+ every year: https://x.com/thejayden/status/2052847766754250815?s=46
- Why even Apple's legendary logistics can't withstand the RAMpocalypse | OctogoneTech #8: https://www.youtube.com/watch?v=gjYbOViRy_k Can France still create tech giants? (With Carlos Diaz): https://www.youtube.com/watch?v=74TpWDkYpdE
- Suraj vs The Future | With ChatGPT: https://www.youtube.com/watch?v=bMmEEa8-6fU
- The 3 Most Important Claude Features Beginners Don't Know About: https://www.youtube.com/watch?v=tkpdPvx65A0
- How to Improve Video Streaming in Next.js - Adaptive Bitrate Streaming Tutorial | ImageKit: https://www.youtube.com/watch?v=MKbdkWfVZ1w
- Skills for AI agents specializing in French bureaucracy: https://github.com/romainsimon/paperasse
- Building a Chess Coach — Anant Dole and Asbjorn Steinskog, Take Take Take: https://www.youtube.com/watch?v=FlzpEGHNVKQ
- Become a no-code & AI Product Builder with Uncode School: https://www.youtube.com/watch?v=8Ikwj_SNSNI Anthropic just buried generalist AI (and nobody saw it coming): https://www.youtube.com/watch?v=qqhQDBClm1Y
- Your Agent Can Now Train Models — Merve Noyan, Hugging Face: https://www.youtube.com/watch?v=OV56RddyFuU
🎶 Music credit
- A little footwork from New York New Jersey. ⚽ #FIFAWorldCup: https://vm.tiktok.com/ZNRGDjFGx/
@fifaworldcup A little footwork from New York New Jersey. ⚽ #FIFAWorldCup ♬ original sound - FIFA World Cup