⚙️ NoLife Models - Towards a local infrastructure for AI runtimes with Symfony

on May 13, 2026

For years, using an AI model meant calling a remote API.

The workflow was relatively simple:

send a prompt
wait for a response
display text.

But in recent months, a new ecosystem has been emerging around local models.

An ecosystem composed of:

model catalogues
local runtimes
benchmark systems
structured exports
observability
governance
orchestration.

And gradually, AI is starting to look more like a software infrastructure than a simple chatbot.

It is in this context that *NoLife Models was born:

A Symfony 8 project designed to explore this new layer of local infrastructure around AI models.

The problem: local models are collapsing

Today, there is a gigantic number of models:

Qwen
Llama
Granite
Gemma
Mistral
Phi
DeepSeek

And each one possesses:

different sizes
different quantifications
different capacities
different window contexts
different behaviors.

The problem is therefore no longer:

“How to use a template?”

But rather:

“Which model to use, in what context, on what runtime, with what performance?”

models.dev: towards a standardized model catalog

One particularly interesting project in this evolution is:

models.dev

The idea is simple: transform the models into structured objects that can be used by tools.

We are no longer just talking about a “model name”.

We now speak of:

modalities
reasoning flags
pricing
capabilities
context windows
providers
Runtime compatibility.

This is very similar to what Docker Hub represented for containers.

Models: Exploring models as software artifacts

Around this idea, exploration interfaces are also emerging, such as:

models

The topic then becomes:

filter
compare
explore
to understand the actual capabilities of the models.

Exactly as we started to do with:

containers
packages
cloud runtimes
OCI images.

Docker ecosystem ↔ AI runtime ecosystem

The analogy with Docker is becoming increasingly relevant.

Docker ecosystem	AI runtime ecosystem
Docker Hub	models.dev
Pictures	GGUF / model weights
dockerd	Ollama / Kronk / vLLM
Kubernetes	orchestration agent
Observability	inference tracing
OCI runtime	runtime abstraction layers

The subject is therefore no longer simply:

“to run an LLM”.

The topic becomes:

“governing a model infrastructure.”

Ollama: the local HTTP runtime

The runtime that probably popularized this approach is:

Ollama

Ollama exposes models as local HTTP services.

A few endpoints are sufficient:

/api/tags
/api/generate
/api/chat

And immediately:

any application
any language
any orchestrator

can begin to interact with local models.

This simplicity is extremely powerful.

NoLife Models

It is precisely this observation that motivated the creation of:

NoLife Models

NoLife Models is a local application built with:

Symfony 8
Symfony UX
Twig
Live Components
Turbo
HttpClient

The project enables:

explore a catalog of models
to list the installed Ollama models
to compare several models
to launch benchmarks
to export the results.

A runtime-oriented architecture

The central point of the project is probably this abstraction:

interface LocalModelRuntimeInterface
{
    /*- @return list<LocalModel> */
    public function listLocalModels(): array

    public function generate(
        GeneratePromptCommand $command
    ): ModelInferenceResult
}

This decision completely transforms the architecture.

The domain no longer depends on:

d'Ollama
from OpenAI
from a specific provider.

The domain depends solely on:

of an inference contract.

And this immediately opens the way to:

LM Studio
vLLM
OpenAI-compatible runtimes
embedded runtimes
future adapters.

Symfony 8 + Hexagonal Architecture

The project follows a DDD/hexagonal approach.

UserInterface
    ↓
Application
    ↓
Domain ports
    ↓
Infrastructure adapters

Responsibilities are separate:

Layer	Role
Domain	contracts + models
Application	orchestration
Infrastructure	Ollama adaptor, exports
UserInterface	Symfony UX + controllers

The runtime then becomes a simple interface implementation.

Compare models locally

One of the most interesting elements of the project is the comparison engine.

Even prompt. Same runtime surface. Same configuration.

But several models.

This allows for comparison:

latency
loading time
generation speed
reasoning
quality of responses
hallucinations
OCR
vision.

Most importantly:

Mathematical benchmarks alone are not sufficient.

Quality remains a matter of human interpretation.

Benchmarks and reproducibility

The project also allows for the launching of benchmark suites.

The idea:

execute several prompts
on several models
with controlled parameters
and produce structured results.

This is gradually bringing the project closer to:

of an evaluation system
of a runtime observatory
of an observability layer.

Exports and Governance

Exports play a very important role:

JSON
CSV
Markdown

Because an inference without artifacts becomes difficult to audit.

Exports allow:

to draw
reproduce
compare
archive
analyze.

And it is precisely at this point that we move beyond simple “prompt engineering”.

Kronk: Another direction

Another particularly interesting project is:

Kronk

Kronk should not be seen as:

“a better Ollama”.

The philosophy is different.

Ollama exposes models as local HTTP services.

Kronk pushes inference directly into the application process.

The inference then becomes:

embedded
programmable
integrated into the application runtime.

With :

GGUF
llama.cpp
Yzma
streaming
OpenAI compatible APIs.

The model is gradually ceasing to be a simple HTTP endpoint.

It becomes:

a software dependency.

Towards a local AI infrastructure

The most interesting thing about this development, It's probably because we're seeing the same patterns reappear as in the cloud:

catalogues
runtimes
observability
orchestration
policies
exports
traces
governance.

But applied this time: to local inference.

Conclusion

NoLife Models is not designed as:

a chatbot
an OpenAI wrapper
a simple Ollama UI.

The project explores a broader question:

What does a local AI runtime infrastructure look like?

With :

catalogues
runtimes
benchmarks
exports
observability
abstractions runtime.

We are probably still at the beginning of this ecosystem.

But the primitives are already starting to appear.

And that becomes extremely interesting to observe.

GitHub Repositories

🐙 An open-source database of AI models. : https://github.com/anomalyco/models.dev
☺️ Your personal engine for running open source models locally: https://github.com/ardanlabs/kronk

Sources

🚀 I created an app to compare your local LLMs with Ollama: https://www.youtube.com/watch?v=YzxE3jQqItI
🧩 Extract Insights from Videos with Docling + OpenRAG: https://www.youtube.com/watch?v=Y0b1TANWZ-Y
🤯 AI Model explorer based on models.dev: https://github.com/dgageot/modles
😮 Baby steps with Kronk https://k33g.hashnode.dev/baby-steps-with-kronk-1
😋 How to cook a little coding agent with Docker Model Runner and Docker Agent (and sbx) https://k33g.org/20260419-little-coder-agent.html
😍 fabpot Activity https://github.com/symfony/models-dev/commits?author=fabpot

🔗 Links of the week

Symfony Level Up #9 Sylvain Blondeau: https://symfonylevelup.substack.com/p/symfony-level-up-9
US giants are pushing the boundaries even further (too far?) and China is leading the way: https://www.youtube.com/watch?v=L4LCSXvA7LU
Oussama: For massive job cuts! https://www.youtube.com/watch?v=GLfPVWRns-U
From €0 to €10,000/month with AI: the exact method I wish I had: https://www.youtube.com/watch?v=sRtQmFEhlBE
Fouloscopie: How to discuss effectively? https://www.youtube.com/watch?v=8J1opDS1otY
MACI #158 - Discover CKE, our managed Kubernetes - With Antoine Blondeau and Gilles Biannic: https://www.youtube.com/watch?v=FtAF5kN_8pY
Github Open Source Friday with Spec-Kit: https://www.youtube.com/watch?v=2IArMAhkJcE
Generate Images Locally with Docker Model Runner and Open WebUI https://www.docker.com/blog/blog-generate-images-locally-dmr-open-webui/
Digital Defence Commission - DEF'LAN 2026 | LIVE: https://www.youtube.com/watch?v=OW4VCl6P-l4
Why TTS Models Now Look Like LLMs — Samuel Humeau, Mistral: https://www.youtube.com/watch?v=3jGAU2sbAyY
Give Your Chat Agent a Voice — Luke Harries, ElevenLabs: https://www.youtube.com/watch?v=DCZZ3AJKzuc
Voice AI: when is the “Her” moment? — Neil Zeghidour, Gradium AI: https://youtu.be/P_RI1kCkRbo?is=w2jQToL-6ua941SI
Context Is the New Code — Patrick Debois, Tessl: https://www.youtube.com/watch?v=bSG9wUYaHWU
Here's one of the engineers explaining how they use LLMs to generate $30B+ every year: https://x.com/thejayden/status/2052847766754250815?s=46
Why even Apple's legendary logistics can't withstand the RAMpocalypse | OctogoneTech #8: https://www.youtube.com/watch?v=gjYbOViRy_k Can France still create tech giants? (With Carlos Diaz): https://www.youtube.com/watch?v=74TpWDkYpdE
Suraj vs The Future | With ChatGPT: https://www.youtube.com/watch?v=bMmEEa8-6fU
The 3 Most Important Claude Features Beginners Don't Know About: https://www.youtube.com/watch?v=tkpdPvx65A0
How to Improve Video Streaming in Next.js - Adaptive Bitrate Streaming Tutorial | ImageKit: https://www.youtube.com/watch?v=MKbdkWfVZ1w
Skills for AI agents specializing in French bureaucracy: https://github.com/romainsimon/paperasse
Building a Chess Coach — Anant Dole and Asbjorn Steinskog, Take Take Take: https://www.youtube.com/watch?v=FlzpEGHNVKQ
Become a no-code & AI Product Builder with Uncode School: https://www.youtube.com/watch?v=8Ikwj_SNSNI Anthropic just buried generalist AI (and nobody saw it coming): https://www.youtube.com/watch?v=qqhQDBClm1Y
Your Agent Can Now Train Models — Merve Noyan, Hugging Face: https://www.youtube.com/watch?v=OV56RddyFuU

🎶 Music credit

A little footwork from New York New Jersey. ⚽ #FIFAWorldCup: https://vm.tiktok.com/ZNRGDjFGx/

@fifaworldcup A little footwork from New York New Jersey. ⚽ #FIFAWorldCup ♬ original sound - FIFA World Cup

Chinese students welcome President Trump to the Great Hall of the People in Beijing 💐 🎥: @MargoMartin47

Chinese students welcome President Trump to the Great Hall of the People in Beijing 💐

🎥: @MargoMartin47 pic.twitter.com/wChlpVnphh
— Rapid Response 47 (@RapidResponse47) May 14, 2026