Darkwood Blog Blog
  • Articles
  • Auto
en
  • de
  • fr
Login
  • Blog
  • Articles
  • Auto

⚙️ NoLife Models - Towards a local infrastructure for AI runtimes with Symfony

on May 13, 2026

For years, using an AI model meant calling a remote API.

The workflow was relatively simple:

  • send a prompt;
  • wait for a response;
  • display text.

But in recent months, a new ecosystem has been emerging around local models.

An ecosystem composed of:

  • model catalogues;
  • local runtimes;
  • benchmark systems;
  • structured exports;
  • observability;
  • governance;
  • orchestration.

And gradually, AI is starting to look more like a software infrastructure than a simple chatbot.

It is in this context that *NoLife Models was born:

  • NoLife Models GitHub
  • SlideWire presentation repository

A Symfony 8 project designed to explore this new layer of local infrastructure around AI models.

The problem: local models are collapsing

Today, there is a gigantic number of models:

-Qwen

  • Llama
  • Granite
  • Gemma
  • Mistral
  • Phi
  • DeepSeek
  • etc.

And each one possesses:

  • different sizes;
  • different quantifications;
  • different capacities;
  • different window contexts;
  • different behaviors.

The problem is therefore no longer:

“How to use a template?”

But rather:

“Which model to use, in what context, on what runtime, with what performance?”

models.dev: towards a standardized model catalog

One particularly interesting project in this evolution is:

  • models.dev

The idea is simple: transform the models into structured objects that can be used by tools.

We are no longer just talking about a “model name”.

We now speak of:

  • modalities;
  • reasoning flags;
  • pricing;
  • capabilities;
  • context windows;
  • providers;
  • Runtime compatibility.

This is very similar to what Docker Hub represented for containers.

Models: Exploring models as software artifacts

Around this idea, exploration interfaces are also emerging, such as:

  • models

The topic then becomes:

  • filter;
  • compare ;
  • explore ;
  • to understand the actual capabilities of the models.

Exactly as we started to do with:

  • containers;
  • packages;
  • cloud runtimes;
  • OCI images.

Docker ecosystem ↔ AI runtime ecosystem

The analogy with Docker is becoming increasingly relevant.

| Docker ecosystem | AI runtime ecosystem | | ---------------- | -------------------------- | | Docker Hub | models.dev | | Pictures | GGUF / model weights | | dockerd | Ollama / Kronk / vLLM | | Kubernetes | orchestration agent | | Observability | inference tracing | | OCI runtime | runtime abstraction layers |

The subject is therefore no longer simply:

“to run an LLM”.

The topic becomes:

“governing a model infrastructure.”

Ollama: the local HTTP runtime

The runtime that probably popularized this approach is:

  • Ollama

Ollama exposes models as local HTTP services.

A few endpoints are sufficient:

/api/tags
/api/generate
/api/chat

And immediately:

  • any application;
  • any language;
  • any orchestrator;

can begin to interact with local models.

This simplicity is extremely powerful.

NoLife Models

It is precisely this observation that motivated the creation of:

  • NoLife Models

NoLife Models is a local application built with:

  • Symfony 8
  • Symfony UX
  • Twig
  • Live Components
  • Turbo
  • HttpClient

The project enables:

  • explore a catalog of models;
  • to list the installed Ollama models;
  • to compare several models;
  • to launch benchmarks;
  • to export the results.

A runtime-oriented architecture

The central point of the project is probably this abstraction:

interface LocalModelRuntimeInterface
{
    /*- @return list<LocalModel> */
    public function listLocalModels(): array;

    public function generate(
        GeneratePromptCommand $command
    ): ModelInferenceResult;
}

This decision completely transforms the architecture.

The domain no longer depends on:

  • d'Ollama;
  • from OpenAI;
  • from a specific provider.

The domain depends solely on:

  • of an inference contract.

And this immediately opens the way to:

  • LM Studio;
  • vLLM;
  • OpenAI-compatible runtimes;
  • embedded runtimes;
  • future adapters.

Symfony 8 + Hexagonal Architecture

The project follows a DDD/hexagonal approach.

UserInterface
    ↓
Application
    ↓
Domain ports
    ↓
Infrastructure adapters

Responsibilities are separate:

| Layer | Role | | -------------- | ------------------------ | | Domain | contracts + models | | Application | orchestration | | Infrastructure | Ollama adaptor, exports | | UserInterface | Symfony UX + controllers |

The runtime then becomes a simple interface implementation.

Compare models locally

One of the most interesting elements of the project is the comparison engine.

Even prompt. Same runtime surface. Same configuration.

But several models.

This allows for comparison:

  • latency;
  • loading time;
  • generation speed;
  • reasoning;
  • quality of responses;
  • hallucinations;
  • OCR;
  • vision.

Most importantly:

Mathematical benchmarks alone are not sufficient.

Quality remains a matter of human interpretation.

Benchmarks and reproducibility

The project also allows for the launching of benchmark suites.

The idea:

  • execute several prompts;
  • on several models;
  • with controlled parameters;
  • and produce structured results.

This is gradually bringing the project closer to:

  • of an evaluation system;
  • of a runtime observatory;
  • of an observability layer.

Exports and Governance

Exports play a very important role:

  • JSON
  • CSV
  • Markdown

Because an inference without artifacts becomes difficult to audit.

Exports allow:

  • to draw;
  • reproduce;
  • compare ;
  • archive;
  • analyze.

And it is precisely at this point that we move beyond simple “prompt engineering”.

Kronk: Another direction

Another particularly interesting project is:

  • Kronk

Kronk should not be seen as:

“a better Ollama”.

The philosophy is different.

Ollama exposes models as local HTTP services.

Kronk pushes inference directly into the application process.

The inference then becomes:

  • embedded;
  • programmable;
  • integrated into the application runtime.

With :

  • GGUF;
  • llama.cpp;
  • Yzma;
  • streaming;
  • OpenAI compatible APIs.

The model is gradually ceasing to be a simple HTTP endpoint.

It becomes:

a software dependency.

Towards a local AI infrastructure

The most interesting thing about this development, It's probably because we're seeing the same patterns reappear as in the cloud:

  • catalogues;
  • runtimes;
  • observability;
  • orchestration;
  • policies;
  • exports;
  • traces;
  • governance.

But applied this time: to local inference.

Conclusion

NoLife Models is not designed as:

  • a chatbot;
  • an OpenAI wrapper;
  • a simple Ollama UI.

The project explores a broader question:

What does a local AI runtime infrastructure look like?

With :

  • catalogues;
  • runtimes;
  • benchmarks;
  • exports;
  • observability;
  • abstractions runtime.

We are probably still at the beginning of this ecosystem.

But the primitives are already starting to appear.

And that becomes extremely interesting to observe.

GitHub Repositories

  • 🐙 An open-source database of AI models. : https://github.com/anomalyco/models.dev
  • ☺️ Your personal engine for running open source models locally: https://github.com/ardanlabs/kronk

Sources

  • 🚀 I created an app to compare your local LLMs with Ollama: https://www.youtube.com/watch?v=YzxE3jQqItI
  • 🧩 Extract Insights from Videos with Docling + OpenRAG: https://www.youtube.com/watch?v=Y0b1TANWZ-Y
  • 🤯 AI Model explorer based on models.dev: https://github.com/dgageot/modles
  • 😮 Baby steps with Kronk https://k33g.hashnode.dev/baby-steps-with-kronk-1
  • 😋 How to cook a little coding agent with Docker Model Runner and Docker Agent (and sbx) https://k33g.org/20260419-little-coder-agent.html
  • 😍 fabpot Activity https://github.com/symfony/models-dev/commits?author=fabpot

🔗 Links of the week

  • Symfony Level Up #9 Sylvain Blondeau: https://symfonylevelup.substack.com/p/symfony-level-up-9
  • US giants are pushing the boundaries even further (too far?) and China is leading the way: https://www.youtube.com/watch?v=L4LCSXvA7LU
  • Oussama: For massive job cuts! https://www.youtube.com/watch?v=GLfPVWRns-U
  • From €0 to €10,000/month with AI: the exact method I wish I had: https://www.youtube.com/watch?v=sRtQmFEhlBE
  • Fouloscopie: How to discuss effectively? https://www.youtube.com/watch?v=8J1opDS1otY
  • MACI #158 - Discover CKE, our managed Kubernetes - With Antoine Blondeau and Gilles Biannic: https://www.youtube.com/watch?v=FtAF5kN_8pY
  • Github Open Source Friday with Spec-Kit: https://www.youtube.com/watch?v=2IArMAhkJcE
  • Generate Images Locally with Docker Model Runner and Open WebUI https://www.docker.com/blog/blog-generate-images-locally-dmr-open-webui/
  • Digital Defence Commission - DEF'LAN 2026 | LIVE: https://www.youtube.com/watch?v=OW4VCl6P-l4
  • Why TTS Models Now Look Like LLMs — Samuel Humeau, Mistral: https://www.youtube.com/watch?v=3jGAU2sbAyY
  • Give Your Chat Agent a Voice — Luke Harries, ElevenLabs: https://www.youtube.com/watch?v=DCZZ3AJKzuc
  • Voice AI: when is the “Her” moment? — Neil Zeghidour, Gradium AI: https://youtu.be/P_RI1kCkRbo?is=w2jQToL-6ua941SI
  • Context Is the New Code — Patrick Debois, Tessl: https://www.youtube.com/watch?v=bSG9wUYaHWU
  • Here's one of the engineers explaining how they use LLMs to generate $30B+ every year: https://x.com/thejayden/status/2052847766754250815?s=46
  • Why even Apple's legendary logistics can't withstand the RAMpocalypse | OctogoneTech #8: https://www.youtube.com/watch?v=gjYbOViRy_k Can France still create tech giants? (With Carlos Diaz): https://www.youtube.com/watch?v=74TpWDkYpdE
  • Suraj vs The Future | With ChatGPT: https://www.youtube.com/watch?v=bMmEEa8-6fU
  • The 3 Most Important Claude Features Beginners Don't Know About: https://www.youtube.com/watch?v=tkpdPvx65A0
  • How to Improve Video Streaming in Next.js - Adaptive Bitrate Streaming Tutorial | ImageKit: https://www.youtube.com/watch?v=MKbdkWfVZ1w
  • Skills for AI agents specializing in French bureaucracy: https://github.com/romainsimon/paperasse
  • Building a Chess Coach — Anant Dole and Asbjorn Steinskog, Take Take Take: https://www.youtube.com/watch?v=FlzpEGHNVKQ
  • Become a no-code & AI Product Builder with Uncode School: https://www.youtube.com/watch?v=8Ikwj_SNSNI Anthropic just buried generalist AI (and nobody saw it coming): https://www.youtube.com/watch?v=qqhQDBClm1Y
  • Your Agent Can Now Train Models — Merve Noyan, Hugging Face: https://www.youtube.com/watch?v=OV56RddyFuU

🎶 Music credit

  • A little footwork from New York New Jersey. ⚽ #FIFAWorldCup: https://vm.tiktok.com/ZNRGDjFGx/
@fifaworldcup A little footwork from New York New Jersey. ⚽ #FIFAWorldCup ♬ original sound - FIFA World Cup

Site

  • Sitemap
  • Contact
  • Legal mentions

Network

  • Hello
  • Blog
  • Apps
  • Photos

Social

Darkwood 2026, all rights reserved