Multi-LLM providers¶

OSS v0.1 ships two: Mistral (EU-sovereign) and Ollama (self-hosted). The plugin surface is a single abstract base class, so alternate providers can be implemented without touching any call-site in the router layer.

The interface¶

# backend/app/services/llm/base.py
class BaseLLMProvider(abc.ABC):
    provider_name: str     # recorded verbatim in audit log
    provider_model: str    # recorded verbatim in audit log
    data_region: str       # "EU" | "US" | "self-hosted"

    @abc.abstractmethod
    async def stream_chat(
        self,
        messages: list[dict[str, str]],
        temperature: float = 0.3,
    ) -> AsyncIterator[str]:
        ...

    async def health_check(self) -> bool:
        return True

Three contracts on every implementation:

stream_chat yields content-only tokens as strings. Provider- specific SSE / chunk formats are handled inside the impl.
provider_name and provider_model are recorded in every audit entry — the Art. 53 upstream-GPAI trail in the dossier derives from them.
data_region is declared for the dossier's provider-manifest so a deployer can produce an Art. 10 data-flow statement if asked.

Shared OpenAI-compatible streamer¶

Both Mistral and Ollama expose /v1/chat/completions in OpenAI's SSE format, so both providers inherit from OpenAICompatibleProvider:

# backend/app/services/llm/openai_compatible.py
class OpenAICompatibleProvider(BaseLLMProvider):
    def __init__(self, *, provider_name, provider_model,
                 data_region, api_base, api_key=None, timeout=120.0):
        ...

Writing a new OpenAI-compatible provider is a seven-line subclass:

class LocalAIProvider(OpenAICompatibleProvider):
    def __init__(self, *, api_base, model, api_key=None):
        super().__init__(
            provider_name="localai",
            provider_model=model,
            data_region="self-hosted",
            api_base=api_base,
            api_key=api_key,
        )

Provider selection per organisation¶

organizations has two columns added in the initial migration:

llm_provider TEXT — "mistral", "ollama", or NULL (env default).
llm_model TEXT — e.g. "mistral-large-latest", "gemma3:4b", or NULL (provider default).

get_llm_provider(org) (backend/app/services/llm/factory.py) looks up per-org settings, falls back to DEFAULT_LLM_PROVIDER / provider-specific model defaults from config.

Commercial providers (not in OSS)¶

The following live in the closed commercial plugin repo Lex-Custis/pro-plugins:

Anthropic Claude — streaming via Anthropic's messages API.
OpenAI — gpt-4o / gpt-4o-mini, via the stock OpenAI SDK.
Azure OpenAI — same endpoint shape, deployment-name routing.
AWS Bedrock — Claude, Llama, Mistral via Bedrock.
GCP Vertex — Gemini, Claude via Vertex.

These are gated commercial plugins. The ABC and factory are open — writing your own Anthropic plugin against the OSS ABC is legal and encouraged for self-hosters. The commercial repo exists so we can maintain and test them as a service.

Compliance implications per provider¶

Provider	Data region	GDPR adequacy	Art. 53 disclosure
Mistral (hosted)	EU (France)	Yes (intra-EU)	mistral.ai/legal
Ollama (self-hosted)	Whatever you host it on	Depends on your deploy	Upstream model's disclosure — pin it per model
Anthropic (commercial)	US	Requires SCCs or DPF	anthropic.com/legal
OpenAI (commercial)	US (default); EU residency on Enterprise	Requires SCCs or DPF	openai.com/policies

The dossier's provider_manifest.json records which provider + model served each inference in the period, so a regulator can see at a glance whether EU-sovereign was the rule or the exception.

Adding your own provider¶

Implement BaseLLMProvider (or extend OpenAICompatibleProvider).
Register in factory.py::_build().
Add provider-specific config keys to config.py + .env.example.
Add a unit test covering the streaming token yield.
Cover the name in the dossier's _UPSTREAM_DISCLOSURES map so Art. 53 manifests are complete.

A PR that follows this pattern will merge fast in the OSS repo.