Multi-LLM providers¶
OSS v0.1 ships two: Mistral (EU-sovereign) and Ollama (self-hosted). The plugin surface is a single abstract base class, so alternate providers can be implemented without touching any call-site in the router layer.
The interface¶
# backend/app/services/llm/base.py
class BaseLLMProvider(abc.ABC):
provider_name: str # recorded verbatim in audit log
provider_model: str # recorded verbatim in audit log
data_region: str # "EU" | "US" | "self-hosted"
@abc.abstractmethod
async def stream_chat(
self,
messages: list[dict[str, str]],
temperature: float = 0.3,
) -> AsyncIterator[str]:
...
async def health_check(self) -> bool:
return True
Three contracts on every implementation:
stream_chatyields content-only tokens as strings. Provider- specific SSE / chunk formats are handled inside the impl.provider_nameandprovider_modelare recorded in every audit entry — the Art. 53 upstream-GPAI trail in the dossier derives from them.data_regionis declared for the dossier's provider-manifest so a deployer can produce an Art. 10 data-flow statement if asked.
Shared OpenAI-compatible streamer¶
Both Mistral and Ollama expose /v1/chat/completions in OpenAI's SSE
format, so both providers inherit from
OpenAICompatibleProvider:
# backend/app/services/llm/openai_compatible.py
class OpenAICompatibleProvider(BaseLLMProvider):
def __init__(self, *, provider_name, provider_model,
data_region, api_base, api_key=None, timeout=120.0):
...
Writing a new OpenAI-compatible provider is a seven-line subclass:
class LocalAIProvider(OpenAICompatibleProvider):
def __init__(self, *, api_base, model, api_key=None):
super().__init__(
provider_name="localai",
provider_model=model,
data_region="self-hosted",
api_base=api_base,
api_key=api_key,
)
Provider selection per organisation¶
organizations has two columns added in the initial migration:
llm_provider TEXT—"mistral","ollama", or NULL (env default).llm_model TEXT— e.g."mistral-large-latest","gemma3:4b", or NULL (provider default).
get_llm_provider(org) (backend/app/services/llm/factory.py) looks
up per-org settings, falls back to DEFAULT_LLM_PROVIDER /
provider-specific model defaults from config.
Commercial providers (not in OSS)¶
The following live in the closed commercial plugin repo
Lex-Custis/pro-plugins:
- Anthropic Claude — streaming via Anthropic's messages API.
- OpenAI — gpt-4o / gpt-4o-mini, via the stock OpenAI SDK.
- Azure OpenAI — same endpoint shape, deployment-name routing.
- AWS Bedrock — Claude, Llama, Mistral via Bedrock.
- GCP Vertex — Gemini, Claude via Vertex.
These are gated commercial plugins. The ABC and factory are open — writing your own Anthropic plugin against the OSS ABC is legal and encouraged for self-hosters. The commercial repo exists so we can maintain and test them as a service.
Compliance implications per provider¶
| Provider | Data region | GDPR adequacy | Art. 53 disclosure |
|---|---|---|---|
| Mistral (hosted) | EU (France) | Yes (intra-EU) | mistral.ai/legal |
| Ollama (self-hosted) | Whatever you host it on | Depends on your deploy | Upstream model's disclosure — pin it per model |
| Anthropic (commercial) | US | Requires SCCs or DPF | anthropic.com/legal |
| OpenAI (commercial) | US (default); EU residency on Enterprise | Requires SCCs or DPF | openai.com/policies |
The dossier's provider_manifest.json records which provider +
model served each inference in the period, so a regulator can see at
a glance whether EU-sovereign was the rule or the exception.
Adding your own provider¶
- Implement
BaseLLMProvider(or extendOpenAICompatibleProvider). - Register in
factory.py::_build(). - Add provider-specific config keys to
config.py+.env.example. - Add a unit test covering the streaming token yield.
- Cover the name in the dossier's
_UPSTREAM_DISCLOSURESmap so Art. 53 manifests are complete.
A PR that follows this pattern will merge fast in the OSS repo.