Vedran Balagović

Vedran Balagović

Fractional CTO · Technical Advisor

Open to fractional CTO & advisory engagements

The technical co-founder you can hire by the week.

I help founders make the right architecture and AI calls — set technical direction, derisk the hard decisions, lead the team — and when it counts, I can build the product myself. 10+ years of technical leadership, a 15-engineer studio led, technical co-founder several times over.

What I do

How I can help

From setting technical direction to writing the code — most engagements are one or a blend of these.

01

Fractional / interim CTO

Own the technical direction: architecture, roadmap, hiring and process, weekly delivery cadence, and clear technical communication with founders and investors.

ArchitectureRoadmapHiring & process
02

Technical advisory & due diligence

Architecture and scalability reviews, AI feasibility & cost analysis, MVP rescue, and technical due diligence for founders and investors weighing a build or a deal.

Architecture reviewScalabilityTech DD
03

AI / LLM integration

Where AI genuinely belongs in your product — and where it doesn't. RAG, on-device speech, MCP servers and agent tooling, with cost-aware usage tied to the business.

RAGSpeech / ASRMCP & agents
04

Build it end-to-end

When you need execution, not just advice: I take a product from architecture to a shipped first version — full-stack, AI, infrastructure and mobile, on real startup timelines.

Full-stackInfra & DevOpsMobile
10+years building
15engineers led
250+GitHub stars
6+products shipped
Proof

Things I've designed, built and shipped

Founder, technical co-founder and builder across AI infrastructure, data, and mobile.

Neural Draft

Founder & CEO

The backend for AI-built sites. A single API that gives AI-generated apps a production backend — CMS, blogging, social scheduling, booking, and Stripe-connected commerce — plus an MCP server so tools like Claude Code, Cursor and Continue build against it with automatic brand consistency, and a zero-dependency TypeScript SDK. Multi-tenant, multi-language.

Multi-tenant SaaSMCP serverTypeScript SDKLLM pipelinesStripe commerce

Lex Custis

Open source

EU-sovereign compliance engine for the EU AI Act. Self-host it, point it at your LLM, and it becomes the regulator's proof layer: a tamper-evident HMAC hash-chained audit log, one-click Annex IV dossier export, an Article 73 incident workflow, and PII / prompt-injection / bias checks. Multi-tenant, multi-LLM.

AGPL-3.0FastAPI + PostgresHash-chain auditEU AI Act

gospeech

Private R&D

Local speech-to-text with speaker diarization — no Python, no cloud, a single binary. Wraps a pure-C Qwen3-ASR engine (PyTorch-matching quality, hand-tuned SIMD, CUDA) with Pyannote 3.0 diarization via sherpa-onnx, plus optional local-LLM speaker identification.

Qwen3-ASRGo + C / CGODiarizationOn-device

AutoValuer

Builder

Used-car valuations computed from 266k+ live listings across 11 European markets, refreshed 4× daily — with photo-based AI condition checks and fair-price ranges. Powered by the CarsDataset API (370+ brands, CSV/SQL export). A study in scraping, normalization and data engineering at scale.

Scraping at scale11 marketsAI visionREST API

Littlemind

Builder · Mobile

A 60-second daily check-in app for parents of kids 4–14; an AI companion surfaces patterns in mood, sleep and behavior and suggests evidence-based techniques. Flutter, iOS & Android, 10 languages — built and shipped to both stores solo.

FlutterAI insights10 languages

motchi.art

Builder

An AI character-animation engine — turns inputs into character animations through an AI-driven pipeline, for creators and studios who want to skip manual animation work.

AI animationGenerative pipeline
Depth

Hands-on with the hard parts of AI

Not just calling APIs — building the pipelines, models and tooling underneath them.

Speech & ASR

Private R&D

On-device Qwen3-ASR transcription with speaker diarization (gospeech) — pure-C inference engine, SIMD/CUDA kernels, streaming with encoder cache, and local-LLM speaker identification. PyTorch-matching quality, fully offline.

Qwen3-ASRsherpa-onnxllama.cpp

RAG & retrieval

Private R&D

Retrieval-augmented generation systems end to end — document chunking, embeddings, vector search, re-ranking and grounded, cited answers. The plumbing that makes LLM features trustworthy in production.

EmbeddingsVector searchGrounding

Agent tooling

Public + private

MCP servers (Neural Draft), an LLM compliance engine (Lex Custis), and Promptual — a GPT-4o prompt-improver browser extension. The tooling that lets AI agents act safely and usefully.

MCPGPT-4oCost-aware AI
Open source

~250+ stars across the Flutter ecosystem

Developer tooling used by hundreds of engineers.

See all repositories on GitHub →

Experience

A decade of building & leading

2025 — Present

Founder & CEO · Neural Draft

AI infrastructure · UK / Remote

Built an AI-backend platform from zero to production: multi-tenant architecture, LLM generation pipelines, an MCP server and TypeScript SDK, and a cost-aware AI usage strategy.

2025 — Present

CTO & Co-Founder · Axerate

US / Remote

Technical co-founder owning architecture and execution. Delivered the MVP under real startup constraints and set up CI/CD, code review and a weekly release cadence from day one.

2016 — 2025

CTO · QED d.o.o.

Software consultancy & product studio · Croatia

Grew a dev company from 2 to 15 engineers. Acted as fractional CTO for multiple startups across mobile, web and backend; cut time-to-market with internal tooling and reusable frameworks.

Earlier

Co-Founder · Emploio

HR Tech

Co-founded a recruitment-technology startup. Designed core system architecture, data pipelines and matching logic.

Toolbox

Technology stack

Mobile & Frontend

FlutterDartVue.jsReactTypeScript

Backend & APIs

LaravelPythonGoNode.jsRESTMicroservices

AI / LLM

OpenAIClaudeRAGVector searchFine-tuningQwen3-ASRMCPllama.cpp

Infra & DevOps

AWSGCPScalewayFirebaseDockerNginxCI/CDMonitoring

Data & Scraping

Scraping pipelinesNormalizationDuckDB

Databases

PostgreSQLMySQLDuckDBMongoDBFirestore
Writing & community

Sharing what I learn

Let's talk

Need a CTO, advisor, or someone to build it?

Fractional / interim CTO · technical advisory & due diligence · AI strategy · end-to-end MVP build. Tell me what you're working on and where you're stuck — I'll tell you the fastest sound path to it.