Release v0.1.4 is out — 100% open source

The ultra-fast AI Gateway written in Rust.

Melis unifies OpenAI, Anthropic, Google Vertex, OCI GenAI and Ollama behind a single OpenAI-compatible contract. Stateless, sub-2ms overhead, under 32Mi RSS — built for production LLMOps.

View on GitHub Download v0.1.4

Apache-style open source Stateless & horizontally scalable Kubernetes native

Run Melis in one command

bash

docker run -d \
  --name melis-gateway \
  -p 9090:9090 \
  -v $(pwd)/config.yaml:/app/config.yaml:ro \
  -v $(pwd)/routes.yaml:/app/routes.yaml:ro \
  -e MELIS_SERVER_PORT=9090 \
  melis-gateway:latest

One contract — every major provider

OpenAIAnthropicGoogle Vertex AIOCI GenAIOllamaDeepSeekLlama 3

Features

Everything an LLM platform team needs — in one tiny Rust binary.

Move load balancing, circuit breaking, token compression and rate limiting out of your application code and into a high-performance infrastructure layer.

OpenAI-compatible contract

Exposes POST /v1/chat/completions. Melis transpiles payloads on the fly to each upstream provider's schema.

Sub-2ms overhead

Non-blocking async Rust core. Internal processing under 2ms with a memory footprint below 32Mi RSS.

Weighted multi-provider routing

Native support for openai, anthropic, google_vertex_ai, oci_genai and ollama with configurable traffic weights.

Adaptive context trimming

Tokenizes inputs locally and trims repetitive metadata before sending to the cloud — protect your token budget.

Enterprise resiliency

Distributed token-bucket rate limiting and circuit breaking with exponential backoff, orchestrated via Redis.

Cloud-native by design

Hot-reload routes.yaml, Prometheus /metrics, OpenTelemetry tracing and Kubernetes-compliant probes.

Architecture

Stateless. Horizontal. Production-grade.

Melis instances scale horizontally inside Kubernetes with no shared state. Volatile cluster state, blocklists and token-bucket counters live in an external high-speed Redis layer.

Pure reverse proxy
Sits between your apps and any LLM provider.
Zero-impact migrations
Swap providers in routes.yaml — zero application code changes.
Hot-reload config
routes.yaml reloads within seconds without dropping active connections.

[ App Python (FastAPI) ] ──┐
                           ├──► [ Melis AI Gateway Pod ] ──► [ OpenAI / Claude / Gemini ]
[ App Java (Quarkus)   ] ──┘                │
                                            ▼
                               [ Redis ] ◄──┴──► [ Prometheus / OTel ]

<2ms

Gateway overhead

<32Mi

Memory RSS

100%

Open source

Declarative routing

Swap providers without touching application code.

Move from a costly OpenAI setup to a local Ollama Llama 3 model by editing a single YAML file. Melis intercepts, translates and streams responses natively.

routes.yaml

yaml

routes:
  - path: "/v1/chat/completions"
    method: "POST"
    provider: "ollama"         # Swapped from "openai" instantly
    model: "llama3.2"          # Overrides the payload target model
    token_optimization:
      strategy: "adaptive_trimming"
      compress_above_tokens: 4096

Your app — unchanged

python

from openai import OpenAI

client = OpenAI(base_url="http://melis:9090/v1", api_key="sk-anything")

resp = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello from Melis!"}],
)
print(resp.choices[0].message.content)

Deploy

First-class citizen of modern cloud infra.

Run as a standalone Docker container, or deploy natively to Kubernetes with ConfigMaps, Secrets and Horizontal Pod Autoscaling.

Docker standalone

bash

docker run -d \
  --name melis-gateway \
  -p 9090:9090 \
  -v $(pwd)/config.yaml:/app/config.yaml:ro \
  -v $(pwd)/routes.yaml:/app/routes.yaml:ro \
  -e MELIS_SERVER_PORT=9090 \
  melis-gateway:latest

Kubernetes native

bash

kubectl apply -f k8s/configmap.yaml
kubectl apply -f k8s/secret.yaml
kubectl apply -f k8s/deployment.yaml
kubectl apply -f k8s/service.yaml
kubectl apply -f k8s/hpa.yaml

Observability

Turn the third-party AI black box into a transparent stream of metrics.

Scrape /metrics from Prometheus, ship traces with OpenTelemetry and build Grafana dashboards your SRE team will actually trust.

Token volumetrics

Real-time tracking of input vs output tokens per API key or client.

Network performance

Isolated latency profiles — gateway overhead vs provider round-trip.

Resiliency lifecycles

Live circuit breaker status, failure ratios and fallback activations.

curl /metrics

bash

# HELP melis_request_duration_seconds Gateway overhead per request
# TYPE melis_request_duration_seconds histogram
melis_request_duration_seconds_bucket{provider="openai",le="0.002"} 18421
melis_tokens_total{provider="anthropic",direction="input"}   1284912
melis_tokens_total{provider="anthropic",direction="output"}   421038
melis_circuit_breaker_state{provider="google_vertex_ai"} 0
melis_ratelimit_drops_total{client="tenant-a"} 12

Forever open source

Built in the open. Run it anywhere.

Melis is — and will always be — 100% open source. No paid tier, no vendor lock-in, no "open core" surprises. Fork it, deploy it, contribute back.

github.com/gomesrocha/melis Release notes v0.1.4

Ship your AI features behind a real gateway.

Drop Melis in front of any OpenAI SDK and gain routing, resiliency, observability and cost control — without rewriting a line of application code.

Get started Read the docs