Gateway Python Library
Track Costs
Every request returns token counts and dollar costs — in headers or in code.
# Point any SDK at the gateway — costs come back in headers
curl http://localhost:7680/v1/chat/completions \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "X-Majordomo-Key: your-key" \
-d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello!"}]}'
# Response headers:
# X-Majordomo-Input-Cost: 0.000125
# X-Majordomo-Output-Cost: 0.000250
# X-Majordomo-Total-Cost: 0.000375 from majordomo_llm import get_llm_instance
llm = get_llm_instance("openai", "gpt-4o")
response = await llm.get_response("Explain async/await in Python.")
print(f"Cost: ${response.total_cost:.6f}")
print(f"Tokens: {response.input_tokens} in / {response.output_tokens} out") Python Library
Get Structured Output
Define a Pydantic model, get typed results — one API across all providers.
from pydantic import BaseModel
from majordomo_llm import get_llm_instance
class MovieReview(BaseModel):
title: str
rating: float
summary: str
llm = get_llm_instance("anthropic", "claude-sonnet-4-20250514")
response = await llm.get_structured_json_response(
response_model=MovieReview,
user_prompt="Review the movie Inception.",
)
print(f"{response.content.title}: {response.content.rating}/10")
print(f"Cost: ${response.total_cost:.6f}") Python Library
Add Automatic Failover
Cascade across providers — if one fails, the next picks up automatically.
from majordomo_llm import LLMCascade
cascade = LLMCascade([
("anthropic", "claude-sonnet-4-20250514"), # Primary
("openai", "gpt-4o"), # Fallback
("gemini", "gemini-2.5-flash"), # Last resort
])
response = await cascade.get_response("Summarize this quarter's metrics.")
print(response.content) # Whichever provider responded first Gateway Python Library Frameworks
Route Through a Gateway
Deploy the gateway, then point any client, library, or framework at it.
# Deploy the gateway, then point any HTTP client at it
curl http://localhost:7680/v1/messages \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "X-Majordomo-Key: your-key" \
-H "X-Majordomo-App-Name: my-app" \
-d '{"model": "claude-sonnet-4-20250514", "max_tokens": 1024,
"messages": [{"role": "user", "content": "Hello!"}]}' from majordomo_llm import get_llm_instance
llm = get_llm_instance(
"anthropic", "claude-sonnet-4-20250514",
base_url="http://localhost:7680",
default_headers={"X-Majordomo-Key": "your-key"},
)
response = await llm.get_response("Hello!") from pydantic_ai import Agent
from majordomo_frameworks.pydantic_ai import create_model, build_extra_headers
model = create_model("openai", "gpt-4o", gateway_url="http://localhost:7680")
agent = Agent(model)
result = await agent.run(
"Summarize this document.",
model_settings={"extra_headers": build_extra_headers(
majordomo_key="your-key", app_name="my-agent",
)},
)