Uplink API Documentation

OpenAI-compatible access to 170+ models, arbitrage-aware routing, workflows, and tenant analytics delivered through a single Cloudflare Worker. This guide covers authentication, common patterns, and every public endpoint.

Platform Overview

Uplink is a drop-in replacement for the OpenAI REST API that routes traffic across Groq, Together.ai, OpenRouter, and bespoke research providers. The worker continuously synchronizes pricing and health signals so the arbitrage engine can deliver 20-40% cost savings without sacrificing quality.

Our Endpoints Use Our Endpointsβ„’ β€” The agent mode calls search endpoints, workflows orchestrate chat completions, and every capability is built by composing other capabilities. This ensures consistency, reliability, and that every feature we add makes every other feature more powerful.

Production Base URL

https://api.frnds.cloud

Staging (Wrangler)

https://api.frnds.cloud

OpenAI Compatibility

Supports /v1/chat/completions, streaming, tools, and JSON modes. V3 endpoints add arbitrage metadata and admin surfaces.

Quickstart

Authenticate with any valid tenant key and hit the v3 chat endpoint. The worker automatically selects the cheapest healthy provider for the requested capability.

Using curl

curl https://api.frnds.cloud/v3/chat/completions \
  -H "Authorization: Bearer ak_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama-3.3-70b-versatile",
    "messages": [{"role": "user", "content": "Summarize Uplink."}],
    "arbitrage_mode": "auto",
    "stream": false
  }'

Using Python SDK

from uplink_client import UplinkClient

client = UplinkClient(
    base_url="https://api.frnds.cloud",
    api_key="ak_your_key"
)

# Basic chat
async with client:
    response = await client.chat({
        "model": "llama-3.3-70b-versatile",
        "messages": [{"role": "user", "content": "Summarize Uplink."}]
    })
    print(response.choices[0].message.content)

# Agent mode with auto-research
async with client:
    response = await client.agent_chat({
        "model": "llama-3.3-70b-versatile",
        "messages": [{"role": "user", "content": "What are the latest AI breakthroughs?"}],
        "tools": ["search", "extract"],
        "max_iterations": 5
    })
    print(response.choices[0].message.content)

Using JavaScript SDK

import { UplinkClient } from '@frnd/uplink-sdk'

const client = new UplinkClient({
  baseUrl: 'https://api.frnds.cloud',
  apiKey: 'ak_your_key'
})

// Basic chat
const response = await client.chat({
  model: 'llama-3.3-70b-versatile',
  messages: [{ role: 'user', content: 'Summarize Uplink.' }]
})
console.log(response.choices[0].message.content)

// Agent mode with auto-research
const agentResponse = await client.agentChat({
  model: 'llama-3.3-70b-versatile',
  messages: [{ role: 'user', content: 'What are the latest AI breakthroughs?' }],
  tools: ['search', 'extract'],
  max_iterations: 5
})
console.log(agentResponse.choices[0].message.content)

Install SDKs:

  • Python: pip install uplink-client (or from sdk/python/)
  • JavaScript: npm install @frnd/uplink-sdk (or from sdk/javascript/)

Streaming works with "stream": true; partial tokens emit as data: events that mirror the OpenAI spec and include arbitrage metadata on the first chunk.

⚠️ Common Mistakes to Avoid

  • Wrong API URL: Always use https://api.frnds.cloud (not .workers.dev or other variants)
  • Missing Headers: Include both Authorization: Bearer ak_your_key AND Content-Type: application/json
  • Malformed JSON: Ensure your request body is valid JSON - use -d '{...}' with curl
  • HTTP Method: Most endpoints use POST for mutations, GET for reads
  • API Key Format: Keys start with ak_ prefix (e.g., ak_dev_abc123)

πŸ›‘οΈ Smart Model Deprecation

Production-safe model migration. When AI providers deprecate models, Uplink automatically routes requests to the best modern alternative based on capabilities and cost - without breaking your application.

How It Works

The system maintains a deprecation registry that maps old model names to their recommended replacements. When you request a deprecated model, Uplink:

  • Automatically resolves to the best alternative (cost + capability optimized)
  • Returns full transparency via response headers and metadata
  • Logs deprecation warnings to help you plan migrations
  • Ensures zero downtime during provider model transitions

Example: Deprecated Model Request

curl https://api.frnds.cloud/v1/chat/completions \
  -H "Authorization: Bearer ak_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama-3.2-90b-vision-preview",
    "messages": [{"role": "user", "content": "Describe this image"}]
  }'

Response Headers

HTTP/1.1 200 OK
X-Model-Deprecated: true
X-Model-Original: llama-3.2-90b-vision-preview
X-Model-Resolved-To: meta-llama/llama-4-scout-17b-16e-instruct
X-Deprecation-Source: MODEL_MAPPING
Content-Type: application/json

Response Metadata

{
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "I can see..."
      }
    }
  ],
  "_deprecation": {
    "original_model": "llama-3.2-90b-vision-preview",
    "resolved_model": "meta-llama/llama-4-scout-17b-16e-instruct",
    "resolution_source": "MODEL_MAPPING",
    "deprecated_since": "2025-04-07",
    "recommended_alternative": "meta-llama/llama-4-scout-17b-16e-instruct",
    "reasoning": "Model explicitly mapped via MODEL_MAPPING"
  }
}

Migration Best Practices

  • Monitor Headers: Check X-Model-Deprecated header in your production logs
  • Plan Updates: When you see deprecation notices, plan to update to the recommended alternative
  • Test First: Try the new model in staging before updating production code
  • No Rush: Automatic resolution keeps your app working - migrate on your timeline

πŸ’‘ Developer Experience First

Deprecations happen across the AI industry. Uplink ensures you're never caught off-guard. Get deprecation notices via three channels (logs, headers, metadata) so you can migrate on your schedule, not the provider's.

✨ Forgiving Model Aliases

The most developer-friendly AI API. Stop memorizing exact model names. Uplink accepts 80+ naming variations - dashes, underscores, with or without provider prefixes - and intelligently maps them to the correct model.

Why This Matters

Different AI providers use different naming conventions. You might remember a model as "llama-3.3-70b" but the provider requires "meta-llama/llama-3.3-70b-versatile". Or you write "whisper_v3" but the API expects "whisper-large-v3". Uplink handles all these variations automatically.

Supported Naming Styles

  • Dashes vs Underscores: llama-3.3-70b ↔ llama_3_3_70b
  • Provider Prefixes: meta-llama/llama-3.3-70b ↔ llama-3.3-70b
  • Version Variations: llama3.3-70b ↔ llama-3.3-70b
  • Short Names: whisper-v3 ↔ whisper-large-v3
  • Provider Typos: moonshot/kimi-k2 ↔ moonshotai/kimi-k2

Examples: All These Work

Llama 3.3 70B

// All of these resolve to the same model:
"llama-3.3-70b-versatile"                    βœ…
"meta-llama/llama-3.3-70b-versatile"         βœ…
"llama3.3-70b-versatile"                     βœ…
"llama_3_3_70b_versatile"                    βœ…
"llama-3.3-70b"                              βœ…

Moonshot Kimi K2

// All of these resolve to the same model:
"moonshot/kimi-k2-instruct"                  βœ…
"moonshotai/kimi-k2-instruct"                βœ…
"kimi-k2"                                    βœ…
"kimi_k2_instruct"                           βœ…
"kimi-k2-instruct"                           βœ…

Whisper (Voice)

// All of these resolve to the same model:
"whisper-large-v3"                           βœ…
"whisper-v3"                                 βœ…
"whisper_large_v3"                           βœ…
"whisper-large-v3-turbo"                     βœ…

How It Works

Uplink's alias system uses intelligent normalization:

  1. Normalize separators: Convert all dashes/underscores to a standard format
  2. Strip prefixes: Remove provider prefixes for matching
  3. Version matching: Map short versions to full model names
  4. Provider fallback: Try variations if exact match fails
  5. Error prevention: Return helpful suggestions if no match found

Usage Example

# Use whatever naming style you prefer:
curl https://api.frnds.cloud/v1/chat/completions \
  -H "Authorization: Bearer ak_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama_3_3_70b",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

# This works exactly the same as:
curl https://api.frnds.cloud/v1/chat/completions \
  -H "Authorization: Bearer ak_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "meta-llama/llama-3.3-70b-versatile",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

🎯 Write Code The Way You Think

No more "model not found" errors from typos or naming confusion. Uplink accepts model names the way YOU write them. Focus on building features, not memorizing syntax.

πŸ‘οΈ Llama 4 Vision Models

Latest multimodal AI. Access Llama 4 Scout and Maverick - Groq's newest vision-language models with 128K context, tool calling, and superior image understanding.

Available Models

Llama 4 Scout 17B

meta-llama/llama-4-scout-17b-16e-instruct

17B params, 16 experts. Recommended for most vision tasks.

Llama 4 Maverick 17B

meta-llama/llama-4-maverick-17b-128e-instruct

17B params, 128 experts. Best for complex reasoning.

Key Capabilities

  • Multimodal Input: Process text + images in the same request
  • 128K Context: Long context window for document analysis
  • Tool Calling: Full function calling support with vision
  • JSON Mode: Structured output for vision tasks
  • Streaming: Real-time responses for vision analysis

Vision Example: Image Analysis

curl https://api.frnds.cloud/v1/chat/completions \
  -H "Authorization: Bearer ak_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "meta-llama/llama-4-scout-17b-16e-instruct",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": "Describe this image in detail"
          },
          {
            "type": "image_url",
            "image_url": {
              "url": "https://example.com/image.jpg"
            }
          }
        ]
      }
    ]
  }'

Vision + Tools Example

from uplink_client import UplinkClient

client = UplinkClient(api_key="ak_your_key")

# Analyze image and extract structured data
response = await client.chat({
    "model": "meta-llama/llama-4-scout-17b-16e-instruct",
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "Extract all text from this receipt and calculate the total"
                },
                {
                    "type": "image_url",
                    "image_url": {"url": "https://example.com/receipt.jpg"}
                }
            ]
        }
    ],
    "tools": [
        {
            "type": "function",
            "function": {
                "name": "calculate_total",
                "description": "Calculate total from line items",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "items": {"type": "array"},
                        "tax": {"type": "number"}
                    }
                }
            }
        }
    ],
    "response_format": {"type": "json_object"}
})

print(response.choices[0].message.content)

Automatic Migration from Llama 3.2 Vision

If you're using the old vision models, they automatically redirect to Llama 4:

  • llama-3.2-90b-vision-preview β†’ meta-llama/llama-4-scout-17b-16e-instruct
  • llama-3.2-11b-vision-preview β†’ meta-llama/llama-4-scout-17b-16e-instruct

Your existing code continues to work with improved performance and lower costs.

Use Cases

  • Document Analysis: Extract data from invoices, receipts, forms
  • Image Understanding: Describe, classify, or analyze images
  • Visual Reasoning: Answer questions about images
  • OCR with Context: Extract text with semantic understanding
  • Multimodal RAG: Combine vision with document search

πŸš€ Production-Ready Vision AI

Llama 4 Scout and Maverick are deployed on Groq's ultra-fast infrastructure with 128K context. Perfect for document processing, image analysis, and multimodal applications that need both speed and accuracy.

πŸš€ SDK v2.0 - Built-in OAuth & Developer Onboarding

NEW! The Uplink SDK now includes complete OAuth authentication, sub-tenant management, and payment gateway integration. Build AI-powered apps with zero configuration.

Package

@frnd/uplink-sdk

Version

2.0.0

Platform

Node.js 18+

Installation

NPM (Recommended):

npm install @frnd/uplink-sdk

Quick Install (curl):

curl -fsSL https://api.frnds.cloud/sdk/install.sh | bash

Direct from URL:

# Install from Cloudflare Pages
npm install -g https://api.frnds.cloud/sdk/uplink-sdk-2.0.0.tgz

# Or from GitHub
npm install -g github:your-org/uplink-worker#main

Quick Start: 3-Step Setup

What's New: The SDK handles OAuth authentication, creates isolated sub-tenants, validates payments, and stores credentials automatically. No manual API key management needed!

Step 1: Initialize Developer Account (one-time)

import { initializeDeveloperAccount } from '@frnd/uplink-sdk'

// Opens browser for OAuth, creates sub-tenant, saves credentials
const result = await initializeDeveloperAccount({
  appName: 'My Awesome App',
  plan: 'pro', // starter, pro, or enterprise
  developerEmail: 'dev@example.com'
})

console.log('βœ“ Sub-tenant:', result.subTenant.id)
console.log('βœ“ API Key:', result.apiKey)
// Credentials saved to .uplink/config.json

Step 2: Use Auto-Configured Client

import { getUplinkClient } from '@frnd/uplink-sdk'

// Automatically loads credentials from .uplink/config.json
const uplink = await getUplinkClient()

const response = await uplink.chat({
  model: 'llama-3.3-70b-versatile',
  messages: [{ role: 'user', content: 'Hello!' }]
})

console.log(response.choices[0].message.content)

Step 3: Deploy Your App

That's it! Your app now has:

  • βœ… OAuth authentication
  • βœ… Isolated sub-tenant with quotas
  • βœ… Automatic credential management
  • βœ… Zero configuration for end users

Payment Gateway Integration

Monetize your app with built-in Stripe/Paddle support:

import { initializeDeveloperAccount } from '@frnd/uplink-sdk'

const result = await initializeDeveloperAccount({
  appName: 'My Paid App',
  plan: 'pro',
  requirePayment: true,
  paymentConfig: {
    provider: 'stripe',
    priceId: 'price_xxxxx', // Your Stripe price ID
    successUrl: 'https://myapp.com/success',
    cancelUrl: 'https://myapp.com/cancel'
  }
})

// Payment is validated before sub-tenant creation βœ“

Custom Payment Validation:

const result = await initializeDeveloperAccount({
  appName: 'My App',
  requirePayment: true,
  paymentConfig: {
    provider: 'custom',
    validateFn: async (paymentToken) => {
      // Your custom validation logic
      return await myPaymentGateway.verify(paymentToken)
    }
  }
})

πŸ’° Pricing SDK - Real-Time Cost Tracking

Enterprise Feature: The Pricing SDK provides real-time cost calculations and infrastructure pricing data. Perfect for building your own pricing models and showing users accurate cost estimates.

Setup:

import { createPricingSDK } from '@frnd/uplink-sdk'

const pricing = createPricingSDK({
  apiKey: 'your-api-key',
  baseUrl: 'https://api.frnds.cloud',
  useActualCosts: true,  // Use real usage data
  lookbackDays: 30        // 30-day average
})

Calculate Operation Costs:

// Get cost for a specific operation
const chatCost = await pricing.getOperationCost('chat', {
  inputTokens: 1000,
  outputTokens: 500
})

console.log(`Chat cost: $${chatCost.toFixed(6)}`)

// Calculate document processing cost
const docCost = await pricing.getOperationCost('document', {
  documentSizeKB: 500
})

// Calculate embedding cost
const embeddingCost = await pricing.getOperationCost('embedding', {
  inputTokens: 2000
})

Calculate Total User Costs:

// Calculate monthly cost for a user's usage
const monthlyCost = await pricing.calculateUserCost({
  documents: 100,       // Documents processed
  chats: 1000,          // Chat requests
  storageGB: 5,         // Storage used
  voiceMinutes: 60      // Voice transcription
})

console.log(`Monthly cost: $${monthlyCost.totalCost}`)
console.log(`Breakdown:`)
console.log(`  LLM: $${monthlyCost.breakdown.llm}`)
console.log(`  Storage: $${monthlyCost.breakdown.storage}`)
console.log(`  Voice: $${monthlyCost.breakdown.voice}`)

Get Infrastructure Pricing:

// Get full pricing configuration
const config = await pricing.getPricingConfig()

console.log('LLM Pricing:')
console.log(`  Input: $${config.llm.inputTokenCost} per 1M tokens`)
console.log(`  Output: $${config.llm.outputTokenCost} per 1M tokens`)

console.log('Embeddings:')
console.log(`  Cost: $${config.embeddings.costPerMillionTokens} per 1M tokens`)

console.log('Storage:')
console.log(`  Vectorize: $${config.vectorize.storageCostPer100MDimensions} per 100M dims`)
console.log(`  R2: $${config.r2.storageCostPerGB} per GB/month`)

Build Custom Pricing Tiers:

// Generate optimal pricing tiers based on usage patterns
const tiers = await pricing.generateTiers({
  targetMargin: 0.25,           // 25% margin
  expectedMonthlyUsers: 1000,
  avgDocumentsPerUser: 50,
  avgChatsPerUser: 500
})

console.log('Recommended Tiers:')
tiers.forEach(tier => {
  console.log(`${tier.name}: $${tier.price}/month`)
  console.log(`  Includes: ${tier.quotas.documents} docs, ${tier.quotas.chats} chats`)
  console.log(`  Margin: ${tier.margin}%`)
})

Accuracy

Based on real infrastructure costs with volume discounts

Updates

Real-time pricing synchronized from providers

Granularity

Per-operation, per-user, per-tenant cost tracking

API Client Features

The SDK includes a full-featured OpenAI-compatible client:

Chat Completions:

const response = await uplink.chat({
  model: 'llama-3.3-70b-versatile',
  messages: [
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'Hello!' }
  ],
  temperature: 0.7,
  max_tokens: 1000
})

Streaming:

for await (const chunk of uplink.streamChat({
  model: 'llama-3.3-70b-versatile',
  messages: [{ role: 'user', content: 'Tell me a story...' }]
})) {
  process.stdout.write(chunk.choices[0]?.delta?.content || '')
}

Multi-Provider Arbitrage:

// Use canonical model name for automatic routing
const response = await uplink.chatWithArbitrage({
  model: 'llama-3.3-70b', // No provider suffix
  messages: [{ role: 'user', content: 'Hello!' }]
}, {
  mode: 'cost', // 'auto', 'cost', 'speed', 'quality'
  max_cost: 0.001
})

// Check savings
console.log(`Provider: ${response._arbitrage.provider_used}`)
console.log(`Saved: ${response._arbitrage.savings_percentage}%`)

Magic API (Ultra-Simple):

import { Magic } from '@frnd/uplink-sdk'

const magic = new Magic('https://api.frnds.cloud')

// Simple question
const answer = await magic.ask('What is quantum computing?')

// Web search
const results = await magic.search('latest AI developments 2024')

// Extract content
const content = await magic.extract('https://example.com')

// Conversational chat
const chat = magic.chat()
await chat.say('Hello!')
await chat.say('How do I use async/await?')
console.log(chat.history)

TypeScript Support

Full TypeScript support with comprehensive types:

import {
  UplinkClient,
  initializeDeveloperAccount,
  type OnboardingOptions,
  type OnboardingResult,
  type UplinkCredentials,
  type UplinkPricingConfig
} from '@frnd/uplink-sdk'

const options: OnboardingOptions = {
  appName: 'My App',
  plan: 'pro',
  requirePayment: true
}

const result: OnboardingResult = await initializeDeveloperAccount(options)
const credentials: UplinkCredentials = result.credentials

Configuration Management

The SDK stores credentials in .uplink/config.json:

{
  "UPLINK_URL": "https://api.frnds.cloud",
  "API_KEY": "ak_sub_xyz789...",
  "TENANT_ID": "sub_abc123",
  "APP_NAME": "My App",
  "PLAN": "pro",
  "CREATED_AT": "2025-11-01T12:00:00.000Z",
  "MASTER_API_KEY": "ak_master123...",
  "MASTER_TENANT_ID": "tenant_parent456"
}

Load Credentials:

import { loadConfig } from '@frnd/uplink-sdk'

const config = loadConfig() // Loads from .uplink/config.json
if (!config) {
  console.error('Run: npx @frnd/uplink-sdk init')
  process.exit(1)
}

console.log(`API Key: ${config.API_KEY}`)
console.log(`Tenant: ${config.TENANT_ID}`)
Security: Never commit .uplink/config.json to git. Add it to .gitignore immediately.

Plans & Quotas

Plan Monthly Tokens Daily Tokens RPM Concurrent Storage
Starter 100K 5K 10 2 1 GB
Pro 1M 50K 60 10 10 GB
Enterprise 10M 500K 300 50 100 GB

Additional Resources

✨ Magic EVS - Zero-Config Document RAG

NEW! Upload files and chat with them in 2 API calls - no chunking, no embeddings, no configuration required.

Quick Example

# 1. Upload a file
curl -X POST https://api.frnds.cloud/v3/evs/magic/upload \
  -H "Authorization: Bearer ak_your_key" \
  -F "file=@employee-handbook.pdf"

# Response includes source_id
# {"success":true,"sourceId":"magic_employee_handbook_pdf_123",...}

# 2. Ask questions
curl "https://api.frnds.cloud/v3/evs/magic/search?q=How%20many%20PTO%20days?" \
  -H "Authorization: Bearer ak_your_key"

Accuracy

98-100% - Hybrid search + progressive retrieval

Latency

50ms P50 - Multi-stage with early exit

Setup Time

1 API call - Zero configuration

Magic EVS Endpoints

POST /v3/evs/magic/upload

Upload a file with automatic chunking and indexing. Supports PDF, TXT, MD, JSON, CSV.

Field Type Description
file File File to upload (multipart/form-data)
chat_id string Associate with specific conversation
tags string Comma-separated tags (e.g., "hr,policies")
category string Document category for filtering
GET /v3/evs/magic/search

Semantic search across uploaded documents with automatic query optimization.

curl "https://api.frnds.cloud/v3/evs/magic/search?q=vacation%20policy&limit=10" \
  -H "Authorization: Bearer ak_your_key"
POST /v3/evs/magic/chat

Ask questions about your documents - RAG automatically retrieves context.

curl -X POST https://api.frnds.cloud/v3/evs/magic/chat \
  -H "Authorization: Bearer ak_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "message": "What is the PTO policy?",
    "source_id": "magic_handbook_pdf_123"
  }'
POST /v3/evs/magic/batch

Upload multiple files at once with automatic grouping.

curl -X POST https://api.frnds.cloud/v3/evs/magic/batch \
  -H "Authorization: Bearer ak_your_key" \
  -F "file1=@policy1.pdf" \
  -F "file2=@policy2.pdf" \
  -F "file3=@handbook.md" \
  -F "group_id=company_docs"

Why Magic EVS? Upload a file, get instant search and chat. No manual chunking, no embedding management, no vector DB setup. 98%+ accuracy with 50ms P50 latency. Production-ready RAG in 1 API call.

πŸŽ™οΈ Voice API - Speech-to-Text & Text-to-Speech

NEW! Production-ready voice capabilities with multi-provider support for transcription (STT) and synthesis (TTS). Build voice assistants, conversational AI, and audio workflows with minimal code.

Speech-to-Text

Groq Whisper models (large-v3, large-v3-turbo) with multi-language support

Text-to-Speech

Multiple providers: Groq PlayAI, OpenAI, ElevenLabs with 30+ voices

Voice Sessions

Stateful conversations with audio turn management and history

Supported Providers & Models

Speech-to-Text (STT)

Provider Model Languages Features
Groq (default) whisper-large-v3 100+ languages High accuracy, timestamps, segments
Groq whisper-large-v3-turbo 100+ languages Faster, lower latency

Text-to-Speech (TTS)

Provider Model Voices Formats
Groq (default) playai-tts 19 English voices (Fritz, Arista, Atlas, etc.) WAV
Groq playai-tts-arabic 4 Arabic voices (Ahmad, Amira, Khalid, Nasser) WAV
OpenAI tts-1, tts-1-hd 6 voices (alloy, echo, fable, onyx, nova, shimmer) MP3, Opus, AAC, FLAC
ElevenLabs eleven_monolingual_v1, eleven_multilingual_v1/v2 Bella, Rachel, Domi (+ custom voices) MP3

Quick Start Examples

Transcribe Audio

curl -X POST https://api.frnds.cloud/v1/voice/transcribe \
  -H "Authorization: Bearer ak_your_key" \
  -F "audio=@recording.mp3" \
  -F "model=whisper-large-v3-turbo" \
  -F "language=en"

Synthesize Speech

curl -X POST https://api.frnds.cloud/v1/voice/synthesize \
  -H "Authorization: Bearer ak_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Hello, how can I help you today?",
    "voice": "Fritz-PlayAI",
    "provider": "groq",
    "speed": 1.0,
    "format": "wav"
  }'

Voice Conversation Session

# 1. Create session
SESSION_ID=$(curl -X POST https://api.frnds.cloud/v1/voice/sessions \
  -H "Authorization: Bearer ak_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama-3.3-70b-versatile",
    "ttsProvider": "groq",
    "voice": "Fritz-PlayAI",
    "speed": 1.0,
    "language": "en"
  }' | jq -r '.sessionId')

# 2. Process audio turn (user speaks, assistant responds with audio)
curl -X POST https://api.frnds.cloud/v1/voice/sessions/$SESSION_ID/turn \
  -H "Authorization: Bearer ak_your_key" \
  -F "audio=@user_question.mp3"

# 3. Process text turn (typed input, audio response)
curl -X POST https://api.frnds.cloud/v1/voice/sessions/$SESSION_ID/text \
  -H "Authorization: Bearer ak_your_key" \
  -H "Content-Type: application/json" \
  -d '{"text": "What is the weather today?"}'

# 4. Get session info
curl https://api.frnds.cloud/v1/voice/sessions/$SESSION_ID \
  -H "Authorization: Bearer ak_your_key"

Voice Sessions Explained

Voice sessions provide stateful conversation management with:

  • Conversation History - Maintains context across multiple turns
  • Audio Caching - Optional caching of synthesized responses
  • Turn Management - Handles audio upload, transcription, LLM processing, and TTS in one call
  • Streaming Support - Server-Sent Events (SSE) for real-time updates
  • Metadata Tracking - Duration, turn count, timestamps, and cost estimates

Session Configuration

Setting Type Default Description
voice string Fritz-PlayAI Voice ID for TTS synthesis
speed number 1.0 Speech speed (0.25-4.0)
silenceThresholdMs number 1000 Silence detection threshold
turnTimeoutMs number 5000 Maximum turn processing time
maxResponseLength number 500 Max characters in assistant response
enableCache boolean true Cache synthesized audio
language string en Primary language code

Supported Audio Formats

Input (Transcription): WAV, MP3, WebM, Ogg, Opus, AAC, FLAC

Output (Synthesis): Varies by provider

  • Groq: WAV only
  • OpenAI: MP3, Opus, AAC, FLAC
  • ElevenLabs: MP3

Pricing

Operation Provider Rate
Transcription (STT) Groq Whisper $0.05 per minute
Synthesis (TTS) Groq PlayAI $0.50 per 1M characters (est.)
Synthesis (TTS) OpenAI tts-1 $0.015 per 1K characters
Synthesis (TTS) OpenAI tts-1-hd $0.030 per 1K characters

Note: ElevenLabs pricing varies by plan. Check your ElevenLabs dashboard for current rates.

Why Uplink Voice? Multi-provider support with automatic failover. Session-based conversation management with context retention. Supports 100+ languages for transcription and 30+ voices for synthesis. Production-ready with cost tracking and usage analytics.

Authentication

All API endpoints require an API key. Keys are tenant scoped and can be rotated without downtime. Three authentication patterns are accepted for backward compatibility:

Method Format Use Case
Authorization header Authorization: Bearer ak_xxx Recommended for all OpenAI-compatible clients.
Query string ?api_key=ak_xxx Ollama-compatible integrations and simple demos.
Request body { "api_key": "ak_xxx" } Legacy SDKs that cannot set headers.

Uplink enforces global and per-key rate limiting, KV-backed quota tracking, and tenant suspension checks before any request is proxied to a provider.

πŸ” Self-Service Signup & Signin

Uplink provides secure email-verified authentication endpoints for self-service user onboarding with automatic tier-based API token generation.

POST /public/signup

Create a new account and receive verification code via email (expires in 15 minutes).

curl -X POST https://api.frnds.cloud/public/signup \
  -H "Content-Type: application/json" \
  -d '{
    "email": "user@example.com",
    "first_name": "Jane",
    "last_name": "Doe",
    "phone": "+1234567890"
  }'
POST /public/verify-email

Verify email with 6-digit code and receive dev token (7 days, 5 requests, 50K tokens).

curl -X POST https://api.frnds.cloud/public/verify-email \
  -H "Content-Type: application/json" \
  -d '{
    "email": "user@example.com",
    "code": "123456"
  }'
POST /public/signin

Sign in to existing account and receive verification code via email.

curl -X POST https://api.frnds.cloud/public/signin \
  -H "Content-Type: application/json" \
  -d '{
    "email": "user@example.com"
  }'
POST /public/verify-signin

Verify signin code and retrieve your active API token.

curl -X POST https://api.frnds.cloud/public/verify-signin \
  -H "Content-Type: application/json" \
  -d '{
    "email": "user@example.com",
    "code": "654321"
  }'

Security: All codes are time-limited (15 min), single-use, and require email verification. Failed attempts are logged for security monitoring. Tokens are only shown once at creation.

πŸ“± Telegram Login Widget

NEW! Passwordless signup and signin using Telegram. Automatic account creation with 30-day developer tier (100 requests/day).

POST /auth/telegram/verify

Verify Telegram Login Widget authentication data and create/sign-in user.

curl -X POST https://api.frnds.cloud/auth/telegram/verify \
  -H "Content-Type: application/json" \
  -d '{
    "id": 123456789,
    "first_name": "John",
    "last_name": "Doe",
    "username": "johndoe",
    "photo_url": "https://...",
    "auth_date": 1698765432,
    "hash": "abc123..."
  }'

Response

{
  "success": true,
  "message": "Welcome back, John!",
  "session_id": "sess_abc123...",
  "token": "ak_dev_xyz789...",
  "user": {
    "id": "user_xyz789",
    "telegram_id": 123456789,
    "tier": "telegram_signup",
    "first_name": "John",
    "username": "johndoe"
  }
}
POST /api/link-telegram

Link Telegram account to an existing email-based account. Requires authenticated session.

curl -X POST https://api.frnds.cloud/api/link-telegram \
  -H "Content-Type: application/json" \
  -H "Cookie: uplink_session=sess_..." \
  -d '{
    "id": 123456789,
    "first_name": "John",
    "username": "johndoe",
    "auth_date": 1698765432,
    "hash": "abc123..."
  }'

Response

{
  "success": true,
  "message": "Telegram account linked successfully",
  "linked_methods": ["email", "telegram"]
}

Benefits of Telegram Auth

  • Instant signup - No email verification needed
  • Secure - Cryptographically signed by Telegram
  • Generous limits - 30-day trial with 100 requests/day
  • Multi-auth - Link both email and Telegram to one account

User Dashboard & Session Management

NEW! Authenticated dashboard for viewing usage, managing tokens, and upgrading tiers. Uses HttpOnly cookies for security.

Create Session

POST /auth/session

Create an authenticated session using your API token. Returns HttpOnly cookie for dashboard access.

curl -X POST https://api.frnds.cloud/auth/session \
  -H "Content-Type: application/json" \
  -d '{
    "token": "ak_dev_your_token_here"
  }'

Response

{
  "success": true,
  "message": "Session created",
  "session_id": "sess_abc123..."
}

Get Dashboard Data

GET /api/dashboard

Retrieve user information, usage statistics, and limits. Requires authenticated session (cookie).

curl https://api.frnds.cloud/api/dashboard \
  -H "Cookie: uplink_session=sess_..."

Response

{
  "user": {
    "id": "user_xyz789",
    "email": "user@example.com",
    "first_name": "Jane",
    "last_name": "Doe",
    "tier": "email_verified",
    "created_at": "2025-10-20T10:30:00Z",
    "expires_at": "2025-10-27T10:30:00Z",
    "telegram_id": 123456789,
    "telegram_username": "janedoe",
    "linked_auth_methods": ["email", "telegram"],
    "can_upgrade": true,
    "needs_email": false
  },
  "token": "ak_dev_your_token_here",
  "usage": {
    "total_requests": 45,
    "total_tokens": 12500
  },
  "limits": {
    "max_requests": 100,
    "max_tokens": 100000,
    "expires_in_days": 7
  }
}

Create Checkout Session

GET /api/create-checkout

Create Stripe checkout session for tier upgrade. Requires authenticated session and verified email.

curl "https://api.frnds.cloud/api/create-checkout?tier=tier2" \
  -H "Cookie: uplink_session=sess_..."

Response

{
  "url": "https://checkout.stripe.com/c/pay/cs_...",
  "session_id": "cs_..."
}

Sign Out

POST /api/signout

Destroy session and sign out. Clears HttpOnly cookie.

curl -X POST https://api.frnds.cloud/api/signout \
  -H "Cookie: uplink_session=sess_..."

SDK Usage

JavaScript/TypeScript

import { UplinkClient } from 'uplink-client'

const client = new UplinkClient({
  baseUrl: 'https://api.frnds.cloud'
})

// Create session (cookies handled automatically in browser)
await client.createSession('ak_dev_your_token')

// Get dashboard data
const dashboard = await client.getDashboard()
console.log(`Usage: ${dashboard.usage.total_requests}/${dashboard.limits.max_requests}`)

// Create checkout for upgrade
if (dashboard.user.can_upgrade) {
  const checkout = await client.createCheckout('tier2')
  window.location.href = checkout.url
}

Python

from uplink_client import UplinkClient

client = UplinkClient(
    base_url="https://api.frnds.cloud"
)

# Create session (note: cookie handling requires requests session)
session_resp = await client.create_session("ak_dev_your_token")

# Get dashboard data
dashboard = await client.get_dashboard()
print(f"Usage: {dashboard.usage.total_requests}/{dashboard.limits.max_requests}")

# Create checkout for upgrade
if dashboard.user.can_upgrade:
    checkout = await client.create_checkout("tier2")
    print(f"Checkout URL: {checkout.url}")

Security Notes

  • HttpOnly cookies - Session tokens not accessible via JavaScript
  • 30-day sessions - Automatic expiration
  • Email required for upgrades - Telegram-only users must add email first
  • Stripe webhooks - Automatic tier upgrades after payment

Pricing & Billing

Uplink offers tiered subscription plans with included tokens and pay-as-you-go overage billing.

Billing Tiers

Tier Price Included Tokens Overage Rate Rate Limits
Free $0/month 100K tokens $0.15 per 1M tokens 60 req/hour, 1 req/min
Hobby $9/month 500K tokens $0.12 per 1M tokens (20% savings) 600 req/hour, 10 req/min
Starter $29/month 2M tokens $0.10 per 1M tokens (33% savings) 6K req/hour, 100 req/min
Pro $99/month 10M tokens $0.08 per 1M tokens (47% savings) 30K req/hour, 500 req/min

How Billing Works

  • Base Subscription: Monthly charge for your tier with included token allowance
  • Metered Usage: Tokens beyond your included amount are billed at your tier's overage rate
  • Hourly Reporting: Token usage is aggregated hourly and reported to Stripe
  • Promo Codes: Apply discount codes during checkout or in your billing portal
  • No Surprises: Monitor your usage in real-time via response headers

Usage Tracking Headers

Every API response includes usage tracking headers:

X-Token-Usage: 12450              # Total tokens used this period
X-Token-Limit: 500000             # Included tokens for your tier
X-Token-Remaining: 487550         # Tokens remaining before overage
X-Token-Reset: 1735689600         # Unix timestamp when usage resets
X-Token-Overage: 0                # Overage tokens (if any)
X-Estimated-Overage-Cost: 0.00    # Estimated overage cost in USD

Billing API Endpoints

GET /api/billing/tiers

Get all available billing tiers with pricing details.

curl https://api.frnds.cloud/api/billing/tiers

Response:

[
  {
    "id": "free",
    "name": "Free",
    "price_monthly": 0,
    "included_tokens": 100000,
    "overage_rate": 0.15,
    "rate_limit": {
      "requests_per_hour": 60,
      "requests_per_minute": 1
    },
    "features": [
      "100K tokens/month",
      "Pay-as-you-go after limit",
      "Community support"
    ]
  }
  // ... more tiers
]
POST /api/billing/promo-code

Apply a promotional code to your subscription.

curl -X POST https://api.frnds.cloud/api/billing/promo-code \
  -H "Cookie: uplink_session=YOUR_SESSION_ID" \
  -H "Content-Type: application/json" \
  -d '{"code": "LAUNCH2025"}'

Response:

{
  "success": true,
  "message": "Promo code applied successfully",
  "discount": {
    "id": "di_123abc",
    "coupon_id": "LAUNCH2025",
    "percent_off": 20,
    "duration": "repeating",
    "duration_in_months": 3
  }
}
GET /api/billing/portal

Get URL for Stripe customer billing portal.

curl https://api.frnds.cloud/api/billing/portal \
  -H "Cookie: uplink_session=YOUR_SESSION_ID"

Response:

{
  "url": "https://billing.stripe.com/session/..."
}

SDK Examples

JavaScript/TypeScript

import { UplinkClient } from '@frnd/uplink-sdk'

const client = new UplinkClient({
  baseUrl: 'https://api.frnds.cloud',
  apiKey: 'your-api-key'
})

// Get billing info
const billing = await client.getBillingInfo()
console.log(`Tier: ${billing.tier}`)
console.log(`Status: ${billing.subscription_status}`)

// Get usage details
const usage = await client.getUsageDetails()
console.log(`Tokens used: ${usage.tokens_used} / ${usage.tokens_included}`)
console.log(`Overage cost: $${usage.estimated_cost}`)

// Get available tiers
const tiers = await client.getBillingTiers()
for (const tier of tiers) {
  console.log(`${tier.name}: $${tier.price_monthly}/mo`)
}

// Apply promo code
const result = await client.applyPromoCode('LAUNCH2025')
if (result.success && result.discount) {
  console.log(`Applied ${result.discount.percent_off}% discount`)
}

// Get billing portal URL
const portal = await client.getBillingPortalUrl()
console.log(`Manage subscription: ${portal.url}`)

Python

from uplink_client import UplinkClient

client = UplinkClient(
    base_url='https://api.frnds.cloud',
    api_key='your-api-key'
)

# Get billing info
billing = await client.get_billing_info()
print(f"Tier: {billing.tier}")
print(f"Status: {billing.subscription_status}")

# Get usage details
usage = await client.get_usage_details()
print(f"Tokens: {usage.tokens_used} / {usage.tokens_included}")
print(f"Overage: ${usage.estimated_cost:.2f}")

# Get available tiers
tiers = await client.get_billing_tiers()
for tier in tiers:
    print(f"{tier.name}: ${tier.price_monthly}/mo")

# Apply promo code
result = await client.apply_promo_code('LAUNCH2025')
if result['success']:
    print(f"Applied discount: {result['discount']['percent_off']}%")

# Get billing portal
portal_url = await client.get_billing_portal_url()
print(f"Manage subscription: {portal_url}")

πŸ’‘ Billing Best Practices

  • Monitor usage via response headers to avoid unexpected overage
  • Use the billing portal to manage payment methods and view invoices
  • Apply promo codes before upgrading to maximize savings
  • Free tier has hard limits - upgrade for overage billing
  • Paid tiers allow unlimited overage at transparent rates

Pricing API (for Arepo & Partners)

The Pricing API provides real-time infrastructure costs and pricing calculations for building your own pricing models. Designed for Arepo and other platform integrators who need accurate cost data to set their own margins and tier pricing.

Enterprise Pricing: First-party tenants (like Arepo) receive 0% markup and enterprise volume discounts (15-20% off). Set tenant_id=arepo in requests to enable this pricing.

Quick Start

All endpoints require Bearer token authentication:

curl https://api.frnds.cloud/v3/pricing/config \
  -H "Authorization: Bearer YOUR_API_KEY"

Endpoints

GET /v3/pricing/config

Get complete infrastructure cost configuration with volume discounts applied.

Query Parameters:

  • tenant_id - Tenant identifier (use "arepo" for enterprise pricing)
  • markup - Optional markup percentage (0.0-0.5) for non-first-party tenants
curl "https://api.frnds.cloud/v3/pricing/config?tenant_id=arepo" \
  -H "Authorization: Bearer YOUR_API_KEY"

Response:

{
  "version": "2025-01-27",
  "lastUpdated": "2025-01-27T00:00:00Z",
  "llm": {
    "inputTokenCost": 0.00000008,
    "outputTokenCost": 0.00000024,
    "avgTokensPerRequest": 750
  },
  "embeddings": {
    "costPerMillionTokens": 0.02,
    "costPerRequest": 0.000015,
    "dimensions": 768
  },
  "vectorize": {
    "storageCostPer100MDimensions": 0.25,
    "queryCostPer1MDimensions": 0.0025,
    "indexMaintenanceCost": 0.01
  },
  "voice": {
    "whisperCostPerHour": 0.006,
    "ttsCostPerMillionChars": 15.0,
    "avgTranscriptionMinutes": 1.5,
    "avgSynthesisChars": 500
  }
}
GET /v3/pricing/operations

Calculate cost for a specific operation.

Query Parameters:

  • operation - Operation type: chat, embedding, vector_search, voice_transcribe, voice_synthesize
  • inputTokens - Number of input tokens
  • outputTokens - Number of output tokens
  • dimensions - Vector dimensions (for embedding/search)
  • minutes - Voice minutes (for transcription)
  • characters - Text characters (for TTS)
  • tenant_id - Tenant for volume discounts
curl "https://api.frnds.cloud/v3/pricing/operations?operation=chat&inputTokens=500&outputTokens=500&tenant_id=arepo" \
  -H "Authorization: Bearer YOUR_API_KEY"

Response:

{
  "cost": 0.00016,
  "currency": "USD",
  "per": "request",
  "operation": "chat",
  "params": {
    "inputTokens": 500,
    "outputTokens": 500
  }
}
POST /v3/pricing/calculate

Calculate total monthly cost for a user's usage pattern.

curl -X POST https://api.frnds.cloud/v3/pricing/calculate \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "documents": 50,
    "chats": 200,
    "storageGB": 0.5,
    "voiceMinutes": 10,
    "ttsCharacters": 2000,
    "tenantId": "arepo"
  }'

Response:

{
  "totalCost": 0.32,
  "breakdown": {
    "llm": 0.15,
    "evs": 0.12,
    "voice": 0.03,
    "infrastructure": 0.02
  },
  "tierRecommendation": "free",
  "estimatedRequests": 200,
  "estimatedTokens": 150000
}
POST /v3/pricing/tiers

Generate recommended tier pricing based on parameters.

curl -X POST https://api.frnds.cloud/v3/pricing/tiers \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "npMultiplier": 1.0,
    "targetMargin": 0.50,
    "markup": 0.0,
    "tenantId": "arepo"
  }'

Response:

[
  {
    "id": "free",
    "name": "Free",
    "npMultiplier": 1.0,
    "price": 0,
    "includedValue": 100000,
    "overageRate": 0.15,
    "costToServe": 1.5,
    "netProfit": -1.5,
    "profitMargin": -1.0
  },
  {
    "id": "hobby",
    "name": "Hobby",
    "npMultiplier": 1.0,
    "price": 9,
    "includedValue": 500000,
    "overageRate": 0.12,
    "costToServe": 7.5,
    "netProfit": 1.5,
    "profitMargin": 0.17
  }
  // ... more tiers
]

SDK Integration

For TypeScript/JavaScript projects, use the Pricing SDK for type-safe access:

import { createPricingSDK } from '@uplink/pricing-sdk'

const pricing = createPricingSDK({
  apiKey: process.env.UPLINK_API_KEY!,
  tenantId: 'arepo',
  markup: 0.0,           // 0% for first-party
  isFirstParty: true,    // Gets enterprise volume discounts
  cacheEnabled: true,
  cacheTTL: 3600         // Cache for 1 hour
})

// Get infrastructure costs
const costs = await pricing.getPricingConfig()

// Calculate user cost
const userCost = await pricing.calculateUserCost({
  documents: 50,
  chats: 200,
  storageGB: 0.5
})

// Get operation cost
const chatCost = await pricing.getOperationCost('chat', {
  inputTokens: 500,
  outputTokens: 500
})

// Generate recommended tiers
const tiers = await pricing.generateTiers({
  npMultiplier: 1.0,
  targetMargin: 0.50
})

Python SDK

from uplink_pricing import PricingClient

pricing = PricingClient(
    api_key=os.environ["UPLINK_API_KEY"],
    tenant_id="arepo",
    markup=0.0,
    is_first_party=True
)

# Get infrastructure costs
costs = pricing.get_pricing_config()

# Calculate user cost
user_cost = pricing.calculate_user_cost(
    documents=50,
    chats=200,
    storage_gb=0.5
)

# Get operation cost
chat_cost = pricing.get_operation_cost(
    operation="chat",
    input_tokens=500,
    output_tokens=500
)
Caching Recommendation: Infrastructure costs are updated hourly. Cache responses for at least 30 minutes to reduce API calls.

Additional Endpoints

  • GET /v3/pricing/np-impact - Calculate impact of NP multiplier changes
  • POST /v3/pricing/feature-impact - Calculate impact of feature toggles on pricing
  • GET /v3/pricing/optimize-freemium - Calculate optimal freemium limits for a cost target
  • GET /v3/pricing/refresh - Force refresh pricing cache (admin only)

See the Arepo Integration Guide for complete documentation and examples.

Rate Limits & Usage Controls

  • Global IP limit: All requests pass through a global limiter; violations return 429 with too_many_requests.
  • Authentication limiter: Failed logins are throttled separately to block brute force attacks.
  • Per-key limiter: Every API key has a dedicated bucket; rate-limit headers mirror OpenAI's X-RateLimit-Remaining semantics.
  • Quota tracking: UsageTracker stores daily and monthly usage in KV. If a tenant exceeds quota, responses return 429 quota_exceeded.

Response Model

Successful chat responses include the standard OpenAI payload plus an _arbitrage block describing the provider decision:

{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1732400000,
  "model": "llama-3.3-70b-versatile",
  "choices": [ ... ],
  "usage": { "prompt_tokens": 120, "completion_tokens": 45, "total_tokens": 165 },
  "_arbitrage": {
    "provider_used": "groq",
    "model_used": "llama-3.3-70b-versatile",
    "arbitrage_mode": "auto",
    "reasoning": "Selected for 23% cost savings",
    "savings_percentage": 23.1,
    "estimated_cost": 0.0007
  }
}

Error payloads follow the OpenAI structure with an error object that includes message, type, and code. Admin and workflow routes return detailed validation errors when Zod schema validation fails.

πŸŽ™οΈ Voice API (ElevenLabs Compatible)

Production-ready text-to-speech service with telephony support. Drop-in replacement for ElevenLabs API with additional features for phone systems.

Key Features

  • 100% ElevenLabs Compatible: Works with existing ElevenLabs code - no changes required
  • Telephony Support: Β΅-law (G.711) encoding, 8kHz/16kHz formats for phone systems
  • Multi-Provider: Automatic selection between Groq, OpenAI, Azure, ElevenLabs, PlayHT
  • Voice Caching: Automatic caching of repeated phrases for instant responses
  • 30% Cost Savings: Provider arbitrage reduces TTS costs automatically

Available Voices

Voice ID Name Description ElevenLabs Compatibility ID
uplink_rachel_001 Rachel Professional, clear narration 21m00Tcm4TlvDq8ikWAM
uplink_bella_002 Bella Friendly, warm tone EXAVITQu4vr4xnSDxMaL
uplink_antoni_003 Antoni Professional male voice ErXwobaYiN019PkySvjV
uplink_elli_004 Elli Energetic, cheerful MF3mGyEYCl7XYWbV9V6O
uplink_josh_005 Josh Casual, friendly TxGEqnHWrfWFTfGW9XjX

Text-to-Speech Endpoint

POST /v1/text-to-speech/{voice_id}

Convert text to speech with specified voice. Returns audio data in requested format.

curl -X POST https://api.frnds.cloud/v1/text-to-speech/uplink_rachel_001 \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Hello from Uplink Voice API",
    "model_id": "eleven_multilingual_v2",
    "voice_settings": {
      "stability": 0.75,
      "similarity_boost": 0.75
    },
    "output_format": "mp3_44100_128"
  }' --output voice.mp3

Telephony Formats

Format Description Use Case
mp3_8000_16 8kHz MP3 @ 16kbps Low bandwidth telephony (recommended)
mp3_16000_32 16kHz MP3 @ 32kbps Balanced quality VoIP
pcm_8000 8kHz PCM Basic telephony (limited support)
pcm_16000 16kHz PCM HD Voice/VoIP (limited support)

Streaming TTS

POST /v1/text-to-speech/{voice_id}/stream

Stream audio chunks as they're generated. Returns Server-Sent Events (SSE).

const response = await fetch('https://api.frnds.cloud/v1/text-to-speech/uplink_rachel_001/stream', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_KEY',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    text: 'Long text to stream...',
    output_format: 'mp3_44100_128',
    generation_config: {
      chunk_length_schedule: [120, 160, 250, 290],
      streaming_latency: 0
    }
  })
})

const reader = response.body.getReader()
// Process streaming chunks...

Voice Agent

POST /v1/voice/agent

Combine LLM intelligence with voice synthesis for conversational AI.

const response = await fetch('https://api.frnds.cloud/v1/voice/agent', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_KEY',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    messages: [
      { role: 'system', content: 'You are a helpful assistant.' },
      { role: 'user', content: 'What is the weather like?' }
    ],
    voice_config: {
      voice_id: 'uplink_rachel_001',
      language: 'en-US',
      speaking_rate: 1.1,
      emotion: 'friendly'
    },
    response_config: {
      include_text: true,
      audio_format: 'mp3_44100_128'
    },
    model: 'llama-3.1-8b-instant',
    enable_voice: true
  })
})

Migration from ElevenLabs

No code changes required! Just update your base URL and API key:

// Before (ElevenLabs)
const BASE_URL = 'https://api.elevenlabs.io/v1'
const API_KEY = 'your_elevenlabs_key'

// After (Uplink)
const BASE_URL = 'https://api.frnds.cloud/v1'
const API_KEY = 'your_uplink_key'

// Your existing code works unchanged!

Both Uplink voice IDs and ElevenLabs IDs are supported, so existing integrations work immediately.

πŸ“„ Full Documentation

For complete Voice API documentation including Twilio integration, SSML support, and SDK examples, see the Voice API Integration Guide.

Endpoint Reference

POST /v3/chat/completions

Create a chat completion using the arbitrage engine. Supports streaming, tool invocation, JSON mode, and structured response metadata.

Field Type Description
model string Canonical model slug (e.g. llama-3.3-70b-versatile). Use /v3/models for the full catalog.
messages Message[] OpenAI-style conversation array with role and content.
arbitrage_mode string auto (default), cost, speed, quality, or manual.
max_latency number Upper bound in milliseconds; providers exceeding this are skipped.
min_quality number Minimum quality score between 0 and 1.
max_cost number Cap the per-request USD spend.
stream boolean Emit SSE chunks that mirror OpenAI streaming responses.
tools Tool[] Function-calling definitions. Provider selection respects tool compatibility.
GET /v3/models

Returns all available models with pricing, provider coverage, and capability metadata aggregated from Groq, Together, and OpenRouter.

GET /v3/models/shared

Lists models that have cross-provider coverage—ideal arbitrage targets.

POST /v3/arbitrage/route

Dry-run routing for a proposed request. Returns the selected provider, canonical model, reasoning, and predicted savings without invoking the downstream API.

GET /v3/arbitrage/opportunities

Raw opportunity ledger from the arbitrage database including provider spreads and health signals.

GET /v3/arbitrage/health

System health snapshot, sync cadence, and the timestamp of the latest provider refresh.

GET /v3/arbitrage/report

Plain-text summary of current arbitrage conditions—perfect for CLI dashboards.

πŸ†• Deprecation Response Fields

When requesting a deprecated model, the API automatically resolves to the best alternative and returns deprecation information via response headers and metadata.

Response Headers (NEW)

Header Description Example Value
X-Model-Deprecated Boolean flag indicating model was deprecated true
X-Model-Original The original model you requested llama-3.2-90b-vision-preview
X-Model-Resolved-To The model that actually handled your request meta-llama/llama-4-scout-17b-16e-instruct
X-Deprecation-Source How the deprecation was resolved MODEL_MAPPING

Response Metadata Object (NEW)

When a deprecated model is used, the response includes a _deprecation object with detailed migration information:

{
  "choices": [...],
  "_deprecation": {
    "original_model": "llama-3.2-90b-vision-preview",
    "resolved_model": "meta-llama/llama-4-scout-17b-16e-instruct",
    "resolution_source": "MODEL_MAPPING",
    "deprecated_since": "2025-04-07",
    "recommended_alternative": "meta-llama/llama-4-scout-17b-16e-instruct",
    "reasoning": "Model explicitly mapped via MODEL_MAPPING"
  }
}
Field Type Description
original_model string The model name from your request
resolved_model string The actual model that processed your request
resolution_source string How the system resolved the deprecation: MODEL_MAPPING, PROVIDER_REDIRECT, or CAPABILITY_MATCH
deprecated_since string (optional) ISO 8601 date when the model was deprecated
recommended_alternative string The recommended model to use going forward
reason string (optional) Human-readable explanation of why the model was deprecated

⚠️ Migration Recommendation

When you see deprecation headers or metadata in your responses, plan to update your code to use the recommended_alternative model. The automatic resolution ensures zero downtime, but updating to the recommended model provides better performance and future-proofs your application.

Voice API Endpoints

POST /v1/voice/sessions

Create a new voice conversation session with configurable TTS/STT providers and voice settings. Returns session ID and endpoint URLs for conversation turns.

Field Type Description
model string LLM model for conversation (e.g. llama-3.3-70b-versatile)
ttsProvider string TTS provider: groq, openai, or elevenlabs
voice string Voice ID (e.g. Fritz-PlayAI, nova, Bella)
speed number Speech speed multiplier (0.25-4.0, default: 1.0)
language string Primary language code (default: en)
POST /v1/voice/transcribe

Transcribe audio to text using Groq Whisper models. Supports 100+ languages with optional language hints. Returns text, language detection, duration, and timestamped segments.

Field Type Description
audio File Audio file (WAV, MP3, WebM, Ogg, Opus, AAC, FLAC)
model string whisper-large-v3 or whisper-large-v3-turbo (default)
language string Optional language hint (ISO 639-1 code)
POST /v1/voice/synthesize

Convert text to speech using Groq PlayAI, OpenAI, or ElevenLabs TTS. Returns base64-encoded audio data with duration and cost estimates.

Field Type Description
text string Text to synthesize (required)
provider string TTS provider: groq, openai, elevenlabs
voice string Voice ID (provider-specific, see /v1/voice/voices)
model string Model ID (e.g. playai-tts, tts-1-hd)
speed number Speech speed (0.25-4.0, default: 1.0)
format string Audio format (provider-dependent: wav, mp3, opus, aac, flac)
POST /v1/voice/sessions/:sessionId/turn

Process a complete voice conversation turn: transcribe user audio, generate LLM response, synthesize assistant speech. Returns both text and audio responses with metadata.

Field Type Description
audio File User audio input file

Response: userText, assistantText, assistantAudio (base64), audioFormat, duration, metadata

POST /v1/voice/sessions/:sessionId/text

Process a text-based conversation turn with audio response. Same as /turn but accepts typed text instead of audio input.

Field Type Description
text string User text input
POST /v1/voice/sessions/:sessionId/stream

Stream a voice conversation turn with Server-Sent Events. Emits events for transcription, LLM generation, and TTS synthesis stages in real-time.

Event Types: transcription_start, transcription_complete, llm_start, llm_chunk, llm_complete, tts_start, tts_chunk, tts_complete, conversation_complete, error, done

GET /v1/voice/sessions/:sessionId

Retrieve voice session information including settings, model configuration, metadata (duration, turn count, timestamps), and conversation history length.

DELETE /v1/voice/sessions/:sessionId

Delete a voice session and all associated conversation history. Returns { success: true } on successful deletion.

GET /v1/voice/voices

List available voices for the specified TTS provider. Returns array of voice objects with id, name, and language fields.

Query Parameter Type Description
provider string TTS provider to query: groq, openai, elevenlabs

πŸš€ Advanced RAG Features

NEW! Production-tested retrieval strategies delivering 98-100% accuracy with 50ms P50 latency.

Hybrid Search

BM25 keyword + semantic vector fusion with Reciprocal Rank Fusion

Progressive Retrieval

Multi-stage search with early exit (50ms Stage 1, 200ms Stage 2)

Adaptive Strategy

Auto-optimizes weights based on query type (lookup, analytical, etc.)

Hybrid Search

POST /v3/evs/hybrid-search

Combines BM25 keyword ranking with semantic vector search. Best for exact-match queries (IPs, version numbers, specific terms).

Field Type Description
query string Search query (required)
chat_id string Filter to specific conversation
semantic_weight number Weight for semantic search (0-1, default: 0.6)
keyword_weight number Weight for keyword search (0-1, default: 0.4)
fusion_method string rrf (Reciprocal Rank Fusion) or linear
curl -X POST https://api.frnds.cloud/v3/evs/hybrid-search \
  -H "Authorization: Bearer ak_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "database server IP address",
    "semantic_weight": 0.3,
    "keyword_weight": 0.7
  }'

Progressive Retrieval

POST /v3/evs/progressive-search

Multi-stage retrieval that starts fast and deepens if confidence is low. Delivers 70% of queries in <50ms via early exit.

Stage Latency Budget Method Confidence Threshold
Stage 1 50ms Cache + keywords 0.9 (exit if met)
Stage 2 200ms Hybrid search 0.8 (exit if met)
Stage 3 800ms Multi-query ensemble 0.7 (final)
curl -X POST https://api.frnds.cloud/v3/evs/progressive-search \
  -H "Authorization: Bearer ak_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "How many PTO days?",
    "chat_id": "support_123"
  }'

Response includes metadata showing which stage completed and final confidence score.

Adaptive Retrieval

POST /v3/evs/adaptive-search

Auto-detects query type (lookup, analytical, procedural, comparison) and optimizes retrieval parameters accordingly.

Query Type Semantic Weight Keyword Weight Chunk Size
Lookup (facts, numbers) 0.3 0.7 Small (200-400 chars)
Analytical (deep) 0.8 0.2 Large (800-1200 chars)
Procedural (how-to) 0.6 0.4 Medium (400-800 chars)
curl -X POST https://api.frnds.cloud/v3/evs/adaptive-search \
  -H "Authorization: Bearer ak_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "Compare PTO policies for full-time vs part-time",
    "auto_optimize": true
  }'

Set auto_optimize: true to let the system analyze and optimize automatically.

Query Classification Improvements

Enhanced UUID Detection (Oct 2025)

The EVS query classifier now uses a full UUID v4 pattern matcher, improving accuracy and reducing false positives:

  • Previous: Partial pattern /\b[0-9a-f]{8}-[0-9a-f]{4}\b/ matched short segments like "20231026-1234"
  • Current: Full pattern /\b[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}\b/i

Impact on Search Behavior

Queries with full UUIDs are now correctly routed to keyword/exact-match search, while queries with UUID-like patterns (but not valid UUIDs) use semantic search for better relevance.

Example queries affected:
  • "Find document 550e8400-e29b-41d4-a716-446655440000" β†’ Keyword search βœ…
  • "Documents from 20231026-1234" β†’ Semantic search (no longer false positive) βœ…
  • "Transaction abc123-def4" β†’ Semantic search (not a valid UUID) βœ…

No action required: This improvement is automatic and backward compatible. Your existing queries will automatically benefit from improved classification accuracy.

Performance: Progressive retrieval achieves 50ms P50 latency (70% Stage 1 exits). Hybrid search adds +10-15% accuracy for exact-match queries. Adaptive retrieval optimizes automatically with zero configuration.

Research Tools

POST /v3/search

AI-powered web search with multi-provider support (Brave, NewsAPI, Diary). Returns search results with optional AI-generated answers and content extraction.

Field Type Description
query string Search query (required)
search_provider string Provider to use: brave, newsapi, diary, hybrid (default: brave)
search_depth string basic or advanced (default: basic)
max_results number Maximum results to return (1-20, default: 5)
include_answer boolean Generate AI summary of results (default: false)
time_range string Filter by time: day, week, month, year

Related routes:

  • POST /v3/extract – Extract and parse content from a URL (markdown, text, images)
  • POST /v3/crawl – Deep crawl a website with link following and content extraction
  • POST /v3/map – Map website structure and discover pages
POST /v3/chat/agent

Agent-enabled chat completions with automatic tool injection. Drop-in OpenAI replacement where the model can autonomously search the web, extract content, and crawl websites to answer your questions.

Field Type Description
model string Model that supports function calling (e.g. llama-3.3-70b-versatile)
messages Message[] OpenAI-style conversation array
tools string[] Tools to enable: search, extract, crawl (default: all)
max_iterations number Maximum tool call loops (default: 5)
stream boolean Stream response with tool call events
curl https://api.frnds.cloud/v3/chat/agent \
  -H "Authorization: Bearer ak_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama-3.3-70b-versatile",
    "messages": [{"role": "user", "content": "What are the latest AI breakthroughs this week?"}],
    "tools": ["search"],
    "stream": true
  }'

The model automatically decides when to search the web, extract content from URLs, or crawl websites. Tool calls and results are handled server-side, and the final answer is returned after all tool iterations complete.

POST /v3/chat/smart-agent

Smart agent endpoint with semantic memory search and web search enabled by default. Designed for Arepo integration with chatId-scoped document retrieval and flexible tool configuration. Streaming is enabled by default.

Field Type Description
model string Model that supports function calling (e.g. llama-3.3-70b-versatile)
messages Message[] OpenAI-style conversation array
tools string[] Tools to enable (default: ["memory_search", "search"]). Additional tools can be added.
chat_id string Optional chat session ID for scoping EVS document retrieval (isolates user documents)
max_iterations number Maximum tool call loops (default: 5)
stream boolean Stream response with tool call events (default: true)
# Default behavior: semantic memory + web search
curl https://api.frnds.cloud/v3/chat/smart-agent \
  -H "Authorization: Bearer ak_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama-3.3-70b-versatile",
    "messages": [{"role": "user", "content": "What did I say about Q3 projections?"}],
    "chat_id": "user-session-abc123"
  }'

# Custom tools: add stock quotes and news
curl https://api.frnds.cloud/v3/chat/smart-agent \
  -H "Authorization: Bearer ak_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama-3.3-70b-versatile",
    "messages": [{"role": "user", "content": "Analyze NVDA stock performance"}],
    "tools": ["memory_search", "search", "get_stock_quote", "get_recent_news"],
    "chat_id": "user-session-abc123"
  }'

Default Tools:

  • memory_search – Semantic search across user's EVS documents (chatId-scoped for data isolation)
  • search – Real-time web search via Brave/Tavily APIs

Available Additional Tools: extract, get_stock_quote, get_crypto_quote, get_recent_news, get_top_headlines, get_current_weather

The smart agent automatically routes between semantic memory retrieval (for user-specific context) and web search (for current information). Cost tracking scales with the number of tools used and iterations performed.

GET /v3/workflows

List saved workflows for the authenticated tenant. Workflows are stored in KV and can be executed with persistent state.

Related routes:

  • GET /v3/workflows/templates – discover built-in templates.
  • POST /v3/workflows – create or update a workflow definition.
  • POST /v3/workflows/execute – run a stored or inline workflow.
  • POST /v3/workflows/from-template – instantiate a template with overrides.
  • GET /v3/workflows/:id, PUT /v3/workflows/:id, DELETE /v3/workflows/:id – manage lifecycle.
POST /v1/chat/completions

Strict OpenAI v1 compatibility for SDKs that have not yet migrated. Accepts the same payload as v3 but does not return the _arbitrage object. Session persistence is available via session_id.

  • GET /v1/models – legacy model catalog.
  • GET /v1/sessions/:id – retrieve stored conversation history.
  • DELETE /v1/sessions/:id – purge a stored session and any R2 context snapshots.
  • POST /v1/embeddings – currently returns 501 Not Implemented.
GET /v1/tools

List registered function-calling tools. Use POST /v1/tools/register with a JSON schema to add custom tools at runtime.

GET /v1/billing/usage

Tenant self-service usage dashboard. Query parameters start_date and end_date (ISO yyyy-mm-dd) slice the reporting window. Returns real-time quota status.

Use GET /v1/billing/history for six months of pre-aggregated monthly spend.

GET /v3/docs

Machine-readable documentation. Append ?format=markdown for Markdown output or switch to /v3/docs/openapi for the OpenAPI 3.0 schema.

GET/POST /ask, /code, /translate, ...

Magic endpoints provide curl-friendly shortcuts with automatic key detection. Supply the prompt as the path segment (e.g. /ask/your-question) or POST JSON for richer control.

Document Compression & RAG

POST /evs/ingest

Upload documents to Embedded Vector Storage (baseline, 1x compression). Supports semantic search via embeddings.

✨ NEW: Human-Readable Citations with displayName

Add an optional displayName field during ingestion to get human-readable citations in LLM responses. No post-processing needed!

Request Example:

{
  "tenantId": "tenant_123",
  "chatId": "user_docs",
  "source": {
    "type": "text",
    "content": "Employee PTO policy...",
    "displayName": "Employee-Handbook-2025.pdf"
  }
}

LLM Response (automatic):

Employees receive 15 days PTO per year [Source: Employee-Handbook-2025.pdf (chunk 3)]

API Response Metadata:

{
  "_vector_metadata": {
    "sources": [{
      "id": "abc123...",
      "title": "Employee Handbook",
      "displayName": "Employee-Handbook-2025.pdf",
      "score": 0.92
    }]
  }
}

Benefits: Without displayName, citations show hex IDs like [Source: 1d34d363:8709...]. With it, you get readable filenames automatically in both LLM context and API responses. Fully backward compatible - optional field that defaults to title if not provided.

POST /ocr/ingest

Compress documents using DeepSeek-OCR visual encoding (10x compression). Metadata search only. Drop-in replacement for /evs/*.

POST /ocv/ingest

Compress documents using Memvid video encoding (50-100x compression). Full semantic search via QR-encoded frames. Drop-in replacement for /evs/*.

POST /auto/analyze

Analyze a document and get compression recommendations without ingesting. Returns size, complexity analysis, and optimal method selection.

POST /auto/ingest

Smart auto-routing: analyzes document and automatically routes to optimal compression method (EVS/OCR/OCV) based on size, structure, and content.

POST /auto/benchmark

Benchmark all available compression methods on a document. Compares compression ratio, speed, and cost. Warning: Tests all methods (expensive).

GET /auto/compare

Get comparison table of compression methods (EVS 1x, OCR 10x, OCV 75x) with recommendations for different use cases.

POST /[evs|ocr|ocv]/search

Search compressed documents. EVS and OCV support full semantic search. OCR supports metadata search only. Identical API across all three families.

POST /[evs|ocr|ocv]/chat/completions

RAG-enhanced chat with document context. Automatically retrieves relevant chunks and injects into chat. Works with all compression methods.

GET /admin/tenants*

Admin APIs for provisioning, invoicing, and analytics. All routes require the ADMIN_API_KEY bearer token and are intended for operator dashboards.

Admin Analytics

The analytics API provides comprehensive business metrics including user counts, API usage, revenue data, and conversion funnels. Requires ADMIN_API_KEY authentication.

Get Analytics Metrics

GET /admin/analytics

Returns comprehensive analytics metrics for business insights and monitoring.

curl -H "Authorization: Bearer $ADMIN_API_KEY" \
  https://api.frnds.cloud/admin/analytics

Response Schema

{
  // User metrics
  "total_users": 1250,
  "users_by_tier": {
    "email_verified": 1000,
    "phone_verified": 500,
    "payment_verified": 200,
    "telegram_signup": 50
  },
  "new_signups_today": 15,
  "new_signups_this_week": 85,
  "new_signups_this_month": 320,

  // API usage metrics
  "total_api_calls_today": 5000,
  "total_api_calls_this_week": 28000,
  "total_tokens_consumed_today": 250000,
  "total_tokens_consumed_this_week": 1400000,

  // Revenue metrics
  "mrr": 5000.00,
  "total_paying_customers": 200,
  "churn_rate": 0.03,

  // Conversion funnel
  "funnel": {
    "signups": 1250,
    "email_verified": 1000,
    "phone_verified": 500,
    "payment_verified": 200
  },
  "conversion_rates": {
    "signup_to_email": 0.80,
    "email_to_phone": 0.50,
    "phone_to_payment": 0.40,
    "overall": 0.16
  },

  // Data quality indicator (NEW in Oct 2025)
  "is_estimate": false
}

Understanding is_estimate

The is_estimate field indicates the data quality of the response:

  • false - Data is from recent cache (accurate, <5 minutes old)
  • true - Data is from daily counters (estimated, summary may be stale)

When is_estimate: true, consider calling the refresh endpoint for accurate real-time data.

Refresh Analytics Cache

POST /admin/analytics/refresh

Triggers a recalculation of analytics metrics from actual user data. Use this when you need the most accurate data.

curl -X POST \
  -H "Authorization: Bearer $ADMIN_API_KEY" \
  https://api.frnds.cloud/admin/analytics/refresh

Response

{
  "success": true,
  "message": "Analytics cache refreshed successfully"
}

SDK Usage

Both JavaScript and Python SDKs support the analytics API with proper typing for the is_estimate field.

JavaScript/TypeScript

import { UplinkClient } from 'uplink-client'

const client = new UplinkClient({
  baseUrl: 'https://api.frnds.cloud',
  apiKey: process.env.ADMIN_API_KEY
})

// Get analytics
const metrics = await client.getAnalytics()

console.log(`Total users: ${metrics.total_users}`)
console.log(`Data is estimated: ${metrics.is_estimate}`)

// If data is estimated, refresh for accuracy
if (metrics.is_estimate) {
  await client.refreshAnalytics()
  const updated = await client.getAnalytics()
  console.log(`Updated MRR: $${updated.mrr}`)
}

Python

from uplink_client import UplinkClient

client = UplinkClient(
    base_url="https://api.frnds.cloud",
    api_key=os.environ["ADMIN_API_KEY"]
)

# Get analytics
metrics = await client.get_analytics()

print(f"Total users: {metrics.total_users}")
print(f"Data is estimated: {metrics.is_estimate}")

# If data is estimated, refresh for accuracy
if metrics.is_estimate:
    await client.refresh_analytics()
    updated = await client.get_analytics()
    print(f"Updated MRR: ${updated.mrr}")

Best Practices

  • Check is_estimate - Always check this field before making business decisions
  • Refresh when needed - Call the refresh endpoint when accurate data is critical
  • Cache client-side - Analytics data is expensive to compute, cache responses appropriately
  • Monitor performance - Large user counts may take longer to calculate

Schemas & SDK Integration

  • OpenAPI: GET /v3/docs/openapi returns a complete 3.0 document for SDK code generation.
  • Markdown: GET /v3/docs?format=markdown mirrors this guide for CLI or knowledge-base ingestion.
  • TypeScript: Worker-side request validation relies on Zod schemas in src/schemas. Reuse them when extending the worker to guarantee compatibility.