Gemilive

Plug-and-play Gemini Multimodal Live
for your custom stack.

Abstracts away the brutal boilerplate of bridging native browser WebAudio, gapless PCM streaming, and live video frames to a secure Python backend proxy. Build conversational intelligence in six lines of code.

View on GitHub

The "Proxy Problem"

While Google provides excellent core SDKs for the Gemini Multimodal Live API, integrating it securely into a production app usually kills a weekend. You can't put your API keys directly into a browser frontend, so you are forced to build a custom backend proxy. Suddenly, you're hand-wiring WebSockets to bridge raw 16kHz microphone streams from a JS frontend into a Python backend just to forward them to Gemini.

Gemilive effortlessly bridges that gap permanently.

🌐

Browser Frontend

gemilive-js

WebSockets ↔

🐍

FastAPI Proxy

gemilive

API Stream ↔

✨

Google Gemini

AI Cloud

🐍 FastAPI Backend

main.py

from fastapi import FastAPI
from gemilive import mount_gemilive

app = FastAPI()

# Instantly proxies secure WebSockets to Gemini Live
# Handles upstream 16kHz PCM & downstream 24kHz audio
mount_gemilive(
    app, 
    system_prompt="Be incredibly witty."
)

🌐 Browser Frontend

app.js

import { GemiliveClient } from 'gemilive-js';

const client = new GemiliveClient("wss://api.yourdomain.com/ws/live");

// Manages mic extraction, downsampling, and canvas video snapping
await client.start();

client.onMessage = (msg) => console.log(msg);
// Audio automatically plays back purely gapless over WebAudio Timeline