"Voice-First SRE Copilot powered by Gemini Live + Google ADK"
🎤
VOICE + VISION
Real-Time Streaming
Talk naturally while sharing your camera - OpsVoice sees errors, hears you, and responds instantly with native audio
🚨
INCIDENT MGMT
Create, Track, Resolve
Full incident lifecycle - create incidents, check open issues, update status, and retrieve runbooks hands-free
⚡
PROACTIVE
Alert Monitor
Background service health polling with proactive WebSocket alerts - OpsVoice warns you before things break
Powered byGemini 2.5 FlashLive APIGoogle ADKCloud Run
🔴 LIVE DEMO
Watch OpsVoice in Action
Switch to your app screen recording here
OpsVoice
System Architecture
Voice-First SRE Copilot powered by Gemini Live + Google ADK
User Layer
🖥
Browser
SRE operator interface with real-time voice, text, and camera inputs
VOICE + TEXT + CAMERA
SENDRECEIVE
Frontend (In-Browser)
➡️
WebSocket Connection
Full-duplex bidirectional streaming between browser and server
BIDIRECTIONAL
🎙️
Audio Processing
Real-time PCM capture at 16kHz mono, base64 encoded for streaming
PCM 16KHZ
📷
Camera Frame Capture
Periodic frame capture from device camera, JPEG compressed
JPEG FRAMES
⬇️ WEBSOCKET / WSS ⬆️
Backend (Google Cloud Run)
</>
FastAPI Server
Async Python server handling WebSocket upgrades and HTTP endpoints
PYTHON / ASYNC
⚙️
Google ADK Runner
Agent Development Kit orchestrating tool calls and Gemini sessions
ADK FRAMEWORK
👤
Session Management
Per-user session state, conversation history, and context tracking
STATEFUL
⚠️
Proactive Alert Monitor
Background task polling service health, broadcasts alerts via WebSocket
ASYNC BROADCAST
⬇️ GEMINI LIVE API ⬆️
AI Layer
✨
Gemini 2.5 Flash
Multimodal AI with native audio understanding and generation capabilities
NATIVE AUDIO
⚡
Live Bidirectional Streaming
Real-time audio-in, audio-out with interruptible conversation flow
REAL-TIME
⬇️ TOOL CALLS
⬇️ INFRASTRUCTURE
Agent Tools
⚔️
check_service_health
Poll status of monitored services
📝
create_incident
Open new incident records in Firestore
📋
get_open_incidents
List active incident records
✅
update_incident_status
Resolve or escalate incidents
📖
get_runbook
Retrieve remediation playbooks
🔍
google_search
Web search for troubleshooting context
Google Cloud Services
☁️
Cloud Run
Serverless container hosting
💾
Firestore
Incidents database (NoSQL)
🔐
Secret Manager
Secure API key storage
📦
Artifact Registry
Container image repository
</>
Cloud Build
CI/CD pipeline
📊
Cloud Logging
Centralized log aggregation
Data Flow Paths
1
Voice/Text Request
Browser → WebSocket → FastAPI → ADK Runner → Gemini Live API
2
Tool Execution
Gemini Live API → Tool Calls → Service Health / Firestore
3
AI Response
Gemini Live API → Audio/Text → WebSocket → Browser
4
Proactive Alerts
Alert Monitor → WS Broadcast → All Browsers
User
Frontend
Backend
AI
Tools
Cloud
Technical Highlights
✨
Gemini Native Audio
gemini-2.5-flash-native-audio-preview with Live API
🤖
Google ADK
Runner.run_live() + LiveRequestQueue
🎤
Voice + Camera + Text
PCM 16kHz audio + JPEG frames, all via WebSocket
🚨
6 Agent Tools
Incident CRUD, health checks, runbooks, grounded search
⚡
Proactive Alerts
Background health polling with async WebSocket broadcast
☁️
Cloud Native
Cloud Run + Firestore + Secret Manager + Terraform
"AI interaction does not need a text box. By combining real-time voice with computer vision, we created an experience that feels like having a senior engineer right beside you."
OpsVoice
💻Open Source on GitHub
☁️Live on Google Cloud
📝Blog Post on Dev.to
Thank you for watching
Arrow keys or click dots to navigate | F = fullscreen