AI · Video Intelligence · RAG · Speech-to-Text

InsightAI — Video Intelligence & Meeting Assistant

An AI-powered video analysis platform that automatically transcribes meeting recordings, generates structured summaries, extracts key decisions and action items — and lets users chat with their video content through RAG-powered conversational queries.

My Role: Full Stack Developer
Auto
Transcription
Speech-to-Text AI
RAG
Chat with Video
Context-aware answers
AI
Smart Summaries
OpenAI powered
0x
Rewatch Needed
Instant insights

What We Built & Why

InsightAI is a video intelligence platform built for teams that run meetings, interviews, webinars, and recorded sessions. Instead of rewatching full recordings to find a specific decision or action item, users simply upload the video — and the platform handles everything automatically.

Using Speech-to-Text for accurate transcription, OpenAI for structured summaries, and RAG for contextual chat — InsightAI transforms hours of video content into searchable, queryable, and instantly accessible knowledge.

Core problem solved: Teams waste hours rewatching recordings just to find one decision or action item. InsightAI turns every video into a searchable knowledge base — ask a question, get the exact answer, with zero rewatching.

Video Upload

Upload any meeting or video recording — any format, any length, processed automatically.

Transcription

Accurate speech-to-text extraction from the uploaded video in seconds.

AI Summary

Structured meeting summary with key points, decisions, and action items.

Chat with Video

Ask any question about the meeting — get context-aware, RAG-powered answers.

See InsightAI in Action

Real screenshots from the deployed InsightAI platform — from video upload to verbatim transcript extraction and AI-powered meeting analysis.

InsightAI homepage — drag and drop video upload with 100% Private, Auto-Transcribe, AI Analysis features
Upload Your Meeting Video

Users simply drag & drop any meeting recording. InsightAI runs 100% locally — powered by OpenAI Whisper & Meta Llama 3.2. Data never leaves the machine.

· Supports MP4 and major video formats (200MB limit)
· 100% Private — offline secure mode, no cloud upload
· Powered by OpenAI Whisper for high-accuracy transcription
· AI Analysis — chat, summarise, extract insights
Verbatim Transcript Generated

After analysis, a complete verbatim transcript is generated — 1,529 words extracted from a 32.6MB recording. Three tabs available: Executive Summary, AI Assistant, and Raw Transcript.

· Full verbatim transcript with every spoken word
· Transcript word count shown in sidebar
· Download Transcript button for export
· Executive Summary & AI Assistant tabs ready
InsightAI transcript view — 1529 words verbatim transcript from meeting recording
InsightAI Executive Briefing — AI-generated executive summary with key takeaways from the meeting
Executive Briefing — AI-Generated Meeting Summary

The Executive Summary tab generates a full briefing — an AI summary of what the meeting was about, followed by Key Takeaways as a numbered list. All from a 32.6MB recording, fully processed locally.

· Executive Briefing — structured meeting overview
· Key Takeaways — numbered list of important points
· AI identifies decisions, proposals & next steps
· All processing runs locally — 100% private & secure

The Video Intelligence Pipeline

From video upload to fully searchable meeting knowledge — every step is handled automatically by the AI pipeline.

01
Video Upload

User uploads a meeting or video recording. Supports all major video formats — processed via FastAPI backend.

02
Speech-to-Text

Audio extracted from the video and passed through the Speech-to-Text engine — generating an accurate full transcript.

03
AI Processing

OpenAI processes the transcript — generating structured summaries, identifying key points, decisions, and action items.

04
Vector Indexing

Transcript chunks embedded as vectors and stored in a vector database — ready for semantic search and RAG retrieval.

05
Summary Generated

Structured meeting summary delivered — key points, decisions, action items, and participants clearly listed.

06
Insight Extraction

AI identifies and tags key insights — what was decided, what was raised, and what needs follow-up action.

07
Chat Interface

User asks questions about the meeting — e.g. "What was decided about the budget?" — RAG retrieves the exact context and LLM answers.

08
Knowledge Ready

Video content fully transformed into searchable knowledge — no rewatching, no manual notes, instant access.

What InsightAI Does

Automatic Transcription
· Accurate speech-to-text from video audio
· Speaker diarisation & timestamp support
· Handles accents, multiple speakers
· Full transcript available for download
AI-Generated Summaries
· Concise structured meeting summary
· Key points & discussion topics listed
· Decisions and outcomes highlighted
· Action items with owners extracted
Chat with Video Content
· Ask questions in plain English
· RAG retrieves exact relevant transcript chunks
· LLM generates context-aware answers
· Cites specific moments from the video
Insight Extraction
· Key decisions automatically identified
· Action items pulled out with context
· Important topics tagged and surfaced
· Follows up on unresolved discussion points
Searchable Knowledge Base
· Every video becomes a searchable asset
· Semantic search across all recordings
· Vector database for fast retrieval
· Query across multiple meetings at once
Productivity Enhancement
· Zero rewatching of recorded meetings
· Find any moment in seconds via chat
· Auto meeting notes — no manual effort
· Onboard team to past meetings instantly

What InsightAI Delivers

Hours of meeting recordings processed in seconds — accurate transcripts, summaries, and action items ready immediately after upload.

Chat with any meeting — ask "What did we decide about the Q3 budget?" and get the exact answer without scrubbing through the recording.

Structured meeting notes auto-generated — key decisions, action items, and discussion points captured without any manual effort.

Every recording becomes a searchable knowledge asset — teams can query across months of meetings to find any insight instantly.

New team members can be onboarded from past recordings — query the knowledge base instead of scheduling catch-up calls.

Technologies Used

Python
Core backend language
FastAPI
Async REST backend
OpenAI API
Summaries & chat
Speech-to-Text
Audio transcription
RAG Pipeline
Vector search & retrieval
Chat Interface
AI-powered Q&A

What Was Built & Applied

Python
FastAPI
OpenAI API
Speech-to-Text Processing
Retrieval Augmented Generation (RAG)
AI-Powered Chat Interfaces

Want AI-powered video intelligence for your team?

From automatic transcription to RAG-powered chat — we build meeting assistants that turn your recordings into searchable knowledge.

Start Your Project →