ADAI (Audio Description AI) is an AI-powered platform that generates professional audio descriptions for videos. It narrates visual elements — actions, settings, characters, and on-screen text — so people who are blind or have low vision can fully experience video content. ADAI helps organizations meet WCAG 2.1 AA and ADA accessibility requirements.

How do I upload a video?

Click "Upload" in the navigation bar, then drag and drop your video file or click "Browse Files." We support MP4, MOV, AVI, and MKV formats up to 2 GB and 45 minutes in length. You can also reuse a previously uploaded video from the Recent Activity section on the upload page.

What video formats and sizes are supported?

We accept MP4, MOV, AVI, and MKV files. The maximum file size is 2 GB and the maximum duration is 45 minutes. For best results, upload at 720p or higher resolution so the AI can accurately interpret visual details.

What is the "Getting Started" checklist?

The checklist on your dashboard walks you through essential first steps: generating your first audio-described video, teaching the AI a custom pronunciation via the Lexicon, completing your profile, and reviewing your billing and credit status. Each step is checked off automatically as you complete it.

How long does processing take?

Processing time depends on video length, resolution, and the mode you select. A typical 5-minute video in Standard mode takes roughly 3 – 8 minutes. Extended and Hybrid modes may take longer because they perform additional analysis to handle dense visual content. You can track real-time progress from your dashboard or the Videos page.

What are the processing modes?

ADAI offers three processing modes. Standard (SD) fits descriptions into the natural pauses in dialogue — ideal for most content. Extended pauses the video when a description is too long for the available gap, ensuring nothing is missed. Hybrid intelligently combines both approaches, only pausing when strictly necessary. You choose the mode when uploading.

How do I get a free trial?

New users can sign up for a free trial that includes 50 credits — enough to process about 7 minutes of video. Visit the registration page and select "Start Free Trial." No credit card is required to get started.

What does the dashboard show me?

Your dashboard provides an at-a-glance overview: recent and in-progress videos, your credit balance gauge, the Getting Started checklist, quick-action buttons for uploading or managing videos, and — for organization admins — a team overview showing member activity.

How do I organize videos into projects?

On the Videos page, use the project sidebar on the left to create a new project. Give it a name, then drag videos into it or use the bulk-action menu to move selected videos. Projects help you group related content — for example, by client, campaign, or series.

Can I search across all my videos?

Yes. The Videos page has an advanced search bar that searches across titles, tags, entity names, and even transcript text. You can also filter by status, date range, processing mode, duration, file size, and lexicon used.

Can I share a video with someone outside my organization?

Yes. From the video detail page, generate a share link. Anyone with that link can view the video and its audio descriptions without needing an ADAI account. You can also use the embed player (/embed/ ) to embed the accessible player on external websites.

How do I share videos within my organization?

On the video detail page, use the share option and select team members from your organization. Shared videos appear on each member's "Shared" page. You can see all sharing activity — both videos shared with you and videos you've shared — from the Shared section in the navigation.

Can I perform bulk actions on multiple videos?

Yes. On the Videos page, select multiple videos using the checkboxes, then use the bulk-action toolbar to delete, move to a project, or perform other operations on all selected videos at once.

What is the video grid vs. list view?

The Videos page supports two layouts. Grid view shows large thumbnails — great for visual browsing. List view is a compact table format that shows more metadata at a glance (duration, status, date, mode). Toggle between them using the view switcher in the top-right corner.

Can I download the audio description as a separate file?

Yes. After processing, you can export the audio description track in Broadcast Wave Format (BWF), which is the professional standard used in broadcast and post-production workflows. This allows you to mix the description track into your existing audio pipeline.

How do I favorite a video?

Click the heart or star icon on any video card to mark it as a favorite. Favorites can be filtered from the Videos page for quick access to your most important content.

Live Voice lets you interact with ADAI in real-time using your camera and microphone. Point your camera at anything — a document, a product label, your surroundings — and the AI describes what it sees and answers follow-up questions instantly, like having a sighted guide on demand.

Does Live Voice record my camera?

No. Live Voice processes video frames in real-time to provide context to the AI, but it does not permanently record or store your video feed. All processing is transient and limited to the duration of your session.

Why is the audio slightly delayed?

Live Voice needs to analyze the visual input and generate a spoken response, which introduces a small amount of latency. We continuously optimize for speed, but a brief delay is expected as the AI processes the scene before speaking.

How do I configure my microphone for Live Voice?

Go to Account > Live Voice (the "Gemini Live" tab) to select your preferred microphone and adjust input settings. You can also change the microphone directly from the Live Voice session dialog before joining.

What languages does Live Voice support?

Live Voice currently supports English with plans to expand to additional languages. The underlying Gemini model can understand many languages, but the real-time voice response quality is optimized for English at this time.

How do I start a Live Voice session?

You can start Live Voice from multiple places: press "G" on any video page or click the Live Voice button in the navigation.

How do I fix a mispronounced word?

Go to the "Lexicon" page and create a new entry. Type the word exactly as it appears in your content, then provide either a phonetic spelling or a "sounds like" replacement (e.g., "Resume" → "Reh-zoo-may"). The AI will use your pronunciation for all future videos.

Can I change the narrator's voice?

Yes. When uploading a video or in Account > Preferences, you can choose from a range of AI voices with different accents, tones, and genders. Voices are organized into tiers: Premium (highest fidelity), Generative (AI-styled), and Basic.

Hybrid Mode intelligently combines Standard and Extended approaches. It fits descriptions into natural pauses whenever possible and only pauses the video when a description is too long for the available gap. This gives you full context without unnecessary interruptions.

What is the difference between personal and organization lexicons?

A personal lexicon applies only to your own videos. An organization lexicon is shared across all members of your organization, ensuring consistent pronunciation of brand names, product terms, and jargon for everyone on the team. Admins can manage the organization lexicon from the Lexicon page.

What languages are supported for audio descriptions?

ADAI supports 18+ languages for generated audio descriptions, including English, Spanish, French, German, Portuguese, Japanese, Korean, Mandarin, Hindi, Arabic, and more. You select the output language during upload, and the AI generates descriptions natively in that language.

What is a style prompt?

When using a Generative-tier voice, you can provide a style prompt to guide the tone and delivery of the narration — for example, "warm and conversational" or "formal documentary style." The AI adapts its pacing, emphasis, and expressiveness based on your prompt.

What are the voice quality tiers?

Premium voices offer the highest fidelity with natural intonation and are ideal for professional or broadcast-quality output. Generative voices use advanced AI synthesis and support style prompts for creative control. Basic voices are clear and reliable, suitable for quick drafts or internal reviews.

How much does processing cost?

Standard video processing costs 7 credits per minute of video. For example, a 5-minute video costs 35 credits. The minimum charge for any video is 7 credits (i.e., videos under 1 minute are rounded up).

How do I purchase more credits?

Go to Account > Billing and submit the "Request more credits" form. Share your requested credit amount, timeline, and project details, and our billing team will follow up to complete the top-up.

Where can I see my credit balance?

Your current credit balance is displayed in the credits dropdown in the top navigation bar. For detailed usage history, transaction records, and package information, visit Account > Billing.

Do unused credits expire?

No. Purchased credits never expire. You can use them whenever you need, at your own pace.

What is "AI Preview" and what does it cost?

AI Preview generates a high-quality animated intro for your video using Veo 3.1. When enabled during upload, it adds 2 additional credits to your total processing cost.

Do Extended and Hybrid modes cost more?

The per-minute credit rate is the same across all modes (7 credits/minute). However, Extended and Hybrid modes may produce slightly longer output because they add pauses for more detailed descriptions, which could affect the final video duration. The cost is based on the original input video length, not the output.

What is included in the free trial?

The free trial gives you 50 credits at no cost — enough to process about 7 minutes of video. You get full access to all features including every processing mode, voice tier, and language. No credit card is required.

How do I view my transaction history?

Go to Account > Billing to see a complete record of all credit purchases, usage charges, and remaining balance over time.

Can I get a cost estimate before processing?

Yes. On the upload page, after selecting your video and settings, you will see a credit estimate showing the expected cost before you confirm. This estimate is based on the video duration and any add-ons like AI Preview.

How do I invite team members to my organization?

Go to Account > Team and click "Invite Member." Enter their email address and select a role (Admin or Member). They will receive an invitation email with a link to join your organization.

What roles are available in an organization?

Organizations support Admin and Member roles. Admins can manage members, invitations, organization-level lexicons, and billing. Members can upload videos, use shared lexicons, and view shared content, but cannot manage team settings.

How do I switch between organizations?

If you belong to multiple organizations, use the organization switcher in the top navigation bar to change which organization's workspace you are viewing. Your personal videos and the organization's shared videos are kept separate.

How do I update my profile?

Go to Account > Profile to edit your name, username, phone number, and avatar. Changes are saved immediately and reflected across the platform.

How do I set up two-factor authentication (2FA)?

Navigate to Account > Security and follow the prompts to enable 2FA. Once enabled, you will need to enter a code from your authenticator app each time you sign in, adding an extra layer of protection to your account.

How do I manage active sessions?

Account > Security shows all devices and sessions currently signed in to your account. You can revoke any session you don't recognize to immediately sign out that device.

How do I change my email notification preferences?

Go to Account > Notifications to control which email notifications you receive — including processing status updates (started, completed, failed), team activity, and product announcements. Toggle each category on or off.

How do I leave an organization?

Go to Account > Danger Zone and select "Leave Organization." Your personal videos remain with your account, but you will lose access to the organization's shared content and lexicons.

How do I delete my account?

Account deletion is available under Account > Danger Zone. This action is permanent and will delete all your videos, lexicon entries, and personal data. If you are the sole admin of an organization, you must transfer ownership or delete the organization first.

How is my video data stored and protected?

Uploaded videos and generated audio descriptions are stored securely in cloud infrastructure with encryption at rest and in transit. Access is restricted to your account and any organization members you explicitly share with. We do not use your content to train AI models.

Can I export my personal data?

Yes. Go to Account > Privacy and request a data export. You will receive a downloadable archive containing your profile information, video metadata, lexicon entries, and usage history.

How do I manage cookie preferences?

Visit Account > Privacy to view and update your cookie preferences. You can opt out of non-essential cookies (analytics, marketing) while keeping the functional cookies required for the platform to operate.

How do I reset my password?

Click "Forgot Password" on the login page and enter your email address. You will receive a reset link that lets you set a new password. For security, the link expires after a limited time. You can also change your password from Account > Security while signed in.

AI Support Available

How can we help you today?

Search our knowledge base or for instant assistance.

What is ADAI?

ADAI automatically generates professional audio descriptions for video — narrating actions, characters, and on-screen text so people who are blind or have low vision can fully experience your content. Upload a video, and AI does the rest.

AI-Powered

Multi-agent scene analysis

18+ Languages

Native multilingual output

WCAG 2.1 AA

ADA & Section 508 compliant

Broadcast-Ready

BWF/WAV audio export

Multiple Voice Tiers

Premium, Generative, Basic

Live Voice

Real-time visual AI assistant

Team Collaboration

Organizations & shared credits

About ADAI Pricing Compliance Explore Videos

ADAI Platform Details

ADAI (Audio Description AI) is an AI-powered platform that automatically generates professional audio descriptions for video content. Audio descriptions narrate the visual elements of a video — actions, settings, characters, facial expressions, on-screen text, and scene transitions — so that people who are blind or have low vision can fully experience the content independently.

ADAI is the first platform to fully automate the audio description workflow in a single agentic pipeline. You upload a video, and ADAI analyzes it frame by frame using multimodal AI (powered by Google Vertex AI and Gemini), detects scenes, extracts dialogue transcripts, reads on-screen text via OCR, identifies characters and actions, generates context-aware narration scripts, and synthesizes natural speech using Google Cloud Text-to-Speech — all without manual scripting, voice talent, or studio time.

The platform supports 18+ languages including English, Spanish, French, German, Portuguese, Italian, Japanese, Korean, Mandarin Chinese, Hindi, Arabic, Dutch, Polish, Swedish, Turkish, Vietnamese, Thai, and Indonesian. Multiple AI voice options are available across three quality tiers: Premium (highest fidelity, natural intonation), Generative (AI-styled with customizable style prompts), and Basic (clear and reliable for drafts).

Three processing modes let you match the output to your content type: Standard fits descriptions into natural pauses in dialogue, Extended pauses the video when needed for complete coverage, and Hybrid (recommended) intelligently combines both — only pausing when strictly necessary. After processing, a built-in scene editor lets you fine-tune individual descriptions, adjust timing, create snapshots, and collaborate with your team through threaded comments.

ADAI helps organizations meet accessibility requirements including WCAG 2.1 AA (Guideline 1.2.5 — Audio Description for Prerecorded Video), ADA (Americans with Disabilities Act), Section 508 (U.S. Rehabilitation Act), and FCC CVAA (21st Century Communications and Video Accessibility Act). All AI-generated content includes latent disclosure metadata as required by the California AI Transparency Act (SB 942).

Output can be exported as a complete video with embedded audio descriptions, a separate Broadcast Wave Format (BWF) audio track for professional post-production workflows, or streamed directly from ADAI's accessible video player with HLS and DASH adaptive streaming. Videos can be shared via unique links, embedded on external websites using the ADAI player or OEmbed, or published to the community showcase.

Teams can collaborate through organizations with shared credit pools, role-based access (Owner, Admin, Member), shared pronunciation lexicons, and team video sharing. A credit-based billing model charges 7 credits per minute of input video with no subscription required — credits never expire. New users receive 50 free credits (about 7 minutes of video) with no credit card required.

Additional capabilities include Live Voice, a real-time visual AI assistant powered by Gemini that describes what your camera sees and answers questions via voice conversation; a pronunciation lexicon with AI-suggested pronunciations for brand names and technical terms; a scene review system with a 7-dimension quality rubric; and a comprehensive support system including AI voice assistant, support tickets, and direct contact.

Quick Start Guide

Go from upload to accessible video in four steps

Create Your Account

Sign up for free at adai.tv/register. You receive 50 credits instantly — no credit card required. Set your username, password, and optionally configure your vision profile for a tailored experience.

Upload Your Video

Drag and drop your MP4, MOV, AVI, or MKV file (up to 2 GB / 45 min). Choose a processing mode, select a voice and language from 18+ options, and attach any custom pronunciations from your lexicon.

Go to Upload

AI Processes Your Video

ADAI's multi-agent AI pipeline analyzes your video frame by frame — detecting scenes, extracting dialogue, reading on-screen text, identifying characters, and generating natural audio descriptions. Track progress in real time from your dashboard.

View Dashboard

Review, Edit & Export

Watch the result in the built-in accessible player, or open the scene editor to fine-tune descriptions. Export the final video, download the separate BWF audio track, generate share links, or embed the player on your website.

View Videos