Python Convert Audio to Text

14h

DIY Automated Media Ingest Server (AMIS) Keeps All Your Media Organizied

See the 3D printed 2U rack automated ingestion server. Powered by an AMD Ryzen 7600X with Intel Arc A310, plus Python, FFmpeg ...

Tech Xplore

AI learns to 'listen': Compact speech tokens help models understand spoken words

Large language models (LLMs) such as ChatGPT and Gemini were originally designed to work with text only. Today, they have ...

IEEE

CoAVT: A Cognition-Inspired Unified Audio-Visual-Text Pre-Training Model for Multimodal Processing

Abstract: There has been a long-standing quest for a unified audio-visual-text model to enable various multimodal understanding tasks, which mimics the listening, seeing, and reading process of human ...

IEEE

VATMAN: Integrating Video-Audio-Text for Multimodal Abstractive SummarizatioN via Crossmodal Multi-Head Attention Fusion

Abstract: The paper introduces VATMAN (Video-Audio-Text Multimodal Abstractive summarizatioN), a novel approach for generating hierarchical multimodal summaries utilizing Trimodal Hierarchical ...

GitHub

Audio to SRT

A native desktop application that converts audio files into perfectly formatted SRT subtitle files using OpenAI's Whisper AI. No cloud processing, no subscriptions, no complexity. Perfect for: Content ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results