Overview
Audio transcription allows you to convert speech from audio files into text using state-of-the-art AI models. CyrionAI provides access to Whisper models for accurate, multilingual transcription.Basic Usage
Simple Audio Transcription
Transcription with Language Specification
Parameters
Required Parameters
| Parameter | Type | Description |
|---|---|---|
file | file | The audio file to transcribe (MP3, MP4, M4A, WAV, etc.) |
Optional Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
model | string | ”whisper-1” | The model to use for transcription |
language | string | null | Language code (e.g., “en”, “es”, “fr”) |
prompt | string | null | Contextual prompt to improve accuracy |
response_format | string | ”json” | Response format (“json”, “text”, “srt”, “verbose_json”) |
temperature | number | 0 | Controls randomness (0-1) |
Supported Audio Formats
CyrionAI supports a wide range of audio formats:| Format | Extension | Description |
|---|---|---|
| MP3 | .mp3 | Most common audio format |
| MP4 | .mp4 | Video files (audio will be extracted) |
| M4A | .m4a | Apple audio format |
| WAV | .wav | Uncompressed audio |
| FLAC | .flac | Lossless audio |
| OGG | .ogg | Open source audio format |
File Size Limits
- Maximum file size: 25 MB
- Recommended duration: Up to 60 minutes
- Supported languages: 99+ languages
Language Support
Automatic Language Detection
Specify Language
Common Language Codes
| Language | Code | Language | Code |
|---|---|---|---|
| English | en | Spanish | es |
| French | fr | German | de |
| Italian | it | Portuguese | pt |
| Russian | ru | Chinese | zh |
| Japanese | ja | Korean | ko |
| Arabic | ar | Hindi | hi |
Response Formats
JSON Format (Default)
Text Format
SRT Format (Subtitles)
Verbose JSON Format
Using Prompts
Contextual Prompts
Provide context to improve transcription accuracy:Specialized Vocabulary
Names and Proper Nouns
Best Practices
1. Audio Quality
2. File Preparation
3. Language Specification
4. Use Appropriate Prompts
Common Use Cases
Meeting Transcription
Interview Transcription
Training Content
Podcast Transcription
Error Handling
Advanced Features
Batch Processing
Timestamped Segments
Language Detection
Examples
Nonprofit Board Meeting
Volunteer Training Session
Donor Interview
Performance Considerations
Processing Time
- Transcription typically takes 30-60 seconds per minute of audio
- Processing time varies by file size and complexity
Rate Limits
- Audio transcription: 50 requests per minute
- Plan accordingly for batch processing
File Optimization
- Compress audio files to reduce upload time
- Use appropriate audio formats (MP3 recommended)
- Ensure good audio quality for better accuracy
Next Steps
- Learn about chat completions
- Explore image generation
- Check out more audio examples
- View the API reference