Installing Whisper: Step-by-Step Guide
Whisper is one of the most powerful tools for automatic speech recognition. Developed by OpenAI, it’s free to use, supports multiple languages, and works impressively well even with noisy recordings. But how do you actually install it?
In this guide, you’ll get clear, easy-to-follow instructions on how to install Whisper on Windows, macOS, or Linux — no programming experience required.
Preparation: What You Need Before Installing Whisper
Before installing Whisper, a bit of setup is needed. Don’t worry — no coding skills required, just a bit of patience.
Install Python
Whisper runs on Python. Recommended versions: 3.8 to 3.11.
- Download it from python.org
- On Windows, make sure to check “Add Python to PATH” during installation
- Verify installation:
bash
CopyEdit
python --version
Set Up a Virtual Environment (Recommended)
A virtual environment keeps the Whisper setup isolated and clean.
bash
CopyEdit
# Windows or macOS/Linux
python -m venv whisper-env
# Activate the environment
# macOS/Linux
source whisper-env/bin/activate
# Windows
whisper-env\Scripts\activate.bat
.avif)
Install FFmpeg
Whisper relies on FFmpeg to handle audio formats.
- Windows (using Chocolatey):
bash
CopyEdit
choco install ffmpeg
- macOS (using Homebrew):
bash
CopyEdit
brew install ffmpeg
- Linux (Debian/Ubuntu):
bash
CopyEdit
sudo apt update && sudo apt install ffmpeg
Verify installation:
bash
CopyEdit
ffmpeg -version
Installing Whisper: Step by Step
Step 1: Install Whisper via pip
With Python and FFmpeg set up, install Whisper:
bash
CopyEdit
pip install -U openai-whisper
If you run into errors (e.g., related to tiktoken
or Rust
), try:
bash
CopyEdit
pip install --upgrade pip
You may also need the Rust compiler (rustup.rs
) if dependencies fail to compile.
Step 2: Run a Test Transcription
Place an audio file (e.g., example.mp3
) in your working directory. Then create a Python script:
python
CopyEdit
import whisper
model = whisper.load_model("small")
result = model.transcribe("example.mp3")
print(result["text"])
Run it:
bash
CopyEdit
python transcribe.py
The model will download automatically on first use.
Installing Whisper by Operating System
Windows
1. Install Python and FFmpeg
Ensure they’re in the PATH variable so you can run python
and ffmpeg
globally.
2. Activate Virtual Environment
bash
CopyEdit
python -m venv whisper-env
whisper-env\Scripts\activate.bat
3. Install Whisper and PyTorch (Optional)
For GPU acceleration:
bash
CopyEdit
pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu118
pip install -U openai-whisper
4. Test Transcription
Use the script above. If errors occur, check audio format, file integrity, or try a simpler file.
macOS
1. Install Homebrew, Python, FFmpeg
bash
CopyEdit
brew install python@3.11
brew install ffmpeg
2. Create Virtual Environment
bash
CopyEdit
python3 -m venv whisper-env
source whisper-env/bin/activate
3. Install Whisper
bash
CopyEdit
pip install -U openai-whisper
4. Test Transcription
Same process as Windows. On M1/M2 Macs, PyTorch supports Metal acceleration (MPS), which improves speed, especially with larger models.
Linux (Debian/Ubuntu)
1. Install FFmpeg and Python
bash
CopyEdit
sudo apt update
sudo apt install ffmpeg python3 python3-pip python3-venv
2. Create and Activate Virtual Environment
bash
CopyEdit
python3 -m venv whisper-env
source whisper-env/bin/activate
3. Install Whisper
bash
CopyEdit
pip install -U openai-whisper
4. Optional: PyTorch with CUDA for NVIDIA GPUs
bash
CopyEdit
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
5. Run Test Script
Use the same method as above.

Whisper Model Sizes and Memory Requirements
Whisper downloads the model when first used. Choose based on your needs:
- tiny: Fast, low accuracy (~75 MB)
- base: Basic balance (~142 MB)
- small: Solid for general use (~244 MB)
- medium: High quality (~769 MB)
- large: Maximum accuracy (~1.55 GB)
Models are stored in:~/.cache/whisper
Common Errors and How to Fix Them
FFmpeg Not Found
Check if it’s in your PATH:
bash
CopyEdit
ffmpeg -version
“No module named 'whisper'”
Make sure your virtual environment is active before running scripts.
CUDA Not Recognized
Install PyTorch with the correct CUDA version for your system (CU118, CU121, etc.). You’ll find the correct option on PyTorch’s Get Started page.
Transcription Fails
Verify the audio file is supported and unencrypted. Formats like MP3, WAV, FLAC, M4A, OGG, and AAC usually work. Avoid DRM or variable bitrate issues.
Conclusion: Install Whisper or Choose an Alternative?
With just a few setup steps, Whisper can run on any modern computer. It’s a robust tool for transcription, podcasts, research, or content creation — fully offline and free.
If you'd prefer a no-setup experience, tools like Sally offer Whisper's capabilities with added AI summaries, CRM integration, and a plug-and-play UI.
Voice recognition has never been this accessible. Want to save time? Try Sally for free today.
Test Meeting Transcription now!
We'll help you set everything up - just contact us via the form.
Test NowOr: Arrange a Demo Appointment