Environment Setup
Get your machine ready for AI engineering. Python, API keys, GPU drivers, and essential packages.
30 Years of Dependency Hell (and How We Got Out)
Environment setup is where most AI beginners quit. Not because the concepts are hard, but because Python's packaging ecosystem spent two decades in chaos. Understanding that history is the fastest way to understand why the tools exist, which ones to use, and which ones to avoid.
The AI/ML stack is uniquely punishing: you need Python packages that wrap C++ libraries that link against CUDA drivers that must match your GPU firmware. A single version mismatch and nothing works. Every tool in this lesson exists because someone hit a wall with the previous tool.
System Python and distutils
Python shipped with your OS. You installed packages globally with python setup.py install, which dumped files into your system's site-packages. Two projects needing different versions of the same library? Impossible. You overwrote one to install the other. Upgrading a package for one project could break your OS utilities. Developers called this "dependency hell."
# The bad old days: everything was global sudo python setup.py install # installs to /usr/lib/python2.7/site-packages # Project A needs numpy==1.8, Project B needs numpy==1.11 # You can only have one installed. Pick one. The other breaks.
This also meant sudo was required to install packages — a security nightmare that trained an entire generation of developers to run untrusted code as root.
setuptools and easy_install
Phillip Eby created setuptools and its companion easy_install, which could download packages from PyPI automatically. This was a revelation — no more manually hunting for .tar.gz files on SourceForge. But easy_install had no uninstall command. It installed packages as compressed eggs that were opaque and hard to debug. Dependencies were still global.
virtualenv — The Breakthrough
Ian Bicking released virtualenv, which created isolated Python directories with their own site-packages. For the first time, Project A and Project B could have different numpy versions on the same machine. This single tool saved Python from being abandoned by the data science community. The idea was so obviously correct that Python 3.3 (2012) absorbed it into the standard library as venv.
"virtualenv is a tool to create isolated Python environments. The basic problem being addressed is one of dependencies and versions, and indirectly permissions."
— virtualenv documentation, 2007. The problem statement remains identical 19 years later.
pip Replaces easy_install
Also by Ian Bicking. pip could uninstall packages, used a flat installation format instead of eggs, and supported requirements.txt for reproducible installs. By 2014 it shipped with Python itself. The virtualenv + pip combo became the standard workflow for a decade — and still works fine for simple projects. Its main limitation: pip resolves dependencies by backtracking, which can be extremely slow on large dependency trees. And it can't manage Python versions themselves.
Conda — The Data Science Fork
Continuum Analytics (now Anaconda, Inc.) created conda because pip couldn't handle compiled C/C++/Fortran libraries that scientific computing depends on. NumPy, SciPy, and especially CUDA-linked PyTorch — these packages have complex native dependencies that pip simply wasn't designed to manage. Conda was a completely separate package manager with its own repository, its own solver, and the ability to install non-Python dependencies (like CUDA itself, MKL, or OpenSSL).
The trade-off: conda's SAT solver is thorough but painfully slow. A fresh conda create could take 10+ minutes just resolving dependencies. Mamba (2019) rewrote the solver in C++ and made conda usable again. But the ecosystem was already fragmenting — mixing pip and conda in the same environment became the number-one source of broken installs in data science.
Docker — Nuclear Option for Reproducibility
Solomon Hykes released Docker, and teams stopped trying to fix dependency conflicts — they containerized the entire OS instead. For production ML pipelines, Docker became non-negotiable: you ship the exact CUDA version, the exact Python version, the exact library versions, frozen in an image. NVIDIA's nvidia-docker (2016) extended this to GPU passthrough. But Docker adds latency, disk space (ML images are 5–15 GB), and cognitive overhead. For local development, it's overkill.
uv — The Unifier
Charlie Marsh at Astral released uv, written in Rust. It replaces pip, pip-tools, virtualenv, pyenv, and poetry — a single binary that handles Python version management, virtual environment creation, dependency resolution, and package installation. It's 10–100x faster than pip because the resolver and installer are written in compiled Rust instead of interpreted Python.
uv reached 1.0 stability in August 2024. By early 2025 it had become the de facto standard for new Python projects. If you're setting up a fresh AI development environment today, this is the tool to use.
Why the tooling history matters
Every Stack Overflow answer about Python environments is from a different era. Someone telling you to use conda install isn't wrong — they're from 2017. Someone recommending poetry isn't wrong — they're from 2021. Understanding the timeline lets you evaluate advice in context instead of cargo-culting outdated workflows.
The throughline: 1991 → 2026
Three decades. One problem — "it works on my machine" — solved in layers:
Setting Up Python with uv
uv is the recommended tool for 2025+. It handles everything: installing Python itself, creating virtual environments, resolving dependencies, and locking versions for reproducibility. One binary, zero configuration.
Install uv
# macOS / Linux (one command, no dependencies)
curl -LsSf https://astral.sh/uv/install.sh | sh
# Windows (PowerShell)
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
# Verify installation
uv --version
# uv 0.6.x (2026-xx-xx)uv installs as a single static binary. No Python required to install it — it bootstraps itself.
Create Your First AI Project
# Create a new project (generates pyproject.toml + .venv)
uv init my-ai-project
cd my-ai-project
# uv automatically creates a virtual environment
# No need to manually run "uv venv" or "source .venv/bin/activate"
# Install packages — uv uses the .venv automatically
uv add openai anthropic transformers torch
# Run a script — uv activates the venv for you
uv run python main.py
# Or activate manually if you prefer
source .venv/bin/activate # macOS/Linux
# .venv\Scripts\activate # Windowsuv add writes to pyproject.toml and generates a uv.lock lockfile automatically. The lockfile pins every transitive dependency to an exact version. Share it with your team and everyone gets identical installs.
Managing Python Versions
# Install a specific Python version (managed by uv)
uv python install 3.12
# Use it for a project
uv init --python 3.12 my-project
# List installed versions
uv python list
# Pin a version for a project (in pyproject.toml)
uv python pin 3.12No more pyenv, no more brew install python@3.12, no more fighting with system Python. uv downloads and manages Python builds itself.
When to Use What
| Tool | Best For | Speed | Manages Python? |
|---|---|---|---|
| uv | New projects, fast iteration, 2025+ standard | 10–100x pip | Yes |
| venv + pip | Simple projects, no extra install needed | Baseline | No |
| conda / mamba | Complex CUDA deps, existing conda projects | Slow (mamba: OK) | Yes |
| Docker | Production, CI/CD, team reproducibility | N/A (build once) | Full OS |
API Keys Setup
API keys are secrets. Treat them like passwords. One leaked key can cost you thousands of dollars.
You'll need API keys to call cloud models (GPT-4o, Claude, etc.). The setup is straightforward, but the security around it matters enormously. Bots scan every public GitHub commit for sk- prefixed strings within seconds of pushing. If you commit an API key, assume it's compromised immediately.
The $50,000 Mistake People Keep Making
In 2024, a developer committed their OpenAI key to a public GitHub repo. Within 90 seconds, automated scrapers detected it. Within 5 minutes, the key was being used to generate tokens at maximum throughput. The bill hit $47,000 before the developer noticed. OpenAI does not reverse charges for leaked keys. This happens every week.
# BEFORE creating any .env file:
# 1. Create .gitignore FIRST
echo ".env" >> .gitignore
echo ".env.local" >> .gitignore
echo ".env.*" >> .gitignore
# 2. Verify it's ignored
git status # .env should NOT appear
# 3. THEN create your .env file
touch .envNever
Hardcode keys in source code, even "temporarily"
Never
Paste keys in Jupyter notebooks (they end up in .ipynb JSON)
Never
Share keys in Slack, Discord, or email (all searchable)
OpenAI
GPT-4o, o3, embeddings, Whisper, DALL-E
- 1. Go to platform.openai.com/api-keys
- 2. Click "Create new secret key"
- 3. Copy the key immediately — you cannot see it again
- 4. Set a spending limit at Settings → Limits (do this first!)
- 5. Add billing at Settings → Billing
OPENAI_API_KEY=sk-proj-...
Tip: Create project-specific keys with limited permissions. If one leaks, you revoke only that key.
Anthropic
Claude Opus, Sonnet, Haiku
- 1. Go to console.anthropic.com/settings/keys
- 2. Click "Create Key"
- 3. Copy the key
- 4. Set workspace spending limits before doing anything else
- 5. Add credits at console.anthropic.com/settings/billing
ANTHROPIC_API_KEY=sk-ant-api03-...
Hugging Face
Models, datasets, Inference API, Spaces
- 1. Go to huggingface.co/settings/tokens
- 2. Click "Create new token"
- 3. Select "Read" for downloading models, "Write" for pushing models
- 4. Copy the token
HF_TOKEN=hf_...
Free tier is generous. Gated models (Llama, Gemma) require accepting a license on the model page before your token works.
Loading API Keys in Python
import os
from dotenv import load_dotenv
# Load .env file into environment variables
load_dotenv()
# Access keys — never hardcode these
openai_key = os.getenv("OPENAI_API_KEY")
anthropic_key = os.getenv("ANTHROPIC_API_KEY")
# Fail fast if key is missing (better than a cryptic API error)
if not openai_key:
raise ValueError(
"OPENAI_API_KEY not found. "
"Create a .env file with: OPENAI_API_KEY=sk-proj-..."
)
# Use with the OpenAI client
from openai import OpenAI
client = OpenAI() # Reads OPENAI_API_KEY from env automaticallyuv add python-dotenv
Production note
In production, don't use .env files. Use your platform's secret management: Vercel Environment Variables, AWS Secrets Manager, GCP Secret Manager, or Kubernetes secrets. The .env pattern is for local development only.
GPU Setup for Local Models
API-based models (GPT-4o, Claude) work on any machine. Local models need GPU acceleration.
Running inference on a 7-billion-parameter model takes ~2 seconds on a GPU and ~2 minutes on a CPU. Training is even more extreme: what takes 1 hour on an A100 takes 1 week on a CPU. If you plan to run models locally, GPU setup is not optional — it's a prerequisite.
NVIDIA GPU (CUDA)
Best ML supportNVIDIA dominates ML because CUDA has been the standard for 15+ years. Every framework (PyTorch, TensorFlow, JAX) has first-class CUDA support. You need: (1) an NVIDIA GPU with sufficient VRAM, (2) the NVIDIA driver, (3) CUDA toolkit, (4) PyTorch compiled for your CUDA version.
# 1. Check your NVIDIA driver version
nvidia-smi
# Look for "CUDA Version: 12.x" in top right — this is driver capability
# 2. Install CUDA toolkit (Ubuntu)
# Visit: https://developer.nvidia.com/cuda-downloads
# Or: sudo apt install nvidia-cuda-toolkit
# 3. Install PyTorch with matching CUDA version
uv add torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124
# 4. Verify CUDA is available in Python
python -c "import torch; print(f'CUDA available: {torch.cuda.is_available()}'); print(f'GPU: {torch.cuda.get_device_name(0)}')"
# CUDA available: True
# GPU: NVIDIA GeForce RTX 4090VRAM guide: 8 GB handles 7B models (quantized). 16 GB handles 13B. 24 GB (RTX 4090) handles 30B quantized. 80 GB (A100/H100) handles 70B at full precision. When in doubt, check the model card on Hugging Face for VRAM requirements.
Apple Silicon (M1/M2/M3/M4)
Apple's Metal Performance Shaders (MPS) backend works with PyTorch out of the box. No CUDA, no driver installation. Unified memory means your GPU shares RAM with the CPU — an M4 Max with 128 GB unified memory can load models that would require an A100.
# Install PyTorch (MPS support included automatically)
uv add torch torchvision torchaudio
# Verify MPS is available
python -c "import torch; print(f'MPS available: {torch.backends.mps.is_available()}')"
# MPS available: True
# Use MPS in your code
import torch
device = torch.device("mps" if torch.backends.mps.is_available() else "cpu")
model = model.to(device)
inputs = inputs.to(device)Caveat: some operations fall back to CPU silently on MPS. Check PyTorch MPS coverage docs if you hit unexpected slowdowns.
No GPU? You Have Options
No GPU does not mean you can't do AI engineering. Most work in this course uses API-based models that run on someone else's hardware. For local inference:
Ollama (easiest)
# Install from ollama.com, then: ollama run llama3.2:3b # Runs on CPU with decent speed
Cloud GPUs (cheapest)
# Google Colab: free T4 GPU # Lambda Labs: $0.50/hr for A10 # Vast.ai: $0.30/hr for RTX 4090 # RunPod: $0.40/hr for A40
For this course, a laptop with no GPU and $5/month in API credits is sufficient for every lesson through Level 2. GPU becomes important when you start fine-tuning in Level 3.
Essential Packages
AI engineering sits at the intersection of several ecosystems. Here's what each package does and why it's in your stack.
# === LLM API Clients ===
openai # GPT-4o, embeddings, Whisper, DALL-E
anthropic # Claude Opus, Sonnet, Haiku
# === Hugging Face Ecosystem ===
transformers # 300,000+ pre-trained models, inference + fine-tuning
sentence-transformers # Purpose-built embedding models (SBERT, BGE, E5)
datasets # 100,000+ datasets with streaming support
accelerate # Multi-GPU, mixed-precision training utilities
# === Deep Learning Framework ===
torch # PyTorch — the dominant ML framework
# Install separately with CUDA if needed (see GPU section)
# === Tokenization ===
tiktoken # OpenAI's fast BPE tokenizer (count tokens before API calls)
# === Environment ===
python-dotenv # Load .env files into os.environ
# === Data Processing ===
numpy # N-dimensional arrays, linear algebra
pandas # DataFrames for structured data
# === HTTP Clients ===
httpx # Modern async HTTP client (used by openai/anthropic SDKs)
# === Vector Databases (pick one to start) ===
chromadb # Embedded vector DB, zero config, great for prototyping
# pinecone-client # Managed cloud vector DB, scales to billions
# qdrant-client # Self-hosted or cloud, rich filtering
# === Optional: Local Inference ===
# ollama # Python client for Ollama local models
# vllm # High-throughput inference serverQuick Install (Copy-Paste)
# With uv (recommended) — creates project + installs everything
uv init ai-starter && cd ai-starter
uv add openai anthropic transformers sentence-transformers \
tiktoken python-dotenv numpy pandas httpx chromadb
# Verify core packages
uv run python -c "import openai, anthropic, transformers; print('All good!')"
# === OR with pip (if you prefer the classic way) ===
python -m venv .venv && source .venv/bin/activate
pip install openai anthropic transformers sentence-transformers \
tiktoken python-dotenv numpy pandas httpx chromadbDocker for Reproducibility
Docker ensures your environment works identically across machines. For local development it's optional. For production ML pipelines, CI/CD, and team collaboration, it's essential. The key insight: you ship the environment, not just the code.
FROM python:3.12-slim
WORKDIR /app
# Install uv (fast, no pip needed)
COPY --from=ghcr.io/astral-sh/uv:latest /uv /usr/local/bin/uv
# Copy dependency files first (Docker layer caching)
COPY pyproject.toml uv.lock ./
# Install dependencies (cached if pyproject.toml unchanged)
RUN uv sync --frozen --no-dev
# Copy application code
COPY . .
# Run — uv runs inside the venv automatically
CMD ["uv", "run", "python", "main.py"]GPU with Docker
# Use NVIDIA base image
FROM nvidia/cuda:12.4-runtime-ubuntu22.04
# Run with GPU access
docker run --gpus all my-ai-app
# Or use docker compose
services:
app:
build: .
deploy:
resources:
reservations:
devices:
- capabilities: [gpu]Requires NVIDIA Container Toolkit installed on the host.
.dockerignore (important!)
.env
.env.*
.venv/
__pycache__/
*.pyc
.git/
data/
models/
*.bin
*.safetensorsWithout this, Docker copies your 15 GB model weights and .env secrets into the image. Model weights should be downloaded at runtime or mounted as volumes.
Development Environments
Jupyter Notebooks
- - Exploration and prototyping
- - Inline visualization (matplotlib, plotly)
- - Iterating on prompts and embeddings
- - Sharing results with non-engineers
- - Hidden state (cells run out of order)
- - API keys in output cells (leaked to .ipynb JSON)
- - Hard to test, hard to code review
uv add jupyter
uv run jupyter notebookPython Scripts + IDE
- - Production code and services
- - Version control (clean diffs)
- - Testing and CI/CD pipelines
- - Refactoring and type checking
- - Prototype in Jupyter, productionize as scripts
- - Never deploy notebooks to production
uv run python main.py
uv run pytest tests/IDE Recommendations
Free, excellent Python/Jupyter extensions, integrated terminal, GitHub Copilot support.
VS Code fork with AI-native features. Tab completion, inline chat, codebase-aware generation.
Most powerful Python IDE. Best debugging, refactoring, and type inference. Pro version has Jupyter.
Verification Script
Copy this script into your project and run it. It checks every component of your setup and tells you exactly what's missing. Fix anything marked MISSING before proceeding.
#!/usr/bin/env python3
"""Verify AI development environment setup.
Run: uv run python verify_setup.py
"""
import sys
import shutil
def header(msg: str) -> None:
print(f"\n{'='*50}")
print(f" {msg}")
print(f"{'='*50}")
def check(label: str, ok: bool, fix: str = "") -> bool:
status = "OK" if ok else "MISSING"
color = "\033[92m" if ok else "\033[91m"
reset = "\033[0m"
print(f" {color}[{status}]{reset} {label}")
if not ok and fix:
print(f" Fix: {fix}")
return ok
def main():
all_ok = True
header("1. Python Version")
v = sys.version_info
print(f" Python {v.major}.{v.minor}.{v.micro}")
all_ok &= check("Python >= 3.10", v >= (3, 10), "uv python install 3.12")
header("2. Package Manager")
has_uv = shutil.which("uv") is not None
check("uv installed", has_uv, "curl -LsSf https://astral.sh/uv/install.sh | sh")
header("3. Core Packages")
packages = [
("openai", "openai", "uv add openai"),
("anthropic", "anthropic", "uv add anthropic"),
("transformers", "transformers", "uv add transformers"),
("sentence_transformers", "sentence-transformers", "uv add sentence-transformers"),
("tiktoken", "tiktoken", "uv add tiktoken"),
("torch", "torch", "uv add torch"),
("dotenv", "python-dotenv", "uv add python-dotenv"),
("numpy", "numpy", "uv add numpy"),
("pandas", "pandas", "uv add pandas"),
]
for import_name, display_name, fix_cmd in packages:
try:
__import__(import_name)
check(display_name, True)
except ImportError:
all_ok &= check(display_name, False, fix_cmd)
header("4. GPU Availability")
try:
import torch
if torch.cuda.is_available():
name = torch.cuda.get_device_name(0)
vram = torch.cuda.get_device_properties(0).total_mem / 1e9
check(f"CUDA GPU: {name} ({vram:.0f} GB)", True)
elif hasattr(torch.backends, "mps") and torch.backends.mps.is_available():
check("Apple MPS (Metal)", True)
else:
check("GPU", False, "CPU only — fine for API work, see GPU section")
except ImportError:
check("GPU check", False, "Install torch first")
header("5. API Keys (.env)")
try:
from dotenv import load_dotenv
import os
load_dotenv()
for key, name in [
("OPENAI_API_KEY", "OpenAI"),
("ANTHROPIC_API_KEY", "Anthropic"),
("HF_TOKEN", "Hugging Face"),
]:
val = os.getenv(key, "")
if val:
masked = val[:8] + "..." + val[-4:]
check(f"{name}: {masked}", True)
else:
check(f"{name}", False, f"Add {key}=... to .env file")
except ImportError:
check("python-dotenv", False, "uv add python-dotenv")
header("6. Quick API Test")
try:
import os
from dotenv import load_dotenv
load_dotenv()
if os.getenv("OPENAI_API_KEY"):
from openai import OpenAI
client = OpenAI()
r = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Say 'setup verified' in 2 words"}],
max_tokens=10,
)
check(f"OpenAI API: {r.choices[0].message.content}", True)
else:
check("OpenAI API", False, "Set OPENAI_API_KEY to test")
except Exception as e:
all_ok &= check(f"OpenAI API: {e}", False)
header("Result")
if all_ok:
print(" All checks passed. You're ready to build.")
else:
print(" Some checks failed. Fix the items above before continuing.")
print()
if __name__ == "__main__":
main()Run the Verification
# Save as verify_setup.py in your project, then:
uv run python verify_setup.py
# Or if using pip:
python verify_setup.pyCommon Issues and Fixes
These are the errors you will hit. Not might — will. Bookmark this section.
ModuleNotFoundError: No module named 'torch'
The most common error in AI development. Either the package isn't installed, or you're running Python from the wrong environment.
# Check which Python you're actually using
which python # Should point to .venv/bin/python
python -c "import sys; print(sys.executable)"
# If it points to /usr/bin/python — your venv isn't activated
source .venv/bin/activate # activate it
# Or use: uv run python script.py # uv handles it for youCUDA out of memory
The model doesn't fit in your GPU's VRAM. This is a hardware limit, not a bug.
# Check how much VRAM you have
nvidia-smi # Look for "Memory-Usage"
# Solutions (pick one):
# 1. Use a smaller model (7B instead of 70B)
# 2. Use quantization (4-bit uses ~4x less VRAM)
from transformers import BitsAndBytesConfig
bnb_config = BitsAndBytesConfig(load_in_4bit=True)
# 3. Reduce batch size
# 4. Use CPU offloading (slower but works)
model = AutoModel.from_pretrained("model", device_map="auto")openai.AuthenticationError: Incorrect API key
Your key is wrong, expired, or not being loaded from .env.
# Debug: is the key actually loaded?
python -c "import os; from dotenv import load_dotenv; load_dotenv(); print(os.getenv('OPENAI_API_KEY', 'NOT SET')[:12])"
# Common causes:
# 1. .env file is in wrong directory (must be in cwd or parent)
# 2. Key has trailing whitespace (copy-paste artifact)
# 3. Key was revoked on platform.openai.com
# 4. load_dotenv() not called before os.getenv()torch.cuda.is_available() returns False
PyTorch was installed without CUDA support, or CUDA toolkit version doesn't match.
# Check which PyTorch you have
python -c "import torch; print(torch.__version__)"
# If it says "2.2.0+cpu" — you have the CPU-only build
# Fix: reinstall with CUDA
uv remove torch && uv add torch --index-url https://download.pytorch.org/whl/cu124
# Verify driver compatibility
nvidia-smi # Driver CUDA version must be >= PyTorch CUDA versionopenai.RateLimitError: Rate limit reached
Too many API requests too fast, or you've hit your spending limit.
# Add exponential backoff to your API calls
import time
import openai
for attempt in range(5):
try:
response = client.chat.completions.create(...)
break
except openai.RateLimitError:
wait = 2 ** attempt # 1, 2, 4, 8, 16 seconds
print(f"Rate limited. Waiting {wait}s...")
time.sleep(wait)Key Takeaways
- 1
Use uv for everything — Python versions, virtual environments, package installation, and lockfiles. One tool replaces five.
- 2
API keys are secrets — Create .gitignore before .env. Set spending limits before writing code. Never commit keys.
- 3
GPU is optional for starting out — APIs run on provider hardware. Local models need GPU. Cloud GPUs are cheap if you don't own one.
- 4
Run the verification script — Five minutes of verification now saves hours of debugging later. Fix every MISSING item.
- 5
Docker is for production, not prototyping — Use it when you need reproducibility across machines. Skip it while you're learning.
Help improve this page
Missing a benchmark? Found an error? Have a suggestion? Your feedback feeds the CodeSOTA flywheel.