Appendix J — Using Conda for Environment Management
J.1 Introduction to Conda
Conda is a powerful open-source package and environment management system that runs on Windows, macOS, and Linux. While similar to the virtual environment tools covered in the main text, conda offers distinct advantages for certain Python workflows, particularly in data science, scientific computing, and research domains.
Unlike tools that focus solely on Python packages, conda can package and distribute software for any language, making it especially valuable for projects with complex dependencies that extend beyond the Python ecosystem.
J.2 When to Consider Conda
Conda is particularly well-suited for:
- Data science projects requiring scientific packages (NumPy, pandas, scikit-learn, etc.)
- Research environments with mixed-language requirements (Python, R, C/C++ libraries)
- Projects with complex binary dependencies that are difficult to compile
- Cross-platform development where consistent environments across operating systems are crucial
- GPU-accelerated computing requiring specific CUDA versions
- Bioinformatics, computational physics, and other specialized scientific domains
J.3 Conda vs. Other Environment Tools
Feature | Conda | venv + pip | uv |
---|---|---|---|
Focus | Any language packages | Python packages | Python packages |
Binary package distribution | Yes (pre-compiled) | Limited | Limited |
Dependency resolution | Environment-level solver | Package-level solver | Fast, improved solver |
Platform support | Windows, macOS, Linux | Windows, macOS, Linux | Windows, macOS, Linux |
Non-Python dependencies | Excellent | Limited | Limited |
Speed | Moderate | Moderate | Very fast |
Scientific package support | Excellent | Good | Good |
J.4 Getting Started with Conda
J.4.1 Installation
Conda is available through several distributions:
- Miniconda: Minimal installer containing just conda and its dependencies
- Anaconda: Full distribution including conda and 250+ popular data science packages
For most development purposes, Miniconda is recommended as it provides a minimal base that you can build upon as needed.
To install Miniconda:
# Linux
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh
# macOS
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
bash Miniconda3-latest-MacOSX-x86_64.sh
# Windows
# Download the installer from https://docs.conda.io/en/latest/miniconda.html
# and run it
J.4.2 Basic Conda Commands
J.4.2.1 Creating Environments
# Create a new environment with Python 3.10
conda create --name myenv python=3.10
# Create environment with specific packages
conda create --name datasci python=3.10 numpy pandas matplotlib
# Create environment from file
conda env create --file environment.yml
J.4.2.2 Activating and Deactivating Environments
# Activate an environment
conda activate myenv
# Deactivate current environment
conda deactivate
J.4.2.3 Managing Packages
# Install packages
conda install numpy pandas
# Install from specific channel
conda install -c conda-forge scikit-learn
# Update packages
conda update numpy
# Remove packages
conda remove pandas
# List installed packages
conda list
J.4.2.4 Environment Management
# List all environments
conda env list
# Remove an environment
conda env remove --name myenv
# Export environment to file
conda env export > environment.yml
# Clone an environment
conda create --name newenv --clone oldenv
J.5 Environment Files with Conda
Conda uses YAML files to define environments, making them easily shareable and reproducible:
# environment.yml
name: datasci
channels:
- conda-forge
- defaults
dependencies:
- python=3.10
- numpy=1.23
- pandas>=1.4
- matplotlib
- scikit-learn
- pip
- pip:
- some-package-only-on-pypi
This file defines: - The environment name (datasci
) - Channels to search for packages (with preference order) - Conda packages with optional version constraints - Additional pip packages to install
Create this environment with:
conda env create -f environment.yml
J.6 Best Practices for Conda
J.6.1 Channel Management
Conda packages come from “channels.” The main ones are:
- defaults: Official Anaconda channel
- conda-forge: Community-led channel with more up-to-date packages
For consistent environments, specify channels explicitly in your environment files and consider adding channel priority:
channels:
- conda-forge
- defaults
This prioritizes conda-forge packages over defaults when both are available.
J.6.2 Minimizing Environment Size
Conda environments can become large. Keep them streamlined by:
- Only installing what you need
- Using the
--no-deps
flag when appropriate - Considering a minimal base environment with
conda create --name myenv python
J.6.3 Managing Conflicting Dependencies
When facing difficult dependency conflicts:
# Create environment with strict solver
conda create --name myenv python=3.10 --strict-channel-priority
# Or use the libmamba solver for better resolution
conda install -n base conda-libmamba-solver
conda create --name myenv python=3.10 --solver=libmamba
J.6.4 Combining Conda with pip
While conda can install most packages, some are only available on PyPI. The recommended approach:
- Install all conda-available packages first using conda
- Then install PyPI-only packages using pip
This approach is implemented automatically when using an environment.yml file with a pip section.
J.6.5 Environment Isolation from System Python
Avoid using your system Python installation with conda. Instead:
# Explicitly create all environments with a specific Python version
conda create --name myenv python=3.10
J.7 Integration with Development Workflows
J.7.1 Using Conda with VS Code
VS Code can automatically detect and use conda environments:
- Install the Python extension
- Open the Command Palette (Ctrl+Shift+P)
- Select “Python: Select Interpreter”
- Choose your conda environment from the list
J.7.2 Using Conda with Jupyter
Conda integrates well with Jupyter notebooks:
# Install Jupyter in your environment
conda install -c conda-forge jupyter
# Register your conda environment as a Jupyter kernel
conda install -c conda-forge ipykernel
python -m ipykernel install --user --name=myenv --display-name="Python (myenv)"
J.7.3 CI/CD with Conda
For GitHub Actions, you can use conda environments:
name: Python CI with Conda
on: [push, pull_request]
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up conda
uses: conda-incubator/setup-miniconda@v2
with:
python-version: 3.10
environment-file: environment.yml
auto-activate-base: false
- name: Run tests
shell: bash -l {0}
run: |
conda activate myenv pytest
J.8 Common Pitfalls and Solutions
J.8.1 Slow Environment Creation
Conda environments can take time to create due to dependency resolution:
# Use the faster libmamba solver
conda install -n base conda-libmamba-solver
conda create --name myenv python=3.10 numpy pandas --solver=libmamba
J.8.2 Conflicting Channels
Mixing packages from different channels can cause conflicts:
# Use strict channel priority
conda config --set channel_priority strict
J.8.3 Large Environment Sizes
Conda environments can grow large, especially with the Anaconda distribution:
# Start minimal and add only what you need
conda create --name myenv python=3.10
conda install -n myenv numpy pandas
# Or use mamba for more efficient installations
conda install -c conda-forge mamba
mamba create --name myenv python=3.10 numpy pandas
J.9 Mamba: A Faster Alternative
For large or complex environments, consider mamba, a reimplementation of conda’s package manager in C++:
# Install mamba
conda install -c conda-forge mamba
# Use mamba with the same syntax as conda
mamba create --name myenv python=3.10 numpy pandas
mamba install -n myenv scikit-learn
Mamba offers significant speed improvements for environment creation and package installation while maintaining compatibility with conda commands.
J.10 Conclusion
Conda provides a robust solution for environment management, particularly valuable for scientific computing, data science, and research applications. While more complex than venv, it solves specific problems that other tools cannot easily address, especially when dealing with non-Python dependencies or cross-platform binary distribution.
For projects focusing purely on Python dependencies without complex binary requirements, the venv and uv approaches covered in the main text may provide simpler workflows. However, understanding conda remains valuable for many Python practitioners, especially those working in scientific domains.