Conda and pip - managing environments

Environment: “the world around you”. In the Python context, a way to define all the tools and libraries that will be loaded when code is run, in a manner that makes it possible to reload them later in a predictable and reproducible fashion. This requires being able to both describe the necessary packages and to install/remove them.

Conda: a tool to install packages and to manage environments. Open source, developed by Anaconda (a company). It can manage not only Python packages but also R, C tools and more. The canonical conda docs.

Pip: the Python packaging tool, it only installs/removes Python packages, hosted at the Python Package Index (PyPI).

We will mostly use Conda

For managing our environments and dependencies. But we may need to occasinally install packages with pip if they aren’t available via Conda.

One important note: conda create and conda env create are very similar, but not identical (this is for historical reasons, they are being now merged). To create a new environment based on a few packages specified at the command line, use e.g.:

conda create -n myenv python=3.6 notebook pandas

but to create an env based on an environment.yml file, use:

conda env create -f environment.yml

You can bootstrap the creation of the environment.yml file from an existing environment by using:

conda env export > environment.yml

A key feature of using environment.yml is that you can specify within it not only packages installed with Conda, but also with pip.

Here’s an example of a simple environment.yml file:

name: example-environment

dependencies:
  - python==3.4
  - numpy
  - toolz
  - matplotlib
  - dill
  - pandas
  - partd
  - bokeh
  - pip:
    - git+https://github.com/blaze/dask.git#egg=dask[complete]
    - jupyterlab

To see your environments: conda info -e.

To remove an environment: conda remove -n example-environment --all.