Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker Hub Sample images #50046

Closed
wants to merge 9 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 0 additions & 27 deletions .github/workflows/code-checks.yml
Original file line number Diff line number Diff line change
@@ -133,33 +133,6 @@ jobs:
asv machine --yes
asv run --quick --dry-run --strict --durations=30 --python=same

build_docker_dev_environment:
name: Build Docker Dev Environment
runs-on: ubuntu-22.04
defaults:
run:
shell: bash -el {0}

concurrency:
# https://github.community/t/concurrecy-not-work-for-push/183068/7
group: ${{ github.event_name == 'push' && github.run_number || github.ref }}-build_docker_dev_environment
cancel-in-progress: true

steps:
- name: Clean up dangling images
run: docker image prune -f

- name: Checkout
uses: actions/checkout@v3
with:
fetch-depth: 0

- name: Build image
run: docker build --pull --no-cache --tag pandas-dev-env .

- name: Show environment
run: docker run --rm pandas-dev-env python -c "import pandas as pd; print(pd.show_versions())"

requirements-dev-text-installable:
name: Test install requirements-dev.txt
runs-on: ubuntu-22.04
10 changes: 10 additions & 0 deletions ci/dockerfiles/Dockerfile.alpine
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
FROM python:3.10.8-alpine

RUN apk update & apk upgrade
RUN apk add gcc g++ libc-dev

COPY requirements-minimal.txt /tmp
RUN python -m pip install -r /tmp/requirements-minimal.txt

WORKDIR /home/pandas
CMD ["/bin/sh"]
13 changes: 13 additions & 0 deletions ci/dockerfiles/Dockerfile.mamba-all
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
FROM quay.io/condaforge/mambaforge

RUN apt update && apt upgrade -y
RUN DEBIAN_FRONTEND=noninteractive apt install -y tzdata

RUN mamba env create -f \
https://raw.githubusercontent.com/pandas-dev/pandas/main/environment.yml

RUN mamba init
RUN echo "\nmamba activate pandas-dev" >> ~/.bashrc
RUN mamba clean --all -qy

WORKDIR /home/pandas
21 changes: 21 additions & 0 deletions ci/dockerfiles/Dockerfile.mamba-minimal
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
FROM quay.io/condaforge/mambaforge

RUN apt update && apt upgrade -y
RUN DEBIAN_FRONTEND=noninteractive apt install -y tzdata

RUN mamba create -n pandas-dev \
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is disconnected from requirements-minimal.txt at the moment. Could create a separate env file, break this up into two steps with a pip install, or leave as is

cython \
hypothesis \
numpy \
pytest \
pytest-asyncio \
python=3.10.8 \
pytz \
python-dateutil \
versioneer

RUN mamba init
RUN echo "\nmamba activate pandas-dev" >> ~/.bashrc
RUN mamba clean --all -qy

WORKDIR /home/pandas
9 changes: 4 additions & 5 deletions Dockerfile → ci/dockerfiles/Dockerfile.pip-all
Original file line number Diff line number Diff line change
@@ -1,13 +1,12 @@
FROM python:3.10.8
WORKDIR /home/pandas

RUN apt-get update && apt-get -y upgrade
RUN apt-get install -y build-essential
RUN apt update && apt -y upgrade

# hdf5 needed for pytables installation
RUN apt-get install -y libhdf5-dev

RUN python -m pip install --upgrade pip
RUN python -m pip install \
RUN python -m pip install --use-deprecated=legacy-resolver \
-r https://raw.githubusercontent.com/pandas-dev/pandas/main/requirements-dev.txt

WORKDIR /home/pandas
CMD ["/bin/bash"]
10 changes: 10 additions & 0 deletions ci/dockerfiles/Dockerfile.pip-minimal
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
FROM python:3.10.8-slim

RUN apt update && apt upgrade -y
RUN apt install -y gcc g++

COPY requirements-minimal.txt /tmp
RUN python -m pip install -r /tmp/requirements-minimal.txt

WORKDIR /home/pandas
CMD ["/bin/bash"]
8 changes: 8 additions & 0 deletions ci/dockerfiles/requirements-minimal.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
cython
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file has some overlap with #50339

hypothesis
numpy
pytest
pytest-asyncio
pytz
python-dateutil
versioneer
68 changes: 27 additions & 41 deletions doc/source/development/contributing_environment.rst
Original file line number Diff line number Diff line change
@@ -157,56 +157,42 @@ should already exist.
Option 3: using Docker
~~~~~~~~~~~~~~~~~~~~~~

pandas provides a ``DockerFile`` in the root directory to build a Docker image
with a full pandas development environment.
Instead of manually setting up a development environment, you can use `Docker
<https://docs.docker.com/get-docker/>`_. pandas provides pre-built images that serve a
variety of users. These images include:

**Docker Commands**
* alpine - a lightweight image for the absolute minimalist (note: this is experimental)
* pip-minimal - a pip-based installation with the minimum set of packages for building / testing
* mamba-minimal - a mamba-based installation with the minimum set of packages for building / testing
* pip-all - a pip-based installation with all testing dependencies
* mamba-all - a mamba-based installation with all testing dependencies

Build the Docker image::
If you are a new user and the image size is no concern to you, we suggest opting for either image
that includes all of the dependencies, as this will ensure you can run the test suite without any
caveats.

# Build the image
docker build -t pandas-dev .
To use any of the images, you should first start with ``docker pull pandas/pandas:<tag>``,
where tag is one of *alpine*, *pip-minimal*, *mamba-minimal*, *pip-all* or *mamba-all*. You can then run
the image without any extra configuration.

Run Container::
To illustrate, if you wanted to use the *pip-all* image, from the root of your local pandas project
you would run:

# Run a container and bind your local repo to the container
# This command assumes you are running from your local repo
# but if not alter ${PWD} to match your local repo path
docker run -it --rm -v ${PWD}:/home/pandas pandas-dev

*Even easier, you can integrate Docker with the following IDEs:*

**Visual Studio Code**

You can use the DockerFile to launch a remote session with Visual Studio Code,
a popular free IDE, using the ``.devcontainer.json`` file.
See https://code.visualstudio.com/docs/remote/containers for details.

**PyCharm (Professional)**

Enable Docker support and use the Services tool window to build and manage images as well as
run and interact with containers.
See https://www.jetbrains.com/help/pycharm/docker.html for details.

Step 3: build and install pandas
--------------------------------
.. code-block:: bash

You can now run::
docker pull pandas/pandas:pip-all
docker run --rm -it -v ${PWD}:/home/pandas pandas/pandas:pip-all

# Build and install pandas
python setup.py build_ext -j 4
python -m pip install -e . --no-build-isolation --no-use-pep517
Similarly for *mamba-all*

At this point you should be able to import pandas from your locally built version::
.. code-block:: bash

$ python
>>> import pandas
>>> print(pandas.__version__) # note: the exact output may differ
2.0.0.dev0+880.g2b9e661fbb.dirty
docker pull pandas/pandas:mamba-all
docker run --rm -it -v ${PWD}:/home/pandas pandas/pandas:mamba-all

This will create the new environment, and not touch any of your existing environments,
nor any existing Python installation.
The *mamba-* images will automatically activate the appropriate virtual environment for you on entry.

.. note::
You will need to repeat this step each time the C extensions change, for example
if you modified any file in ``pandas/_libs`` or if you did a fetch and merge from ``upstream/main``.

You may run the images from a directory besides the root of the pandas project - just be
sure to substitute ${PWD} in the commands above to point to your local pandas repository