Passion's Log

Where code meets life

← Back to blog

Deploying a Private PyPI Repository via Docker Compose

In enterprise environments, Python dependency management impacts development efficiency and deployment stability. This guide builds an offline, scalable, and highly available dependency repository using Docker.

DevOps #DevOps#Docker#Python#Private Repository#Dependency Management#CI/CD#Automation

Preface

In enterprise environments, Python dependency management not only affects development efficiency but also directly relates to deployment stability and controllability.

Issues such as limited public network access, lost dependencies, and version inconsistencies make a private PyPI repository essential infrastructure.

This article will build an offline, scalable, and highly available dependency repository solution based on Docker to meet enterprise delivery and operations requirements.


⚠️ Note

All content in the files must be modified according to your specific situation, such as image, /data address, port numbers, networks, hardware resources, etc.


Environment Preparation

Install Docker

For specific installation of Docker and Docker Compose, please refer to previous articles

Create Project Directory

mkdir /data/workspace/install-Pypi && cd /data/workspace/install-Pypi

Create Packages Directory

Used to store dependency packages you manually upload.

mkdir -p packages

Image Preparation

docker pull pypiserver/pypiserver:latest

Prepare Files

Write docker-compose.yml File

cd /data/workspace/install-Pypi && vim docker-compose.yml
version: '3.8'

services:
  pypiserver:
    image: pypiserver/pypiserver:latest
    container_name: pypiserver
    restart: always
    ports:
      - "8080:8080"
    volumes:
      - ./packages:/data/packages
    command: -P . -a . /data/packages

Start Service

docker-compose up -d

Test Service

Ensure you have dependencies in your repository. Pull temporarily for testing (assuming intranet IP is 192.168.1.100).

pip install <your-dependency-package-name> -i http://192.168.1.100:8080/simple/ --trusted-host 192.168.1.100

Fetch and Store Dependencies (Two Scenarios)

1. Online Scenario

Since the machine has external network access, we directly download the dependencies required in the official screenshot and automatically store them in the private server’s packages directory.

pip download -r requirements.txt \
  --only-binary=:all: \
  --find-links ./packages \
  --dest ./packages

Storage Explanation: After the command executes, all .whl files will be directly stored in the /data/workspace/install-Pypi/packages directory. Due to Docker’s volume mapping mechanism, the pypiserver container will recognize these newly added dependencies in real-time automatically, without any restart operation.


2. Offline Scenario

Execute the following operations on a Linux machine that has internet access.

⚠️ Enterprise Pitfall Avoidance Guide (Extremely Important): Python .whl dependency packages are OS-specific. If the online machine used to download dependencies (e.g., Mac or Windows) has a different architecture than your intranet deployment server (Linux X86 or ARM64), the directly downloaded packages transferred to the intranet will not install!

Strong Recommendation: Find an online Linux virtual machine with the exact same OS and architecture as your intranet server (e.g., both CentOS 7 X86_64 or Ubuntu 22.04 ARM64), and execute the following download operations on this machine. If conditions absolutely do not allow this, I will teach you how to force specify the platform in subsequent commands.

On the online machine, open the terminal and create a clean directory to avoid confusion with other files.

# Create a directory named dify-offline-download
mkdir -p /root/dify-offline-download

# Enter the directory
cd /root/dify-offline-download

Upload or copy the requirements.txt file you need to install into the newly created /root/dify-offline-download directory.

# Check to confirm the file exists
ls -l requirements.txt

Download only compiled binary .whl packages, not source code, and store them all in the wheels folder in the current directory.

pip download -r requirements.txt --only-binary=:all: --find-links ./wheels --dest ./wheels
(Wait for the command to finish. The screen will scroll wildly, prompting Saved ./wheels/xxxx.whl. Do not interrupt it until Successfully downloaded… appears)

⚠️ Note:

Advanced Tip (If you can only use Mac/Windows to download for intranet Linux): You must force specify the platform after the command (taking target server Linux X86_64 as an example): pip download -r requirements.txt —only-binary=:all: —find-links ./wheels —dest ./wheels —platform manylinux2014_x86_64 —python-version 3.10 —implementation cp —abi cp310 (If the architecture is the same, please ignore this advanced tip and use the standard one above)

After downloading, the wheels folder will contain dozens or hundreds of files. Do not upload them one by one sporadically, as they are极易 lost. Package them into a .tar.gz compressed file.

# Still execute in the /root/dify-offline-download directory:
tar -czvf dify-dependencies.tar.gz ./wheels

Operations in Offline Environment (Extraction and Import)

Extract the dependency compressed package physically transported in and move it to the exclusive directory.

# 1. Extract dependency package (will extract a wheels folder)
tar -xzvf dify-dependencies.tar.gz

# 2. Move all extracted .whl files to the warehouse's exclusive packages directory
mv ./wheels/* ./packages/

# 3. Clean up useless empty directories and compressed packages (keep enterprise environment tidy)
rm -rf ./wheels dify-dependencies.tar.gz pypiserver.tar

Storage Explanation: When the mv command finishes, the moment the dependency packages fall into the /data/workspace/install-Pypi/packages directory, pypiserver completes the dependency warehouse loading.


Documentation Supplement Section

Upstream Source (Fallback) Configuration in Online Environment

Explanation: This section applies only to [Scenario 1: Target deployment server has external network access]. By configuring Fallback, the private repository can automatically guide clients to specified public mirror sources (such as Tsinghua Source, Ali Source) to fetch packages when dependencies are missing locally.

Solution A

Configure a single upstream return address on the server side (pypiserver). If you hope the private server uniformly manages a most stable upstream source (e.g., Tsinghua Source), you can directly modify the startup parameters in docker-compose.yml.

cd /data/workspace/install-Pypi && vim docker-compose.yml

Add —fallback-url parameter

version: '3.8'

services:
  pypiserver:
    image: pypiserver/pypiserver:latest
    container_name: pypiserver
    restart: always
    ports:
      - "8080:8080"
    volumes:
      - ./packages:/data/packages
    command: -P . -a . --fallback-url https://pypi.tuna.tsinghua.edu.cn/simple/ /data/packages

Restart service to apply configuration

docker-compose down
docker-compose up -d

Solution B

Configure multiple source backups on the client side (Enterprise Recommended Solution)

Since pypiserver’s —fallback-url can only configure one upstream address, if Tsinghua Source is undergoing maintenance, it will lead to download failure. In actual enterprise production, to ensure extreme availability (high availability), we usually do not configure Fallback on the server side, but utilize the pip client’s own extra-index-url feature to achieve multi-source degradation.

Keep the server’s docker-compose.yml in the cleanest state, without adding —fallback-url.

version: '3.8'

services:
  pypiserver:
    image: ucbcvr30j13lrb.xuanyuan.run/pypiserver/pypiserver:latest 
    container_name: pypiserver
    restart: always
    ports:
      - "8081:8080"
    volumes:
      - ./packages:/data/packages
    command: -P . -a . /data/packages

Modify pip.conf of the deployment machine (client)

mkdir -p ~/.pip && vim ~/.pip/pip.conf

Write multi-source high-availability configuration

[global]
# Core: Set timeout to prevent any source from hanging
timeout = 60

# Primary Source: Your private warehouse address, always search here first
index-url = http://192.168.1.100:8080/simple/

# Fallback Sources: When primary source cannot find, search from the following sources in order
extra-index-url =
    https://pypi.tuna.tsinghua.edu.cn/simple/
    https://mirrors.aliyun.com/pypi/simple/
    https://pypi.org/simple/

# Trusted Hosts: Must add private server IP (HTTP) and all involved backup source domains to the trust list
trusted-host =
    192.168.1.100
    pypi.tuna.tsinghua.edu.cn
    mirrors.aliyun.com
    pypi.org

Enterprise Advantage Explanation:

When adopting Solution B, control of dependency resolution is handed to the pip client, which will prioritize searching http://192.168.1.100:8080 (local private server);

If not found, or if the private server is down, it will automatically and smoothly switch to Tsinghua Source;

If Tsinghua Source has network fluctuations, it will continue to try Ali Source or Official Source;

This is an extremely robust “Local Priority + Multi-Cloud Disaster Recovery” architecture.


Comments

Back to top