Docker Architecture and Components

Understanding the building blocks of the Docker ecosystem

Introduction to Docker

Docker is a platform that enables developers to build, package, and run applications in containers. It has become synonymous with containerization because it made containers accessible and practical for everyday development and deployment scenarios.

While our previous lecture introduced containerization concepts broadly, today we'll dive into Docker specifically, exploring its architecture and key components that make it work.

graph TD A[Docker Platform] --> B[Docker Engine] A --> C[Docker Hub] A --> D[Docker Compose] A --> E[Docker Desktop] B --> F[Container Runtime] B --> G[Image Management] B --> H[Networking] B --> I[Storage]

Docker Architecture Overview

Docker uses a client-server architecture, with a client component that communicates with a server (daemon) component using a REST API. This separation allows the Docker client to run on a different system than the Docker daemon, enabling remote management of Docker hosts.

graph TD A[Docker Client] -->|Commands| B[Docker Daemon] B -->|Manages| C[Containers] B -->|Manages| D[Images] B -->|Manages| E[Networks] B -->|Manages| F[Volumes] B --> G[containerd] G --> H[runc] I[Registry] <-->|Push/Pull Images| B

Think of this architecture like a restaurant: The client (you) places an order (command), the server (daemon) receives the order and delegates tasks to the kitchen staff (containerd, runc) who prepare your meal (container). The pantry (registry) provides ingredients (images) when needed.

Core Architectural Components

Docker Client

The Docker client is the primary way users interact with Docker. When you run commands like docker run or docker build, you're using the Docker client, which sends these commands to the Docker daemon for execution.

Common Client Commands

# Run a container
docker run nginx

# List running containers
docker ps

# Build an image
docker build -t myapp .

# Pull an image from a registry
docker pull ubuntu:20.04

# Push an image to a registry
docker push myusername/myapp:1.0

The client is like the remote control for your TV. You press buttons on the remote (issue commands), but the TV itself (the daemon) does the actual work of changing channels or adjusting volume.

Docker Client Configuration

The Docker client can be configured to connect to different Docker daemons, allowing you to manage containers on remote systems. This is done using environment variables or configuration files:

# Connect to a remote Docker daemon
export DOCKER_HOST=tcp://192.168.1.100:2375

# Use TLS for secure connections
export DOCKER_TLS=1
export DOCKER_CERT_PATH=/path/to/certs

Docker Daemon

The Docker daemon (dockerd) is a persistent background process that manages Docker objects such as images, containers, networks, and volumes. It listens for Docker API requests and processes them accordingly.

If the Docker client is the remote control, the daemon is the TV's internal circuitry that actually performs the work. It constantly listens for incoming commands and carries them out.

Daemon Responsibilities

Daemon Configuration

The Docker daemon can be configured using a JSON configuration file, typically located at /etc/docker/daemon.json:

{
  "debug": true,
  "tls": true,
  "tlscert": "/var/docker/server.pem",
  "tlskey": "/var/docker/serverkey.pem",
  "hosts": ["tcp://192.168.1.10:2376"]
}

Security Considerations

The Docker daemon runs with root privileges, which means anyone with access to the daemon effectively has root access to the host system. This underscores the importance of properly securing Docker installations:

containerd and runc

In 2016, Docker restructured its architecture to extract core container runtime functionality into separate components: containerd and runc. This modularization allowed these components to be used independently of Docker and contributed to the standardization of container runtimes.

Docker CLI Docker Daemon (dockerd) containerd containerd-shim runc

containerd

containerd is a daemon that manages the complete container lifecycle on a single host:

In our restaurant analogy, if the Docker daemon is the head chef coordinating the kitchen, containerd is the station chef responsible for implementing the cooking processes.

runc

runc is a lightweight, portable container runtime that implements the Open Container Initiative (OCI) specification. It's responsible for the low-level work of actually creating containers:

Continuing our restaurant analogy, runc is the cook who actually prepares the individual dishes according to specific recipes.

containerd-shim

The containerd-shim is a small process that sits between containerd and runc. Its main purposes are:

This component is like the kitchen expediter who ensures that finished dishes are properly presented and delivered to the customer, even if the chef is busy with other orders.

Docker Images and Layers

Docker images are read-only templates used to create containers. They're composed of filesystem layers that represent the file changes at each step of the image creation process.

graph TD A[Base Image Layer
e.g., ubuntu:20.04] --> B[Add Node.js Layer] B --> C[Add Application Code Layer] C --> D[Configure Environment Layer] D --> E[Final Image] E --> F[Container 1
with R/W Layer] E --> G[Container 2
with R/W Layer] E --> H[Container 3
with R/W Layer]

Image Layering System

Each instruction in a Dockerfile creates a new layer in the image:

# Layer 1: Base Image
FROM ubuntu:20.04

# Layer 2: Update packages
RUN apt-get update && apt-get upgrade -y

# Layer 3: Install Node.js
RUN apt-get install -y nodejs npm

# Layer 4: Set working directory
WORKDIR /app

# Layer 5: Copy application code
COPY . .

# Layer 6: Install dependencies
RUN npm install

# Layer 7: Configure port
EXPOSE 3000

# Layer 8: Set startup command
CMD ["npm", "start"]

Each layer only stores the changes from the previous layer, which makes image distribution more efficient. When you pull an image, Docker only downloads the layers you don't already have locally.

Union File System

Docker uses a union file system to combine these layers into a single, coherent filesystem for the container. This is similar to how transparent overlays work in image editing software: each layer is stacked on top of previous layers, with higher layers taking precedence when files exist in multiple layers.

Read-Only Layers and Copy-on-Write

All image layers are read-only. When a container runs, Docker adds a writable layer on top of the image layers. Any changes made within the container are stored in this writable layer using a copy-on-write mechanism:

  1. If a container process needs to read a file, it reads from the existing file in the lower image layers.
  2. If a process needs to modify a file, Docker first copies the file from the image layer to the writable container layer, then makes the change.
  3. All future reads will see the modified version of the file from the container layer.

This is like working with a photocopy of an important document instead of the original. You can make all the notes and edits you want on your copy, but the original remains unchanged for others to use.

Docker Storage

Docker provides several options for managing data in containers, each with different use cases and characteristics.

Storage Types

graph TD A[Docker Storage] --> B[Volumes] A --> C[Bind Mounts] A --> D[tmpfs Mounts] B --> B1[Created and managed by Docker] B --> B2[Stored in /var/lib/docker/volumes/] C --> C1[Any directory on the host] C --> C2[Host path mounted into container] D --> D1[Stored in host memory] D --> D2[Never written to host filesystem]

Volumes

Volumes are the preferred way to persist data in Docker:

# Create a volume
docker volume create my-data

# Run a container with a volume
docker run -v my-data:/app/data nginx

# List volumes
docker volume ls

# Inspect a volume
docker volume inspect my-data

Bind Mounts

Bind mounts directly map a host path into a container:

# Run a container with a bind mount
docker run -v /host/path:/container/path nginx

tmpfs Mounts

tmpfs mounts store data in the host's memory only:

# Run a container with a tmpfs mount
docker run --tmpfs /app/temp nginx

Choosing the right storage option is like choosing the right type of notebook: volumes are like a dedicated journal that stays on your bookshelf, bind mounts are like sticky notes you place on various surfaces around your house, and tmpfs mounts are like an erasable whiteboard that clears when powered off.

Docker Networking

Docker provides a networking system that allows containers to communicate with each other and with the outside world. It offers several built-in network drivers to accommodate different scenarios.

graph TD subgraph "Host System" A[Docker Engine] --- B[Network Drivers] B --- C[bridge] B --- D[host] B --- E[none] B --- F[overlay] B --- G[macvlan] end C --- H[Container 1] C --- I[Container 2] D --- J[Container 3] E --- K[Container 4] F --- L[Container 5] F --- M[Container 6 on Different Host] G --- N[Container 7]

Network Drivers

Network Commands

# List networks
docker network ls

# Create a new network
docker network create my-network

# Run a container on a specific network
docker run --network=my-network nginx

# Connect a running container to a network
docker network connect my-network container-name

# Inspect a network
docker network inspect my-network

Container Communication

Containers on the same network can communicate with each other using container names as hostnames, which Docker resolves via an embedded DNS server:

# Run a web server container
docker run -d --name web --network my-network nginx

# Run another container and access the web server
docker run --network my-network alpine wget -O- http://web

Docker networking is like a sophisticated telephone exchange. Different types of connections (network drivers) serve different purposes, but they all enable communication between callers (containers) based on specific rules and directories (DNS).

Docker Registries

Docker registries are services that store and distribute Docker images. They're a crucial part of the Docker ecosystem, enabling collaboration and deployment across different environments.

Types of Registries

Working with Registries

# Pull an image from Docker Hub
docker pull nginx:latest

# Tag an image for a registry
docker tag my-app:1.0 username/my-app:1.0

# Push an image to Docker Hub
docker push username/my-app:1.0

# Pull from a private registry
docker pull registry.example.com/my-app:1.0

Registry Authentication

# Log in to Docker Hub
docker login

# Log in to a private registry
docker login registry.example.com

Docker registries function like package distribution centers. Developers deliver their packaged applications (images) to the center, which then stores them in organized shelves (repositories) and delivers them to customers (users) when requested.

Docker Compose

Docker Compose is a tool for defining and running multi-container Docker applications. It uses a YAML file to configure application services, networks, and volumes, allowing you to start all services with a single command.

Core Features

Sample Docker Compose File (docker-compose.yml)

version: '3'

services:
  # Web server service
  web:
    image: nginx:alpine
    ports:
      - "80:80"
    volumes:
      - ./website:/usr/share/nginx/html
    depends_on:
      - app

  # Application service
  app:
    build: ./app
    environment:
      - NODE_ENV=production
      - DB_HOST=db
    depends_on:
      - db

  # Database service
  db:
    image: postgres:13
    volumes:
      - postgres_data:/var/lib/postgresql/data
    environment:
      - POSTGRES_PASSWORD=mysecretpassword
      - POSTGRES_USER=myuser
      - POSTGRES_DB=myapp

volumes:
  postgres_data:

Common Commands

# Start services
docker-compose up

# Start services in detached mode
docker-compose up -d

# Stop services
docker-compose down

# View logs
docker-compose logs

# Scale a service
docker-compose up -d --scale app=3

Docker Compose is like a blueprint and construction manager for a complex building. The YAML file is the blueprint that specifies how everything should be arranged, and the compose command is the construction manager that ensures all components are built and connected according to the plan.

Docker in Modern Development Workflows

Docker has transformed development workflows by providing a standardized environment across different stages of development and deployment.

Local Development

Testing and CI/CD

Deployment

Development Docker Compose App DB CI/CD Pipeline Build Image Run Tests Push to Registry Production Kubernetes App App App DB

Practice Activities

Activity 1: Explore Docker Architecture

  1. Install Docker on your system if you haven't already
  2. Run docker info to view information about your Docker installation
  3. Identify the storage driver, logging driver, and network driver configurations
  4. Find the location of Docker's data directory on your system

Activity 2: Investigate Layer Caching

  1. Create a Dockerfile with multiple RUN instructions
  2. Build the image and observe the build time
  3. Make a small change to one of the middle layers and rebuild
  4. Observe which layers are rebuilt and which are pulled from cache
  5. Optimize your Dockerfile to improve caching

Activity 3: Set Up a Multi-Container Application

  1. Create a docker-compose.yml file for a simple web application with a frontend and backend
  2. Configure appropriate networks for the containers to communicate
  3. Set up a volume for persistent data
  4. Use environment variables for configuration
  5. Start the application with Docker Compose and verify it works correctly

Resources for Further Learning

Summary

In this lecture, we've explored the architecture and key components of Docker:

Understanding Docker's architecture is essential for effectively containerizing applications and troubleshooting issues. In our next lecture, we'll dive into Docker installation and configuration.