Introduction to the Weekend Project
Welcome to our final project for Module 28: DevOps & Deployment! Throughout this module, we've explored the fundamentals of CI/CD pipelines, Docker containerization, cloud deployment, Kubernetes orchestration, and comprehensive monitoring with Prometheus and Grafana. Now it's time to bring all these components together in a cohesive, end-to-end deployment pipeline for a full-stack application.
To tackle this complex project effectively, we'll use George Polya's renowned 4-step problem solving procedure. Polya, a mathematician and educator, developed this approach to break down and solve challenging problems through a structured methodology. The four steps are:
- Understand the Problem - Clarify what we're trying to accomplish
- Devise a Plan - Develop a strategy for solving the problem
- Execute the Plan - Implement our solution step by step
- Review/Reflect - Examine our solution and learn from the process
This problem-solving framework is particularly valuable in DevOps, where we often face complex, multi-faceted challenges that require systematic thinking and clear planning.
Project Overview
For this weekend project, you'll establish a comprehensive DevOps pipeline that takes a full-stack application from code commit to production deployment with robust monitoring and alerting. Your pipeline will incorporate continuous integration, automated testing, containerization, orchestration, and observability.
You can choose to work with an existing application from previous modules or use the sample full-stack application we'll provide. The application includes:
- A React or Vue.js frontend
- A Node.js, Python (Flask/Django), or PHP (Laravel) backend
- A PostgreSQL, MongoDB, or MySQL database
- Basic authentication and API functionality
Step 1: Understand the Problem
The first step in Polya's method is to thoroughly understand the problem before attempting to solve it. Let's break down what we need to accomplish in this project.
Problem Statement
We need to create a complete deployment pipeline that automates the process of taking our code from development to production while ensuring quality, reliability, and observability.
Key Components and Requirements
Asking the Right Questions
Following Polya's approach, let's ask key questions to ensure we fully understand the problem:
- What are the inputs? Source code, configurations, environment variables, secrets
- What are the outputs? Deployed application, monitoring dashboards, alerts
- What constraints do we face? Time, resource limitations, security requirements
- What are the critical metrics for success? Deployment frequency, lead time, time to recover, failure rate
- Who are the stakeholders? Developers, operations, users, business owners
Understanding the Application
We need to understand the structure and requirements of our full-stack application:
- Architecture: Is it a monolith or microservices? How do components communicate?
- Dependencies: What external services and libraries does it rely on?
- State management: How is persistent data handled? What about user sessions?
- Resource requirements: CPU, memory, storage, network demands
- Scaling needs: Expected load, growth projections, scaling strategy
Step 2: Devise a Plan
With a clear understanding of the problem, we can now devise a plan for creating our deployment pipeline.
Pipeline Architecture
Let's design the overall architecture of our pipeline:
Strategy Breakdown
Let's divide our plan into manageable phases:
Phase 1: Application Preparation
- Review and optimize the application code and structure
- Set up unit and integration tests
- Create Docker configurations for all components
- Implement health checks and monitoring instrumentation
Phase 2: CI/CD Pipeline Setup
- Create GitHub Actions workflows for automated building and testing
- Configure multi-environment deployments (staging, production)
- Set up security scanning and quality checks
- Implement automated deployment to Kubernetes
Phase 3: Kubernetes Infrastructure
- Create a Kubernetes cluster (local with Minikube or cloud-based)
- Configure namespaces, resource limits, and network policies
- Set up persistent storage for databases
- Implement ingress, TLS, and load balancing
Phase 4: Monitoring and Observability
- Deploy Prometheus and Grafana for metrics
- Set up Loki or ELK Stack for logs
- Configure alerts for critical conditions
- Create comprehensive dashboards
Resources and Tools
We'll use the following tools and technologies:
| Category | Tools | Purpose |
|---|---|---|
| Version Control | Git, GitHub | Code management and collaboration |
| CI/CD | GitHub Actions | Automated workflows |
| Containerization | Docker, Docker Compose | Application packaging |
| Container Registry | Docker Hub, GitHub Container Registry | Store and distribute images |
| Orchestration | Kubernetes, Minikube | Container orchestration |
| Infrastructure | Terraform, Helm | Infrastructure as code |
| Monitoring | Prometheus, Grafana, Loki | Metrics, visualization, logs |
| Security | Trivy, OWASP ZAP | Security scanning |
Alternative Approaches
In line with Polya's emphasis on considering multiple solutions, let's acknowledge some alternatives:
- Jenkins instead of GitHub Actions: More flexible but requires more setup and maintenance
- Docker Swarm instead of Kubernetes: Simpler but less feature-rich
- AWS/Azure/GCP managed services: Less setup but potentially more costly
- ELK Stack instead of Loki: More powerful but resource-intensive
Step 3: Execute the Plan
Now we'll implement our plan step by step. In Polya's method, execution involves working systematically and verifying each step as we proceed.
Phase 1: Application Preparation
Task 1.1: Dockerize the Application
Create Dockerfiles for each component of your application:
# Frontend Dockerfile (React example)
FROM node:16-alpine as build
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
FROM nginx:alpine
COPY --from=build /app/build /usr/share/nginx/html
COPY nginx.conf /etc/nginx/conf.d/default.conf
EXPOSE 80
CMD ["nginx", "-g", "daemon off;"]
# Backend Dockerfile (Node.js example)
FROM node:16-alpine as build
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
FROM node:16-alpine
WORKDIR /app
COPY --from=build /app/package*.json ./
COPY --from=build /app/node_modules ./node_modules
COPY --from=build /app/src ./src
COPY --from=build /app/dist ./dist
# Add Prometheus client for monitoring
RUN npm install prom-client
EXPOSE 3000
CMD ["node", "dist/server.js"]
Task 1.2: Create Docker Compose for Local Development
version: '3.8'
services:
frontend:
build:
context: ./frontend
ports:
- "80:80"
depends_on:
- backend
backend:
build:
context: ./backend
ports:
- "3000:3000"
environment:
- NODE_ENV=development
- DB_HOST=database
- DB_PORT=5432
- DB_USER=postgres
- DB_PASSWORD=postgres
- DB_NAME=appdb
depends_on:
- database
database:
image: postgres:14-alpine
environment:
- POSTGRES_USER=postgres
- POSTGRES_PASSWORD=postgres
- POSTGRES_DB=appdb
volumes:
- db-data:/var/lib/postgresql/data
ports:
- "5432:5432"
volumes:
db-data:
Task 1.3: Add Monitoring Instrumentation
Instrument your backend service to expose metrics for Prometheus:
// Node.js Backend Instrumentation Example
const express = require('express');
const app = express();
const { Counter, Gauge, Histogram, register } = require('prom-client');
// Create metrics
const httpRequestsTotal = new Counter({
name: 'http_requests_total',
help: 'Total number of HTTP requests',
labelNames: ['method', 'route', 'status']
});
const httpRequestDurationMicroseconds = new Histogram({
name: 'http_request_duration_seconds',
help: 'Duration of HTTP requests in seconds',
labelNames: ['method', 'route', 'status'],
buckets: [0.01, 0.05, 0.1, 0.5, 1, 2, 5]
});
const activeConnections = new Gauge({
name: 'http_active_connections',
help: 'Number of active connections'
});
// Add middleware to collect metrics
app.use((req, res, next) => {
activeConnections.inc();
const end = httpRequestDurationMicroseconds.startTimer();
res.on('finish', () => {
httpRequestsTotal.inc({
method: req.method,
route: req.route?.path || req.path,
status: res.statusCode
});
end({
method: req.method,
route: req.route?.path || req.path,
status: res.statusCode
});
activeConnections.dec();
});
next();
});
// Expose metrics endpoint
app.get('/metrics', async (req, res) => {
res.set('Content-Type', register.contentType);
res.end(await register.metrics());
});
// Your regular routes
app.get('/api/users', (req, res) => {
// ...
});
app.listen(3000, () => {
console.log('Server listening on port 3000');
});
Phase 2: CI/CD Pipeline Setup
Task 2.1: Create GitHub Action Workflow
# .github/workflows/ci-cd.yml
name: CI/CD Pipeline
on:
push:
branches: [ main ]
pull_request:
branches: [ main ]
jobs:
build-and-test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Node.js
uses: actions/setup-node@v3
with:
node-version: '16'
cache: 'npm'
- name: Install dependencies
run: |
cd frontend && npm ci
cd ../backend && npm ci
- name: Run linting
run: |
cd frontend && npm run lint
cd ../backend && npm run lint
- name: Run tests
run: |
cd frontend && npm test
cd ../backend && npm test
- name: Build frontend
run: cd frontend && npm run build
- name: Build backend
run: cd backend && npm run build
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
- name: Login to Docker Hub
uses: docker/login-action@v2
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Build and push frontend Docker image
uses: docker/build-push-action@v4
with:
context: ./frontend
push: ${{ github.ref == 'refs/heads/main' }}
tags: yourusername/frontend:latest
- name: Build and push backend Docker image
uses: docker/build-push-action@v4
with:
context: ./backend
push: ${{ github.ref == 'refs/heads/main' }}
tags: yourusername/backend:latest
security-scan:
runs-on: ubuntu-latest
needs: build-and-test
steps:
- uses: actions/checkout@v3
- name: Run Trivy vulnerability scanner on frontend
uses: aquasecurity/trivy-action@master
with:
image-ref: 'yourusername/frontend:latest'
format: 'sarif'
output: 'trivy-frontend-results.sarif'
severity: 'CRITICAL,HIGH'
- name: Run Trivy vulnerability scanner on backend
uses: aquasecurity/trivy-action@master
with:
image-ref: 'yourusername/backend:latest'
format: 'sarif'
output: 'trivy-backend-results.sarif'
severity: 'CRITICAL,HIGH'
deploy:
runs-on: ubuntu-latest
needs: [build-and-test, security-scan]
if: github.ref == 'refs/heads/main'
steps:
- uses: actions/checkout@v3
- name: Set up kubectl
uses: azure/setup-kubectl@v3
- name: Set Kubernetes context
uses: azure/k8s-set-context@v3
with:
kubeconfig: ${{ secrets.KUBE_CONFIG }}
- name: Deploy to Kubernetes
run: |
kubectl apply -f kubernetes/namespace.yaml
kubectl apply -f kubernetes/configmap.yaml
kubectl apply -f kubernetes/secret.yaml
kubectl apply -f kubernetes/database.yaml
kubectl apply -f kubernetes/backend.yaml
kubectl apply -f kubernetes/frontend.yaml
kubectl apply -f kubernetes/ingress.yaml
- name: Verify deployment
run: |
kubectl rollout status deployment/backend -n app
kubectl rollout status deployment/frontend -n app
Phase 3: Kubernetes Infrastructure
Task 3.1: Create Kubernetes Manifests
# kubernetes/namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
name: app
# kubernetes/configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config
namespace: app
data:
NODE_ENV: "production"
DB_HOST: "database"
DB_PORT: "5432"
DB_NAME: "appdb"
# kubernetes/secret.yaml
apiVersion: v1
kind: Secret
metadata:
name: app-secrets
namespace: app
type: Opaque
data:
DB_USER: cG9zdGdyZXM= # base64 encoded "postgres"
DB_PASSWORD: c2VjcmV0cGFzc3dvcmQ= # base64 encoded "secretpassword"
# kubernetes/database.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: database-pvc
namespace: app
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: database
namespace: app
spec:
serviceName: database
replicas: 1
selector:
matchLabels:
app: database
template:
metadata:
labels:
app: database
spec:
containers:
- name: postgres
image: postgres:14-alpine
ports:
- containerPort: 5432
env:
- name: POSTGRES_USER
valueFrom:
secretKeyRef:
name: app-secrets
key: DB_USER
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: app-secrets
key: DB_PASSWORD
- name: POSTGRES_DB
valueFrom:
configMapKeyRef:
name: app-config
key: DB_NAME
volumeMounts:
- name: data
mountPath: /var/lib/postgresql/data
volumes:
- name: data
persistentVolumeClaim:
claimName: database-pvc
---
apiVersion: v1
kind: Service
metadata:
name: database
namespace: app
spec:
selector:
app: database
ports:
- port: 5432
targetPort: 5432
# kubernetes/backend.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: backend
namespace: app
spec:
replicas: 2
selector:
matchLabels:
app: backend
template:
metadata:
labels:
app: backend
spec:
containers:
- name: backend
image: yourusername/backend:latest
ports:
- containerPort: 3000
env:
- name: NODE_ENV
valueFrom:
configMapKeyRef:
name: app-config
key: NODE_ENV
- name: DB_HOST
valueFrom:
configMapKeyRef:
name: app-config
key: DB_HOST
- name: DB_PORT
valueFrom:
configMapKeyRef:
name: app-config
key: DB_PORT
- name: DB_NAME
valueFrom:
configMapKeyRef:
name: app-config
key: DB_NAME
- name: DB_USER
valueFrom:
secretKeyRef:
name: app-secrets
key: DB_USER
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: app-secrets
key: DB_PASSWORD
livenessProbe:
httpGet:
path: /health
port: 3000
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /health
port: 3000
initialDelaySeconds: 5
periodSeconds: 5
resources:
limits:
cpu: "500m"
memory: "512Mi"
requests:
cpu: "100m"
memory: "256Mi"
---
apiVersion: v1
kind: Service
metadata:
name: backend
namespace: app
spec:
selector:
app: backend
ports:
- port: 80
targetPort: 3000
# kubernetes/frontend.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: frontend
namespace: app
spec:
replicas: 2
selector:
matchLabels:
app: frontend
template:
metadata:
labels:
app: frontend
spec:
containers:
- name: frontend
image: yourusername/frontend:latest
ports:
- containerPort: 80
livenessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 5
periodSeconds: 5
resources:
limits:
cpu: "200m"
memory: "256Mi"
requests:
cpu: "100m"
memory: "128Mi"
---
apiVersion: v1
kind: Service
metadata:
name: frontend
namespace: app
spec:
selector:
app: frontend
ports:
- port: 80
targetPort: 80
# kubernetes/ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: app-ingress
namespace: app
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
spec:
rules:
- host: app.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: frontend
port:
number: 80
- path: /api
pathType: Prefix
backend:
service:
name: backend
port:
number: 80
Phase 4: Monitoring and Observability
Task 4.1: Install Prometheus and Grafana with Helm
# Add Helm repositories
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo add grafana https://grafana.github.io/helm-charts
helm repo update
# Create monitoring namespace
kubectl create namespace monitoring
# Install Prometheus
helm install prometheus prometheus-community/kube-prometheus-stack \
--namespace monitoring \
--set grafana.enabled=true \
--set prometheus.prometheusSpec.serviceMonitorSelectorNilUsesHelmValues=false
# Install Loki for logs
helm install loki grafana/loki-stack \
--namespace monitoring \
--set grafana.enabled=false \
--set prometheus.enabled=false
Task 4.2: Configure Service Monitors for Application
# kubernetes/service-monitor.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: backend-monitor
namespace: monitoring
spec:
selector:
matchLabels:
app: backend
namespaceSelector:
matchNames:
- app
endpoints:
- port: http
interval: 15s
path: /metrics
Task 4.3: Create Grafana Dashboards
Access Grafana and create the following dashboards:
- Infrastructure Dashboard: Node CPU, memory, disk, and network usage
- Kubernetes Dashboard: Pod status, resource usage, and deployment metrics
- Application Dashboard: Request rate, error rate, response time, and active connections
- Logs Dashboard: Application logs with filtering and alerting
Example PromQL queries for application dashboard:
# Request Rate
sum(rate(http_requests_total{namespace="app"}[5m])) by (service, route)
# Error Rate
sum(rate(http_requests_total{namespace="app", status=~"5.."}[5m])) by (service, route) /
sum(rate(http_requests_total{namespace="app"}[5m])) by (service, route) * 100
# 95th Percentile Response Time
histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket{namespace="app"}[5m])) by (le, service, route))
# Active Connections
sum(http_active_connections{namespace="app"}) by (service)
Task 4.4: Configure Alerting
# kubernetes/prometheus-rules.yaml
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: app-alerts
namespace: monitoring
spec:
groups:
- name: app
rules:
- alert: HighErrorRate
expr: sum(rate(http_requests_total{namespace="app", status=~"5.."}[5m])) by (service) / sum(rate(http_requests_total{namespace="app"}[5m])) by (service) > 0.05
for: 5m
labels:
severity: critical
annotations:
summary: "High error rate on {{ $labels.service }}"
description: "{{ $labels.service }} is returning errors for more than 5% of requests"
- alert: SlowResponseTime
expr: histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket{namespace="app"}[5m])) by (le, service)) > 1
for: 5m
labels:
severity: warning
annotations:
summary: "Slow response time on {{ $labels.service }}"
description: "{{ $labels.service }} has a 95th percentile response time above 1 second"
- alert: HighCPUUsage
expr: sum(rate(container_cpu_usage_seconds_total{namespace="app"}[5m])) by (pod) / sum(kube_pod_container_resource_limits_cpu_cores{namespace="app"}) by (pod) > 0.8
for: 10m
labels:
severity: warning
annotations:
summary: "High CPU usage on {{ $labels.pod }}"
description: "{{ $labels.pod }} is using more than 80% of its CPU limit for more than 10 minutes"
Step 4: Review and Reflect
The final step in Polya's method is to review our work and reflect on what we've learned.
Verify the Solution
Let's check if our deployment pipeline meets all requirements:
Testing the Pipeline
To verify our pipeline works correctly, perform these tests:
- Push a code change: Verify that it triggers the CI/CD pipeline
- Introduce a failing test: Confirm that the pipeline stops at the testing stage
- Deploy a new version: Check that the deployment completes successfully
- Generate load: Ensure metrics are collected and displayed in Grafana
- Trigger alerts: Test that alerts fire when thresholds are exceeded
Lessons Learned
Through this project, we've gained valuable insights:
- Automation saves time: Manual deployments are error-prone and time-consuming
- Infrastructure as code is essential: It ensures consistency and repeatability
- Monitoring is not an afterthought: It should be built into the application and infrastructure
- Security requires vigilance: Automated scanning is just one part of a comprehensive security strategy
- Problem-solving approach matters: Polya's method helped us tackle a complex challenge systematically
Further Improvements
Based on our reflection, here are potential enhancements for the future:
- Implement blue-green deployments: For zero-downtime updates
- Add canary releases: To test new versions with a subset of users
- Enhance security measures: Add vulnerability scanning, secret management, and policy enforcement
- Implement distributed tracing: For better visibility into request flow
- Create custom metric exporters: For business-specific metrics
- Add cost monitoring: To track and optimize cloud resource usage
Submission Requirements
For this weekend project, prepare the following deliverables:
- GitHub Repository: Containing the application code, Docker configurations, Kubernetes manifests, and GitHub Actions workflows
- Documentation: README file explaining the project structure, setup instructions, and the deployment pipeline
- Diagram: Visual representation of your deployment pipeline and monitoring stack
- Reflection Report: 1-2 page document discussing your approach, challenges faced, solutions implemented, and lessons learned
- Screenshots: GitHub Actions workflow runs, Kubernetes dashboard, and Grafana dashboards
Assessment Criteria
Your project will be evaluated based on:
- Completeness: Implementation of all required components
- Automation: Level of automation in the pipeline
- Reliability: Robustness of the deployment process
- Observability: Quality and comprehensiveness of monitoring
- Documentation: Clarity and completeness of documentation
- Problem-solving approach: Evidence of systematic thinking and application of Polya's method
Resources and References
Conclusion
This weekend project brings together everything we've learned in Module 28: DevOps & Deployment. By building a complete deployment pipeline with monitoring for a full-stack application, you'll gain hands-on experience with the tools and practices used in modern DevOps environments.
More importantly, by applying George Polya's 4-step problem solving procedure, you'll develop a structured approach to tackling complex technical challenges. This skill will serve you well throughout your career as a full-stack developer or DevOps engineer.
Remember that DevOps is not just about tools but also about culture and practices. The pipeline you're building is designed to support continuous improvement, rapid feedback, and reliable deliveries—all core principles of DevOps.
Good luck with your project, and don't hesitate to ask questions or seek help if needed!