Introduction to API Response Caching
API response caching is the practice of storing API responses temporarily so they can be reused for subsequent identical requests, reducing the need to regenerate the same data repeatedly.
Think of caching like a coffee shop's pre-made batch of coffee. Rather than brewing a fresh pot for each customer (which takes time and resources), they prepare batches in advance. When customers order coffee, they get served immediately from the batch. Only when the batch runs out do they need to brew a new one. Similarly, API caching prepares and stores responses in advance to serve repeated requests quickly.
Why API Caching Matters
Implementing effective API caching provides numerous benefits:
- Improved Performance: Reduce response times by serving pre-generated content
- Reduced Server Load: Decrease CPU, memory, and database usage
- Higher Throughput: Handle more requests with the same infrastructure
- Better Scalability: Accommodate traffic spikes more easily
- Lower Costs: Decrease infrastructure requirements and operational expenses
- Enhanced Reliability: Continue serving cached data even when backend services are experiencing issues
- Consistent Experience: Provide more predictable response times
The impact of caching on API performance can be dramatic. For example, a properly cached API endpoint might serve responses in milliseconds compared to hundreds of milliseconds or even seconds for uncached responses. This can significantly improve user experience, especially in mobile applications or web applications where API responsiveness directly affects user engagement.
Caching Fundamentals
Key Caching Concepts
Before diving into specific strategies, let's understand some fundamental caching concepts:
- Cache Hit: When a requested item is found in the cache
- Cache Miss: When a requested item is not found in the cache and must be retrieved from the original source
- Cache Key: The unique identifier used to store and retrieve cached items
- Time-to-Live (TTL): How long an item remains valid in the cache before expiring
- Cache Invalidation: The process of removing or replacing cached items when they're no longer valid
- Cache Consistency: Ensuring cached data accurately reflects the current state of the original data
- Cache Efficiency: The ratio of cache hits to total requests
The Cache Hit Ratio
A key metric for measuring cache effectiveness is the cache hit ratio:
Cache Hit Ratio = (Number of Cache Hits) / (Total Number of Requests)
A high cache hit ratio (e.g., 0.95 or 95%) indicates that most requests are being served from the cache, which is ideal for performance. A low hit ratio suggests that the caching strategy might need optimization.
Cache Control Headers
HTTP provides several headers that control caching behavior:
- Cache-Control: Primary mechanism for defining caching policies
- ETag: Entity tag for conditional requests
- Last-Modified: Timestamp for when the resource was last changed
- Expires: Specifies when a resource becomes stale
- Vary: Indicates how to match future request headers to determine cache hit
The Cache-Control header supports several directives:
| Directive | Description | Example |
|---|---|---|
| max-age | Maximum time in seconds the resource can be cached | Cache-Control: max-age=3600 |
| no-cache | Must revalidate with the server before using the cached version | Cache-Control: no-cache |
| no-store | Don't store the response in any cache | Cache-Control: no-store |
| private | Response is intended for a single user and must not be stored by shared caches | Cache-Control: private |
| public | Response may be cached by any cache | Cache-Control: public |
| s-maxage | Like max-age but for shared caches only | Cache-Control: s-maxage=7200 |
| must-revalidate | Must verify the status of stale resources before using them | Cache-Control: must-revalidate |
These directives can be combined, for example:
Cache-Control: public, max-age=86400, must-revalidate
ETags and Conditional Requests
ETags provide a mechanism for conditional requests, allowing clients to check if their cached version is still valid:
// Server generating an ETag (Node.js/Express example)
const crypto = require('crypto');
app.get('/api/products/:id', (req, res) => {
// Get product data
const product = getProductById(req.params.id);
// Generate ETag based on product data
const productJson = JSON.stringify(product);
const etag = crypto.createHash('md5').update(productJson).digest('hex');
// Check If-None-Match header for conditional request
const ifNoneMatch = req.headers['if-none-match'];
if (ifNoneMatch === etag) {
// Client has the current version, send 304 Not Modified
return res.status(304).end();
}
// Set ETag header and send full response
res.setHeader('ETag', etag);
res.setHeader('Cache-Control', 'public, max-age=3600');
res.json(product);
});
Client vs. Server vs. Proxy Caching
Caching can happen at multiple levels in the request/response chain:
- Client-Side Caching: Browser or mobile app caches responses locally
- Proxy Caching: Intermediate servers (CDNs, reverse proxies) cache responses
- Server-Side Caching: API servers cache responses or intermediate data
Each level has different characteristics:
| Cache Level | Pros | Cons |
|---|---|---|
| Client Cache | Eliminates network requests; Fastest response times | No control after deployment; Different cache behaviors across clients |
| CDN/Proxy Cache | Reduced latency; Offloads traffic from origin server | Configuration complexity; Potential for stale data |
| API Gateway Cache | Centralized control; Can cache authenticated responses | Cache hit ratio depends on request patterns; Additional infrastructure |
| Application Cache | Fine-grained control; Can cache specific operations | Requires more application logic; Memory usage on app servers |
| Database Cache | Reduces database load; Works with any application | Limited to data retrieval optimization; Doesn't help with compute-heavy APIs |
API Caching Strategies
Different API caching strategies are suitable for different use cases. Let's explore the most common and effective strategies:
Time-Based Caching
The simplest caching strategy, where cached items expire after a fixed time period.
// Time-based caching with Redis (Node.js example)
const redis = require('redis');
const { promisify } = require('util');
const client = redis.createClient();
const getAsync = promisify(client.get).bind(client);
const setexAsync = promisify(client.setex).bind(client);
async function fetchProductWithCache(productId) {
const cacheKey = `product:${productId}`;
try {
// Try to get from cache
const cachedData = await getAsync(cacheKey);
if (cachedData) {
console.log('Cache hit!');
return JSON.parse(cachedData);
}
console.log('Cache miss!');
// Get from database
const product = await fetchProductFromDatabase(productId);
// Cache for 1 hour (3600 seconds)
await setexAsync(cacheKey, 3600, JSON.stringify(product));
return product;
} catch (error) {
console.error('Error:', error);
throw error;
}
}
Best For:
- Data that changes on a predictable schedule
- Resources that are expensive to generate but don't need to be perfectly up-to-date
- Public, non-personalized API responses
Considerations:
- Choosing appropriate TTL values (too short: ineffective caching; too long: stale data)
- Handling cache stampedes when popular items expire
- Determining if all resources should have the same TTL or varied based on content type
Content-Based Caching (Validation)
Caching based on content changes, using ETags or Last-Modified dates for validation.
// Content-based caching with Last-Modified (Express example)
app.get('/api/articles/:id', async (req, res) => {
const articleId = req.params.id;
// Get article with its last modification date
const article = await getArticleById(articleId);
const lastModified = new Date(article.updatedAt);
// Format for HTTP header
const lastModifiedStr = lastModified.toUTCString();
// Check If-Modified-Since header
const ifModifiedSince = req.headers['if-modified-since'];
if (ifModifiedSince) {
const ifModifiedDate = new Date(ifModifiedSince);
// If article hasn't been modified since the client's version
if (lastModified <= ifModifiedDate) {
return res.status(304).end(); // Not Modified
}
}
// Set Last-Modified header and send full response
res.setHeader('Last-Modified', lastModifiedStr);
res.setHeader('Cache-Control', 'must-revalidate');
res.json(article);
});
Best For:
- Resources that change irregularly
- Scenarios where bandwidth savings are important
- Content that must be highly accurate but can leverage conditional requests
Considerations:
- Need reliable ways to detect content changes (hash calculations, timestamps)
- Still requires a server roundtrip for validation
- More complex to implement than time-based caching
Variation-Based Caching
Caching different versions of responses based on request parameters or headers.
// Variation-based caching with Redis (Express example)
app.get('/api/products', async (req, res) => {
// Build cache key based on query parameters
const page = req.query.page || 1;
const limit = req.query.limit || 10;
const sort = req.query.sort || 'createdAt';
const category = req.query.category || 'all';
// Create a unique cache key based on the request parameters
const cacheKey = `products:${category}:${page}:${limit}:${sort}`;
try {
// Try to get from cache
const cachedData = await getAsync(cacheKey);
if (cachedData) {
console.log('Cache hit!');
return res.json(JSON.parse(cachedData));
}
console.log('Cache miss!');
// Get from database with filters
const products = await getProducts({
page: parseInt(page),
limit: parseInt(limit),
sort,
category: category !== 'all' ? category : null
});
// Cache for 10 minutes (600 seconds)
await setexAsync(cacheKey, 600, JSON.stringify(products));
res.json(products);
} catch (error) {
console.error('Error:', error);
res.status(500).json({ error: 'Server error' });
}
});
Best For:
- APIs with many parameter combinations
- Endpoints that return different content based on request properties
- Multi-language or multi-region APIs
Considerations:
- Cache keys need to account for all relevant variations
- Cache can grow very large with many variations
- Need to use the Vary header correctly for HTTP caching
Example of using the Vary header:
// Set Vary header to inform caches about content variations
app.get('/api/content', (req, res) => {
const language = req.headers['accept-language'] || 'en';
const userAgent = req.headers['user-agent'];
// Indicate that response varies based on these headers
res.setHeader('Vary', 'Accept-Language, User-Agent');
// Get content for the specific language and platform
const content = getContentForLanguageAndPlatform(language, userAgent);
res.json(content);
});
Query Result Caching
Caching the results of database queries or expensive computations.
// Query result caching with Redis (Node.js/Mongoose example)
const mongoose = require('mongoose');
const redis = require('redis');
const util = require('util');
const client = redis.createClient();
client.get = util.promisify(client.get);
client.set = util.promisify(client.set);
// Create a function to cache mongoose query results
function cacheQuery(query, hashKey, ttl = 3600) {
const Model = query.model;
const originalExec = query.exec;
// Override the exec function
query.exec = async function() {
// Generate a unique key based on the query and model
const key = `${Model.collection.name}:${hashKey}:${JSON.stringify(query.getQuery())}`;
// Try to get from cache
const cachedResult = await client.get(key);
if (cachedResult) {
console.log('Query cache hit!');
const parsedResult = JSON.parse(cachedResult);
// Convert to Mongoose documents
return Array.isArray(parsedResult)
? parsedResult.map(doc => new Model(doc))
: new Model(parsedResult);
}
console.log('Query cache miss!');
// Execute the original query
const result = await originalExec.apply(this, arguments);
// Cache the result
await client.set(
key,
JSON.stringify(result),
'EX',
ttl
);
return result;
};
return query;
}
// Usage example
app.get('/api/users', async (req, res) => {
try {
// Create a query and apply caching
const query = User.find({ active: true }).sort('lastName');
const cachedQuery = cacheQuery(query, 'active-users');
// Execute the cached query
const users = await cachedQuery;
res.json(users);
} catch (error) {
console.error('Error:', error);
res.status(500).json({ error: 'Server error' });
}
});
Best For:
- Expensive database queries
- Computationally intensive operations
- Aggregation and reporting endpoints
Considerations:
- Need to invalidate cache when underlying data changes
- May need to cache query results at multiple levels of granularity
- Cache key generation must capture all query parameters
Fragment Caching
Caching portions of API responses rather than entire responses.
// Fragment caching example (Node.js/Express)
app.get('/api/dashboard', async (req, res) => {
const userId = req.user.id;
try {
// Create an object to hold all fragments
const dashboard = {};
// Try to get user profile from cache (changes infrequently)
const profileCacheKey = `user:${userId}:profile`;
let userProfile = await cache.get(profileCacheKey);
if (!userProfile) {
console.log('Profile cache miss');
userProfile = await getUserProfile(userId);
await cache.set(profileCacheKey, userProfile, 3600); // Cache for 1 hour
}
dashboard.userProfile = userProfile;
// Try to get user statistics from cache (changes more frequently)
const statsCacheKey = `user:${userId}:stats`;
let userStats = await cache.get(statsCacheKey);
if (!userStats) {
console.log('Stats cache miss');
userStats = await getUserStats(userId);
await cache.set(statsCacheKey, userStats, 300); // Cache for 5 minutes
}
dashboard.userStats = userStats;
// Activity feed is never cached (real-time data)
dashboard.activityFeed = await getActivityFeed(userId);
// Recommended content (cached for all users, not per-user)
const recommendationsCacheKey = 'global:recommendations';
let recommendations = await cache.get(recommendationsCacheKey);
if (!recommendations) {
console.log('Recommendations cache miss');
recommendations = await getRecommendations();
await cache.set(recommendationsCacheKey, recommendations, 1800); // Cache for 30 minutes
}
dashboard.recommendations = recommendations;
res.json(dashboard);
} catch (error) {
console.error('Error:', error);
res.status(500).json({ error: 'Server error' });
}
});
Best For:
- Complex API responses with varying freshness requirements
- Responses that combine static and dynamic data
- Personalized content with common elements
Considerations:
- More complex implementation logic
- Requires careful tracking of dependencies between fragments
- May result in multiple cache operations for a single request
Surrogate Key Caching
Using associated metadata keys to manage cache invalidation for related resources.
// Surrogate key caching with a CDN (like Fastly)
// Server side logic
app.get('/api/articles/:id', async (req, res) => {
const articleId = req.params.id;
// Get the article
const article = await getArticleById(articleId);
// Set surrogate keys for effective cache invalidation
// These keys represent relationships this article has
const surrogateKeys = [
`article-${articleId}`,
`author-${article.authorId}`,
`category-${article.categoryId}`
];
// Add related tags if present
if (article.tags && article.tags.length) {
article.tags.forEach(tag => {
surrogateKeys.push(`tag-${tag}`);
});
}
// Set Surrogate-Key header (used by CDNs like Fastly)
res.setHeader('Surrogate-Key', surrogateKeys.join(' '));
// Set cache control for the CDN
res.setHeader('Cache-Control', 'public, max-age=3600');
res.setHeader('Surrogate-Control', 'max-age=86400'); // CDN-specific directive
res.json(article);
});
// When updating an article
app.put('/api/articles/:id', async (req, res) => {
const articleId = req.params.id;
// Update the article
const article = await updateArticle(articleId, req.body);
// Purge the specific article from the cache
await purgeFromCDN(`article-${articleId}`);
res.json(article);
});
// When updating an author's information
app.put('/api/authors/:id', async (req, res) => {
const authorId = req.params.id;
// Update the author
const author = await updateAuthor(authorId, req.body);
// Purge all articles by this author from the cache
await purgeFromCDN(`author-${authorId}`);
res.json(author);
});
Best For:
- Content with complex relationships
- Systems with advanced CDNs that support surrogate keys
- Scenarios where precise cache invalidation is critical
Considerations:
- Requires CDN or cache server that supports surrogate keys
- Need to track all relationships between resources
- Can lead to over-invalidation if keys are too broad
Cache Invalidation Techniques
Managing cache invalidation is one of the most challenging aspects of caching. Let's explore various techniques:
"There are only two hard things in Computer Science: cache invalidation and naming things."
— Phil Karlton
Time-Based Invalidation
The simplest approach - cache entries expire after a predetermined time.
// Setting TTL with Redis
redis.setex('product:1234', 3600, JSON.stringify(product)); // Expires after 1 hour
// Setting Cache-Control max-age in HTTP
res.setHeader('Cache-Control', 'public, max-age=3600'); // 1 hour client cache
Pros: Simple to implement, no additional logic needed
Cons: Data may be stale until expiration, or unnecessarily refreshed if unchanged
Explicit Invalidation
Actively removing or updating cache entries when the source data changes.
// Explicit cache invalidation on data change
app.put('/api/products/:id', async (req, res) => {
const productId = req.params.id;
try {
// Update product in database
const updatedProduct = await updateProduct(productId, req.body);
// Invalidate cache for this product
await redis.del(`product:${productId}`);
// Optionally, update the cache with new data
await redis.setex(
`product:${productId}`,
3600,
JSON.stringify(updatedProduct)
);
res.json(updatedProduct);
} catch (error) {
console.error('Error:', error);
res.status(500).json({ error: 'Server error' });
}
});
Pros: Ensures cache accuracy, efficient use of cache
Cons: Requires tracking all cache keys, complexity increases with distributed systems
Cache Stampede Prevention
Techniques to prevent multiple concurrent regenerations of the same cached item.
// Preventing cache stampede with a distributed lock
const redlock = require('redlock');
const lock = new redlock([redis], {
driftFactor: 0.01,
retryCount: 10,
retryDelay: 200
});
async function getProductWithStampedeProtection(productId) {
const cacheKey = `product:${productId}`;
// Try to get from cache
const cachedData = await redis.get(cacheKey);
if (cachedData) {
return JSON.parse(cachedData);
}
// Cache miss - use a lock to prevent stampede
const lockKey = `lock:${cacheKey}`;
let resource;
try {
// Try to acquire a lock
resource = await lock.acquire([lockKey], 5000); // 5s lock timeout
// Double-check cache after acquiring lock (another process might have updated it)
const cachedDataRetry = await redis.get(cacheKey);
if (cachedDataRetry) {
return JSON.parse(cachedDataRetry);
}
// Generate new data
const product = await fetchProductFromDatabase(productId);
// Cache with TTL
await redis.setex(cacheKey, 3600, JSON.stringify(product));
return product;
} finally {
// Release lock if acquired
if (resource) {
await resource.unlock();
}
}
}
Stale-While-Revalidate Pattern
Serve stale content while asynchronously refreshing the cache.
// Stale-While-Revalidate implementation
app.get('/api/products/:id', async (req, res) => {
const productId = req.params.id;
const cacheKey = `product:${productId}`;
try {
// Get from cache with metadata
const cachedItem = await cache.getWithMetadata(cacheKey);
if (cachedItem) {
const { data, metadata } = cachedItem;
const { createdAt, ttl } = metadata;
const now = Date.now();
const age = (now - createdAt) / 1000; // age in seconds
// Check if data is stale (over TTL but within grace period)
if (age > ttl && age < ttl + 600) { // 10-minute grace period
// Return stale data but refresh in background
refreshCacheAsync(cacheKey, fetchProductFromDatabase, productId);
// Set stale response headers
res.setHeader('Cache-Control', 'max-age=0, must-revalidate');
res.setHeader('X-Cache-Status', 'stale');
return res.json(data);
} else if (age <= ttl) {
// Fresh data
res.setHeader('Cache-Control', `max-age=${ttl - age}`);
res.setHeader('X-Cache-Status', 'hit');
return res.json(data);
}
// If beyond grace period, fall through to refresh
}
// Cache miss or beyond grace period - get fresh data
const product = await fetchProductFromDatabase(productId);
// Cache with metadata
await cache.setWithMetadata(cacheKey, product, {
ttl: 3600, // 1 hour TTL
createdAt: Date.now()
});
res.setHeader('Cache-Control', 'max-age=3600');
res.setHeader('X-Cache-Status', 'miss');
res.json(product);
} catch (error) {
console.error('Error:', error);
res.status(500).json({ error: 'Server error' });
}
});
// Async cache refresh function
function refreshCacheAsync(cacheKey, fetchFn, ...args) {
// Fire and forget - don't await
(async () => {
try {
const freshData = await fetchFn(...args);
await cache.setWithMetadata(cacheKey, freshData, {
ttl: 3600,
createdAt: Date.now()
});
console.log(`Async refresh completed for ${cacheKey}`);
} catch (error) {
console.error(`Async refresh failed for ${cacheKey}:`, error);
}
})();
}
Pros: Improves user experience, reduces system load during traffic spikes
Cons: Increased complexity, may serve stale data for a short period
Cache Versioning
Instead of invalidating cache entries, use versioned keys to serve new versions.
// Cache versioning example
let CACHE_VERSION = 1;
// Function to get cache key with version
function getVersionedCacheKey(key) {
return `v${CACHE_VERSION}:${key}`;
}
// Get data with versioned cache key
async function getDataWithVersionedCache(dataId) {
const cacheKey = getVersionedCacheKey(`data:${dataId}`);
// Try to get from cache
const cachedData = await redis.get(cacheKey);
if (cachedData) {
return JSON.parse(cachedData);
}
// Cache miss - fetch and cache
const data = await fetchData(dataId);
await redis.setex(cacheKey, 3600, JSON.stringify(data));
return data;
}
// Global cache invalidation by incrementing version
function invalidateEntireCache() {
CACHE_VERSION += 1;
console.log(`Cache version bumped to v${CACHE_VERSION}`);
}
// Call this when deploying new code or making schema changes
// Old cache entries will expire naturally via TTL
Pros: Simple global invalidation, avoids race conditions
Cons: Can't selectively invalidate items, temporary increased storage during transition
Event-Based Invalidation
Using events to coordinate cache invalidation across distributed systems.
// Event-based cache invalidation with Redis Pub/Sub
const publisher = redis.createClient();
const subscriber = redis.createClient();
// Subscribe to cache invalidation events
subscriber.subscribe('cache:invalidate');
subscriber.on('message', (channel, message) => {
if (channel === 'cache:invalidate') {
const { key, pattern } = JSON.parse(message);
if (pattern) {
// Delete all keys matching pattern
redisClient.keys(pattern, (err, keys) => {
if (err) return console.error('Error finding keys to invalidate:', err);
if (keys.length > 0) {
redisClient.del(keys, (err, count) => {
console.log(`Invalidated ${count} cache entries matching ${pattern}`);
});
}
});
} else if (key) {
// Delete specific key
redisClient.del(key, (err, count) => {
console.log(`Invalidated cache key: ${key}`);
});
}
}
});
// Function to invalidate cache via events
function invalidateCache(key = null, pattern = null) {
publisher.publish('cache:invalidate', JSON.stringify({ key, pattern }));
}
// Example usage in API
app.put('/api/categories/:id', async (req, res) => {
const categoryId = req.params.id;
try {
// Update category
const category = await updateCategory(categoryId, req.body);
// Invalidate this specific category
invalidateCache(`category:${categoryId}`);
// Invalidate all products in this category
invalidateCache(null, `product:*:category:${categoryId}`);
res.json(category);
} catch (error) {
console.error('Error:', error);
res.status(500).json({ error: 'Server error' });
}
});
Pros: Works well in distributed systems, decoupled components
Cons: Additional infrastructure complexity, potential message delivery issues
Implementing Cache in Different Environments
Let's explore practical implementation approaches in different environments and frameworks:
Node.js/Express with Redis
Using Redis as a cache for Express applications:
// Express route caching middleware with Redis
const express = require('express');
const redis = require('redis');
const { promisify } = require('util');
const client = redis.createClient();
const getAsync = promisify(client.get).bind(client);
const setexAsync = promisify(client.setex).bind(client);
function cacheMiddleware(duration) {
return async (req, res, next) => {
// Skip caching for non-GET requests
if (req.method !== 'GET') {
return next();
}
// Create a cache key from the full URL
const cacheKey = `api:${req.originalUrl}`;
try {
// Try to get from cache
const cachedResponse = await getAsync(cacheKey);
if (cachedResponse) {
// Parse the cached JSON response
const data = JSON.parse(cachedResponse);
// Set X-Cache header to indicate a cache hit
res.setHeader('X-Cache', 'HIT');
return res.json(data);
}
// Cache miss - capture the response
const originalJson = res.json;
res.json = async function(data) {
// Cache the response data
await setexAsync(cacheKey, duration, JSON.stringify(data));
// Set cache header
res.setHeader('X-Cache', 'MISS');
// Call the original json method
return originalJson.call(this, data);
};
next();
} catch (error) {
console.error('Cache error:', error);
next();
}
};
}
// Usage
const app = express();
// Apply cache to specific routes
app.get('/api/products', cacheMiddleware(300), (req, res) => {
// This endpoint is now cached for 300 seconds
// ...
});
// Apply cache to route groups
app.use('/api/public', cacheMiddleware(600));
PHP/Laravel Caching
Laravel provides a robust caching system:
// Laravel API response caching
use Illuminate\Support\Facades\Cache;
class ProductController extends Controller
{
public function index()
{
// Cache the API response for 60 minutes
return Cache::remember('products.all', 60 * 60, function () {
return Product::all();
});
}
public function show($id)
{
// Use cache tags for easier invalidation
return Cache::tags(['products', "product-{$id}"])->remember(
"products.{$id}",
30 * 60, // 30 minutes
function () use ($id) {
return Product::findOrFail($id);
}
);
}
public function update(Request $request, $id)
{
$product = Product::findOrFail($id);
$product->update($request->validated());
// Invalidate the cache for this product
Cache::tags(["product-{$id}"])->flush();
return $product;
}
public function destroy($id)
{
Product::findOrFail($id)->delete();
// Invalidate both the specific product and the all products list
Cache::tags(['products', "product-{$id}"])->flush();
return response()->json(['message' => 'Product deleted']);
}
}
Python/Django Caching
Django offers multiple caching backends:
# Django API view caching
from django.views.decorators.cache import cache_page
from django.utils.decorators import method_decorator
from django.core.cache import cache
from rest_framework.viewsets import ModelViewSet
from rest_framework.response import Response
class ProductViewSet(ModelViewSet):
queryset = Product.objects.all()
serializer_class = ProductSerializer
@method_decorator(cache_page(60 * 15)) # Cache for 15 minutes
def list(self, request):
# This response will be cached
queryset = self.filter_queryset(self.get_queryset())
serializer = self.get_serializer(queryset, many=True)
return Response(serializer.data)
@method_decorator(cache_page(60 * 5)) # Cache for 5 minutes
def retrieve(self, request, pk=None):
# Cache individual product lookups
instance = self.get_object()
serializer = self.get_serializer(instance)
return Response(serializer.data)
def update(self, request, *args, **kwargs):
# Invalidate cache on update
result = super().update(request, *args, **kwargs)
product_id = kwargs.get('pk')
cache_key = f'cached_product_{product_id}'
cache.delete(cache_key)
return result
# Custom caching for more granular control
def get_product_with_cache(product_id):
cache_key = f'cached_product_{product_id}'
# Try to get from cache
cached_product = cache.get(cache_key)
if cached_product is not None:
return cached_product
# Cache miss - fetch from database
try:
product = Product.objects.get(id=product_id)
serialized_product = ProductSerializer(product).data
# Cache for 10 minutes
cache.set(cache_key, serialized_product, 10 * 60)
return serialized_product
except Product.DoesNotExist:
return None
CDN-Based Caching
Utilizing a CDN for edge caching of API responses:
// Express with Fastly CDN headers
app.get('/api/public/content/:id', (req, res) => {
const contentId = req.params.id;
// Get content
getContent(contentId)
.then(content => {
// Set Surrogate-Control for Fastly
res.setHeader('Surrogate-Control', 'max-age=86400'); // 1 day CDN cache
// Set Cache-Control for browsers
res.setHeader('Cache-Control', 'public, max-age=60'); // 1 minute browser cache
// Set surrogate keys for precise invalidation
res.setHeader('Surrogate-Key', `content-${contentId} category-${content.categoryId}`);
// Set Vary if the response changes based on headers
res.setHeader('Vary', 'Accept-Language');
// Return content
res.json(content);
})
.catch(error => {
console.error('Error:', error);
res.status(500).json({ error: 'Server error' });
});
});
// Function to purge content from Fastly when updated
function purgeFromFastly(surrogateKey) {
return fetch('https://api.fastly.com/service/{service_id}/purge', {
method: 'POST',
headers: {
'Fastly-Key': process.env.FASTLY_API_KEY,
'Content-Type': 'application/json'
},
body: JSON.stringify({
surrogate_key: surrogateKey
})
});
}
Client-Side Caching
Implementing caching in the frontend:
// JavaScript client with caching (using browser's Cache API)
class ApiClient {
constructor(baseUrl) {
this.baseUrl = baseUrl;
this.cacheName = 'api-cache-v1';
}
async fetchWithCache(endpoint, options = {}) {
const url = `${this.baseUrl}${endpoint}`;
// Create request object
const request = new Request(url, options);
// Check if we're online
if (!navigator.onLine) {
console.log('Offline - trying to fetch from cache');
const cachedResponse = await this.getFromCache(request);
if (cachedResponse) {
return cachedResponse.json();
}
throw new Error('You are offline and the requested data is not cached');
}
// Online flow - try network first, then update cache
try {
const response = await fetch(request);
// Check if response is valid
if (!response.ok) {
throw new Error(`API error: ${response.status}`);
}
// Clone the response since it can only be used once
const responseToCache = response.clone();
// Cache the response for offline use
this.updateCache(request, responseToCache);
return response.json();
} catch (error) {
console.error('Fetch error:', error);
// Try to get from cache as fallback
const cachedResponse = await this.getFromCache(request);
if (cachedResponse) {
return cachedResponse.json();
}
throw error;
}
}
async getFromCache(request) {
try {
const cache = await caches.open(this.cacheName);
const cachedResponse = await cache.match(request);
return cachedResponse;
} catch (error) {
console.error('Cache error:', error);
return null;
}
}
async updateCache(request, response) {
try {
const cache = await caches.open(this.cacheName);
await cache.put(request, response);
} catch (error) {
console.error('Cache update error:', error);
}
}
// Helper methods for common operations
async get(endpoint) {
return this.fetchWithCache(endpoint);
}
async post(endpoint, data) {
// POST requests are not cached
const url = `${this.baseUrl}${endpoint}`;
const response = await fetch(url, {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify(data)
});
return response.json();
}
}
// Usage
const api = new ApiClient('https://api.example.com');
// Get product data with caching
async function loadProduct(id) {
try {
const product = await api.get(`/products/${id}`);
displayProduct(product);
} catch (error) {
showError(error.message);
}
}
Monitoring and Optimizing Cache Performance
To ensure your caching strategy is effective, you need to monitor and optimize it continuously:
Key Cache Metrics
- Hit Ratio: Percentage of requests served from cache
- Miss Ratio: Percentage of requests that couldn't be served from cache
- Latency: Response time for cached vs. uncached requests
- Eviction Rate: How often items are removed from cache due to memory pressure
- Memory Usage: How much memory the cache is consuming
- TTL Distribution: The distribution of time-to-live values across cache entries
// Example of tracking cache metrics in Express
function cacheMetricsMiddleware() {
// Initialize metrics
const metrics = {
hits: 0,
misses: 0,
requests: 0,
totalLatencyMs: 0,
cachedLatencyMs: 0,
uncachedLatencyMs: 0,
startTime: Date.now()
};
// Return middleware function
return (req, res, next) => {
const startTime = process.hrtime();
// Track original response send method
const originalSend = res.send;
// Override send method to capture metrics
res.send = function(body) {
// Calculate response time
const diff = process.hrtime(startTime);
const responseTimeMs = diff[0] * 1000 + diff[1] / 1000000;
// Increment request count
metrics.requests++;
// Add to total latency
metrics.totalLatencyMs += responseTimeMs;
// Check cache status from headers
const cacheStatus = res.getHeader('X-Cache');
if (cacheStatus === 'HIT') {
metrics.hits++;
metrics.cachedLatencyMs += responseTimeMs;
} else {
metrics.misses++;
metrics.uncachedLatencyMs += responseTimeMs;
}
// Call original send method
return originalSend.call(this, body);
};
next();
};
}
// Endpoint to expose cache metrics
app.get('/api/metrics/cache', (req, res) => {
const uptime = Date.now() - metrics.startTime;
const hitRatio = metrics.requests > 0 ? metrics.hits / metrics.requests : 0;
const avgLatencyMs = metrics.requests > 0 ? metrics.totalLatencyMs / metrics.requests : 0;
const avgCachedLatencyMs = metrics.hits > 0 ? metrics.cachedLatencyMs / metrics.hits : 0;
const avgUncachedLatencyMs = metrics.misses > 0 ? metrics.uncachedLatencyMs / metrics.misses : 0;
res.json({
uptime,
requests: metrics.requests,
hits: metrics.hits,
misses: metrics.misses,
hitRatio,
avgLatencyMs,
avgCachedLatencyMs,
avgUncachedLatencyMs,
latencyImprovement: avgUncachedLatencyMs > 0 ?
((avgUncachedLatencyMs - avgCachedLatencyMs) / avgUncachedLatencyMs) * 100 : 0
});
});
Optimizing Cache Efficiency
Techniques to improve your caching efficiency:
- Cache Key Optimization: Design cache keys to maximize reuse
- TTL Tuning: Adjust TTL based on data change frequency
- Cache Warming: Pre-populate cache with commonly accessed data
- Partial Caching: Cache only expensive parts of responses
- Compression: Compress cached data to reduce memory usage
- Optimal Eviction Policies: Choose the right algorithm (LRU, LFU, etc.)
// Example of cache warming
async function warmCache() {
console.log('Starting cache warming...');
try {
// Get top 100 most popular product IDs
const popularProductIds = await getPopularProductIds(100);
// Warm the cache for each product
const promises = popularProductIds.map(async (productId) => {
// Check if already in cache
const cacheKey = `product:${productId}`;
const existingCache = await redisClient.get(cacheKey);
if (!existingCache) {
// Fetch and cache the product
const product = await fetchProductFromDatabase(productId);
await redisClient.setex(cacheKey, 3600, JSON.stringify(product));
return { productId, status: 'cached' };
}
return { productId, status: 'already-cached' };
});
const results = await Promise.all(promises);
console.log('Cache warming completed:', {
total: results.length,
newlyCached: results.filter(r => r.status === 'cached').length,
alreadyCached: results.filter(r => r.status === 'already-cached').length
});
} catch (error) {
console.error('Cache warming failed:', error);
}
}
// Run cache warming on startup and periodically
warmCache();
setInterval(warmCache, 24 * 60 * 60 * 1000); // Once a day
Cache Optimization Case Study
Here's a real-world optimization scenario:
In this case study, a company improved their cache hit ratio from 35% to 92% by:
- Normalizing query parameters (e.g., sorting parameter order, lowercasing values)
- Adjusting TTL values based on content change frequency
- Implementing cache warming for popular resources
- Introducing a shared cache layer for microservices
- Using surrogate keys for precise invalidation
The result was a 70% reduction in database load and a 65% improvement in average response time.
Handling Dynamic and Personalized Content
Caching becomes more challenging when dealing with dynamic or personalized content. Here are strategies to address this:
Edge Side Includes (ESI)
A technique for assembling dynamic web pages from individual cached fragments:
// Example of using ESI with Varnish Cache
// Original response in Express
app.get('/api/dashboard', (req, res) => {
// Add Surrogate-Control header to enable ESI processing
res.setHeader('Surrogate-Control', 'content="ESI/1.0"');
// The main response with ESI includes
res.send({
dashboard: {
// Static content cached for a long time
layout: { /* layout data */ },
// Dynamic, personalized content via ESI
userProfile: {
esi: `<esi:include src="/api/profile/${req.user.id}" />`
},
// Semi-dynamic content via ESI
recommendations: {
esi: '<esi:include src="/api/recommendations" />'
}
}
});
});
// Individual endpoints for the ESI includes
app.get('/api/profile/:id', (req, res) => {
// User-specific data, not cached or short TTL
res.setHeader('Cache-Control', 'private, max-age=60');
res.json({ /* user profile data */ });
});
app.get('/api/recommendations', (req, res) => {
// Shared among many users, longer TTL
res.setHeader('Cache-Control', 'public, max-age=3600');
res.json({ /* recommendation data */ });
});
Cache Variations by User Segments
Creating cache variations for user groups rather than individuals:
// Caching by user segment rather than individual user
app.get('/api/recommendations', (req, res) => {
// Extract user segments from authenticated user
const userSegments = getUserSegments(req.user);
// Create a cache key based on segments (not individual user)
const segmentKey = userSegments.sort().join('-');
const cacheKey = `recommendations:${segmentKey}`;
// Try to get from cache
redisClient.get(cacheKey, async (err, cachedData) => {
if (err) {
console.error('Redis error:', err);
}
if (cachedData) {
// Cache hit
res.setHeader('X-Cache', 'HIT');
return res.json(JSON.parse(cachedData));
}
// Cache miss - generate recommendations
const recommendations = await generateRecommendations(userSegments);
// Cache for future users in this segment
redisClient.setex(cacheKey, 3600, JSON.stringify(recommendations));
res.setHeader('X-Cache', 'MISS');
res.json(recommendations);
});
});
// Helper to extract user segments (e.g., interests, region, account type)
function getUserSegments(user) {
const segments = [];
// Add user tier
segments.push(`tier:${user.accountTier}`);
// Add region
segments.push(`region:${user.region}`);
// Add top interest category
if (user.interests && user.interests.length > 0) {
segments.push(`interest:${user.interests[0]}`);
}
return segments;
}
Client-Side Personalization
Moving personalization logic to the client side:
// Backend returns non-personalized data that can be cached
app.get('/api/products', (req, res) => {
// This endpoint returns all product data and can be heavily cached
res.setHeader('Cache-Control', 'public, max-age=3600');
// Get products
getProducts()
.then(products => {
res.json({ products });
})
.catch(error => {
console.error('Error:', error);
res.status(500).json({ error: 'Server error' });
});
});
// Client-side code does the personalization
const api = new ApiClient('https://api.example.com');
async function loadProductsForUser() {
try {
// Get cached product data
const data = await api.get('/api/products');
// Get user preferences (could be stored in local state)
const userPreferences = getUserPreferences();
// Personalize the data client-side
const personalizedProducts = personalizeProducts(data.products, userPreferences);
// Update UI
displayProducts(personalizedProducts);
} catch (error) {
showError(error.message);
}
}
// Client-side personalization function
function personalizeProducts(products, preferences) {
return products
.filter(product => {
// Filter based on user preferences
if (preferences.excludedCategories.includes(product.category)) {
return false;
}
return true;
})
.sort((a, b) => {
// Sort based on user preferences
if (preferences.favoriteCategories.includes(a.category) &&
!preferences.favoriteCategories.includes(b.category)) {
return -1;
}
if (!preferences.favoriteCategories.includes(a.category) &&
preferences.favoriteCategories.includes(b.category)) {
return 1;
}
return 0;
})
.map(product => {
// Add personalized flags
return {
...product,
isFavorited: preferences.favoritedItems.includes(product.id),
isRecommended: matchesUserInterests(product, preferences.interests)
};
});
}
Real-World Caching Architectures
Let's examine how caching is implemented in real-world production environments:
Multi-Layer Caching Architecture
Large-scale applications often implement caching at multiple levels:
Each layer has a specific role:
- Browser Cache: Stores responses locally on the client
- CDN Cache: Caches responses at edge locations around the world
- API Gateway Cache: Caches responses across services
- Application Cache: Caches processed data and assembled responses
- Object Cache: Caches frequently accessed objects
- Database Cache: Caches query results and database objects
Microservices Caching Patterns
Caching strategies for microservices architectures:
Common patterns include:
- Gateway Caching: Centralized caching at the API gateway
- Client-Side Caching: Each service has its own local cache
- Distributed Caching: Shared cache across services
- Command Query Responsibility Segregation (CQRS): Separate read and write models with heavy caching for reads
High-Traffic E-Commerce Example
Let's look at a high-traffic e-commerce site's caching architecture:
Key features of this architecture:
- Heavy caching for product data (rarely changes)
- Moderate caching for user profiles (occasionally changes)
- No caching for cart and checkout (real-time data)
- CDN caching for static resources and public API responses
- API gateway caching for authenticated but non-personal responses
- Event-driven cache invalidation
- Cache warming for popular products
Case Studies
Case Study 1: API Performance Improvement
A financial data API provider improved performance with targeted caching:
Challenge:
- API serving financial market data to thousands of clients
- High load during market hours
- Data freshness critical for some endpoints, less critical for others
- Multiple data sources with varying update frequencies
Solution:
- Tiered Caching Strategy:
- Real-time data (stock prices): 10-second TTL
- Near-real-time data (market summaries): 1-minute TTL
- Historical data: 1-hour TTL
- Reference data: 24-hour TTL
- Cache Warming: Pre-cache popular symbols before market open
- Distributed Redis Cluster: For high availability and throughput
- Stale-While-Revalidate: For graceful handling of cache misses
Results:
- 95% reduction in database load
- Average response time improved from 120ms to 15ms
- Ability to handle 10x more concurrent users
- Cache hit ratio increased from 40% to 87%
- Significantly reduced infrastructure costs
Case Study 2: Social Media Feed Caching
A social media platform optimized feed delivery with smart caching:
Challenge:
- Personalized feeds for millions of users
- Content frequently updated
- Heavy computational cost for feed generation
- Need for real-time updates for active users
Solution:
- Materialized Feed Caching: Pre-compute and cache user feeds
- Hybrid Invalidation Strategy:
- Time-based invalidation: 10-minute TTL for inactive users
- Event-based invalidation: Immediate refresh for active users
- Feed Segmentation: Cache top feed items separately from "load more" content
- Service Worker Cache: Cache feed locally for offline access
Results:
- Feed load time reduced from 850ms to 120ms
- Server CPU utilization decreased by 70%
- Improved user engagement metrics
- Effective offline experience with cached feeds
Case Study 3: E-Commerce Product Catalog
An e-commerce platform optimized its product catalog API:
Challenge:
- 15 million products with complex filtering options
- Frequent inventory updates
- Personalized pricing for different user segments
- High traffic during sales events
Solution:
- Fragment Caching:
- Product details cached for 24 hours
- Inventory status cached for 5 minutes
- Pricing generated dynamically or segment-cached
- Search Result Caching: Cache popular search queries and filter combinations
- CDN Edge Caching: For product images and non-personalized content
- Cache Stampede Protection: Implemented with distributed locks
Results:
- Search response time improved from 1.2s to 200ms
- Successfully handled 5x normal traffic during Black Friday
- Database load reduced by 85%
- Infrastructure cost savings of 40%
Practical Activities
Activity 1: Basic API Caching Implementation
Implement a basic caching layer for a REST API:
- Create a simple Express API with a few endpoints
- Implement Redis-based caching for GET requests
- Configure appropriate TTL values for different endpoints
- Add cache invalidation on POST/PUT/DELETE requests
- Implement proper cache headers for HTTP caching
- Test performance with and without caching
Activity 2: Advanced Caching Strategies
Enhance your API with more sophisticated caching strategies:
- Implement the stale-while-revalidate pattern
- Add cache warming for frequently accessed resources
- Create a cache key normalization function
- Implement ETags for conditional requests
- Add cache metrics collection
- Create a cache dashboard to visualize performance
Activity 3: Caching Personalized Content
Develop strategies for caching personalized API responses:
- Identify which parts of responses can be cached collectively
- Implement client-side personalization for applicable components
- Create a user segment-based caching strategy
- Implement a fragment caching approach
- Test performance and accuracy of personalized responses
Activity 4: Distributed Cache Implementation
Design and implement a distributed caching system:
- Set up a Redis cluster with multiple nodes
- Implement cache invalidation across nodes
- Create a resilient cache access pattern with fallbacks
- Add monitoring for cache performance
- Test failure scenarios and recovery
- Implement cache synchronization strategies
Additional Resources
Documentation and Guides
- MDN Web Docs: HTTP Caching
- AWS CloudFront: Cache Control
- Redis Documentation: LRU Cache
- Martin Fowler: The Two Hard Things in Computer Science
- Fastly: Surrogate Keys
Libraries and Tools
- Memcached - Distributed memory object caching system
- Redis - In-memory data structure store
- node-cache - Simple in-memory cache for Node.js
- lru-cache - LRU Cache implementation for Node.js
- Varnish Cache - HTTP accelerator and caching proxy
Articles and Papers
- Web Caching and Content Delivery Networks
- AWS Builders Library: Caching Challenges and Strategies
- InfoQ: Simple Strategies for API Caching
- Nordic APIs: The Benefits of a Good API Cache
Books
- "High Performance Browser Networking" by Ilya Grigorik (Chapter on HTTP Caching)
- "Designing Data-Intensive Applications" by Martin Kleppmann
- "Web Caching" by Duane Wessels
- "HTTP: The Definitive Guide" by David Gourley and Brian Totty