Database Migrations

Managing Schema Evolution in Django Applications

Understanding Database Migrations

Database migrations are a way to propagate changes you make to your models into your database schema. They allow you to evolve your database schema over time, without having to drop and recreate your database whenever you make a change to your models.

Think of migrations as a version control system for your database schema. Just as Git tracks changes to your code, migrations track changes to your database structure, allowing you to move forward or backward through different database states.

graph LR A[Python Models] -- "makemigrations" --> B[Migration Files] B -- "migrate" --> C[Database Schema] style A fill:#bbf,stroke:#333,stroke-width:1px style B fill:#fbb,stroke:#333,stroke-width:1px style C fill:#bfb,stroke:#333,stroke-width:1px

In the real world, migrations are analogous to construction blueprints that guide the renovation of a building. When you decide to add a new room or change the layout of a building, you don't demolish the entire structure and rebuild from scratch. Instead, you create blueprints (migrations) that detail exactly what changes need to be made, and then contractors (Django's migration system) follow those blueprints to update the building (database schema) accordingly.

The Migration Workflow

Django's migration system involves a two-step process:

  1. Creating migrations: You create migration files based on the changes you've made to your models.
  2. Applying migrations: You apply the migrations to update your database schema.

Let's look at the workflow in more detail:

sequenceDiagram participant Developer participant Models participant Migrations participant Database Developer->>Models: Update model definitions Developer->>Migrations: Generate migrations
(makemigrations) Migrations-->>Developer: Migration files created Developer->>Migrations: Apply migrations
(migrate) Migrations->>Database: Update schema Database-->>Developer: Schema updated

Creating Migrations

When you make changes to your models (like adding a new model, adding fields to an existing model, or changing a field type), you use the makemigrations command to create migration files:

python manage.py makemigrations

You can also specify a specific app for which to create migrations:

python manage.py makemigrations blog

Django will analyze your models, determine what changes need to be made to your database schema, and create one or more migration files in the migrations directory of the affected app(s).

Applying Migrations

Once you've created migration files, you apply them to update your database schema using the migrate command:

python manage.py migrate

This command applies all pending migrations to your database. You can also specify a specific app to migrate:

python manage.py migrate blog

Or you can migrate up to a specific migration:

python manage.py migrate blog 0003_add_author_field

This workflow is similar to how changes are managed in a construction project: first, architects create blueprints detailing the changes to be made (makemigrations), and then construction workers implement those changes (migrate).

Anatomy of a Migration File

Migration files are Python scripts that define the changes to be made to your database schema. They're stored in the migrations directory of your app and are named with a prefix of NNNN_ (where NNNN is a four-digit number) followed by a descriptive name.

Let's look at an example migration file:

# blog/migrations/0001_initial.py
from django.db import migrations, models
import django.db.models.deletion
import django.utils.timezone

class Migration(migrations.Migration):
    
    initial = True
    
    dependencies = []
    
    operations = [
        migrations.CreateModel(
            name='Author',
            fields=[
                ('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
                ('name', models.CharField(max_length=100)),
                ('email', models.EmailField(max_length=254)),
            ],
        ),
        migrations.CreateModel(
            name='Post',
            fields=[
                ('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
                ('title', models.CharField(max_length=200)),
                ('content', models.TextField()),
                ('published_date', models.DateTimeField(default=django.utils.timezone.now)),
                ('author', models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, to='blog.author')),
            ],
        ),
    ]

This migration file has several key components:

The operations in a migration file correspond to the changes you've made to your models. For example, if you add a new model, the migration will include a CreateModel operation. If you add a field to an existing model, the migration will include an AddField operation.

Migration files are like detailed construction plans that specify exactly what changes need to be made to a building. They include information about what's being built (new models), what's being modified (field changes), and any dependencies (other migrations that must be completed first).

Common Migration Operations

Django's migration system provides a variety of operations for modifying your database schema. Here are some of the most common operations:

CreateModel

The CreateModel operation creates a new table in your database based on a model definition.

migrations.CreateModel(
    name='Category',
    fields=[
        ('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
        ('name', models.CharField(max_length=100)),
        ('slug', models.SlugField(unique=True)),
    ],
    options={
        'verbose_name_plural': 'categories',
    },
)

DeleteModel

The DeleteModel operation removes a table from your database.

migrations.DeleteModel(
    name='Category',
)

AddField

The AddField operation adds a new column to an existing table.

migrations.AddField(
    model_name='post',
    name='category',
    field=models.ForeignKey(null=True, on_delete=django.db.models.deletion.SET_NULL, to='blog.category'),
)

RemoveField

The RemoveField operation removes a column from an existing table.

migrations.RemoveField(
    model_name='post',
    name='category',
)

AlterField

The AlterField operation changes the definition of a column in an existing table.

migrations.AlterField(
    model_name='post',
    name='title',
    field=models.CharField(max_length=300),  # Changed from max_length=200
)

RenameField

The RenameField operation renames a column in an existing table.

migrations.RenameField(
    model_name='post',
    old_name='content',
    new_name='body',
)

RenameModel

The RenameModel operation renames a table in your database.

migrations.RenameModel(
    old_name='Entry',
    new_name='Post',
)

AddIndex

The AddIndex operation adds an index to a table.

migrations.AddIndex(
    model_name='post',
    index=models.Index(fields=['published_date'], name='post_pub_date_idx'),
)

CreateModel with ForeignKey

Creating a model with a ForeignKey relationship requires including the related model in the dependencies.

class Migration(migrations.Migration):
    
    dependencies = [
        ('blog', '0001_initial'),
    ]
    
    operations = [
        migrations.CreateModel(
            name='Comment',
            fields=[
                ('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
                ('author_name', models.CharField(max_length=100)),
                ('content', models.TextField()),
                ('created_date', models.DateTimeField(auto_now_add=True)),
                ('post', models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, related_name='comments', to='blog.post')),
            ],
        ),
    ]

These operations are like the specific instructions in a construction blueprint: add a new room (CreateModel), remove a wall (RemoveField), enlarge a doorway (AlterField), etc. Each operation precisely defines a change to be made to the database schema.

Migration Management Commands

Django provides several management commands for working with migrations. Here are the most useful ones:

makemigrations

Creates new migration files based on the changes you've made to your models.

python manage.py makemigrations

You can specify an app name to create migrations only for that app:

python manage.py makemigrations blog

You can also provide a name for the migration:

python manage.py makemigrations blog --name add_author_field

migrate

Applies migrations to update your database schema.

python manage.py migrate

You can specify an app and migration to migrate to a specific version:

python manage.py migrate blog 0003_add_author_field

Or you can specify just an app to apply all pending migrations for that app:

python manage.py migrate blog

showmigrations

Shows the status of all migrations in the project.

python manage.py showmigrations

The output will show which migrations have been applied (marked with an [X]) and which are pending:

admin
 [X] 0001_initial
 [X] 0002_logentry_remove_auto_add
 [X] 0003_logentry_add_action_flag_choices
auth
 [X] 0001_initial
 [X] 0002_alter_permission_name_max_length
 ...
blog
 [X] 0001_initial
 [X] 0002_add_category_model
 [ ] 0003_add_author_field

You can also specify an app to show just the migrations for that app:

python manage.py showmigrations blog

sqlmigrate

Shows the SQL statements that would be executed for a migration.

python manage.py sqlmigrate blog 0001_initial

This command is useful for understanding what changes will be made to your database schema, or for debugging migration issues.

squashmigrations

Squashes multiple migrations into a single migration.

python manage.py squashmigrations blog 0001 0005

This command is useful when you have a lot of migrations and want to reduce their number for performance or clarity reasons.

migration --list

Shows a list of all migrations in the project.

python manage.py migrate --list

These commands are like the tools and techniques used to manage a construction project: creating new blueprints (makemigrations), implementing changes according to blueprints (migrate), checking construction progress (showmigrations), previewing construction work (sqlmigrate), and consolidating multiple blueprints into a single comprehensive plan (squashmigrations).

Handling Migration Conflicts

When multiple developers are working on the same project, migration conflicts can occur. Here are some common scenarios and how to handle them:

Multiple Migrations Created with the Same Number

If two developers create migrations for the same app independently, they might end up with migrations that have the same number:

# Developer A creates a migration
$ python manage.py makemigrations blog
Migrations for 'blog':
  blog/migrations/0002_add_category_model.py
    - Create model Category

# Developer B (without Developer A's migration) also creates a migration
$ python manage.py makemigrations blog
Migrations for 'blog':
  blog/migrations/0002_add_author_field.py
    - Add field author to post

When this happens, you need to:

  1. Merge the latest code from all developers (including migrations)
  2. Let Django detect the conflict and create a new migration that merges the changes
$ python manage.py makemigrations blog
Migrations for 'blog':
  blog/migrations/0003_merge_20230424_1455.py
    - Merge migrations 0002_add_category_model and 0002_add_author_field

$ python manage.py migrate
Operations to perform:
  Apply all migrations: blog
Running migrations:
  Applying blog.0002_add_category_model... OK
  Applying blog.0002_add_author_field... OK
  Applying blog.0003_merge_20230424_1455... OK

Unapplied Migrations in Version Control

If you pull code that includes new migrations, you should apply those migrations to update your database schema:

$ git pull origin main
$ python manage.py migrate

Reverting Migrations

Sometimes you need to revert to an earlier database schema. You can do this by specifying the migration you want to revert to:

$ python manage.py migrate blog 0002_add_category_model

This will unapply any migrations after 0002_add_category_model.

Handling migration conflicts is like coordinating between different teams on a construction project. When multiple teams are making changes to the same building, you need to integrate their work carefully to ensure that the final structure is sound and includes all the intended changes.

Data Migrations

Sometimes you need to not only change the database schema but also transform the data in the database. For example, you might want to:

For these scenarios, you can use data migrations. Here's how to create and use a data migration:

Creating an Empty Migration

First, you create an empty migration file:

python manage.py makemigrations --empty blog --name=populate_slug_field

Adding a Data Migration Function

Then, you add a function to the migration file that will populate the data:

from django.db import migrations
from django.utils.text import slugify

def populate_slug_field(apps, schema_editor):
    # Get the historical version of the Post model
    Post = apps.get_model('blog', 'Post')
    
    # Loop through all Post instances
    for post in Post.objects.all():
        # Generate a slug from the title
        post.slug = slugify(post.title)
        # Save the post
        post.save()

class Migration(migrations.Migration):
    
    dependencies = [
        ('blog', '0003_add_slug_field'),
    ]
    
    operations = [
        migrations.RunPython(populate_slug_field),
    ]

The RunPython operation executes the populate_slug_field function when the migration is applied.

Running the Data Migration

You apply the data migration just like any other migration:

python manage.py migrate

Important Considerations for Data Migrations

Data migrations are like the process of moving furniture and belongings during a home renovation. It's not enough to just change the structure of the house (schema migration); you also need to rearrange the contents (data) to fit the new layout.

Migration Strategies for Production

Applying migrations to a production database requires careful planning to minimize downtime and risk. Here are some strategies and best practices:

Backward Compatible Migrations

Design your migrations to be backward compatible with your application code whenever possible. This allows you to deploy migrations and code changes separately.

For example, if you need to rename a field:

  1. Add a new field with the new name
  2. Deploy code that writes to both fields
  3. Copy data from the old field to the new field
  4. Deploy code that uses only the new field
  5. Remove the old field

Test Migrations on a Copy of Production Data

Before applying migrations to production, test them on a staging environment with a copy of your production data.

# Create a backup of your production database
$ pg_dump -U username -d production_db > production_backup.sql

# Restore the backup to your staging database
$ psql -U username -d staging_db < production_backup.sql

# Apply migrations to staging
$ python manage.py migrate

Database Locks and Downtime

Be aware that some migrations can lock tables, potentially causing downtime. For large tables, consider using techniques that minimize locking:

Backup Before Migrating

Always back up your production database before applying migrations:

# PostgreSQL backup
$ pg_dump -U username -d production_db > pre_migration_backup.sql

# MySQL backup
$ mysqldump -u username -p production_db > pre_migration_backup.sql

# SQLite backup (just copy the file)
$ cp db.sqlite3 db.sqlite3.backup

Automate Deployment with Continuous Integration/Continuous Deployment (CI/CD)

Automate your migration process as part of your CI/CD pipeline:

# Example GitLab CI/CD configuration
deploy:
  stage: deploy
  script:
    - pip install -r requirements.txt
    - python manage.py migrate --no-input
    - python manage.py collectstatic --no-input
    - restart-app-server.sh
  only:
    - main

These migration strategies are like the careful planning that goes into renovating a building while it's still in use. You need to minimize disruption, ensure safety, and have a fallback plan in case something goes wrong.

Migration Best Practices

Here are some best practices to follow when working with Django migrations:

Following these best practices will help you manage your database schema effectively and avoid common pitfalls.

Real-World Example: Blog Evolution

Let's walk through a real-world example of how migrations would be used to evolve a blog application over time:

Initial Models

We start with a simple Post model:

# blog/models.py
from django.db import models
from django.utils import timezone

class Post(models.Model):
    title = models.CharField(max_length=200)
    content = models.TextField()
    created_date = models.DateTimeField(auto_now_add=True)
    published_date = models.DateTimeField(default=timezone.now)
    
    def __str__(self):
        return self.title

We create and apply the initial migration:

$ python manage.py makemigrations blog
Migrations for 'blog':
  blog/migrations/0001_initial.py
    - Create model Post

$ python manage.py migrate
Operations to perform:
  Apply all migrations: admin, auth, blog, contenttypes, sessions
Running migrations:
  Applying blog.0001_initial... OK

Adding Categories

Later, we decide to add categories to our blog:

# blog/models.py
from django.db import models
from django.utils import timezone

class Category(models.Model):
    name = models.CharField(max_length=100)
    slug = models.SlugField(unique=True)
    
    class Meta:
        verbose_name_plural = 'categories'
    
    def __str__(self):
        return self.name

class Post(models.Model):
    title = models.CharField(max_length=200)
    content = models.TextField()
    created_date = models.DateTimeField(auto_now_add=True)
    published_date = models.DateTimeField(default=timezone.now)
    category = models.ForeignKey(Category, on_delete=models.CASCADE, null=True, blank=True)
    
    def __str__(self):
        return self.title

We create and apply the migration for these changes:

$ python manage.py makemigrations blog --name=add_category
Migrations for 'blog':
  blog/migrations/0002_add_category.py
    - Create model Category
    - Add field category to post

$ python manage.py migrate
Operations to perform:
  Apply all migrations: admin, auth, blog, contenttypes, sessions
Running migrations:
  Applying blog.0002_add_category... OK

Adding Tags with a Data Migration

Next, we decide to add tags to our blog, and we want to automatically create some default tags:

# blog/models.py
from django.db import models
from django.utils import timezone

class Category(models.Model):
    # ... (unchanged)

class Tag(models.Model):
    name = models.CharField(max_length=50)
    slug = models.SlugField(unique=True)
    
    def __str__(self):
        return self.name

class Post(models.Model):
    title = models.CharField(max_length=200)
    content = models.TextField()
    created_date = models.DateTimeField(auto_now_add=True)
    published_date = models.DateTimeField(default=timezone.now)
    category = models.ForeignKey(Category, on_delete=models.CASCADE, null=True, blank=True)
    tags = models.ManyToManyField(Tag, blank=True)
    
    def __str__(self):
        return self.title

We create a schema migration for the new Tag model and the many-to-many relationship:

$ python manage.py makemigrations blog --name=add_tags
Migrations for 'blog':
  blog/migrations/0003_add_tags.py
    - Create model Tag
    - Add field tags to post

Then we create a data migration to add some default tags:

$ python manage.py makemigrations --empty blog --name=create_default_tags

We edit the empty migration file to add our data migration function:

# blog/migrations/0004_create_default_tags.py
from django.db import migrations

def create_default_tags(apps, schema_editor):
    Tag = apps.get_model('blog', 'Tag')
    for name in ['Technology', 'Programming', 'Django', 'Python', 'Web Development']:
        Tag.objects.create(
            name=name,
            slug=name.lower().replace(' ', '-')
        )

def remove_default_tags(apps, schema_editor):
    Tag = apps.get_model('blog', 'Tag')
    Tag.objects.filter(name__in=[
        'Technology', 'Programming', 'Django', 'Python', 'Web Development'
    ]).delete()

class Migration(migrations.Migration):
    
    dependencies = [
        ('blog', '0003_add_tags'),
    ]
    
    operations = [
        migrations.RunPython(
            create_default_tags,
            reverse_code=remove_default_tags
        ),
    ]

Finally, we apply all the pending migrations:

$ python manage.py migrate
Operations to perform:
  Apply all migrations: admin, auth, blog, contenttypes, sessions
Running migrations:
  Applying blog.0003_add_tags... OK
  Applying blog.0004_create_default_tags... OK

This real-world example demonstrates how migrations allow you to evolve your database schema over time, adding new models, fields, and relationships, and even transforming data as needed.

Practice Activity: Database Migrations

Let's apply what we've learned about migrations to a real project:

Activity 1: Create and Apply Initial Migrations

Create a new Django project and app, define some initial models, and create and apply migrations:

  1. Create a new Django project called library
  2. Create a new app called books
  3. Define Author and Book models in books/models.py
  4. Create an initial migration for the models
  5. Apply the migration to create the database tables

Activity 2: Evolve the Schema with Migrations

Modify the models and create migrations to evolve the schema:

  1. Add a Publisher model and relate it to the Book model
  2. Add fields to the Book model for isbn and publication_date
  3. Create a migration for these changes with a descriptive name
  4. Apply the migration

Activity 3: Create and Apply a Data Migration

Create a data migration to populate some data:

  1. Create an empty migration for the books app
  2. Edit the migration file to add a function that creates some default publishers
  3. Apply the migration
  4. Verify that the publishers were created in the database

Activity 4: Handle a Migration Conflict

Simulate a migration conflict and resolve it:

  1. Create a separate branch or copy of your project
  2. In the original version, add a genre field to the Book model and create a migration
  3. In the separate version, add a language field to the Book model and create a migration
  4. Merge the changes from the separate version into the original version
  5. Create a migration to resolve the conflict
  6. Apply all migrations and verify that both fields were added

Summary

In this lecture, we've explored database migrations in Django, including:

Understanding migrations is crucial for effectively managing your database schema as your Django application evolves. Migrations allow you to make changes to your models and propagate those changes to your database in a controlled, reversible, and collaborative way.

In the next lecture, we'll explore Django's QuerySet API, which allows you to interact with your database using the models you've defined.

Further Resources