Using south to refactor a Django model with inheritance Using south to refactor a Django model with inheritance python python

Using south to refactor a Django model with inheritance


Check out response below by Paul for some notes on compatibility with newer versions of Django/South.


This seemed like an interesting problem, and I'm becoming a big fan of South, so I decided to look into this a bit. I built a test project on the abstract of what you've described above, and have successfully used South to perform the migration you are asking about. Here's a couple of notes before we get to the code:

  • The South documentation recommends doing schema migrations and data migrations separate. I've followed suit in this.

  • On the backend, Django represents an inherited table by automatically creating a OneToOne field on the inheriting model

  • Understanding this, our South migration needs to properly handle the OneToOne field manually, however, in experimenting with this it seems that South (or perhaps Django itself) cannot create a OneToOne filed on multiple inherited tables with the same name. Because of this, I renamed each child-table in the movies/tv app to be respective to it's own app (ie. MovieVideoFile/ShowVideoFile).

  • In playing with the actual data migration code, it seems South prefers to create the OneToOne field first, and then assign data to it. Assigning data to the OneToOne field during creation cause South to choke. (A fair compromise for all the coolness that is South).

So having said all that, I tried to keep a log of the console commands being issued. I'll interject commentary where necessary. The final code is at the bottom.

Command History

django-admin.py startproject southtestmanage.py startapp moviesmanage.py startapp tvmanage.py syncdbmanage.py startmigration movies --initialmanage.py startmigration tv --initialmanage.py migratemanage.py shell          # added some fake data...manage.py startapp mediamanage.py startmigration media --initialmanage.py migrate# edited code, wrote new models, but left old ones intactmanage.py startmigration movies unified-videofile --auto# create a new (blank) migration to hand-write data migrationmanage.py startmigration movies videofile-to-movievideofile-data manage.py migrate# edited code, wrote new models, but left old ones intactmanage.py startmigration tv unified-videofile --auto# create a new (blank) migration to hand-write data migrationmanage.py startmigration tv videofile-to-movievideofile-datamanage.py migrate# removed old VideoFile model from appsmanage.py startmigration movies removed-videofile --automanage.py startmigration tv removed-videofile --automanage.py migrate

For space sake, and since the models invariably look the same in the end, I'm only going to demonstrate with 'movies' app.

movies/models.py

from django.db import modelsfrom media.models import VideoFile as BaseVideoFile# This model remains until the last migration, which deletes # it from the schema.  Note the name conflict with media.modelsclass VideoFile(models.Model):    movie = models.ForeignKey(Movie, blank=True, null=True)    name = models.CharField(max_length=1024, blank=True)    size = models.IntegerField(blank=True, null=True)    ctime = models.DateTimeField(blank=True, null=True)class MovieVideoFile(BaseVideoFile):    movie = models.ForeignKey(Movie, blank=True, null=True, related_name='shows')

movies/migrations/0002_unified-videofile.py (schema migration)

from south.db import dbfrom django.db import modelsfrom movies.models import *class Migration:    def forwards(self, orm):        # Adding model 'MovieVideoFile'        db.create_table('movies_movievideofile', (            ('videofile_ptr', orm['movies.movievideofile:videofile_ptr']),            ('movie', orm['movies.movievideofile:movie']),        ))        db.send_create_signal('movies', ['MovieVideoFile'])    def backwards(self, orm):        # Deleting model 'MovieVideoFile'        db.delete_table('movies_movievideofile')

movies/migration/0003_videofile-to-movievideofile-data.py (data migration)

from south.db import dbfrom django.db import modelsfrom movies.models import *class Migration:    def forwards(self, orm):        for movie in orm['movies.videofile'].objects.all():            new_movie = orm.MovieVideoFile.objects.create(movie = movie.movie,)            new_movie.videofile_ptr = orm['media.VideoFile'].objects.create()            # videofile_ptr must be created first before values can be assigned            new_movie.videofile_ptr.name = movie.name            new_movie.videofile_ptr.size = movie.size            new_movie.videofile_ptr.ctime = movie.ctime            new_movie.videofile_ptr.save()    def backwards(self, orm):        print 'No Backwards'

South is awesome!

Ok standard disclaimer: You're dealing with live data. I've given you working code here, but please use the --db-dry-run to test your schema. Always make a backup before trying anything, and generally be careful.

COMPATIBILITY NOTICE

I'm going to keep my original message intact, but South has since changed the command manage.py startmigration into manage.py schemamigration.


I did try to walk through the solution outlined by T Stone and while I think it's a superb starter and explains how things should be done I ran into a few problems.

I think mostly you don't need to create the table entry for the parent class anymore, i.e. you don't need

new_movie.videofile_ptr = orm['media.VideoFile'].objects.create()

anymore. Django will now do this automatically for you (if you have non-null fields then the above did not work for me and gave me a database error).

I think it is probably due to changes in django and south, here is a version that worked for me on ubuntu 10.10 with django 1.2.3 and south 0.7.1. The models are a little different, but you will get the gist:

Initial setup

post1/models.py:

class Author(models.Model):    first = models.CharField(max_length=30)    last = models.CharField(max_length=30)class Tag(models.Model):    name = models.CharField(max_length=30, primary_key=True)class Post(models.Model):    created_on = models.DateTimeField()    author = models.ForeignKey(Author)    tags = models.ManyToManyField(Tag)    title = models.CharField(max_length=128, blank=True)    content = models.TextField(blank=True)

post2/models.py:

class Author(models.Model):    first = models.CharField(max_length=30)    middle = models.CharField(max_length=30)    last = models.CharField(max_length=30)class Tag(models.Model):    name = models.CharField(max_length=30)class Category(models.Model):    name = models.CharField(max_length=30)class Post(models.Model):    created_on = models.DateTimeField()    author = models.ForeignKey(Author)    tags = models.ManyToManyField(Tag)    title = models.CharField(max_length=128, blank=True)    content = models.TextField(blank=True)    extra_content = models.TextField(blank=True)    category = models.ForeignKey(Category)

There is obviously a lot of overlap, so I wanted to factor the commonalitiesout into a general post model and only keep the differences in the othermodel classes.

new setup:

genpost/models.py:

class Author(models.Model):    first = models.CharField(max_length=30)    middle = models.CharField(max_length=30, blank=True)    last = models.CharField(max_length=30)class Tag(models.Model):    name = models.CharField(max_length=30, primary_key=True)class Post(models.Model):    created_on = models.DateTimeField()    author = models.ForeignKey(Author)    tags = models.ManyToManyField(Tag)    title = models.CharField(max_length=128, blank=True)    content = models.TextField(blank=True)

post1/models.py:

import genpost.models as gpclass SimplePost(gp.Post):    class Meta:        proxy = True

post2/models.py:

import genpost.models as gpclass Category(models.Model):    name = models.CharField(max_length=30)class ExtPost(gp.Post):    extra_content = models.TextField(blank=True)    category = models.ForeignKey(Category)

If you want to follow along you will first need to get these models into south:

$./manage.py schemamigration post1 --initial$./manage.py schemamigration post2 --initial$./manage.py migrate

Migrating the data

How to go about it? First write the new app genpost and do the initialmigrations with south:

$./manage.py schemamigration genpost --initial

(I am using $ to represent the shells prompt, so don't type that.)

Next create the new classes SimplePost and ExtPost in post1/models.pyand post2/models.py respectively (don't delete the rest of the classes yet).Then create schemamigrations for these two as well:

$./manage.py schemamigration post1 --auto$./manage.py schemamigration post2 --auto

Now we can apply all these migrations:

$./manage.py migrate

Let's get to the heart of the matter, migrating the data from post1 and post2 to genpost:

$./manage.py datamigration genpost post1_and_post2_to_genpost --freeze post1 --freeze post2

Then edit genpost/migrations/0002_post1_and_post2_to_genpost.py:

class Migration(DataMigration):    def forwards(self, orm):        #         # Migrate common data into the new genpost models        #        for auth1 in orm['post1.author'].objects.all():            new_auth = orm.Author()            new_auth.first = auth1.first            new_auth.last = auth1.last            new_auth.save()        for auth2 in orm['post2.author'].objects.all():            new_auth = orm.Author()            new_auth.first = auth2.first            new_auth.middle = auth2.middle            new_auth.last = auth2.last            new_auth.save()        for tag in orm['post1.tag'].objects.all():            new_tag = orm.Tag()            new_tag.name = tag.name            new_tag.save()        for tag in orm['post2.tag'].objects.all():            new_tag = orm.Tag()            new_tag.name = tag.name            new_tag.save()        for post1 in orm['post1.post'].objects.all():            new_genpost = orm.Post()            # Content            new_genpost.created_on = post1.created_on            new_genpost.title = post1.title            new_genpost.content = post1.content            # Foreign keys            new_genpost.author = orm['genpost.author'].objects.filter(\                    first=post1.author.first,last=post1.author.last)[0]            new_genpost.save() # Needed for M2M updates            for tag in post1.tags.all():                new_genpost.tags.add(\                        orm['genpost.tag'].objects.get(name=tag.name))            new_genpost.save()            post1.delete()        for post2 in orm['post2.post'].objects.all():            new_extpost = p2.ExtPost()             new_extpost.created_on = post2.created_on            new_extpost.title = post2.title            new_extpost.content = post2.content            # Foreign keys            new_extpost.author_id = orm['genpost.author'].objects.filter(\                    first=post2.author.first,\                    middle=post2.author.middle,\                    last=post2.author.last)[0].id            new_extpost.extra_content = post2.extra_content            new_extpost.category_id = post2.category_id            # M2M fields            new_extpost.save()            for tag in post2.tags.all():                new_extpost.tags.add(tag.name) # name is primary key            new_extpost.save()            post2.delete()        # Get rid of author and tags in post1 and post2        orm['post1.author'].objects.all().delete()        orm['post1.tag'].objects.all().delete()        orm['post2.author'].objects.all().delete()        orm['post2.tag'].objects.all().delete()    def backwards(self, orm):        raise RuntimeError("No backwards.")

Now apply these migrations:

$./manage.py migrate

Next you can delete the now redundant parts from post1/models.py and post2/models.py and then create schemamigrations to update the tables to the new state:

$./manage.py schemamigration post1 --auto$./manage.py schemamigration post2 --auto$./manage.py migrate

And that should be it! Hopefully it all works and you have refactored your models.


Abstract Model

class VideoFile(models.Model):    name = models.CharField(max_length=1024, blank=True)    size = models.IntegerField(blank=True, null=True)    ctime = models.DateTimeField(blank=True, null=True)    class Meta:        abstract = True

May be generic relation will be useful for you too.