Django's select_related

How does Django's select_related works and when should you use it?

What problem does it solve?

Django's select_related() solves the problem when you want to access related fields on a queryset. Assume the models:

from django.db import models


class Author(models.Model):
    name = models.CharField(max_length=250)


class Book(models.Model):
    name = models.CharField(max_length=150)
    author = models.ForeignKey(Author, on_delete=models.CASCADE)

Lets say we want to select all books and subsequently want to loop over all the books and print the authors:

>>> db.reset_queries()
>>> books = Book.objects.all()
>>> for book in books:
...     print(book.author)
... 
(0.000) SELECT "books_book"."id", "books_book"."name", "books_book"."author_id" FROM "books_book"; args=()
(0.000) SELECT "books_author"."id", "books_author"."name" FROM "books_author" WHERE "books_author"."id" = 18; args=(18,)
Author object (18)
(0.000) SELECT "books_author"."id", "books_author"."name" FROM "books_author" WHERE "books_author"."id" = 17; args=(17,)
Author object (17)
(0.000) SELECT "books_author"."id", "books_author"."name" FROM "books_author" WHERE "books_author"."id" = 18; args=(18,)
Author object (18)
(0.000) SELECT "books_author"."id", "books_author"."name" FROM "books_author" WHERE "books_author"."id" = 20; args=(20,)
Author object (20)
(0.000) SELECT "books_author"."id", "books_author"."name" FROM "books_author" WHERE "books_author"."id" = 17; args=(17,)
Author object (17)
>>> print(len(db.connection.queries))
6
>>> 

A total of 6 queries are run (we had 5 books and then one query for the initial select *).
Note that the original .all() query had the SQL:

SELECT 
   "books_book"."id", 
   "books_book"."name", 
   "books_book"."author_id" 
FROM 
   "books_book"

select_related to the rescue

If instead we run:

>>> db.reset_queries()
>>> books = Book.objects.select_related('author').all()
>>> for book in books:
...     print(book.author.name)
... 
(0.000) SELECT "books_book"."id", "books_book"."name", "books_book"."author_id", "books_author"."id", "books_author"."name" FROM "books_book" INNER JOIN "books_author" ON ("books_book"."author_id" = "books_author"."id"); args=()
mary poppins
Tom clancy
mary poppins
kass larasdf
Tom clancy
>>> print(len(db.connection.queries))
1
>>> 

Only one query was run. This is because the select_related() creates an INNER JOIN on the related table. The SQL becomes:

SELECT 
   "books_book"."id", 
   "books_book"."name", 
   "books_book"."author_id", 
   "books_author"."id", 
   "books_author"."name" 
FROM 
   "books_book" INNER JOIN "books_author" ON ("books_book"."author_id")

Chaining order doesn't matter

For the select_related() it doesn't matter which order you do the chaining in. The following two queries are identical:

books = Book.objects.select_related('author').all()
books = Book.objects.all().select_related('author')

What should be in the select_related(XYZ)?

This should be the name of the related model from the original model. If no 'related_name' has been specified, it will simply be the lower-case version of the model name.

Related fields

Lets extend the model with a Nationality table:

from django.db import models


class Nationality(models.Model):
    country = models.CharField(max_length=150, default="DK")


class Author(models.Model):
    name = models.CharField(max_length=250)
    country = models.ForeignKey(Nationality, related_name='authors', on_delete=models.CASCADE, blank=True, default=None)

class Book(models.Model):
    name = models.CharField(max_length=150)
    author = models.ForeignKey(Author, on_delete=models.CASCADE)

If we want to include the related 'Nationality', we can do:

>>> for book in books:
...     print(book.author.country.country)
... 
(0.001) SELECT "books_book"."id", "books_book"."name", "books_book"."author_id", "books_author"."id", "books_author"."name", "books_author"."country_id", "books_nationality"."id", "books_nationality"."country" FROM "books_book" INNER JOIN "books_author" ON ("books_book"."author_id" = "books_author"."id") LEFT OUTER JOIN "books_nationality" ON ("books_author"."country_id" = "books_nationality"."id"); args=()
Denmark
Denmark
Denmark
Denmark
Denmark
>>> print(len(db.connection.queries))
1

Again, only one query. The SQL is:

SELECT 
   "books_book"."id", 
   "books_book"."name", 
   "books_book"."author_id", 
   "books_author"."id", 
   "books_author"."name", 
   "books_author"."country_id", 
   "books_nationality"."id", 
   "books_nationality"."country" 
FROM 
   "books_book" INNER JOIN "books_author" ON ("books_book"."author_id" = "books_author"."id") 
   LEFT OUTER JOIN "books_nationality" ON ("books_author"."country_id" = "books_nationality"."id")

Setup shell

>>> from books.models import Book, Author, Nationality
>>> from django.db import connection, transaction
>>> import logging
>>> from django import db
>>> l = logging.getLogger('django.db.backends')
>>> l.setLevel(logging.DEBUG)
>>> l.addHandler(logging.StreamHandler())

In short (oneliner):

from books.models import Book, Author, Nationality;from django.db import connection, transaction;import logging;l = logging.getLogger('django.db.backends');l.setLevel(logging.DEBUG);l.addHandler(logging.StreamHandler());from django import db

Seed the db

from random import choice

from django.shortcuts import render
from django.http import HttpResponse

from books.models import Book, Author, Nationality


# Create your views here.
def index(request):
    return HttpResponse("Hello, world. You're at the polls index.")


def seed_db(request):
    authors = ['Tom clancy', 'mary poppins', 'marcus aur', 'kass larasdf']
    books = ['alpha', 'beta', 'gamma', 'epsilon', 'zeta']
    countries = ["Denmark", "Hong Kong"]

    author_list = []
    country_list = []

    for country in countries:
        nationality = Nationality(country=choice(countries))
        nationality.save()
        country_list.append(nationality)

    for author in authors:
        author = Author(name=author, country=choice(country_list))
        author.save()
        author_list.append(author)

    for book in books:
        Book(name=book, author=choice(author_list)).save()



    return HttpResponse("The db was seeded")