How does Django's select_related works and when should you use it?
What problem does it solve?
Django's select_related() solves the problem when you want to access related fields on a queryset. Assume the models:
from django.db import models
class Author(models.Model):
name = models.CharField(max_length=250)
class Book(models.Model):
name = models.CharField(max_length=150)
author = models.ForeignKey(Author, on_delete=models.CASCADE)
Lets say we want to select all books and subsequently want to loop over all the books and print the authors:
>>> db.reset_queries()
>>> books = Book.objects.all()
>>> for book in books:
... print(book.author)
...
(0.000) SELECT "books_book"."id", "books_book"."name", "books_book"."author_id" FROM "books_book"; args=()
(0.000) SELECT "books_author"."id", "books_author"."name" FROM "books_author" WHERE "books_author"."id" = 18; args=(18,)
Author object (18)
(0.000) SELECT "books_author"."id", "books_author"."name" FROM "books_author" WHERE "books_author"."id" = 17; args=(17,)
Author object (17)
(0.000) SELECT "books_author"."id", "books_author"."name" FROM "books_author" WHERE "books_author"."id" = 18; args=(18,)
Author object (18)
(0.000) SELECT "books_author"."id", "books_author"."name" FROM "books_author" WHERE "books_author"."id" = 20; args=(20,)
Author object (20)
(0.000) SELECT "books_author"."id", "books_author"."name" FROM "books_author" WHERE "books_author"."id" = 17; args=(17,)
Author object (17)
>>> print(len(db.connection.queries))
6
>>>
A total of 6 queries are run (we had 5 books and then one query for the initial select *).
Note that the original .all() query had the SQL:
SELECT
"books_book"."id",
"books_book"."name",
"books_book"."author_id"
FROM
"books_book"
select_related to the rescue
If instead we run:
>>> db.reset_queries()
>>> books = Book.objects.select_related('author').all()
>>> for book in books:
... print(book.author.name)
...
(0.000) SELECT "books_book"."id", "books_book"."name", "books_book"."author_id", "books_author"."id", "books_author"."name" FROM "books_book" INNER JOIN "books_author" ON ("books_book"."author_id" = "books_author"."id"); args=()
mary poppins
Tom clancy
mary poppins
kass larasdf
Tom clancy
>>> print(len(db.connection.queries))
1
>>>
Only one query was run. This is because the select_related()
creates an INNER JOIN on the related table. The SQL becomes:
SELECT
"books_book"."id",
"books_book"."name",
"books_book"."author_id",
"books_author"."id",
"books_author"."name"
FROM
"books_book" INNER JOIN "books_author" ON ("books_book"."author_id")
Chaining order doesn't matter
For the select_related() it doesn't matter which order you do the chaining in. The following two queries are identical:
books = Book.objects.select_related('author').all()
books = Book.objects.all().select_related('author')
What should be in the select_related(XYZ)?
This should be the name of the related model from the original model. If no 'related_name' has been specified, it will simply be the lower-case version of the model name.
Related fields
Lets extend the model with a Nationality table:
from django.db import models
class Nationality(models.Model):
country = models.CharField(max_length=150, default="DK")
class Author(models.Model):
name = models.CharField(max_length=250)
country = models.ForeignKey(Nationality, related_name='authors', on_delete=models.CASCADE, blank=True, default=None)
class Book(models.Model):
name = models.CharField(max_length=150)
author = models.ForeignKey(Author, on_delete=models.CASCADE)
If we want to include the related 'Nationality', we can do:
>>> for book in books:
... print(book.author.country.country)
...
(0.001) SELECT "books_book"."id", "books_book"."name", "books_book"."author_id", "books_author"."id", "books_author"."name", "books_author"."country_id", "books_nationality"."id", "books_nationality"."country" FROM "books_book" INNER JOIN "books_author" ON ("books_book"."author_id" = "books_author"."id") LEFT OUTER JOIN "books_nationality" ON ("books_author"."country_id" = "books_nationality"."id"); args=()
Denmark
Denmark
Denmark
Denmark
Denmark
>>> print(len(db.connection.queries))
1
Again, only one query. The SQL is:
SELECT
"books_book"."id",
"books_book"."name",
"books_book"."author_id",
"books_author"."id",
"books_author"."name",
"books_author"."country_id",
"books_nationality"."id",
"books_nationality"."country"
FROM
"books_book" INNER JOIN "books_author" ON ("books_book"."author_id" = "books_author"."id")
LEFT OUTER JOIN "books_nationality" ON ("books_author"."country_id" = "books_nationality"."id")
Setup shell
>>> from books.models import Book, Author, Nationality
>>> from django.db import connection, transaction
>>> import logging
>>> from django import db
>>> l = logging.getLogger('django.db.backends')
>>> l.setLevel(logging.DEBUG)
>>> l.addHandler(logging.StreamHandler())
In short (oneliner):
from books.models import Book, Author, Nationality;from django.db import connection, transaction;import logging;l = logging.getLogger('django.db.backends');l.setLevel(logging.DEBUG);l.addHandler(logging.StreamHandler());from django import db
Seed the db
from random import choice
from django.shortcuts import render
from django.http import HttpResponse
from books.models import Book, Author, Nationality
# Create your views here.
def index(request):
return HttpResponse("Hello, world. You're at the polls index.")
def seed_db(request):
authors = ['Tom clancy', 'mary poppins', 'marcus aur', 'kass larasdf']
books = ['alpha', 'beta', 'gamma', 'epsilon', 'zeta']
countries = ["Denmark", "Hong Kong"]
author_list = []
country_list = []
for country in countries:
nationality = Nationality(country=choice(countries))
nationality.save()
country_list.append(nationality)
for author in authors:
author = Author(name=author, country=choice(country_list))
author.save()
author_list.append(author)
for book in books:
Book(name=book, author=choice(author_list)).save()
return HttpResponse("The db was seeded")