About This Talk
The main building blocks of Django REST Framework projects, i.e. Views, Serializers, Managers, and Querysets allow developers to implement complex APIs with very little code repetition while reusing built-ins for essential API features. Developers feel guided by DRF to architect the project in a “Don’t Repeat Yourself” way by using inheritance, nesting, annotations, and model / app-based separation of concerns. They can group code in viewsets, inherit from base classes, reuse the same serializer across views, nest serializers into others, compute fields dynamically with ORM annotations, select or prefetch relations for performance, organize custom behavior with managers and querysets, and much more. All this DRYness is great because it integrates well with common web API concerns like permissions, pagination, filters, etc.
Based on our multi-year experience in building and maintaining several large Django projects, while using those built-in concepts really yields a DRY code, the overuse also results in a codebase full of complicated bugs and performance issues, especially related to ORM usage. View, serializer, and model methods are often heavily coupled to querysets’ annotations and prefeches, but those methods are spread across the codebase. Django’s default queryset laziness, together with its heavy usage of inheritance and nesting is the perfect recipe for a codebase where N+1 issues and heavy unnecessary queries can happen in any line of code after some less careful change.
For example, to prevent N+1 issues, if a serializer method field uses a filtered relationship, you must ensure this relationship is prefetched in all querysets related to that serializer. But this serializer can be nested into others, so you must now be careful to change all queryset references in seemingly unrelated views. Other sorts of “change amplification” situations also happen on large DRF codebases with heavy ORM usage. Requiring developers to be careful while navigating through lots of files to perform changes isn’t reasonable. Maybe being DRY is leading to the wrong abstraction?
It’s possible to design a better architecture that’s optimized both for enabling changes and avoiding performance regressions. With a new custom data prefetching layer that keeps compatibility with serializers and views, we can respect DRY while keeping performance and maintainability. That’s what we’ve been doing in our Django projects, and we will share our learnings in this talk. Hopefully, that applies to other maintainers of complex DRF projects.
Here’s the planned outline:
- [3 minute] Who am I.
- [5 minutes] DRY: the good side
- [5 minutes] Example: resulting architecture when following DRY in a complex Django REST Framework project
- [10 minutes] Example: when DRY leads to Change Amplification
- Serializers vs. queryset annotations and prefetches
- Making things worse with nesting
- When model logic is coupled to prefetches
- New code and unexpected new queries, the fragility of prefetches
- [7 minutes] Common solutions that didn’t work for us
- You can prevent queries, but you can’t prevent nesting
- You can build the queryset in serializer, but you’ll repeat yourself
- Fat models and querysets will have cross-cutting concerns
- Auto-prefetching libraries can’t handle all cases
- Tests can’t test the future
- [10 minutes] Solving with a data prefetching layer
- How it looks like
- The real DRY: gathering together code that changes together
- How each field contributes to the queryset
- How to keep it compatible with DRF views and serializers
- Dealing with cross-model concerns and nesting
- [5 minutes] Questions
Flávio Juvenal (he/him)
I’m a software engineer from Brazil and Chief Scientist at Vinta Software (www.vinta.com.br). I’ve been building web products with Python and Django for the last 12 years. I love drinking medium and light roast coffee and visiting museums around the world.