Optimizing data loading
One of the boosts that we can give to our application's performance is by optimizing the way it loads data from the database. This is not something that is complex to implement, and ORM solutions make it much simpler to get all of this up and running.
Optimizing data loading just has a few rules. So, let's take a look at what these are, and how they can prove to be advantageous:
- Defer the loading of data that can be skipped: When we know that we won't require all the data that we are fetching from the database, we can safely defer the loading of that data by utilizing the lazy loading technique. For example, if we wanted to send a mail to all those users of our BugZot application who have more then 10 bugs pending against them and who are not an administrator, we could just defer the loading of the role's relationship. Considering a big database with a lot of users, this can help to significantly reduce the response time of the application, as well as its overall memory footprint, at the expense of a few extra queries, which might be a desirable trade-off to make.
- Load data early if it is going to be used: In complete contrast to the first point, if we know that the application will use data, no matter the situation, then it makes complete sense to load it in one shot rather than emitting extra queries to load data on demand. For example, if we wanted to promote all the administrators to super administrators, we know we will be accessing the role field of all the users. Then, it doesn't makes sense to make the application lazy load the roles field. We can simply ask the application to eager load the required data so that the application doesn't wait for the data to get loaded on demand. This type of optimization comes at the cost of increased memory usage and slow initial response times, but provides the advantage of fast execution, once all the data has been loaded.
- Do not load data that won't be required: There are times when some of the relationships an object maps to are not required at all during processing. In these kinds of situations, we can save a lot of memory and time by simply not loading those relationship objects at all. This can be fairly easily achieved in SQLAlchemy by simply setting lazy_load='noload'. One example of such a use case is where loading of the relation is not required when all we want to do is to update the last_active time of the user in the database. In this case, we know that we are not required to validate anything related to the role of the user, and hence we can skip the loading of the role altogether.
Achieving these effects clearly cannot be done if the loading technique is embedded in the model definition altogether. So, SQLAlchemy does provide another way to achieve these effects through the use of different methods, named, aptly, based on the technique they use to load the data from the database, for example, lazyload() for lazy loading, joinedload() for joined eager loading, subqueryload() for subquery eager loading, and noload() for no loading, which we will explain in later chapters, including how they can be used in a real application.
Now that we're familiar with loading techniques and how we can use them to our advantage, now let's take a look at one of the final topics of this chapter, where we will see how we can utilize caching to speed up our application response times, as well as saving the effort of querying our database again and again, which will indeed help us during times when the application is performing a lot of data-intensive operations.