Speeding Up Deep Learning Development with Cache ( in PyTorch)

What keeps you waiting?

When developing deep learning codes, it is very important to have an efficient way of reading data from the dataset. Sometimes your data is in format of thousands of files or it simply resides on a database. Either way, loading data can take a while. One might argue that a few minutes of time is not comparable to the training time which may count up to several hours or even days!

Cache is the Answer!

In a nutshell, my solution is to store the data in a single cache file so that your program only reads from one single point on the data storage. Compared to the case of reading thousands of points from storage (i.e. reading from multiple files) which takes much longer or querying the same data from database every single time, (possibly) with some additional intermediate processing. Now that we have a general idea of the solution to the problem, lets get hands on some coding!

Show me the code!

The code for implementing this logic is straightforward. First of all, the dataset class that inherits from the abstract Dataset class of PyTorch is defined as below:

The Code In Action!

Here is a piece of code that reads and compares the two approach. (i.e with and without cache). The primary method of loading data here is reading from thousands of separate files. Which is the case for my own research. Since in my own research I use MIMIC 3 dataset that requires some permissions for access, I have provided a fake data generator that generates some data similar to the real ones in terms of format.

Summary

In this blog post we have reviewed how using cache can increase the speed of loading dataset. Thus making life easier while developing and debugging a deep learning model in PyTorch.

--

--

Master’s in CS, with concentration on Machine Learning. I am into AI, Robotics and more. I enjoy nature, photography and digital art. Read my tweets @tb_cyrus.

Love podcasts or audiobooks? Learn on the go with our new app.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Kourosh T. Baghaei

Kourosh T. Baghaei

Master’s in CS, with concentration on Machine Learning. I am into AI, Robotics and more. I enjoy nature, photography and digital art. Read my tweets @tb_cyrus.