Skip to content

Sequential vs Random Read/Write Performance in Storage

Sequential and Random Read/Write performance is critical if you want to understand the concept of digital storage. These concepts apply both to consumer and enterprise storage devices. You might have seen these names in your SSD or Hard Drive specifications. Sequential performance is always higher than the random performance in all the drives.

The main difference between random and sequential data is in their access patterns. The sequential data is accessed and stored in continuous blocks which leads to less overhead on storage controllers. It is easy to store a big file in continuous blocks without the worry of keeping track of their locations. It can be accessed in the same manner. We just need to know the start and end location of the data.

The random data doesn’t work in the same way. Because the data comes in different file types, sizes, and portions, they must be separated from each other. For reading the data, the data can be accessed from big chunks but in different portions. So, random data represents how the data is accessed from the storage location. This puts many extra layers of data management. The controllers have to work harder to keep track of the locations and prepare different storage locations.

Random data is much more complicated to store and access. However, it is the most important criterion determining the software performance, database management, OS functions, and almost everything that we do on our computers daily. Let’s talk in detail about both.

What is Sequential Data?

Sequential Data in Storage

Sequential data is where the order of data is important. For example, a video file must be stored in the same place in a sequence. Changing the order of the sequence would make the stored data invalid. The sequential data generally include the same types of files or big files of specific types. For playing an audio, the data must be accessed in a series and there is no need to either store it or access it from random locations. It can be stored in adjacent blocks or memory cells.

Sequential data is stored inside the storage devices in a linear order. It is stored and accessed in sequence, one after another, without jumping from one storage location to another. Because the sequential data has a linear access pattern, it enables a continuous flow of data that could be scaled through higher bandwidths. A good example of sequential data is; when you watch a video, the data (frames) is sequentially read from the storage drive from start to finish for a smooth playback.

If data is stored sequentially, it would be read sequentially as well. This is because the flow or the sequence of the data location over time matters the most. It is much better to store video files in adjacent data locations rather than cutting the parts and storing them at discrete locations. This would make the data storage

Streaming video and audio files, doing backups, moving images/videos, etc are some of the tasks with sequential data. In other words, order is everything in sequential data. If there is no order in the incoming or outgoing data, it isn’t sequential at all.

What is Random Data?

Random Data in storage

Random data is the complete opposite of sequential data because it lacks order. There are generally small and multiple files that are stored, accessed, or erased in the storage devices. It is accessed or written in a non-sequential, non-linear, and unpredictable manner. Don’t confuse the word random with the data itself lacking a structure. It has a structure but its access pattern is non-continuous.

The best example of random data is your OS/application/software/games. For example, if you are opening Google Chrome, it will never ask for a single file. There would be different files of different sizes and different formats for cache, history, bookmarks, cookies, images, webpages, themes, extensions, etc. These files would be stored at different locations because there is no point in storing them sequentially. After all, no one knows if they would be required at the same time or not.

The best word for random is fragmented data, however not limited just to this. Random data is stored in non-contagious blocks which results in higher latency compared to the sequential data. Generally, you will find your storage drives with very low random read/write performance compared to the sequential performance. This is because there is a huge overhead in both reading and writing random data. Even for reading random data, finding the data locations and ensuring proper data delivery is a time-consuming task.

Random data could be accessed from large sequential data sets which sounds quite contradictory to what we just discussed. Yes, from a big file, we may have to access different portions for our use. This access pattern makes it a random read task.

The Ease of Storing Sequential Data

Because sequential data is stored in contiguous data blocks in every storage device, it is pretty easy on the system, storage medium, controllers, and almost everything else. There is no need to track and manage the scattered pieces of data. In hard drives, storing sequential data requires minimum write head movement over the spinning disk. Also, in SSDs, the controller can just keep preparing nearby blocks and keep writing the data on adjacent pages. In other words, writing sequential data comes with no heavy overhead and the storage devices can easily streamline this process without extra power consumption or energy usage.

When it comes to reading the sequential data, the controllers can easily predict the next block, page, or sector (in hard drives) to be read. Fetching big chunks of data is possible and the cache can be utilized in a much better manner. All in all, storing, retrieving, and erasing sequential data is much faster due to simple data management, continuity, reduced fragmentation, and big file sizes.

With the help of faster interfaces such as PCIe and NVMe, we can easily make use of parallelism in SSDs and higher bandwidth to reach new limits of sequential data performance.

The Hardships of Storing Random Data

The biggest problem with storing random data is its fragmented nature. The data has to be stored in different locations to ensure it’s separated from other data. So, it will be stored in non-contiguous blocks. This leads to fragmentation and as time passes, the data has to be modified and rewritten for reliability. This scatters the files which degrades the performance. Managing fragmented data is much more resource-intensive for the storage drives.

The access time is slower because, in hard drives, the read head has to move in discrete locations in different parts of the platters. This increases the retrieval and access time a lot. In SSDs, reading random data is tough because the FTL has to be read over and over again to locate the data. Also, switching different blocks and pages takes time and it increases the overhead.

Unlike sequential data, the data locations of random data are unpredictable. So, the optimization is really difficult in this case. In hard drives, there is always the need for defragmentation to maintain the performance. It adds an extra layer of data maintenance.

Sadly, it is very hard to make use of the high data bandwidth and parallelism offered by PCIe and NVMe to enhance the random data. Random data is dependent mainly on the IOPS (Input Output Operations Per second) and latency rather than the throughput and bandwidth. However, with the new generation SSDs, the random read/write speeds are increasing but way lower than the sequential performance.

Why are the access patterns different?

We understood that sequential data has a predictable and linear access pattern while random data has an unpredictable access pattern. But, why does that happen? It is very easy to answer.

The sequential data is ordered which means each piece of data is logically connected to the next. The random data, on the other hand, is accessed based on specific needs.

The access patterns are different because the type of data is different. If the SSD can access the data easily because it is located in nearby locations, its performance would be better. On the other hand, if the SSD controller has to read the mapping table for multiple small files to locate and then read from them, it would take a lot longer.

So, it is about the type of data that makes the difference in its access pattern. See, it is not that there is a difference in the data. Whether it is sequential or random data, computers understand only the language of logic and bits. Sequential data is a continuous stream of bits that can be easily directed towards a large area without much organization. The random data isn’t a stream but fragments of different files

Sequential vs Random Read/Write performance in storage

The performance metrics for sequential and random read/write performance in storage vary depending on the type of storage device. However, below are some of the average numbers to look at.

Hard Disk Drives (HDDs):

Sequential Read/Write Performance:

Read Speed: Typically around 100-200 MB/s.

Write Speed: Typically around 80-160 MB/s.

HDDs have rotating platters and moving read/write heads. The data is stored inside the sections which are divided into tracks on the platters in circular shapes. The read/write heads perform best when accessing data in a continuous stream, minimizing the time spent moving the head.

Random Read/Write Performance:

Read Speed: Typically around 0.5-2 MB/s.

Write Speed: Typically around 0.5-1 MB/s.

Random access involves frequent head movements because the head must come above the data locations which are generally far from each other. It significantly increases latency and decreases throughput compared to sequential operations.

Solid State Drives (SSDs)

Sequential Read/Write Performance:

Read Speed: Typically around 500-10000 MB/s (for SATA SSDs to Gen 5.0 NVMe SSDs).

Write Speed: Typically around 400-10000 MB/s (for SATA SSDs to Gen 5.0 NVMe SSDs).

SSDs have no moving parts and can read/write data across many cells simultaneously, making sequential access very fast. The power of parallelism combined with high PCIe bandwidth helps the NVMe SSDs reach very high sequential read/write speeds. Some Gen 5.0 NVMe drives are capable of offering above 10 GB/s sequential read/write speed.

Random Read/Write Performance:

Read Speed: Typically around 50,000-12,00,000 IOPS (Input/Output Operations Per Second) for random 4K reads.

Write Speed: Typically around 20,000-1,000,000 IOPS IOPS for random 4K writes.

SSDs handle random access much better than HDDs due to their lack of mechanical movement and the ability to access multiple cells at a time. The Gen 5.0 NVMe drives, again, can reach higher random read/write performance due to much effective use of parallelism and other optimizations.

Conclusion

Random and Sequential read/write performance is all about the access patterns of the desired data. If the access pattern is sequential, the data is coming from the storage in sequence. The same goes for storing the data. Random data is much more complex to read and write. So, random performance will always be slower as compared to the sequential performance in the same drive.

Subscribe
Notify of
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments