Solid-state drives use flash-based memory to store data. In modern SSDs, it is generally the charge-trap NAND Flash. In comparison to traditional hard disks that rely on spinning magnetic platters and mechanical read/write heads, SSDs have no moving parts. It depends specifically on capturing bit information (0/1), which indirectly involves storing data in the form of electric charges, making SSDs faster, more energy-efficient, and more durable.
The advantage of SSD over HDD is quite noticeable in tasks such as loading applications, transferring files, and booting the operating system. Although HDDs are cost-effective in real-world applications, SSDs have become the preferred choice in the modern computing world due to their speed, compact form factor, and reliability.
Despite all these advantages, SSDs have their own set of challenges, like any other piece of technology. In comparison to HDD, where data can easily be overwritten, SSD requires the memory cell to be erased before any new data can be written. This idea, combined with the fact that SSDs have a finite number of write cycles, makes overall data management a critical measure in SSDs.
This is where the advanced built-in feature of SSD comes into play. Garbage Collection (GC) in SSDs is a background process that reclaims space by cleaning up blocks containing invalid (obsolete) data, allowing them to be rewritten with new data in the future.

SSDs store data in pages (typically 4KB), but they can only erase data at the block level (often 128–512 pages per block). When files are deleted or updated, the corresponding pages are marked as invalid, but they aren’t immediately erased. New data can only be written to empty (free) pages, not to invalid pages. Garbage Collection consolidates valid pages and erases blocks full of invalid pages to make space for new writes.
Seems too complex to understand?
Let’s elaborate on everything in detail.
Understanding NAND Flash Memory
The SSD comprises NAND-based Flash memory, a non-volatile storage technique that ensures data retention even when the power supply is cut.
The hierarchical structure of NAND Flash is cells, pages, and blocks.
A cell is the smallest unit that can store one or more bits, depending on the type. (SLC, MLC, TLC, QLC). Below is an image of an SSD memory chip, which can contain millions or even billions of these cells, depending on the storage capacity.

It is impossible to see a single cell with the naked eye, and we don’t have to see it to understand how it works. Instead, we examine the cell diagrams and their underlying working principles. Below is an SLC NAND flash cell with two states, i.e., charged and discharged, each having either bit 0 or 1.

Multiple cells are combined to form a page, which is typically between 4KB and 16 KB in size. A page is the smallest writeable/readable unit in the NAND flash.
A group of pages makes up a block. Often 64 KB to 256 KB. A block is the smallest unit that can be erased.


Data in SSDs is written at the page level, but erasure can only occur at the block level. This idea is somewhat unconventional and presents a challenge. For instance, if just a few pages within a block require updating, the entire block must first be erased and then rewritten with the new and remaining valid data. This rule is also known as the erase-before-write rule, and it significantly impacts how the SSD performs the write operation.
Let’s put it more simply.
Data is written sequentially into free pages. When data is updated or deleted, old versions are marked invalid but remain physically in place. The SSD controller can’t erase these pages yet because it must wait until the entire block is filled with invalid data. SSD consolidates valid data into a new block. The old block, now containing only invalid (obsolete) data, is erased. This erasure frees up the block for future writes.
The Need for Garbage Collection
The above erase-before-write rule suggests that data must be erased before rewriting the new information. But it is not as straightforward as it sounds. Data can only be erased at the block level, even though it is written on the page level. This means that if we need to update just one page in the block, the entire block has to be erased and rewritten.
The reason is that when the data is deleted or modified, the original page is not immediately erased. Instead, at the moment, they are marked invalid, while the new data is written in the fresh blocks. Over time, this leads to the accumulation of invalid pages. These invalid pages are no longer helpful but still occupy space.
If the invalid data is not removed, the SSD will run out of fresh blocks. This may force the SSD to perform erase and rewrite functions more frequently on partially filled blocks, which can lead to faster wear of memory cells.
To overcome this challenge, SSDs rely on an internal cleaning process called Garbage Collection.
An Analogy to Understand SSD Garbage Collection
Imagine an SSD as an apartment complex (a block) composed of many rooms (pages) where people (data) reside. You can only move people into empty rooms — once someone has lived there, you can’t just tidy it up and let a new person in. The building’s rules say you must evict everyone in the entire block at once before it can be reused. Over time, some rooms remain empty, while others continue to be occupied.

Garbage collection is like hiring movers to gather the remaining tenants into a different block with plenty of space, so the old block is now completely vacant. Only then can the cleaning crew erase it and make all its rooms ready for new residents. This moving process takes time and resources, which is why SSD performance can dip when garbage collection is happening.
How Garbage Collection Works
Garbage collection in SSDs is an internal cleaning service that is designed to free up and manage the space occupied by data that is no longer in use. Its primary function is to free up memory blocks by erasing those that contain outdated information.
Garbage collection identifies the partially filled blocks, copies the valid pages to a new block, and erases the old block to make it available for fresh writes.
Garbage collection is mainly triggered in two ways:
Proactively as a background process:
When the system is in an idle state or under low-performance mode, the SSD controller initiates background garbage collection as a proactive measure.
Reactively on demand:
Garbage collection is also triggered reactively when the SSD runs low on free space during a write operation. The process may cause a noticeable performance drop.
Impact on Performance and Lifespan
Garbage collection, though a vital SSD functionality, comes with specific side effects that can impact both the lifespan and performance of the drive.
One significant consequence of the garbage collection is write amplification, a phenomenon where excess data is physically written than intended. For instance, even a small change in data or a minor file update may require the SSD to copy and rewrite additional valid pages from a block to free up space, resulting in multiple write operations for a single change.
Garbage Collection vs TRIM Command
Garbage collection and TRIM are often confused with each other. TRIM is a command issued by the operating system to the SSD, instructing the SSD to identify and erase blocks of data that are no longer in use.

For more information about TRIM, read “Understanding TRIM in SSDs.”
TRIM helps in garbage collection by providing accurate information about invalid pages. This enables the early identification of stale data, thereby reducing the overhead during future writes. When TRIM is enabled, garbage collection becomes more efficient as it knows what data needs to be erased.
Conclusion
Behind any smooth operation lies a complex system of management. In SSDs, garbage collection is one of those complex techniques that enhances the storage performance, offering lightning-fast speeds and unshaken reliability.
To summarize:
- SSD uses NAND Flash memory cells that require data to be erased before it can be stored.
- Whenever data is updated or deleted, invalid pages accumulate, thereby requiring a cleanup process.
- Garbage collection cleans up the space by erasing unused data. Garbage collection can sometimes trigger write amplification and temporary slowdowns during active use.
- The TRIM command increases the efficiency of Garbage Collection by telling the SSD which blocks are no longer in use.
- To mitigate the downsides of GC and extend the lifespan of SSDs, techniques such as wear-leveling, over-provisioning, and intelligent firmware are employed.