Skip to content

A Beginner’s Guide to File Systems in Computers

In the era of big data, there is always a need for a storage device big enough to hold that data. Although storage devices are made for data storage, they can’t store the data without protocols provided by the operating system. These protocols ensure a structured way to manage directories, ensuring that the data shall be manipulated efficiently and securely.

A file system is a data structure and set of protocols that is used by an operating system to control the management of data. OS, with the help of a file system, controls how the data should be stored, organized, and retrieved in any storage device, such as an SSD, hard drive, USB drive, etc. 

Let’s understand the basics of the file systems and how they work. We will talk a little about the types of different partitions as well. So, let’s get started.

File System and its implementation

If we go on in a little technical definition, the file system is a software layer that provides an interface between the kernel and the physical storage media, enabling efficient file handling. A file system is a composition of data structures, algorithms, and data about the data.

File Systems are mostly written in either C or C++, as these language has better hardware interaction and at the same time are faster to execute than any other language. There are, however, files system that uses Rust or Python as well. These file systems are more focused on safety, memory management, and scripting utilities. 

The metadata is stored in the dedicated blocks, also known as superblock or Master File Table. The drivers related to the file system are stored in the kernel space. Whenever a program executing asks for read or write, it fires the system calls from the user space to the kernel space using system calls/system-level APIs.

System Calls

The file system is a crucial part of the kernel. It handles the file operation via system calls like open(), read(), write(), and close().

Given below is a table of File System Calls and their functions.

System CallFunctionDescription
open()Open a fileOpens an existing file or creates a new one. Returns a file descriptor.
read()Read from a fileReads data from a file into a buffer. Requires a file descriptor.
write()Write to a fileWrites data from a buffer into a file. Requires a file descriptor.
close()Close a fileCloses an open file and releases system resources.
lseek()Move file pointerChanges the current read/write position within a file.
stat() / fstat()Get file metadataRetrieves information like size, permissions, and timestamps.
mkdir()Create a directoryCreates a new directory in the file system.
rmdir()Remove a directoryDeletes an empty directory.
unlink()Delete a fileRemoves a file from the file system.
rename()Rename a fileChanges the name of a file or directory.
chmod()Change file permissionsModifies file access permissions.
chown()Change file ownerChanges ownership of a file.
mount()Mount a file systemAttaches a file system to a directory.
umount()Unmount a file systemDetaches a file system from the directory tree.
fsync()Synchronize fileEnsures all pending file writes are committed to disk.
dup() / dup2()Duplicate file descriptorCreates a copy of a file descriptor.

Importance of File Systems

1. Data Storage

The file system ensures efficient space allocation by ensuring that the space is optimally used by dividing the data into data blocks and keeping an eye on free and occupied tracks and sectors. File system implementation is done in such a way that data is stored in contiguous blocks, ensuring improved read and write speed. This is also known as defragmentation, which we will learn about in another article. The file system also makes sure that the corrupted data is kept out of place so that recovery is possible after unexpected failures. 

2. Data Retrieval 

The file system helps in faster data retrieval by maintaining directories and indexes that help in locating the data quickly. Most of the file systems use caching mechanisms to store the frequently accessed data. This helps in avoiding multiple reads into the memory since the data is cached. The file system also stores the head information about the data, such as the last accessed time, size, access permissions, etc. This helps the user quickly access the metadata of the data. 

3. Data Management

The file system provides the required data structures for the data. “Required” here means, if data needs to be stored in a linear format, the file system stores it in the linear format. In case the data needs to be stored hierarchically, in the case of folders and subfolders, the file system also takes care of that. File systems also apply access controls and multiple other security measures to prevent unauthorized users or programs from modifying the files. The backup and restore mechanisms are not supported by the file systems, but there are features like snapshots, file backup, and versioning to prevent data from accidental loss.

File System: Structuring and Hierarchy

Efficient data organization and access is one of the important aspects of the file system. Minimal latency and optimal performance are core throughout the management. The file system achieves it through various mechanisms such as a caching mechanism, structure, and indexing. 

A file system always follows a structural approach for better navigation and retrieval. 

Hierarchical File Structure

Most of the modern file system follows a hierarchical tree-like structure. Provided below is an illustration. 

/home/user/documents/work/report.pdf

This hierarchy consists of directories(folders under names or groups), sub-directories (multilevel categorization of data), and file paths(absolute or relative paths to quickly locate the files).

Partitioning and Logical Volumes

The file system supports storage division, also known as volume partitioning. E.g., C:\, D:\, etc. 

Type of File Systems

Disk-based File Systems

  • FAT32, NTFS (Windows): FAT32 is lightweight, but it lacks modern security features; NTFS supports most of the modern features such as encryption and journaling. 
  • HFS+, APFS (mac): HFS+, now replaced by APFS, is mac Mac-based file system. It is better in speed, encryption, and space optimization. 
  • EXT3, EXT4 (Linux): EXT4 offers journaling and improved performance over EXT3. Both of them are Linux-based file systems. 

Network-based File Systems

  • NFS (Network File System): It is used in Unix/Linux for accessing remote files.
  • SMB (Server Message Block): It is used in Windows for sharing files over the network.
  • DFS (Distributed File System): It is used in network-based storage. It is the best file system that ensures redundancy and scalability. 

Flash-based File Systems

  • YAFFS (Yet Another Flash File System): It is used in embedded devices. It is optimized for NAND-based flash memory. 

Database-based File System

  • SQLite File System: It is used in a single database to store the data.
  • Oracle ASM (Automatic Storage Management): It is run by Oracle and is helpful in managing the data efficiently. 

Special Purpose File System

  • Procfs (Process File System): It stores the real-time information of the processes in Linux. 
  • Tmpfs (Temporary File System): It uses RAM for temporary high-speed storage usage.

File System Operations

A file system provides various operations to manage files and directories efficiently. These operations are essential for reading, writing, modifying, and organizing data within a storage system.

Basic File Operations

1. Create Operation

  • It creates a new file in the directory (name, type, permissions).
  • Example: touch filename (Linux), CreateFile() (Windows API).

2. Open Operation

  • It loads a file into memory for reading or writing.
  • Example: fopen() in C, open() in Linux system calls.

3. Read Operation

  • Retrieves data from a file and transfers it to the buffer.
  • Example: read(fd, buffer, size) in Linux.

4. Write Operation

  • It writes the data to a file in a specific location.
  • Example: write(fd, buffer, size) in Linux.

5. Close Operation

  • It releases system resources that are acquired by the programs.
  • Example: fclose(file_pointer) in C.

Basic Directory Operations

1. Create Directory

  • Creates a new folder to store the files.
  • Example: mkdir foldername (Linux/Windows).

2. Delete Directory

  • Removes a folder.
  • Example: rmdir dirname (Linux).

3. List Directory Contents

  • It lists all the directories and sub-directories.
  • Example: ls -l (Linux), dir (Windows).

4. Change Directory

  • It is used to change the directory or move forth and back into the hierarchy.
  • Example: cd /path/to/directory.

File Manipulation Operations

1. Rename Operation

  • It changes the name of a file.
  • Example: mv oldname newname (Linux).

2. Copy Operation

  • It creates a duplicate of the file in the required destination.
  • Example: cp source destination.

3. Move Operation

  • It moves the files from one directory to another.
  • Example: mv filename /new/location/.

4. Delete Operation

  • It permanently deletes the file from the system.
  • Example: rm filename (Linux).

File System Control Operations

1. Change Permissions

  • It modifies the permission of the files, who can access or not.
  • Example: chmod 755 filename.

2. Change Ownership

  • It assigns a new owner to the file.
  • Example: chown user: group filename.

3. Mount & Unmount

  • Attaches (mounts) or detaches (umounts) a storage device to the file system.

4. Synchronization & Flushing

  • Ensures the data is written into the file with every update.
  • Example: fsync() in Linux.

Data Access Mechanisms: Speed and Efficiency

The file system does indexing for fast file lookup instead of scanning the entire disk. 

File Allocation Table (FAT)

FAT is used in FAT16, FAT32, exFAT file systems. FAT-type file systems maintain a table mapping the file names to their location on the disks. It is always good for small storage devices such as USB or SD cards because it struggles with fragmentation. 

Master File Table (MFT)

MFT is a database-like structure that contains information on all the files. NTFS uses MFT. It allows faster searching and file access. 

Inodes in UNIX/Linux (Ext3, Ext4)

Inode stores file’s metadata separately from filenames. Each file in the Inode type file system is assigned an inode number and directory mapping between the inode number and names. It is always efficient for handling a large number of files. There is no file degradation. 

Conclusion

A file system is a very important set of information and protocols for any operating system. In this article, we went through various types of file systems, operations, architecture, and their importance. As the technique of storage evolves, the file system will continue to evolve providing better performance, scalability, and future-proof features. 

Subscribe
Notify of
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments