Harrykar's Techies Blog: File Space

15 September 2015

File Space

STORAGE ACCESS

A collection of storage devices present a file space. This file space exists at two levels:

a logical level defined by partitions, directories, and files, and
a physical level defined by file systems, disk blocks, and pointers.

The users and system administrators primarily view the file space at the logical level. The physical level is one that the operating system handles for us based on our commands (requests).

The devices(disk, usb, optical, solid state ecc drives) that make up the file space collectively provide us with storage access, where we store executable programs and data files.
We generally perform one of two operations on this storage space:

we read from a file (load the file into main memory, or input from file)
and we write to a file (store/save information from main memory to a file, or output to file).

Disk Storage and Blocks

As hard disk storage is the most common implementation of a file system, although some of the concepts apply to other forms of storage as well.

To store a file on disk, the file is decomposed into fixed-sized units called blocks. Figure illustrates a small file (six blocks) and the physical locations of those blocks. Notice that the last block may not fill up the entire disk block space, so it leaves behind a small fragment.

The operating system must be able to manage this distribution of files to blocks in three ways:

given a file and block, the operating system must map that block number into a physical location on some disk surface.
the operating system must be able to direct the disk drive to access that particular block through a movement of both the disk and the drive’s read/write head.
the operating system must be able to maintain free file space (available blocks), including the return of file blocks once a file has been deleted.

All of these operations are hidden from the user and system administrator.

Let us consider how a disk file might be broken into blocks.

The files are distributed across all of the disk’s surfaces (a hard disk drive will contain multiple disk platters and each platter has two surfaces, a top and bottom). Given a new file to store, the file is broken into blocks.
The first block is placed at the first available free block on disk.
Where should the next disk block be placed? If the next block after the first is available, we could place the block there, giving us two blocks of contiguous storage. This may or may not be desirable. The disk drive spins the disks very rapidly. If we want to read two blocks, we read the first block and transfer it into a buffer in the disk drive. Then, that data are transferred to memory. However, during that transfer, the disk continues to spin.
When we are ready to read the second disk block, it is likely that this block has spun past the read/write head and now the disk drive must wait to finish a full disk revolution before reading again. Distributing disk blocks so that they are not contiguous will get around this problem.

In Figure , you will see that the first three disk blocks are located near each other but not in a contiguous block. Instead, the first block lies at location 3018, the second at 3020, and the third at 3022.

Whether initial blocks are contiguous or distributed, we will find that further disk blocks may have to be placed elsewhere because we have reached the end of the available disk blocks in this locality. With the deletion of files and saving of other files, we will eventually find disk blocks of one file scattered around the disk surfaces. This may lead to some inefficiency in access in that we have to move from one location of the disk to another to read consecutive blocks and so the seek time and rotational latency are lengthened.
Back to Figure we might assume that the next available block after 3022 is at 5813 and so as the file continues to grow, its next block lies at 5813 followed by 5815. As the next block, according to the figure, lies at 683, we might surmise that 683 was picked up as free space because a file was deleted.

Block Indexing Using a File Allocation Table

That is, block i’s successor location is stored at location i

How do we locate a particular block of a disk? File systems use an indexing scheme. MS DOS and earlier Windows systems used a file allocation table (FAT).
For every disk block in the file system, the next block’s location is stored in the table under the current block number.
The FAT is loaded from disk into main memory at the time the file system is mounted (e.g., at system initialization time).

In Figure , a partial listing of a FAT is provided. Here, assume a file starts at block 151. Its next block is 153 followed by 156, which is the end of the file (denoted by “EOF”).
To find the file’s third block, the operating system will examine the FAT starting at location 151 to find 153 (the file’s second block) and then look at location 153 to find 156, the file’s third block.
Another file might start at block 154. Its second block is at location 732. The entry “Bad” indicates a bad sector that should not be used.

Other Disk File Details

Aside from determining the indexing strategy and the use/reuse of blocks, the file system must also specify a number of other details. These will include naming schemes for file entries (files, directories, links). Nowadays It is common for names to permit just about any character, including blank spaces; however, older file systems had limitations such as eight-character names and names consisting only of letters and digits (and perhaps a few types of punctuation marks such as the hyphen, underscore, and period). Some file systems do not differentiate between uppercase and lowercase characters while others do. Most file systems permit but do not require file name extensions.

File systems will also maintain information about the entries, often called metadata. This will include the creation, last modification and last access date/time, owner (and group in many cases), and permissions or access control list. The access control list enumerates for each user of the system the permissions granted to that user so that there can be several levels of permissions over the Linux user/group/other approach.

Many different file system types have been developed over the years. Many early mainframes had their own, unique file systems. Today, operating systems tend to share file systems or provide compatibility so that a different file system can still be accessed by many different types of operating systems. Aside from the previously mentioned FAT and NTFS file systems, some of the more common file systems are the extended file system family (ext, ext2, ext3, ext4, derived originally from the Minix OS file system) used in Linux. NFS (the network file system) is also available in Linux. Files-11, which is a descendant of the
file system developed for DEC PDP mainframes, and the Vax VMS operating system (and itself a precursor of NTFS) are also available. While these multitudes of file systems are available, most Linux systems primarily use the ext family as the default file system type.

Resources

Linux with Operating System Concepts by Richard Fox
isbn:9781482235906, goodreads:20792170

Harrykar's Techies Blog

Total Pageviews

Search: This Blog, Linked From Here, The Web, My fav sites, My Blogroll

Translate