Total Pageviews

Search: This Blog, Linked From Here, The Web, My fav sites, My Blogroll

Translate

15 September 2015

inode

When the file system is first established, it comes with a set number of inodes. The inode is a data structure used to store file information. The information that every inode will store consists of
  • The file type
  • The file’s permissions
  • The file’s owner and group
  • The file’s size
  • The inode number
  • A timestamp indicating when the inode was last modified, when the file was created,and when the file was last accessed
  • A link count (the number of hard links that point at this file)
  • The location of the file (i.e., the device storing the file) and pointers to the individual file blocks (if this is a regular file)
these pointers break down into
  •  A set number of pointers that point directly to blocks
  •  A set number of pointers that point to indirect blocks; each indirect block contains pointers that point directly to blocks
  • A set number of pointers that point to doubly indirect blocks, which are blocks that have pointers that point to additional indirect blocks
  • A set number of pointers that point to triply indirect blocks, which are blocks that have pointers that point to additional doubly indirect blocks
Typically, an inode will contain 15 pointers broken down as follows:
  • 12 direct pointers
  • 1 indirect pointer
  • 1 double indirect pointer
  • 1 triply indirect pointer
An inode is illustrated in Figure (which contains 1 indirect pointer and 1 doubly indirect pointer but no triply indirect pointer because of a lack of space).

 Let us take a look at how to access a Linux file through the inode. We will make a few assumptions.
First, our Linux inode will store 12 direct pointers, 1 indirect pointer, 1 doubly indirect pointer, and 1 triply indirect pointer. Blocks of pointers will store 12 pointers no matter whether they are indirect, doubly indirect, or triply indirect blocks. We will assume that our file consists of 500 blocks, numbered 0 to 499, each block storing 8 KB (the typical disk block stores between 1 KB and 8 KB depending on the file system utilized).
Our example file then stores 500 * 8 KB = 4000 KB or 4 MB. Here is the breakdown of how we access the various blocks.
  • Blocks 0–11: direct pointers from the inode.
  • Blocks 12–23: pointers from an indirect block pointed to by the inode’s indirect pointer.
  • For the rest of the file, access is more complicated.
   • We follow the inode’s doubly indirect pointer to a doubly indirect block. This block contains 12 pointers to indirect blocks. Each indirect block contains 12 pointers to disk blocks.
−− The doubly indirect block’s first pointer points to an indirect block of 12 pointers, which point to blocks 24–35.
−− The doubly indirect block’s second pointer points to another indirect block of 12 pointers, which point to blocks 36–47.
−− …
−− The doubly indirect block’s last pointer points to an indirect block of 12 pointers, which point to blocks 156–167.
• We follow the inode’s triply indirect pointer to a triply indirect block. This block contains 12 pointers to doubly indirect blocks, each of which contains 12 pointers to indirect blocks, each of which contain 12 pointers to disk blocks. From the triply indirect block, we can reach blocks 168 through 499 (with room to increase the file to block 1895).

Earlier, we noted that the disk drive supports random access. The idea is that to track down a block, block i, we have a mechanism to locate it. This is done through the inode pointers as just described.
The above example is far from accurate. A disk block used to store an indirect, doubly indirect, or triply indirect block of pointers would be 8 KB in size. Such a sized block would store far more than 12 pointers. A pointer is usually 32 or 64 bits long (4 or 8 bytes). If we assume an 8 byte pointer and an 8 KB disk block, then an indirect, doubly indirect, or
triply indirect block would store 8 KB/8 B pointers = 1 K or 1024 pointers rather than 12.
When a file system is created, it comes with a set number of inodes. The actual number depends on the size of the file system and the size of a disk block. Typically, there is 1 inode for every 2–8 KB of file system space.
If we have a 1 TB file system (a common size for a hard disk today), we might have as many as 128 K (approximately 128 thousand) inodes. The remainder of the file system is made up of disk blocks dedicated to file storage and pointers. Unless nearly all of the files in the file system are very small, the number of inodes should be more than sufficient for any file system usage.


Linux Commands to Inspect inodes and Files

There are several tools available to inspect inodes. The following commands provide such information:
  • stat—provides details on specific file usage, the option –c %i displays the file’s inode number
  • ls –i displays the inodes of all entries in the directory
  • df –i this command provides information on the utilization of the file system, partition by partition. The -i option includes details on the number of inodes used.
The stat command itself will respond with the name of the file, the size of the file, the blocks used to store the file, the device storing the file (specified as a device number), the inode of the file, the number of hard links to the file, the file’s permissions, UID, GID in both name and
number, and the last access, modification, and change date and time for the file. The stat command has many options. The most significant are: 
  • -L : follow links to obtain inode information of files stored in other directories (without this, symbolic links in the given directory are ignored)
  • -f : used to obtain statistics on an entire file system rather than a file
  •  -c FORMAT : output only the requested information where FORMAT uses the characters listed in Table , additionally when used with -f (file system stats) there are other formatting characters available (see second half of Table )

Formatting Characters for -c, Bottom Half for -c -f
We use stat to provide for us the size of each file in blocks and bytes, the file name, the inode number of the file, and the time of last access. This command is given as:

$ stat   -c "%b %s %n %i %x" *
736 373337 firestarter-events.txt~ 1486 2010-04-28 14:37:10.843398916 +0300
472544 241938432 FreeBSD-10.2-RELEASE-amd64-bootonly.iso 285743 2015-09-10 20:01:12.310727882 +0300
119344 61098928 FreeBSD-10.2-RELEASE-amd64-bootonly.iso.xz 4349 2015-09-10 19:53:37.250723649 +0300

 we inspect devices from /dev  where we look at the file’s type (%F)  the device number, UID of the owner, number of hard links, inode
number, file name, and file type.
 
$ stat   -c "%d %u %h %i %n %F" /dev/*
5 0 1 4695 /dev/adsp character special file
5 0 1 4699 /dev/audio character special file
5 0 2 1901 /dev/block directory
5 0 2 2342 /dev/bsg directory
5 0 3 1801 /dev/bus directory
5 0 1 4001 /dev/cdrom symbolic link
5 0 2 2001 /dev/char directory
5 0 1 1592 /dev/console character special file
5 0 1 3249 /dev/core symbolic link
5 0 1 1834 /dev/cpu_dma_latency character special file
5 0 5 2377 /dev/disk directory
5 0 1 4698 /dev/dsp character special file
5 0 1 4004 /dev/dvd symbolic link
5 0 1 1585 /dev/ecryptfs character special file
5 0 1 2382 /dev/fb0 character special file
5 0 1 3250 /dev/fd symbolic link
5 0 1 1233 /dev/full character special file
5 0 1 1586 /dev/fuse character special file
5 0 1 2392 /dev/hidraw0 character special file
5 0 1 1662 /dev/hpet character special file
5 0 4 1811 /dev/input directory
5 0 1 1236 /dev/kmsg character special file
5 0 1 4784 /dev/log socket
5 0 1 1716 /dev/loop0 block special file
5 0 1 1719 /dev/loop1 block special file
5 0 1 1722 /dev/loop2 block special file
5 0 1 1725 /dev/loop3 block special file
5 0 1 1728 /dev/loop4 block special file
5 0 1 1731 /dev/loop5 block special file
5 0 1 1734 /dev/loop6 block special file
5 0 1 1737 /dev/loop7 block special file
5 0 2 1819 /dev/mapper directory
5 0 1 1575 /dev/mcelog character special file
5 0 1 1229 /dev/mem character special file
5 0 1 4702 /dev/mixer character special file
5 0 2 1744 /dev/net directory
5 0 1 1835 /dev/network_latency character special file
5 0 1 1836 /dev/network_throughput character special file
5 0 1 1230 /dev/null character special file
5 0 1 5256 /dev/nvidia0 character special file
5 0 1 5255 /dev/nvidiactl character special file
5 0 1 1237 /dev/oldmem character special file
5 0 2 1741 /dev/pktcdvd directory
5 0 1 1231 /dev/port character special file
5 0 1 1743 /dev/ppp character special file
5 0 1 1814 /dev/psaux character special file
5 0 1 1661 /dev/ptmx character special file
12 0 2 1 /dev/pts directory
5 0 1 1668 /dev/ram0 block special file
5 0 1 1671 /dev/ram1 block special file
5 0 1 1698 /dev/ram10 block special file
5 0 1 1701 /dev/ram11 block special file
5 0 1 1704 /dev/ram12 block special file
5 0 1 1707 /dev/ram13 block special file
5 0 1 1710 /dev/ram14 block special file
5 0 1 1713 /dev/ram15 block special file
5 0 1 1674 /dev/ram2 block special file
5 0 1 1677 /dev/ram3 block special file
5 0 1 1680 /dev/ram4 block special file
5 0 1 1683 /dev/ram5 block special file
5 0 1 1686 /dev/ram6 block special file
5 0 1 1689 /dev/ram7 block special file
5 0 1 1692 /dev/ram8 block special file
5 0 1 1695 /dev/ram9 block special file
5 0 1 1234 /dev/random character special file
5 0 1 397 /dev/rfkill character special file
5 0 1 4497 /dev/root symbolic link
5 0 1 2177 /dev/rtc symbolic link
5 0 1 1818 /dev/rtc0 character special file
5 0 1 2374 /dev/scd0 symbolic link
5 0 1 2344 /dev/sda block special file
5 0 1 2345 /dev/sda1 block special file
5 0 1 2346 /dev/sda2 block special file
5 0 1 2347 /dev/sda3 block special file
5 0 1 2348 /dev/sda4 block special file
5 0 1 2349 /dev/sda5 block special file
5 0 1 2350 /dev/sda6 block special file
5 0 1 2351 /dev/sda7 block special file
5 0 1 2352 /dev/sda8 block special file
5 0 1 2353 /dev/sda9 block special file
5 0 1 4371 /dev/sequencer character special file
5 0 1 4375 /dev/sequencer2 character special file
5 0 1 2341 /dev/sg0 character special file
5 0 1 2364 /dev/sg1 character special file
16 0 2 3261 /dev/shm directory
5 0 1 1576 /dev/snapshot character special file
5 0 3 4314 /dev/snd directory
5 0 1 3252 /dev/sndstat symbolic link
5 0 1 2361 /dev/sr0 block special file
5 0 1 3253 /dev/stderr symbolic link
5 0 1 3254 /dev/stdin symbolic link
5 0 1 3255 /dev/stdout symbolic link
5 0 1 1591 /dev/tty character special file
5 0 1 1593 /dev/tty0 character special file
5 0 1 1598 /dev/tty1 character special file
5 0 1 1607 /dev/tty10 character special file
5 0 1 1608 /dev/tty11 character special file
5 0 1 1609 /dev/tty12 character special file
5 0 1 1610 /dev/tty13 character special file
5 0 1 1611 /dev/tty14 character special file
5 0 1 1612 /dev/tty15 character special file
5 0 1 1613 /dev/tty16 character special file
5 0 1 1614 /dev/tty17 character special file
5 0 1 1615 /dev/tty18 character special file
5 0 1 1616 /dev/tty19 character special file
5 0 1 1599 /dev/tty2 character special file
5 0 1 1617 /dev/tty20 character special file
5 0 1 1618 /dev/tty21 character special file
5 0 1 1619 /dev/tty22 character special file
5 0 1 1620 /dev/tty23 character special file
5 0 1 1621 /dev/tty24 character special file
5 0 1 1622 /dev/tty25 character special file
5 0 1 1623 /dev/tty26 character special file
5 0 1 1624 /dev/tty27 character special file
5 0 1 1625 /dev/tty28 character special file
5 0 1 1626 /dev/tty29 character special file
5 0 1 1600 /dev/tty3 character special file
5 0 1 1627 /dev/tty30 character special file
5 0 1 1628 /dev/tty31 character special file
5 0 1 1629 /dev/tty32 character special file
5 0 1 1630 /dev/tty33 character special file
5 0 1 1631 /dev/tty34 character special file
5 0 1 1632 /dev/tty35 character special file
5 0 1 1633 /dev/tty36 character special file
5 0 1 1634 /dev/tty37 character special file
5 0 1 1635 /dev/tty38 character special file
5 0 1 1636 /dev/tty39 character special file
5 0 1 1601 /dev/tty4 character special file
5 0 1 1637 /dev/tty40 character special file
5 0 1 1638 /dev/tty41 character special file
5 0 1 1639 /dev/tty42 character special file
5 0 1 1640 /dev/tty43 character special file
5 0 1 1641 /dev/tty44 character special file
5 0 1 1642 /dev/tty45 character special file
5 0 1 1643 /dev/tty46 character special file
5 0 1 1644 /dev/tty47 character special file
5 0 1 1645 /dev/tty48 character special file
5 0 1 1646 /dev/tty49 character special file
5 0 1 1602 /dev/tty5 character special file
5 0 1 1647 /dev/tty50 character special file
5 0 1 1648 /dev/tty51 character special file
5 0 1 1649 /dev/tty52 character special file
5 0 1 1650 /dev/tty53 character special file
5 0 1 1651 /dev/tty54 character special file
5 0 1 1652 /dev/tty55 character special file
5 0 1 1653 /dev/tty56 character special file
5 0 1 1654 /dev/tty57 character special file
5 0 1 1655 /dev/tty58 character special file
5 0 1 1656 /dev/tty59 character special file
5 0 1 1603 /dev/tty6 character special file
5 0 1 1657 /dev/tty60 character special file
5 0 1 1658 /dev/tty61 character special file
5 0 1 1659 /dev/tty62 character special file
5 0 1 1660 /dev/tty63 character special file
5 0 1 1604 /dev/tty7 character special file
5 0 1 1605 /dev/tty8 character special file
5 0 1 1606 /dev/tty9 character special file
5 0 1 1667 /dev/ttyS0 character special file
5 0 1 1664 /dev/ttyS1 character special file
5 0 1 1665 /dev/ttyS2 character special file
5 0 1 1666 /dev/ttyS3 character special file
5 0 1 1235 /dev/urandom character special file
5 0 1 1749 /dev/usbmon0 character special file
5 0 1 1753 /dev/usbmon1 character special file
5 0 1 1808 /dev/usbmon2 character special file
5 0 1 7508 /dev/vboxdrv character special file
5 0 1 7512 /dev/vboxdrvu character special file
5 0 1 7561 /dev/vboxnetctl character special file
5 0 2 7754 /dev/vboxusb directory
5 0 1 1594 /dev/vcs character special file
5 0 1 1596 /dev/vcs1 character special file
5 0 1 2609 /dev/vcs2 character special file
5 0 1 2618 /dev/vcs3 character special file
5 0 1 2627 /dev/vcs4 character special file
5 0 1 2636 /dev/vcs5 character special file
5 0 1 2645 /dev/vcs6 character special file
5 0 1 2662 /dev/vcs7 character special file
5 0 1 1595 /dev/vcsa character special file
5 0 1 1597 /dev/vcsa1 character special file
5 0 1 2610 /dev/vcsa2 character special file
5 0 1 2619 /dev/vcsa3 character special file
5 0 1 2628 /dev/vcsa4 character special file
5 0 1 2637 /dev/vcsa5 character special file
5 0 1 2646 /dev/vcsa6 character special file
5 0 1 2663 /dev/vcsa7 character special file
5 0 1 387 /dev/vga_arbiter character special file
5 0 1 1232 /dev/zero character special file

Some considerations :
All except pts are located on device number 5.
All are owned by user 0 (root);
Most of the items have only one hard link, found in /dev. Both input and pts are directories and have more than one hard link. The inode numbers vary from 1 to 7754.
The file type demonstrates that “files” can make up a wide variety of entities from block or character files (devices) to symbolic links to directories to domain sockets. This last field varies in length from one word (directory) to three words (character special file, block special file).
Each new file is given the next inode available. As your file system is used, you will find newer files have higher inode numbers although deleted files return their inodes.

Whenever any file is used in Linux, it must first be opened. The opening of a file requires a special designator known as the file descriptor. The file descriptor is an integer assigned to the file while it is open. In Linux, three file descriptors are always made available:
  • 0  stdin
  • 1  stdout
  • 2  stderr
 Any remaining files that are utilized during Linux command or program execution need to be opened and have a file descriptor given to that file.

When a file is to be opened, the operating system kernel gets involved.
  1. First, it determines if the user has adequate access rights to the file.
  2.  If so, it then generates a file descriptor. 
  3. It then creates an entry in the system’s file table, a data structure that stores file pointers for every open file. The location of this pointer in the file table is equal to the file descriptor generated. For instance, if the file is given the descriptor 185, then the file’s pointer will be the 185th entry in the file table. 
  4. The pointer itself will point to an inode for the given file.
As devices are treated as files, file descriptors will also exist for every device, entities such as the keyboard, terminal windows, the monitor, the network interface(s), the disk drives, as well as the open files.
You can view the file descriptors of a given process by looking at the fd subdirectory of the process’ entry in the /proc directory (e.g., /proc/16531/fd). There will always be entries labeled 0, 1, and 2 for STDIN, STDOUT, and STDERR, respectively. Other devices and files in use will require additional entries. Alternatively, the lsof command will list any open files.

FILES

In the Linux operating system, everything is treated as a file except for the process.
What does this mean? Among other things, Linux file commands can be issued on entities that are not traditional files. The entities treated like files include
  • directories, 
  • physical devices,
  • named pipes, 
  • file system links.
Aside from physical devices, there are also some special-purpose programs that are treated like files (for instance, a random number generator).


Files versus Directories

The directory it's a named entity that contains files and sub-directories (or devices, links, etc.). The directory offers the user the ability to organize their files in some reasonable manner, giving the file space a hierarchical structure. 
Directories can be created just about anywhere in the file system and can contain just about anything from empty directories to directories that themselves contain directories.

The directory differs from the file in a few significant ways. 
  1. we expect directories to be executable. Without that permission, no one (including the owner) can cd into the directory. 
  2. the directory does not store content like a file; instead it merely stores other items. That is, whereas the file ultimately is a collection of blocks of data, the directory contains a list of pointers to files. 
  3. there are some commands that operate on directories and not files (e.g., cd, pwd, mkdir) and some commands that operate on files but not directories (e.g., wc, diff, less, more). We do find that most Linux file commands will operate on directories themselves, including for instance cp, mv, rm (using the recursive version), and wildcards apply to both files and directories.
  

Nonfile File Types

Many devices are treated as files in Linux. These devices are listed under the /dev directory. We categorize these devices into two subcategories: 
  • character devices  : Character devices are those that input or output streams of characters a char at a time ; like the keyboard, the mouse, a terminal (as in terminal window), and serial devices such as older MODEMs and printers.
  • block devices : Block devices communicate via blocks of data. The term “block” is traditionally applied to disk drives where the files are broken into fixed-sized blocks. However, here, block is applied to any device that communicates by transmitting chunks of data at a time (as opposed to the previously mentioned character type)

Aside from the quantity of data movement, another differentiating characteristic between character and block devices is how input and output are handled. 
  • For a character device, a program executing a file command must wait until the character is transferred before resuming. 
  • For a block device, blocks are buffered in memory so that the program can continue once the instruction has been issued. Further, as blocks are only portions of entire files, it is typically the case that a file command can request one portion of a file. This is often known as random access. The idea is that we do not have to request block 1 before obtaining block 2. Having to read blocks in order is known as sequential access. But in random access, we can obtain any block desired and it should take no longer to access block j than block i.
Another type of file construct is the domain socket (or  local socket) This is not to be confused with a network socket.
The domain socket is used to open communication between two local processes. This permits interprocess communication (IPC) so that the two processes can share data.
We might, for instance, want to use IPC when one process is producing data that another process is to consume. This would be the case when some application software is going to print a file. The application software produces the data to be printed, and the printer’s device driver consumes the data.
The IPC is also used to create a rendezvous between two processes where process B must wait for some event from process A.

There are several distinctions between a network and domain socket. 
  • The network socket is not treated as a file (although the network itself is a device that can interact via file system commands) while the domain socket is. 
  • The network socket is created by the operating system to maintain communication with a remote computer while domain sockets are created by users or running software. Network sockets provide communication lines between computers rather than between processes.
Yet another type of file entity is the named pipe. The named pipe  differs from the pipe in that it exists beyond the single usage that occurs when we place a pipe between two Linux commands.
To create a named pipe, you define it through the mkfifo operation. The expression FIFO is short for “first-in-first-out.” FIFO is often used to describe a queue (waiting line) as queues are generally serviced in a first-in, first-out manner. In this case, mkfifo creates a FIFO, or a named pipe. Once the pipe exists, you can assign it to be used between any two processes.
Unlike an ordinary pipe that must be used between two Linux processes in a single command, the named pipe can be used in separate instructions.
Let us examine the usage of a named pipe. First, we define our pipe:

mkfifo a_pipe
This creates a file entity called a_pipe. As with any file or directory, a_pipe has permissions, user and group owner, creation/modification date, and a size (of 0). Now that the pipe exists, we might use the pipe in some operation:

ps aux  > a_pipe

Unlike performing ps aux, or even ps aux | more, this instruction does not seem to do anything when executed. In fact, our terminal window seems to hang as there is no output but neither is the cursor returned to us. What we have done is opened one end of the pipe(in writing). But until also the other end of the pipe is open, there is nowhere for the ps aux instruction’s output to “flow.”
To open the other end of the pipe, we might apply an operation (in a different terminal window since we do not have a prompt in the original window) like:

cat a_pipe

Now, the contents “flow” from the ps aux command through the pipe to the cat command. The output appears in the second terminal window and when done, the command line prompt returns in the original window.

You might ask why use a named pipe? In fact, the pipe is being used much like an ordinary pipe. Additionally, the named pipe does roughly the same thing as a domain socket— it is a go between for IPC. There are differences between the named pipe and pipe.
The named pipe remains in existence. We can call upon the named pipe numerous times. Notice here that the source program is immaterial. We can use a_pipe no matter what the source program is.
Additionally, the mkfifo instruction allows us to fine tune the pipe’s performance. Specifically, we can assign permissions to the result of the pipe. This is done using the option –M mode where mode is a set of permissions such as –M 600 or –M u=rwx,g=r,o=r.
The difference between the named pipe and the domain socket is a little more obscure.
The named pipe always transfers one byte (character) at a time. The domain socket is not limited to byte transfers but could conceivably transfer more data at a time.


Links as File Types

The link is a file type. There are two forms of links:
  • hard links : A hard link is stored in a directory to represent a file. It stores the file’s name and the inode number. When creating a new hard link, it duplicates the original hard link, storing the new link in a different directory.
  • soft (or symbolic) links (or symlinks) : The symbolic link instead merely creates a pointer to point at the original hard link.
The difference between the two types of links is subtle but important. If you were to create a symbolic link and then attempt to access a file through the symbolic link rather than the original link, you are causing an extra level of indirect access.
The operating system must first access the symbolic link, which is a pointer. The pointer then provides access to the original file link. This file link then provides access to the file’s inode, which then provides access to the file’s disk blocks.
Hard link's drawbacks are:
  1. hard links cannot link files together that exist on separate partitions. 
  2. hard links can only link together files whereas symbolic links can link directories and other file system entities together.
On the positive side for hard links, they are always up to date. If you move the original object, all of the hard links are modified at the same time. If you delete or move a file that is linked by a symbolic link, the file’s (hard) link is modified but not the symbolic link; thus you may have an out-of-date symbolic link. This can lead to errors at a later time.

In either case, a link is used so that you can refer to a file that is stored in some other location than the current directory. This can be useful when you do not want to add the file’s location to your PATH variable.
For instance, imagine that user zappaf has created a program called my_program, which is stored in ~zappaf. You want to run the program and are you in your home directory. The symbolic link instead merely creates a pointer to point at the original hard link(and zappaf was nice enough to set its permissions to 755). Rather than adding /home/zappaf to your PATH, or use an absolute pathname you create a symbolic link from your home directory to ~zappaf/my_program. Now you can issue the my_program command from your home directory.
You can determine the number of hard links that exist for a single file when you perform an ls –l. The integer value after the permissions is the number of hard links.

$ ls -l
drwxr-xr-x  5 harrykar harrykar    4096 2011-05-05 23:31 perl5
This number will never be less than 1 because with no hard links, the file will not exist. However, the number could be far larger than 1. Deleting any of the hard links will reduce this number. If the number becomes 0, then the file’s inode is returned to the file system for reuse, and thus access to the file is lost with its disk space available for reuse.
If you have a symbolic link in a directory, you will be able to note this by its type and name when viewing the results of an ls –l command. First, the file type is indicated by an ‘l’ (state for link) and the file name will contain the symbolic link’s name, an arrow (->) and the location of the file being linked.

$ ls -l
lrwxrwxrwx  1 harrykar harrykar      22 2012-04-16 16:17 squeak -> /home/harrykar/.squeak
Unfortunately, unlike the hard link usage, if you were to use ls –l on the original file, you will not see any indication that it is linked to by symbolic links.


Collectively, all of the special types of entities are treated like files in the following ways:
  • Each item is listed when you do an ls.
  • Each item can be operated upon by file commands such as mv, cp, rm and we can apply redirection operators on them.
  • Each item is represented in the directory by means of an inode.
You can determine a file’s type by using ls -l (long listing). The first character of the 10-character permissions is the file’s type. In Linux, the seven types are denoted by the characters in Table .

file type identifiers in ls -l


Every file (no matter the type, e.g., regular file, character type, block type, named pipe) is stored in a directory. The directory maintains the entities stored in it through a list. The listing is a collection of hard and soft links. A hard link of a file stores the file’s name and the inode number dedicated to that file. The symbolic link is a pointer to a hard link stored elsewhere.

As the user modifies the contents of the directory, this list is modified. New files require new hard links pointing to newly allocated inodes. The deletion of a file causes the hard link to be removed and the numeric entry of hard links to a file to be decremented. The inode itself remains allocated to the given file unless the hard link count becomes 0.




Resources


Linux with Operating System Concepts by Richard Fox
isbn:9781482235906, goodreads:20792170

File Space



STORAGE ACCESS


A collection of storage devices present a file space. This file space exists at two levels:
  • a logical level defined by partitions, directories, and files, and 
  • a physical level defined by file systems, disk blocks, and pointers.
The users and system administrators primarily view the file space at the logical level. The physical level is one that the operating system handles for us based on our commands (requests).

The devices(disk, usb, optical, solid state ecc drives) that make up the file space collectively provide us with storage access, where we store executable programs and data files.
We generally perform one of two operations on this storage space:
  • we read from a file (load the file into main memory, or input from file)
  • and we write to a file (store/save information from main memory to a file, or output to file).

Disk Storage and Blocks

As hard disk storage is the most common implementation of a file system, although some of the concepts apply to other forms of storage as well. 
To store a file on disk, the file is decomposed into fixed-sized units called blocks. Figure illustrates a small file (six blocks) and the physical locations of those blocks. Notice that the last block may not fill up the entire disk block space, so it leaves behind a small fragment.

The operating system must be able to manage this distribution of files to blocks in three ways:
  1. given a file and block, the operating system must map that block number into a physical location on some disk surface. 
  2. the operating system must be able to direct the disk drive to access that particular block through a movement of both the disk and the drive’s read/write head. 
  3. the operating system must be able to maintain free file space (available blocks), including the return of file blocks once a file has  been deleted.
All of these operations are hidden from the user and system administrator.

Let us consider how a disk file might be broken into blocks. 
The files are distributed across all of the disk’s surfaces (a hard disk drive will contain multiple disk platters and each platter has two surfaces, a top and bottom). Given a new file to store, the file is broken into blocks.
The first block is placed at the first available free block on disk.
Where should the next disk block be placed? If the next block after the first is available, we could place the block there, giving us two blocks of contiguous storage. This may or may not be desirable. The disk drive spins the disks very rapidly. If we want to read two blocks, we read the first block and transfer it into a buffer in the disk drive. Then, that data are transferred to memory. However, during that transfer, the disk  continues to spin.
When we are ready to read the second disk block, it is likely that this block has spun past the read/write head and now the disk drive must wait to finish a full disk revolution before reading again. Distributing disk blocks so that they are not contiguous will get around this problem.
In Figure , you will see that the first three disk blocks are located near each other but not in a contiguous block. Instead, the first block lies at location 3018, the second at 3020, and the third at 3022.

Whether initial blocks are contiguous or distributed, we will find that further disk blocks may have to be placed elsewhere because we have reached the end of the available disk blocks in this locality. With the deletion of files and saving of other files, we will eventually find disk blocks of one file scattered around the disk surfaces. This may lead to some inefficiency in access in that we have to move from one location of the disk to another to read consecutive blocks and so the seek time and rotational latency are lengthened.
Back to Figure we might assume that the next available block after 3022 is at 5813 and so as the file continues to grow, its next block lies at 5813 followed by 5815. As the next block, according to the figure, lies at 683, we might surmise that 683 was picked up as free space because a file was deleted.


Block Indexing Using a File Allocation Table

That is, block i’s successor location is stored at location i
How do we locate a particular block of a disk? File systems use an indexing scheme. MS DOS and earlier Windows systems used a file allocation table (FAT).
For every disk block in the file system, the next block’s location is stored in the table under the current block number.
The FAT is loaded from disk into main memory at the time the file system is mounted (e.g., at system initialization time). 
In Figure , a partial listing of a FAT is provided. Here, assume a file  starts at block 151. Its next block is 153 followed by 156, which is the end of the file (denoted by “EOF”).
To find the file’s third block, the operating system will examine the FAT starting at location 151 to find 153 (the file’s second block) and then look at location 153 to find 156, the file’s third block.
Another file might start at block 154. Its second block is at location 732. The entry “Bad” indicates a bad sector that should not be used.

Other Disk File Details

Aside from determining the indexing strategy and the use/reuse of blocks, the file system must also specify a number of other details. These will include naming schemes for file entries (files, directories, links). Nowadays It is common  for names to permit just about any character, including blank spaces; however, older file systems had limitations such as eight-character names and names consisting only of letters and digits (and perhaps a few types of punctuation marks such as the hyphen, underscore, and period). Some file systems do not differentiate between uppercase and lowercase characters while others do. Most file systems permit but do not require file name extensions.

File systems will also maintain information about the entries, often called metadata. This will include the creation, last modification and last access date/time, owner (and group in many cases), and permissions or access control list. The access control list enumerates for each user of the system the permissions granted to that user so that there can be several levels of permissions over the Linux user/group/other approach.

Many different file system types have been developed over the years. Many early mainframes had their own, unique file systems. Today, operating systems tend to share file systems or provide compatibility so that a different file system can still be accessed by many different types of operating systems. Aside from the previously mentioned FAT and NTFS file systems, some of the more common file systems are the extended file system family (ext, ext2, ext3, ext4, derived originally from the Minix OS file system) used in Linux. NFS (the network file system) is also available in Linux. Files-11, which is a descendant of the
file system developed for DEC PDP mainframes, and the Vax VMS operating system (and itself a precursor of NTFS) are also available. While these multitudes of file systems are available, most Linux systems primarily use the ext family as the default file system type.



Resources


Linux with Operating System Concepts by Richard Fox
isbn:9781482235906, goodreads:20792170