File systems are a collection of data structures that use block stores for data persistence. File systems add features to the data, including ways of labeling the data for retrieval, protection and ownership mechanisms, and mechanisms for recovery of data from unfortunate loss or misuse.
One of the most popular file system is the FAT file system originally designed by Bill Gates and Marc McDonald in 1977 for use in MS-DOS. FAT has evolved over the years, with versions named by the bit width of the indexing tables: 8 bit FAT, FAT12, FAT16 and FAT32. The file system is used by many devices now, and is such a common file system that it is useful for detachable media such as thumb drives, which will be accessed by a wide variety of operating systems.
All file systems have three categories of information: the superblock, metadata, and content. The superblock is a fixed sized data structure that helps the operating system identify the file system, know its parameters, and find the initial data structures on the block device. The metadata are all the data structures that organize the files and directories on the block device. The content is the pool of data blocks used to store data, other than metadata.
Many file systems have a variety of metadata. FAT as a single metadata structure: the File Allocation Table, FAT. It is remarkable that all organizing needs are satisfied by a single structure.
A block device is an indexed sequence of logical blocks. FAT works instead in clusters. A cluster could be one block, or multiple consecutive blocks subject to an alignment constraint. The FAT table is indexed by cluster index and contains as content a cluster index. The table serves several functions:
- The sequence of clusters used to store the data of a given file are arranged in a linked list. The FAT entries provide an array-implementation of those linked lists. For this, a particular value of FAT entry is stolen from the range of valid cluster addresses to mark the end of a linked list.
- The FAT table tracks which clusters are free and which are in use storing data. A FAT entry containing the value 0 indicates that the cluster with address the same as that index is free; any other value indicates that the cluster of same address as that index is claimed.
- Other content values mark that the cluster with address same as the index is bad, and should not be use.
The FAT file system is laid on on the block device such that the superblock begins a cluster 0 and continues on in consecutive clusters as needed; followed by the FAT table in a consecutive sequence of clusters as needed to accommodate FAT table size; followed by a copy of the FAT table; followed by the data content region. The clusters in the data content region are numbered starting with index 2. This allows for indices 0 (the free cluster mark) and 1 (no meaning) to be reserved.
The data content region contains but files and directories. The original FAT versions only supported a single root directory containing a fixed number of files. The location of this directory was fixed as the start of the data content region and size, in directory entries, was recorded in the superblock. Later FAT versions supported subdirectories and allowed for a variably sized root directory. The superblock records the starting cluster for the root directory in this case.
The data for files and directories are stored in a sequence of clusters. The sequencing of the clusters is recorded in the FAT table by entering the address of the next cluster in the sequence at the index of the previous cluster in the sequence.
For instance, suppose a file is of size not larger than three times the cluster size. To store the data, three clusters assigned to the file, and are arranged in an order, say cluster 17, then cluster 28, then cluster 21. The bytes in the file are then in correspondence with the bytes in the clusters, the first bytes in cluster 17 and the last bytes in cluster 21. The cluster numbers and their ordering is recorded in the FAT table by setting FAT entry 17 to value 28, FAT entry 28 to value 21, and FAT entry 21 to a special end-of-chain value, usually -1.
17 21 28 ----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+---- | | | 28 | | | | -1 | | | | | | | 21 | | ----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----
The file or directory data can be represented by the index of the head of the cluster chain, in this case the number 17.
Files can be arbitrarily sized, so the length of the cluster chain need not be the size of the file. The directory entry for the file will give the index of the head of the cluster chain and the size of the file. If the file length does not fill the last cluster in the chain, the unused space is called slack space.
Directories in FAT are a sequence of fixed sized (32 byte) records that may or may not be in use, but the sequence always fills completely the clusters in the chain. Therefore the size of a directory can be calculated from its cluster chain as the total byte length of the cluster chain.
The 32 byte directory record contains:
- The file name and extension in the famous 8 + 3 format,
- An attribute byte, providing, among other attributes, if this entry is of a file or a directory,
- The creation and modification time and dates,
- Last access date (the time is not recorded),
- The cluster number of the start of the cluster chain,
- The file length.
The format of the directory entry depends on the version of FAT. For FAT32, 4 bytes are needed for the address of the cluster chain, and those bytes are used of other things in FAT12 and FAT16. Note that even FAT32 has only a 28 bit address space for cluster indices — the high 4 bits of a FAT32 cluster address must always be zero.
Directory records, called dirents for directory entries, are fixed length and are allocated to fill the cluster chain comprising the directory. The first character of the file name is used to signal whether the entry is to be considered valid and referring to a file or directory, or invalid. If the first character is zero, the entry is invalid, and so are all entries following this entry. If the first character is value 0xE5 then the entry is invalid but other valid entries might follow this entry. If the first character is an allowable character in a FAT filename, then the entry is valid. The 0xE5 and 0×00 markers allow for directory search to terminate without touching all the directory entries. How this is used depends on the handling of the directory. There is no requirement in the specification to handle directories in the minimum possible space, so even after deletion of many files, a directory that expanded its cluster chain, might remain much larger than necessary.
Deleting a file or directory consists of overwriting the first byte of the file name in the directory entry for the file with the invalid mark (value 0×00 or 0xE5). Then the clusters associated with the file or directory are reclaimed but walking the cluster chain and writing zeros into the FAT entry for each cluster on the chain. Generally no erasing is done, and having raw access to the block device of a FAT file system can recover deleted data, as it is never deliberately erased, only eventually overwritten. Data from a previous file that ends up in slack space an persist on a life time of the file that has that slack space. Computer virus sometimes hide themselves or their data in the slack space.
The 8+3 file name format was a great inconvenience, and FAT continues to be a popular file system even now that long file names with a variety of characters is the norm. FAT introduced a system of long file names that is backwardly compatible with the original 8+3 file naming restriction. Every file in the FAT file system has two names: a long name and an 8+3 name. Typically the user chooses the long name and the 8+3 name is created automatically. Users are rarely aware of the 8+3 name.
The long file name is entered in to the directory as a series of "illegal" directory entries, just in front of the correct directory entry for the file. That correct entry has the 8+3 name. The entries are illegal because their attributes are as conflicting types, and so older operating systems skip over these entries and see only the legal entry with the short name. However, new operating systems gather up the data that is scattered into the illegal entries and glue it all together into the long file name.