Have you ever wondered what all those little boxes are when you defrag your hard drive? These boxes are clusters; they are storage units on the hard drive. This article will explain to you the concept of clusters. This applies mostly to the file systems FAT12, FAT16 and FAT32. FAT12 is only seen on floppy disks and very small storage medias, while FAT16 is the older version of FAT in the win95 days, and FAT32 is newer, and more in the Win98 days. Fat Stands for file allocation table. And no, there is no such thing as a SKINNY32 file system. :P
How FAT works
The way FAT works is that it keeps a record at the start of the drive of all the files, and to point to the files, it points to the clusters that contain the files. The main rule of clusters is that there cannot be more than 1 file per cluster, otherwise pointing to that cluster would cause problems, as it would be pointing and doing read/write operations on two files at once, which would crash more than you ever seen Windows crash before. Depending on the size and FAT version of the partition, the cluster size can vary. Also keep in note that a section of a partition is reserved for the FAT, which is the record of each file and where they can be found. This changes in size depending on the size of the partition and clusters as well.
Each FAT type has their limits, here is a table displaying these limits:
|FAT type||Max Clusters||Cluster sizes||Max volume size|
|FAT12||4 086||0.5 to 4KB||16 736 256 bytes (~16MB)|
|FAT16||65 526||2KB to 32KB||2 147 483 648 bytes (2GB)|
|FAT32||268 435 456||4KB to 32KB||8 796 093 022 208 bytes (8TB)|
The number of maximum clusters is calculated by doing 2^X where X is the number of FAT (ex: FAT12). Also, there are some reserved clusters (10 for FAT12 and FAT16) which is why there are some "missing" if you calculate it yourself. FAT32 is actually calculated 2^28 and not 2^32 because of some reservations.
FAT32 actually supports 64KB clusters, however, most applications don't, so it is hardly ever seen used. 32KB is usually the limit.
FAT32 volume size limit
Note, FAT32's limit is known to be 2TB, but this is because a 8TB partition would be very inefficient as the FAT would be about 1GB, and it would be next to impossible to cache this in memory for fast retrieval. There are many other better file systems for partitions this big. 2TB is probably the highest that would actually work ok. But it IS possible to have a 8TB partition using FAT32, but it is not recommended at all.
Experimenting With Clusters
Here's an experiment you can try on a large FAT partition:
(For this example, lets assume it is using 32KB clusters)
Create a 1 byte file (just a text file and put a space or character in it) and make 1000 copies. It is best to create 10 copies, than take those 10 and copy them 10 times, and then another 10 times. It will go faster than pasting 1000 times. Use CTRL+C and CTRL+V to do the copy paste operation. This should take less than 30 seconds to accomplish. Once you have 1000 files in the folder, stop. You may get access violation errors as Windows becomes confused with this process, so make sure you recopy the missing files to end up with 1000. (We're only using 1000 here since it's a nice big round number)
Wasted Space Explained
Hopefully you did this in a seperate folder and not on your desktop!
Now check the properties of that folder and you will see that this 1000 bytes of data actually takes up 1000 bytes times 32KB, which is ~31MB!!! Talk about wasting space eh? This is because only 1 file per cluster can be stored, and even if it's only 1 byte, the clusters are 32KB (in this example) so is used for 1 file, no matter what. These are the biggest clusters you will ever see, but most common as we all like big partitions, and past 32GB, the clusters need to be 32KB since there would be too many clusters, and like in the table above, there is a limit to how many clusters there can be on a partition, so if you want 4KB clusters, you need a pretty small partition.
Properties dialog shows how much wasted space is used even with 1 byte files
Wasted space is a "per file" basis, as each cluster can only store 1 file, so if this file takes up 10 clusters, there will only be wasted space from the leftover of the last cluster. If you have a file that is exactly 32KB * X then there is 0 wasted space. Also, if you were to cut your big video files in many small parts, it would waste more then if you combined all your videos into one large file. I'm not telling you to combine all your files into one, but one thing for sure, if you have lot of file archives you wont use in years, zip it up as one file and not only will you save space from compression, but you will also save space because only 1 cluster will be left with wasted space left over - up to 32KB of wasted space.
Finding out the size of your clusters
If you are curious to know the size of the clusters on your drive, you can use an utility like Partition Magic or you can check it yourself by creating a file (at least 1 byte) and checking its properties. Where it says "size on disk" you should see something like 32KB. This is also useful to see how much space large files take up. For example, a 892 312 048 byte file will actually be taking up 892 338 176 bytes of disk space. If you calculate 892 338 176 ? 892 312 048 you will notice that 26 128 bytes are wasted, which is about 26KB. This is not that bad to consider that it's a 850MB file to start off with anyway, but when you have many small files (ex: cookies) the amount is huge compared to the actual file size and there is more wasted space because of so many individual files wasting up to 32KB in space each. So empty your cookies once in a while!
A better file system: NTFS
There is another newer file system called NTFS(New Technology File System), which is way more efficient than FAT. NTFS is much better than FAT, but is not compatible with windows 3.x/95/98/ME or DOS, but compatible with Windows XP, 2000 and NT. Use it whenever possible. It has many options such as:
Logging and mapping in case of failure, makes recovery much easier and more possible.
Maps out bad sectors to make sure they are never used.
Security features such as possibility to encrypt/decrypt data as it is written/read from the disk. This would make raw reading next to impossible, so files are more secure when you give your HD to someone else.
Possibility to set user permissions (kind of like chmod in linux/unix).
Using only 4KB clusters, it can support a partition of up to 16TB and a partition of up to 256TB using 64KB clusters!
Possibility to have built-in(controlled by the file system and not the actual sharing application) disk quotas for shared drives.
Possibility to compress without another program such as Ms disk compression. Other Windows applications can read/write without having to know the partition is compressed.
Not only does NTFS have more built-in features, but it has practically no limits (who owns a 256TB drive and want to keep it as one partition?!) and also has very small cluster sizes. A partition that would use 32KB clusters would only use like 2KB or 4KB clusters in NTFS! However, NTFS is only compatible with Win2k/XP/NT, so you would not be able to change files on an NTFS partition unless you are in one of those operating systems. You could always use FAT32 for those, but why do that?
I hope that this article helped you understand the concept of clusters better. If you have comments or suggestions please don't be shy and use the feature below, and also consider joining our forum for great discussions on tech-related issues and off topic stuff as well!
This article originally for IceTeks.com