Freeing Up Capacity On An SSD With NTFS Compression

NTFS Is 19 Years Old

Windows NT 3.1, released by Microsoft in 1993, ushered in a new era. Instead of employing the File Allocation Table (FAT) used previously, the developer introduced the NT file system, which had a couple of notable advantages. For example, Microsoft lifted the 8+3-character file name length limit carried over from the days of DOS. Unlike FAT, which allows only Latin characters for file names, NTFS allows up to 256 characters, and it uses the Unicode character set. This was also supported by FAT32, which succeeded FAT and was introduced with Windows 95b in 1997. But that update had a hard time competing with NTFS, too.

After all, NTFS gives users other benefits like a journaling feature that first executes all pending file changes in a so-called journal on a reserved space before executing them directly. This allows for quick data recovery from NTFS partitions if write operations are interrupted by a system crash or power outage. NTFS also facilitates file and folder permissions, encryptable disk areas and user quotas, and the data compression capability we'll be testing today. Before you activate it, though, we want you to be aware of how it works and whether it will have an effect on your system.

NTFS Compression

NTFS uses the LZNT1 algorithm (a variant of LZ77) for lossless data compression, and 4096-byte clusters for data storage. The file system compresses the output data in blocks of 16 clusters, thus increments of 64 KB. If it can’t compress the output data of 16 clusters to less than 16 clusters, NTFS leaves them unchanged. If the LZNT1 algorithm can compress the 64 KB data block to 60 KB or less however, saving at least one cluster, that free cluster is treated like a sparse file. With sparse files, NTFS ignores those parts of the file that contains no information or zero-byte sequences. A compressed file can therefore consist of uncompressed and compressed clusters, as well as clusters declared sparse.

No file types are excluded from the compression scheme, but just like any other kind of data compression, the LZNT1 algorithm is inefficient for files that are already compressed, such as JPG, AVI, and ZIP files. The compression takes place at the file system level, making it invisible at the application level. As far as Windows and its applications go, there is no difference between a compressed and an uncompressed file.

Advantages: The greatest advantage of the NTFS compression is, obviously, an increase in capacity. Owners of small SSDs especially should be happy about every additional megabyte of drive space reclaimed. Increasing the compression and reducing file sizes could translate into faster read and writes speeds (at least theoretically, since less data is read and written from and to the drive).

Disadvantages: According to Microsoft, NTFS compression is very CPU-intensive, and not recommended for use in servers that handle large volumes of reads and writes. Even for home use, there are restrictions. You should only enable compression in folders with relatively few read and write accesses. More plainly, don't compress the Windows system folder. Also, copy operations are theoretically going to be slower, since the file system decompresses the corresponding files first, copies or moves them, and then compresses them again. If you send those compressed files over a network, they're decompressed first, consequently saving no bandwidth.

Another factor to consider: NTFS compression in 64 KB segments leaves the data highly fragmented, especially easily compressible files, since they'll be peppered with sparse clusters. This can shown clearly with an example: according to Microsoft, on average, the NTFS compression of a 64 KB data block generates one sparse cluster. Dividing a 20 GB file system into 64 KB segments generates 327 680 sparse clusters given that calculation. This is particularly relevant to hard drives; SSDs aren’t as affected because their access times are so low that fragmentation is less of an issue.

  • compton
    I've been wondering about this very topic for a while now.


    However, in the conclusion, it is stated that compression ends up writing more vs. uncompressed NTFS, thus consuming more PE cycles. Shouldn't the opposite be true? When writing to the file system, if a file is compressible it should take up less space and therefore conserve more PEs (though actually compressing the files for the first time should result in more writes).

    Why does on-the-fly compression result in more writes even though the amount to be written is smaller?
    Reply
  • husker
    An interesting article, but seems a bit contradictory. Kind of like buying a Ferrari and then worrying about the gas mileage.
    Reply
  • because when you modify even just one byte of a file that is compressed, you can end up changing a significant portion of the file, not just that byte. it's good if you can fit the change in one block erase; what if you can't? you'll end up writing more info on the "disk" then.
    Reply
  • clonazepam
    I'd been wondering how I would fit the 20+gb necessary for sw:tor on the ssd (i said this previously, in the recent games on ssd vs hdd article). It'd be interesting to see the titles tested in that article with full drive ntfs compression on and off.
    Reply
  • Marcus52
    clonazepamI'd been wondering how I would fit the 20+gb necessary for sw:tor on the ssd (i said this previously, in the recent games on ssd vs hdd article). It'd be interesting to see the titles tested in that article with full drive ntfs compression on and off.
    Keep in mind, whatever storage option you use, you need room to install updates on top of installing the game, most especially for MMOGs. This means room to download the update AND install it.

    ;)
    Reply
  • Why windows are using ntfs instead of zfs or ext4 which are far superior than ntfs?
    Reply
  • BrightCandle
    Presumably this negatively impacts Sandforce based drives more than the Samsung by making the data stored compressed? Any chance you can do this with a Sandforce drive to see the impact?
    Reply
  • acku
    cruizerbecause when you modify even just one byte of a file that is compressed, you can end up changing a significant portion of the file, not just that byte. it's good if you can fit the change in one block erase; what if you can't? you'll end up writing more info on the "disk" then.
    Correctomundo. Compression involves replace repeated occurrences of data with references to a single copy of that data existing earlier in the input (uncompressed) data stream. That's why it's not right to think of a compressed archive as a container that stores any given file into a discrete space. If anything, the files kind of overlap in a big mixing pot.

    When you compress on the fly, you have to completely decompress all the files in an archive and recompress it when you're done. Hence it's all random transfers for the most part.

    BrightCandlePresumably this negatively impacts Sandforce based drives more than the Samsung by making the data stored compressed? Any chance you can do this with a Sandforce drive to see the impact?
    It's not a sequential transfer. Plus it's already precompressed data. Nothing SandForce can do about it. SandForce, Samsung, it's not going to make a difference.

    Cheers,
    Andrew Ku
    TomsHardware.com
    Reply
  • jemm
    SSDs manufactures do something in order to make it to the markeplace unexpensivelly! Economies of scale?

    Reply
  • chesteracorgi
    Do you have any data on how much using compression shortens the life cycle of the SSD? It would be most helpful as those with smaller SSDs are the likely candidates for using compression.
    Reply