This makes me wonder how efficient filesystems are with millions of file in a single directory? Do they create some sort of index? Are there limits to the number of files a directory can hold? Is that what inodes are for? I remember seeing “inode” limits on some VPS I was using a while back.
Indeed, putting too many files in a single directory is inefficient, both for read and write. When writing a file (or changing metadata like permissions), the entire directory inode may have to be rewritten. When searching for a file or opening it, the entire file list for the directory needs to be read (in the worst case where the file is at the end of the list).
When you need to store lots of files on disk, it's a common pattern to spread them out in subdirectories. For instance, instead of `files/2d8af74bcb29ad84`, you would have `files/2d/8a/f74bcb29ad84`.
> When writing a file (or changing metadata like permissions), the entire directory inode may have to be rewritten.
That isn't how inode filesystems work -- if you change a file's permissions, it's just an inode update on the file -- not the containing directory.
Even in DOS-type filesystems (FAT/exFAT), it's just a record update in the corresponding dirent for that file.
If you add a new file to a directory, that causes an mtime update on the directory's inode.
The rest is accurate -- many older filesystems have lookup performance that scales poorly with directory size (for DOS filesystems and BSD UFS, you have to do a full directory scan). Also ls defaults to sorting output, which is O(N log N) and can be slow in large directories.
Large directory sizes suck on NFS and parallel file systems like Panasas, Lustre, GPFS, and the like. Python and Rust's Cargo also suck on networked file systems and would be greatly improved by pushing things into a sqlite file.
most things suck on networked file systems. I still have trauma from having to use NFS more than a decade ago, on systems that would freeze on boot because the network couldn't be reached or the NFS server was down.
I'd also never put sqlite on NFS. Locking is often broken on NFS. Unless you can guarantee a heterogeneous environment. I can imagine some Excel guy in marketing is going to launch his SQLite UI on Windows and completely hose it all.
Nfs locking broken.. excel? Sqlite ui on Windows? Are you sure you're not confusing SMB with nfs? I've had SMB cache files that ducks everything up. Not nfs.
That's sometimes true but it depends a lot on the filesystem used, and sometimes on the options used when building it. ReiserFS, ext4 and FAT 32 won't have the same performance profile.
I think breaking down large collections in subdirectories is mainly done in order to ensure that it'll work even with FS that don't deal well with very large directories and also because it makes inspecting the files manually a little more convenient given that many file explorers (and especially the GUI ones) can have trouble with large directories.
What about on the other end? Why not have each hex digit be a directory along a path? Then you have very, very few files per directory at the cost of deeper hierarchy. What's the practical downside?
Directories are just another type of file. So, if you do this, and your filenames are n characters long, you'll end up needing to do n file accesses just to find the file you're looking for. Unless the underlying file system does something to make that particular access pattern fast, well... it's going to be stupidly slow after a certain point.
> Unless the underlying file system does something to make that particular access pattern fast
I don't know exactly how Linux does it. Windows hands off whole paths to the filesystem, so this idea is possible there.
In FreeBSD, there is a generic routine (lookup(9)) that goes component by component, so at each step the filesystem is only asked to resolve a single component to a vnode. I think a clever filesystem implementation (in FreeBSD) could look at the remaining path and kick off asynchronous prefetch... but I am not aware of anything doing this.
There are two modes you can implement for your filesystem: in the so-called 'high level' mode you get the whole path. In the 'low level' mode the filesystem asks you for one piece of the path at a time.
The low level mode seemed faster in my tests, and I think it's also closer to how Linux kernel works internally?
> Are there limits to the number of files a directory can hold?
Yes, depending on the file system.
For example, ext4 with default settings uses 32-bit hashes. Upon the first collision, you can no longer add more files to the directory (ENOSPC error).