2025-04-24 19:43:00
simonsafar.com
2025/04/22
(… but… it’s a variable… how do you even)
Let us present the problem.
This is Emacs starting up and loading some Lisp files. For which we first need to figure out where to find them.
As it happens, they could be found at many possible locations. There is a list of these locations in the load-path
variable; our method is to check whether it’s present at each of them. (Also, maybe some of them come gzipped; let’s check for those ones, too.)
On my not especially overcomplicated Emacs install, the list has 59 elements.
At first sight this sounds like such a niche problem. Not only is it about Emacs but it’s also Windows; the latter is somewhat known of its less than excellent performance when it comes to small files.
As it happens though, bash
on Linux does the exact same thing. We have a list of directories on PATH
, and, whenever we want to launch a program, we’ll go and check each and one of them for the files we are looking for. We’re fairly lucky though: the list is pretty short.
~ $ strace bash -c asdklfjasldfjaskldfasdljf (...) newfstatat(AT_FDCWD, ".", {st_mode=S_IFDIR|0755, st_size=4096, ...}, 0) = 0 newfstatat(AT_FDCWD, "/home/simon/bin/asdklfjasldfjaskldfasdljf", 0x7ffe5ff8d3c0, 0) = -1 ENOENT (No such file or directory) newfstatat(AT_FDCWD, "/usr/local/bin/asdklfjasldfjaskldfasdljf", 0x7ffe5ff8d3c0, 0) = -1 ENOENT (No such file or directory) newfstatat(AT_FDCWD, "/usr/bin/asdklfjasldfjaskldfasdljf", 0x7ffe5ff8d3c0, 0) = -1 ENOENT (No such file or directory) newfstatat(AT_FDCWD, "/bin/asdklfjasldfjaskldfasdljf", 0x7ffe5ff8d3c0, 0) = -1 ENOENT (No such file or directory) newfstatat(AT_FDCWD, "/usr/games/asdklfjasldfjaskldfasdljf", 0x7ffe5ff8d3c0, 0) = -1 ENOENT (No such file or directory) newfstatat(AT_FDCWD, "/usr/local/games/asdklfjasldfjaskldfasdljf", 0x7ffe5ff8d3c0, 0) = -1 ENOENT (No such file or directory)
… except wait, now we’re looking for ourselves?
newfstatat(AT_FDCWD, ".", {st_mode=S_IFDIR|0755, st_size=4096, ...}, 0) = 0 newfstatat(AT_FDCWD, "/home/simon/bin/bash", 0x7ffe5ff8d490, 0) = -1 ENOENT (No such file or directory) newfstatat(AT_FDCWD, "/usr/local/bin/bash", 0x7ffe5ff8d490, 0) = -1 ENOENT (No such file or directory) newfstatat(AT_FDCWD, "/usr/bin/bash", 0x7ffe5ff8d490, 0) = -1 ENOENT (No such file or directory) newfstatat(AT_FDCWD, "/bin/bash", {st_mode=S_IFREG|0755, st_size=1265648, ...}, 0) = 0 newfstatat(AT_FDCWD, "/bin/bash", {st_mode=S_IFREG|0755, st_size=1265648, ...}, 0) = 0
… and also… let’s not forget about our localized messages.
openat(AT_FDCWD, "/usr/share/locale/en_US.UTF-8/LC_MESSAGES/bash.mo", O_RDONLY) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/share/locale/en_US.utf8/LC_MESSAGES/bash.mo", O_RDONLY) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/share/locale/en_US/LC_MESSAGES/bash.mo", O_RDONLY) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/share/locale/en.UTF-8/LC_MESSAGES/bash.mo", O_RDONLY) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/share/locale/en.utf8/LC_MESSAGES/bash.mo", O_RDONLY) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/share/locale/en/LC_MESSAGES/bash.mo", O_RDONLY) = -1 ENOENT (No such file or directory) newfstatat(2, "", {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0x1), ...}, AT_EMPTY_PATH) = 0
As it happens, Python is slightly smarter than either of the two above. Instead of trying various file names, it will just go and lists directories right away; it is probably this & some caching mechanisms that allow it to find some modules pretty quickly. (We’re still looking for __init__.py
and similar ones one by one though.)
simon@anarillis ~/tmp> strace -f python3 -m our_test_dir.our_test_moduleb 2>&1 |grep our_test execve("/usr/bin/python3", ["python3", "-m", "our_test_dir.our_test_moduleb"], 0x7ffc087c2c38 /* 17 vars */) = 0 newfstatat(AT_FDCWD, "/home/simon/tmp/our_test_dir/__init__.cpython-311-x86_64-linux-gnu.so", 0x7ffc3025b8e0, 0) = -1 ENOENT (No such file or directory) newfstatat(AT_FDCWD, "/home/simon/tmp/our_test_dir/__init__.abi3.so", 0x7ffc3025b8e0, 0) = -1 ENOENT (No such file or directory) newfstatat(AT_FDCWD, "/home/simon/tmp/our_test_dir/__init__.so", 0x7ffc3025b8e0, 0) = -1 ENOENT (No such file or directory) newfstatat(AT_FDCWD, "/home/simon/tmp/our_test_dir/__init__.py", 0x7ffc3025b8e0, 0) = -1 ENOENT (No such file or directory) newfstatat(AT_FDCWD, "/home/simon/tmp/our_test_dir/__init__.pyc", 0x7ffc3025b8e0, 0) = -1 ENOENT (No such file or directory) newfstatat(AT_FDCWD, "/home/simon/tmp/our_test_dir", {st_mode=S_IFDIR|0755, st_size=4096, ...}, 0) = 0 newfstatat(AT_FDCWD, "/home/simon/tmp/our_test_dir", {st_mode=S_IFDIR|0755, st_size=4096, ...}, 0) = 0 newfstatat(AT_FDCWD, "/home/simon/tmp/our_test_dir", {st_mode=S_IFDIR|0755, st_size=4096, ...}, 0) = 0 newfstatat(AT_FDCWD, "/home/simon/tmp/our_test_dir", {st_mode=S_IFDIR|0755, st_size=4096, ...}, 0) = 0 # here is the dir listing! openat(AT_FDCWD, "/home/simon/tmp/our_test_dir", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 3 write(2, "/usr/bin/python3: No module name"..., 64/usr/bin/python3: No module named our_test_dir.our_test_moduleb
Nevertheless, it seems that “trying to find files with a set of possible names in a set of possible directories” is a fairly common operation that not everyone has optimized yet.
(Also, is “optimizing” this really a good goal? Or does it just stand for “OK workarounds for missing file system APIs”?)
How about… instead of asking the operating system for a combination of n files at m different places, we could just give it the list of possible files and the list of possible places?
This would already cut down on the number of system calls, and, if this is going over a network, the required roundtrips.
AS/400 libraries are, by the way, solving a very similar problem. While I’m not sure what implementation they’re using underneath, they have at least a good chance for not having to try every combo all the time, given their database “filesystem”.
But then, in the end, we are just trying to perform a query, to select all the source files ever WHERE they have one of the given names & then we pick the ones that are in source directories we prefer the most (e.g. come first on the PATH list). That’s it.
As it happens, Postgres can solve this problem extremely well and quickly. (… there might be a blog post on how, at some point.)
Could it be something that the operating system or the file system just… does for you, quickly and efficiently?
Keep your files stored safely and securely with the SanDisk 2TB Extreme Portable SSD. With over 69,505 ratings and an impressive 4.6 out of 5 stars, this product has been purchased over 8K+ times in the past month. At only $129.99, this Amazon’s Choice product is a must-have for secure file storage.
Help keep private content private with the included password protection featuring 256-bit AES hardware encryption. Order now for just $129.99 on Amazon!
Help Power Techcratic’s Future – Scan To Support
If Techcratic’s content and insights have helped you, consider giving back by supporting the platform with crypto. Every contribution makes a difference, whether it’s for high-quality content, server maintenance, or future updates. Techcratic is constantly evolving, and your support helps drive that progress.
As a solo operator who wears all the hats, creating content, managing the tech, and running the site, your support allows me to stay focused on delivering valuable resources. Your support keeps everything running smoothly and enables me to continue creating the content you love. I’m deeply grateful for your support, it truly means the world to me! Thank you!
BITCOIN bc1qlszw7elx2qahjwvaryh0tkgg8y68enw30gpvge Scan the QR code with your crypto wallet app |
DOGECOIN D64GwvvYQxFXYyan3oQCrmWfidf6T3JpBA Scan the QR code with your crypto wallet app |
ETHEREUM 0xe9BC980DF3d985730dA827996B43E4A62CCBAA7a Scan the QR code with your crypto wallet app |
Please read the Privacy and Security Disclaimer on how Techcratic handles your support.
Disclaimer: As an Amazon Associate, Techcratic may earn from qualifying purchases.