Dirname is Evil |
February 11th, 2016 |
tech |
size_t final_slash = filename.find_last_of('/'); return filename.substr(0, final_slash);
Why do this when there's dirname(3)
?
Because dirname
is evil:
- It has big traps.
- Different implementations have different traps.
On some systems, dirname
modifies its input. For
example, here's an implementation that's nearly [2] posix conforming:
char* dirname(char* path) { static char dot[] = "."; if (!path) return dot; char* last_slash = NULL; for (char* p = path; *p; p++) { if (*p == '/') last_slash = p; } if (!last_slash) return dot; *last_slash = '\0'; return path; }There are nice things about this: it doesn't need to allocate any memory and it's thread safe. This is what glibc does and is probably the most common behavior. Still, modifying your input string may not be what you want!
Systems can choose, however, to define it in other ways. For example, here's an implementation that leaves its input alone, but instead isn't thread-safe.
char* dirname(char* path) { static char buffer[PATH_MAX]; static const char dot[] = "."; if (!path) return dot; size_t last_slash_pos = -1; for (size_t i; path[i]; i++) { if (i >= PATH_MAX) return dot if (path[i] == '/') last_slash_pos = i; } if (last_slash_pos == -1) return dot; strncpy(buffer, path, last_slash_pos); buffer[last_slash_pos] = '\0'; return buffer; }
Instead of modifying its argument, this version of
dirname
uses internal storage. This means that it's not
thread safe, and you can't trust its return value to stick around if
you call anything that might possibly also call dirname
.
One more thing: dirname
returns a char*
not a const char*
but it's not always safe to modify
its return value. For example, glibc does:
char *dirname (char *path) { static const char dot[] = "."; ... /* This assignment is ill-designed but the XPG specs require to return a string containing "." in any case no directory part is found and so a static and constant string is required. */ path = (char *) dot; return path; }
This means if you give dirname
a slashless string and
pass the output to something that modifies its input, you'll pass
compile-time const checking but you're in for problems at runtime. [3]
So if you're going to use dirname
you have to treat it as
being both thread unsafe and input modifying. At which point it's
much easier to use something else that's better specified.
(Warning: I haven't actually tried running or even compiling these code samples.)
[1] Update 2016-02-12: that code no longer needs anything like
dirname at all because I rewrote it to handle everything with pipes
instead of PID files.
[2] I've left out the bit where it's supposed to ignore trailing '/' characters.
[3] Either changing the return value of dirname
for
future calls, or undefined behavior, I'm not sure which.
Comment via: google plus, facebook