Introducing hashmap.h
I asked on twitter the other day whether anyone had a hashmap that could work with string slices - parts of a string that are not null-terminated and thus have to have an explicit length to accompany the pointer.
I didn’t get any responses on this so I commented with a follow-up that I had grabbed some code written a few years back by the awesome Pete Warden of Google fame, and morphed it into what I required:
Authors note: this part used to contain a tweet, But Hellish Tusk / Space Karen / Elon Musk butchered the platform so it is now gone.
Much to my surprise Pete was happy not only for me to do these modifications, but also since he was no longer maintaining the hashmap code he’d happily redirect users to any effort I put together:
Authors note: this part used to contain a tweet, But Hellish Tusk / Space Karen / Elon Musk butchered the platform so it is now gone.
So I’ve done the work and I’m now introducing my latest library, nearly entirely not written by me, hashmap.h!
Null-Terminated to Slices⌗
So Pete’s code was pretty solid as is. The main difference was that it relied on
null-terminated strings as the key, whereas I wanted to use string slices. My
first modification was to change the entry points that used a key to instead
take a key and a length. So hashmap_put
went from:
extern int hashmap_put(map_t in, char* key, any_t value);
To:
HASHMAP_WEAK int hashmap_put(struct hashmap_s *const hashmap,
const char *const key,
const unsigned len,
void *const value);
You’ll notice that I also went const
mad (and again wished that const
was
the default and mutable
or mut
was required on variables - sigh!), and
removed the typedef for any_t
. The last point is just a general stylistic
thing I have for my libraries - I really dislike that APIs like Windows.h
abstract you so far away from the underlying types with all the SHOUTY CASE
LDPWORD
’s and such, that I generally try and have no typedef’s if I can get
away with it.
Supports UTF-8 Keys⌗
Pete’s code also used the string.h
ASCII-string functions of C to compare
whether the key ever matched. Since I wanted to use this hashmap in conjunction
with UTF-8 strings (using my utf8.h
library) I instead used memcmp
. Now that I have an explicit length for the
string slice this became possible.
Single Header⌗
The last major change I made was to smush the hashmap.c and hashmap.h files together into a single header. I am pretty obsessed with single headers as a way to get round the botched nature of C and C++’s package story (or lack thereof). This meant leveraging some exists hacks to stop the compiler complaining about multiple function definitions (by using weak function references instead).
The License⌗
Pete’s code was already marked explicitly public domain - do what you want with it. I’ve found over the years that while public domain is all fine and well, having an explicit license like the unlicense or the CC0 can make lawyers happy because there is at least some legal text to reference. It also makes GitHub’s license scraping happier because these licenses are already ones that it knows about.
So I’ve licensed this header under the unlicense - it matches what my existing single-header projects use and is something that my users already favour.
Hashmap All The Things⌗
So I’m pretty happy with the code I mostly did not write - and I hope that my packaging and testing of the header allows it to be more widely useful to some of you fine folks out there.
A big thanks again to Pete Warden for writing this code and being so gracious about me making these changes. I hope this proves useful to some of you out there too.