I [[musttail]] You About a Tokenizer

One thing I’ve never really messed with before is the [[clang::musttail]] compiler attribute. Tail calls are when you exit a function by calling another function as the final thing within the function. For example: int foo(float x); int bar(float x) { x += 42.0f; return foo(x); } In the above example, foo is a tail call within the function bar. Compilers can take advantage of this by doing something called tail-call optimization, which allows the compiler to not push a new stack frame, and also to change the call into a jump.

Custom asserts in LLVM

I tried (and failed) to convince the LLVM folks to allow for runtime togglable asserts. No biggie - the people much more involved with maintaining upstream didn’t want to have yet another codepath to maintain. They said that it’d cost 20% runtime performance to have asserts enabled, and that this cost would likely still be paid even if I did have asserts compiled off at runtime. Why you might ask - the answer to which is that LLVM guards a bunch of assert-checking code behind NDEBUG preprocessor checks, and will store extra sideband information if you are running with asserts enabled so as to do some deeper checks.

Your Own Constant Folder in C/C++

I was talking with someone today that really really wanted the sqrtps to be used in some code they were writing. And because of a quirk with clang (still there as of clang 18.1.0), if you happened to use -ffast-math clang would butcher the use of the intrinsic. So for the code: __m128 test(const __m128 vec) { return _mm_sqrt_ps(vec); } Clang would compile it correctly without fast-math: test: # @test sqrtps xmm0, xmm0 ret And create this monstrosity with -ffast-math:

Verse Transactional Memory

When I interviewed for Epic Games it was for a graphics post - I wanted back to working on shader compilers. But even though most of my interviews were from the fantastic graphics side of the company, I had a few interviews about something I knew very little about - the Verse language. And on one of those interviews I was asked about something I hadn’t thought about for 15 years - Transactional Memory.

Updating GitHub repos to Apple Silicon

I’ve updated my C/C++ open sources libraries utest.h, utf8.h, ubench.h, hashmap.h, subprocess.h, and json.h to use the new Apple Silicon GitHub CI runners. So how hard is it? Simple! You just add macos-14 to the build -> strategy -> matrix. I took the opportunity to drop macos-latest (which is still set to macos-13, the last x86 runner) and explicity use the oldest supported macos-11 instead. The new Apple Silicon runner is roughly 2x faster than the x86 one too - nice!