Announcement time - today is my last day at AMD. I’ll be taking a few weeks off, before starting a new gig at Unity mid-July.
It is also the end of my five year involvement in Vulkan & SPIR-V - having contributed to both specifications since 2014.
I’ve very much enjoyed my time in this slice of the industry, but it was time for a new challenge.
Farewell AMD I’ve been at AMD a little over a year now - and it has been an interesting challenge.
In this post, we’ll use floating-point scalar evolution (fpscev) to simplify instructions based on any range data we’ve managed to deduce.
This post is the latest in a series about my experimentations with floating-point scalar evolution, you really want to read the them in order:
An Experimental Floating-Point Scalar Evolution Using Floating-Point Scalar Evolution to Propagate Fast-Math Flags The LLVM optimization pass discussed in this post is available on github here.
As a follow-up to my post An Experimental Floating-Point Scalar Evolution, I’ve started looking into using my previous analysis to actually change the LLVM IR for the better.
The LLVM optimization pass discussed in this post is available on github here.
Fast-Math Flags LLVM has a way of encoding the fast-math flags on various floating-point operations. These flags let the compiler assume something about the operation, which generally allows more performance.
The TL;DR - after a conversation at EuroLLVM with Steve Canon about how LLVM is missing scalar evolution analysis for floating-point, I’ve spent some spare time hacking on a new LLVM analysis pass - fpscev (Floating-Point SCalar EVolution) - available here at github. The pass will analyze floating-point operations in a function and work out if there are any constraints on the range of these values, information which can be used to better optimize code.
For those that don’t know - the favourite single-header C/C++ library that I’ve ever created is my unit test helper - utest.h. The main features are:
Single header (duh). Works with C and C++. Allows a single executable to be created with both C and C++ tests linked in. Blindingly fast to compile and run. After reading this post on another attempt at a googletest replacement jctest, and all the awesome analysis that Matthias included comparing the compile and runtime performance of jctest as compared to googletest, it suddenly hit me that I haven’t actually provided my performance analysis in any blog to show how good utest.