Exploring Hourglass APIs in Rust
There are two talks on APIs that I think every programmer should watch, learn, and study from:
- Designing and Evaluating Reusable Components from Casey Muratori. This absolutely wonderful talk is the fundamental source for how to design APIs.
- Hourglass Interfaces for C++ APIs from Stefanus Du Toit. This talk discusses having a rich API for users that is backed by a C API into proprietary code.
These two talks give a really good and grounded lesson on fundamental API design - the choices you make and the ramifications they can have. In particular, the design challenge and solution of the hourglass API design is a really nice approach to being able to ship a high functioning API that is backed by some proprietary software that you don’t want to ship. For example:
struct MyDataOpaque;
extern "C" MyDataOpaque* mydata_create(int);
extern "C" void mydata_destroy(MyDataOpaque*);
struct MyData final {
explicit MyData(int someState) : opaque(mydata_create(someState)) {}
~MyData() { mydata_destroy(opaque); }
private:
MyDataOpaque* const opaque;
};
This is a classic hourglass API in C++ - we’ve got the C API that returns some
opaque data, and the hourglass MyData
struct lets us expose this to C++ users
in a way they are familiar with. In Stefanus’ talk he went into detail about how
they also implement the C API in C++ - meaning that both sides of the thin C
API are written in the high level language they enjoy - C++.
This all got me thinking - would this approach be possible in some fashion with Rust? First the caveats:
- I suspect, but am not well versed enough with Rust, that using different versions of Rust to build either side of the hourglass API could cause explosions. So I’m going to assume that both sides are built with the same version.
- I really hope that it is safe to use memory allocating functions (like
Box
orVec
) on both Rust sides of a C API.
These pretty major caveats aside - is it possible?
The Bottom of the Hourglass⌗
First we’ll implement the bottom of the hourglass - this would be the proprietary code that you don’t want to ship (all your secret sauce might be in it!).
#[repr(C)]
pub struct MyDataOpaque {
some_state: i32,
}
impl MyDataOpaque {
pub fn new(some_state: i32) -> MyDataOpaque {
MyDataOpaque { some_state }
}
}
#[no_mangle]
pub extern fn mydata_create(some_state: i32) -> *mut MyDataOpaque {
Box::into_raw(Box::new(MyDataOpaque::new(some_state)))
}
#[no_mangle]
pub extern fn mydata_destroy(d : *mut MyDataOpaque) {
// This causes the Drop to be called on the box, freeing everything.
let _ = unsafe{ Box::from_raw(d) };
}
It’s a quite simple bit of code (for our simple example), but the crux of it is just some opaque struct, and some unmangled public extern symbols for the exported C API.
Building the Two Crate Solution⌗
I wanted a way to build a Rust crate for the above, then build with cargo on a
second crate that uses it. I did not want to have a single instantiation of
cargo because I wanted to be 100% sure that I was exercising the export and
import paths of Rust. To do this I used a build.rs
like:
use std::env;
use std::process::Command;
fn main() {
let out_dir = env::var_os("OUT_DIR").unwrap();
let working_dir = env::var_os("CARGO_MANIFEST_DIR").unwrap();
Command::new("cargo")
.arg("build")
.arg("--manifest-path")
.arg(&format!(
"{}/MyDataOpaque/Cargo.toml",
working_dir.to_str().unwrap()
))
.arg("--target-dir")
.arg(&format!("{}", out_dir.to_str().unwrap()))
.arg("--release")
.status()
.unwrap();
println!(
"cargo:rustc-link-search=native={}/release",
out_dir.to_str().unwrap()
);
}
This build script calls cargo in a sub-folder called MyDataOpaque
, builds the
crate, and then adds the built library as a link requirement to our crate.
Building the top of the Hourglass⌗
So now we’ve got a Rust library, exposed via a C API, that we want to write Rust bindings for.
Just to prove it worked I first tried using cbindgen
to generate a C API for
the bottom of the hourglass, and then bindgen
to take this C API and generate
FFI bindings for it in the top of the hourglass, but these have a ton of
dependencies (like libclang) that I wasn’t happy about. So I instead decided to
just manually write the C API for the top of the hourglass.
Aside: assuming this approach was viable you could forsee an
hourglass_bindgen
crate that did something similar to whatcbindgen
does, but just writes out the Rust FFI to the C API instead.
#[repr(C)]
pub struct MyDataOpaque {
_private: [u8; 0],
}
#[link(name = "MyDataOpaque")]
extern "C" {
fn mydata_create(some_state: i32) -> *mut MyDataOpaque;
fn mydata_destroy(opaque: *mut MyDataOpaque);
}
pub struct MyData {
opaque: *mut MyDataOpaque,
}
impl MyData {
pub fn new(some_state: i32) -> MyData {
MyData {
opaque: unsafe { mydata_create(some_state) },
}
}
}
impl Drop for MyData {
fn drop(&mut self) {
unsafe {
mydata_destroy(self.opaque);
}
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn create_destroy() {
let _ = MyData::new();
}
}
As we can see this code is very similar to the C++ example I showed at the
beginning of the post - it wraps a C API in a Rust struct and then can use the
existing safe mechanisms of Rust (like the Drop
trait) to ensure safety of the
data used.
And if I run it?
running 1 test
test tests::create_destroy ... ok
It works, nice!
Conclusion⌗
Hourglass APIs appear to work in Rust (minus the caveats above that I’ll have to
explore further). I do think that maybe using no_std
in the bottom of the
hourglass could mitigate even issues with multiple versions of Rust being used,
but again I’ll have to verify it.
You could even forsee of an approach where in the build.rs
instead of building
some other crate (whose source you don’t want to ship), you could use the
TARGET
to fetch the correctly pre-built version from a web service, or maybe
have a bunch of the library versions resident in the crate (obviously the total
crate size limit might become an issue though), and pick between them.
It’s pretty awesome to me that this approach looks feasible though!