Effortlessly porting a major C++ library to Node.js with SWIG Node-API

Momtchil Momtchev
15 min readMay 9, 2023

An introduction for C++ engineers with previous binary Node.js addon experience to SWIG Node-API

(this tutorial will continuously evolve to match SWIG Node-API, the current version is from September 9th 2023 and includes asynchronous execution and TypeScript which still haven’t been merged to the main trunk)

Photo by Jonas Rhyner on Unsplash

Not so long ago, while working on a Node.js project which generated map tiles, I was very surprised to find out that ImageMagick — the quintessential image processing library, initially released in 1990 — was surprisingly absent from its ecosystem. Sure, there were some wrappers around the CLI tools — but they didn’t even come close to the full scale of features available in it.

Being someone with lots of spare time on my hands — you can check my Github profile for my story — I decided that it was time to right this wrong and do so by working smart instead of working hard.

Now, if you are part of the younger generation of Noders, you probably have never heard of SWIG before. And it is a pity, because this amazing piece of software can easily crash the job market for high-level language bindings writers — by fully automating their jobs.

So I found myself looking at SWIG for Node.js for the second time during the last few years — the first time being when I started maintaining gdal-async — the asynchronous GDAL bindings for Node.js. And once again, I came to the conclusion that the Node.js support was barely usable — it hadn’t been updated for the last 10 years — which in the Node.js world was comparable to the evolution from feudalism to industrial society.

Now, ImageMagick is an absolutely huge C++ project — the SWIG-generated Node.js bindings are about 400,000 lines of code. Hand-writing these would probably have costed upwards of 1 man-year.

SWIG offers the best of both worlds — at the cost of a relatively steep learning curve for the bindings author

With SWIG, it took me two weeks to implement a modern NAPI-based backend — and then one more week to create the ImageMagick bindings. And only because this was the SWIG NAPI testing project and there were some rough edges to polish.

This was the perfect opportunity to kill three birds with one stone:

  • Publish full ImageMagick bindings for Node.js
  • Write a modern, NAPI-based, Node.js backend for SWIG
  • Write a simple tutorial to unleash a torrent of C++ libraries for Node.js

Enter SWIG

SWIG is a C++ header compiler that can produce bindings for interpreted and dynamically typed languages that cannot link directly with C/C++ shared libraries. It supports a large number of languages, including JavaScript.

Now, SWIG is a complicated piece of engineering and no matter what, you won’t be able to avoid delving deep into its huge documentation.

Still, this tutorial’s goal is to allow you to quickly understand some basic concepts and bootstrap your next Node.js addon project.

Learning and using SWIG is definitely not easy. It is an entire computer language with its own C++ transpiler. It eliminates the work that is necessary to write a huge native C++ addon. However, it does not eliminate the requirement of having the necessary skills to do so.

When using SWIG, you will be expected to be able to look and understand the generated code — especially if you will be writing advanced typemaps.

So, if you have never worked on a Node.js binary addon — expect a very steep learning curve.

As of September 9th 2023, the bulk of the Node-API support in SWIG has been merged in the main trunk. It will be available in SWIG 4.2.0. The asynchronous execution and the TypeScript support is still being reviewed and undergoing polishing.

The full SWIG Node-API including asynchronous execution and TypeScript support can be used by checking out my development branch at

https://github.com/mmomtchev/swig#mmom

node-magickwand

You can find the resulting project, which is a real-world Node.js binary addon based on a very complex C++ library with multi-OS support here:

The scaffolding

Create a node-addon-api skeleton:

sudo npm install -g generator-napi-module
mkdir node-magickwand && cd node-magickwand && yo napi-module

SWIG can compile C++ headers, but it usually needs some help with deducing how things work. Its most basic block is the interface .i file. Create its skeleton:

%module magickwand

%{
#include <Magick++.h>
#include <MagickWand/MagickWand.h>
#include <iostream>

using namespace Magick;
%}

%include "cpointer.i"
%include "std_string.i"
%include "typemaps.i"

The very first thing we start with is to tell SWIG how it is supposed to access the C++ library we will be working with. These are the #include statements necessary to bring the ImageMagick definitions into a C++ program. The %{ %} block — called a verbatim block — is a block that SWIG will simply include verbatim in the generated wrappers.

Then, we have the standard JavaScript typemaps — we won’t be writing code that converts v8::String (wrapped in a Napi::String) arguments to std::string arguments — SWIG NAPI comes with the basic types already supported.

%rename(call) operator();
%rename(clone) operator=;
%rename(equal) operator==;
%rename(notEqual) operator!=;
%rename(gt) operator>;
%rename(lt) operator<;
%rename(gte) operator>=;
%rename(lte) operator<=;

This part is almost mandatory in any JavaScript project — JS does not have operator overloading, so we have to rename those methods. The first line means that when dealing with the callable object transparentImage we won’t be calling transparentImage(im) — we will be calling transparentImage.call(im).

The main dish

%nspace;
namespace MagickCore {
%include "../swig/magickcore.i"
%include "../swig/magickwand.i"
}
%include "../swig/magick++.i"

Finally, we bring in all the definitions that we will be compiling. This is different than the verbatim block — the verbatim block is a piece of code that will be directly included in the generated code. Here, we are actually compiling in SWIG. SWIG has two ways of absorbing C++ headers — %include which means include and generated wrappers for everything you see — and %import — which means acquaint yourself with the types mentioned there — but do not generate wrappers for them. The second form is typically used for low-level stuff that SWIG needs to be aware of.

For example, ImageMagick has both an old plain C API — called MagickCore and a newer higher-level C++ API called Magick++. As many methods share the same name, in order to avoid collisions, we have to enable the namespaces support — %nspace — and to include the old C API in a separate namespace.

Typically, MagickCore is precisely the type of header files that you might consider using an %import for. They contain some type definitions that are used by the other API, but one will probably never need these from JavaScript.

Normally, here you would have to %include or %import every single header file of your target project. SWIG does not follow #include statements unless specifically told to — and in most normal projects you won’t need this very unusual and hard to use feature. You won’t need the standard class libraries either — SWIG already includes its own definitions for classes such as std::vector or std::string which you have to use. You just have to list all the header files in the exact order of their dependencies. For ImageMagick, with its 120 header files, this is a daunting task and I had to write a dependency analyser and generator — you can check it in src/deps.js— as well as its output in swig/magickcore.i. You won’t need this in most normal projects and I won’t cover it here.

Sweeping the unusable methods under the carpet

Every C++ project has methods that won’t be needed or won’t even be usable from Node.js — and ImageMagick is not an exception.

We start by ignoring the methods that depend on a va_list:

%ignore LogMagickEventList;
%ignore ThrowMagickExceptionList;

We also ignore everything that has custom or special memory allocation — these are spread around the various classes — so we take advantage of the regular expressions support:

%rename("$ignore", regextarget=1) "NoCopy$";
%rename("$ignore", regextarget=1) "Allocator";

The hard methods — returning a container from std::vector

Most complex C++ projects will inevitably include some methods that SWIG will be unable to handle without some help. Take a look at this one:

template <class Container > 
void coderInfoList( Container *container_,
CoderInfo::MatchType isReadable_ = CoderInfo::AnyMatch,
CoderInfo::MatchType isWritable_ = CoderInfo::AnyMatch,
CoderInfo::MatchType isMultiFrame_ = CoderInfo::AnyMatch
);

This method allows the user to retrieve the descriptors of all image file formats that have been compiled in. You have to pass a pointer to an std compatible container in which you will receive the resulting list. Here is how you are supposed to use it:

list<CoderInfo> coderList; 
coderInfoList( &coderList, // Reference to output list
CoderInfo::TrueMatch, // Match readable formats
CoderInfo::AnyMatch, // Don't care about writable formats
CoderInfo::AnyMatch); // Don't care about multi-frame support

First of all, C++ templates must be instantiated when compiling. JavaScript won’t be able to call this method unless it has been pre-instantiated.

So this is the very first thing we will need to tell SWIG:

%template(coderInfoArray)
std::vector<Magick::CoderInfo>;
%template(coderInfoList)
Magick::coderInfoList<std::vector<Magick::CoderInfo>>;

Then, we will have to include some hand-written code to transform these input arguments. This is called a SWIG typemap:

%include "std_vector.i"

%typemap(in, numinputs=0)
std::vector<Magick::CoderInfo> *container_
{
$1 = new std::vector<Magick::CoderInfo>;
}

%typemap(argout)
std::vector<Magick::CoderInfo> *container_
{
$result = SWIG_NAPI_NewPointerObj(env,
$1, $1_descriptor, SWIG_POINTER_OWN);
}

%typemap(tsout)
std::vector<Magick::CoderInfo> *container_
"std.coderInfoArray";

We create three typemaps here.

First, an in typemap, that handles arguments of type std::vector<Magick::CoderInfo> *. We can apply it to all such arguments — or only to arguments named container_ — which is probably safer. This typemap works without expecting any arguments coming from JavaScript — it effectively eliminates this argument for the JavaScript caller — this is what numinputs=0 does. JavaScript callers will call this method without its first argument. Then, we tell SWIG to initialize this first argument — which we use $1 to refer to — with a new statement. We create a an empty std::vector for it.

The second typemap is an argout typemap. SWIG applies it when exiting the method. We use the same type specification as the first one — to make sure that it is applied whenever the first one is applied. Here we assign the method result $result a new wrapped JS object by transferring ownership — because we used new — with the SWIG_POINTER_OWN flag. $1 is the method argument that we assigned in the input typemap. $1_descriptor is its SWIG type descriptor.

The third typemap — tsout— is required only if you enable the TypeScript support — it specifies the TypeScript return type of the method.

If you have never worked on a binary Node.js addon this might not be immediately clear. Every C++ object that will be visible to JavaScript must have a JS wrapper that implements JS-callable methods. Luckily, SWIG already includes a JS-compatible std::vector that we can access by including std_vector.i. SWIG_NAPI_NewPointerObj creates this wrapper.

Voilà, we have effectively transformed this C++ invocation:

list<CoderInfo> coderList; 
coderInfoList( &coderList, // Reference to output list
CoderInfo::TrueMatch, // Match readable formats
CoderInfo::AnyMatch, // Don't care about writable formats
CoderInfo::AnyMatch); // Don't care about multi-frame support

to this JS invocation:

const list = coderInfoList(Magick.TrueMatch, Magick.AnyMatch,
Magick.AnyMatch);

Instead of expecting a pointer to a container, the JS version of this method will simply return a wrapped std::vector that we can access from JavaScript. This SWIG-provided object implements the size() and get() methods.

The hard methods — a void pointer from a Buffer

There are a number of methods in Magick::Blob such as Magick::Blob::update or one of the constructors Magick::Blob::Blob that expect arguments of the form const void *data_, size_t length_. We want to the JavaScript caller to be able to invoke these with a single Buffer.

If we include , SWIG NAPI already provides these typemaps out of the box for arguments named const void *buffer_data, const size_t buffer_len if we include node_buffer.i. We can use an assignment to add a new case:

%include "nodejs_buffer.i"

%typemap(in) (const void *data_,const size_t length_) =
(const void *buffer_data, const size_t buffer_len);
%typemap(typecheck) (const void *data_,const size_t length_) =
(const void *buffer_data, const size_t buffer_len);
%typemap(ts) (const void *data_,const size_t length_) =
(const void *buffer_data, const size_t buffer_len);

This kind of typemap is called a multi-argument typemap. It will be applied when a method expects two consecutive arguments of the given type. The name is optional — we can apply it to all pairs of arguments of this type — or only to those named data_ and length_ — which is the safer choice. We haven’t provided a numinputs=1 — this is the default value. This means that those two arguments will be represented by a single JavaScript argument.

The hard methods — a void pointer from a TypedArray

Let’s take a look at another twisted example, one of the Image class constructors:

Image::Image(const size_t width_, 
const size_t height_,
std::string map_,
const StorageType type_,
const void *pixels_);

It expects a raw pointer to the image data, dimensions and a data type.

We want to make it work with a TypedArray.

First of all, we use a verbatim block to embed a function that converts the NAPI type to Magick::StorageType directly into the final code:

%{
inline Magick::StorageType GetMagickStorageType(
Napi::Env env, const Napi::TypedArray &array) {

switch (array.TypedArrayType()) {
case napi_int8_array:
case napi_uint8_array:
case napi_uint8_clamped_array:
return MagickCore::CharPixel;
case napi_int16_array:
case napi_uint16_array:
return MagickCore::ShortPixel;
case napi_int32_array:
case napi_uint32_array:
return MagickCore::LongPixel;
case napi_float32_array:
return MagickCore::FloatPixel;
case napi_float64_array:
return MagickCore::DoublePixel;
#if (NAPI_VERSION > 5)
case napi_bigint64_array:
case napi_biguint64_array:
#endif // (NAPI_VERSION > 5)
return MagickCore::LongLongPixel;
}
SWIG_Error(SWIG_ERROR, "Invalid type");
// Avoid a warning
return MagickCore::CharPixel;
}
%}

Nothing too complicated, nothing unusual.

Then we create a multi-argument input typemap:

%typemap(in)
(const Magick::StorageType type_, void *pixels_)
(Napi::TypedArray _global_typed_array)
{
if ($input.IsTypedArray()) {
_global_typed_array = $input.As<Napi::TypedArray>();
$1 = GetMagickStorageType(env, _global_typed_array);
$2 = reinterpret_cast<void*>(
reinterpret_cast<uint8_t *>(
_global_typed_array.ArrayBuffer().Data()) +
_global_typed_array.ByteOffset());
} else {
SWIG_exception_fail(SWIG_TypeError,
"in method '$symname', argument $argnum is not a TypedArray");
}
}

The single input argument coming from JavaScript is referred as $input. It is of type Napi::Value.

Note, that this typemap also creates a local variable in the wrapper method — this is the second expression in parentheses.

First thing first, we check if this argument is a JavaScript TypedArray. If it is not, we throw a TypeError. It is important to throw a TypeError — since this is what the overloading dispatcher uses to determine if it should try another overloaded method signature.

Then we use the previously embedded function to get an ImageMagick-compatible StorageType. Then we proceed to extract the raw pointer to the underlying data of the TypedArray. This is the only correct way to do so. If you have previously used a TypedArray and you got its pointer by simply calling array.ArrayBuffer().Data() — then immediately go back and fix your code before anyone has noticed what a lousy engineer you are. Unless you apply the ByteOffset(), your code will cause terrible destruction when the user passes an argument produced by the subarray() method in JavaScript. Such errors are known to have caused spectacular space launch failures in the past.

And in order to avoid such regrettable disasters, a JavaScript engineer cannot be allowed to access data beyond the end of its allocated array under any circumstances, no matter how hard he tries to do so. Thus, we will also use a check typemap:

%typemap(check)
(const size_t width_, const size_t height_,
const std::string &map_, const Magick::StorageType type_,
const void *pixels_),
(const size_t columns_, const size_t rows_,
const std::string &map_,
const Magick::StorageType type_, void *pixels_)
{
if ($1 * $2 * $3->size() != _global_typed_array.ElementLength()) {
SWIG_exception_fail(SWIG_IndexError,
"The number of elements in the TypedArray does not match "
"the number of pixels in the image");
}
}

This check typemap will be applied to any method having those five arguments. It uses the previously created variable in the wrapper method and its function is to prevent size mismatches.

Note that this typemap is applied to two different cases — one where the image width is called width_ and another one where it is called columns_ — as these are the two argument names found througout ImageMagick. This is different from the previous declaration where the second expression defined a local variable — note the comma between the two expressions.

Finally, if generating TypeScript bindings, we also have to specify the new TypeScript type that the wrapper will require:

%typemap(in)
(const Magick::StorageType type_, void *pixels_)
"Uint8Array | Uint8ClampedArray | Uint16Array | Uint32Array | "
"Float32Array | Float64Array";

The very hard methods — a method that returns a void pointer

Take a look at Magick::Blob::data():

const void *data(void);

The ImageMagick creators definitely didn’t think in advance about JavaScript. What can we do about this nightmare?

No amount of argument massaging can produce a safe JavaScript wrapper out of this method.

The only solution is to ignore this method and to replace it with a JavaScript-friendly one:

%ignore Magick::Blob::data() const;
%extend Magick::Blob {
void data(void **buffer_data, size_t *buffer_len) const {
*buffer_data = const_cast<void *>(self->data());
*buffer_len = self->length();
}
}

We have already seen %ignore. In the case of overloaded methods, you can use it to ignore but one of the signatures — by specifying its arguments. %extend allows you to add methods to an existing C++ class. In this case we simply add a new method that uses the argout version of the nodejs_buffer.i Buffer-compatible signature that we saw earlier. This means that the JavaScript wrapper won’t take any arguments and it will instead return a Buffer. You can check nodejs_buffer.i to see how this works under the hood.

This new method calls the old one through overloading. The old one is still there — it simply won’t be exposed to JavaScript.

Asynchronous execution

One of the most peculiar aspects of the JavaScript language is its way of handling I/O and concurrency by using asynchronous execution. Although this is not by design — it is a reminder of its origin as a language dedicated to UI scripting — it is considered to be a very successful paradigm that allows for very easy and high performance access to parallel programming. This model of concurrency is usually called Green Threads or CoRoutines. It moves all the complexity of handling the parallel execution to the C++ engine.

SWIG Node-API allows the automatic generation of asynchronous wrappers that return a Promise and do the heavy lifting in invisible background threads managed by libuv — the Node.js low-level system library.

Particular care must be taken when enabling asynchronous execution as it bestows to the JavaScript end-user the capability of launching multiple operations — on the same object and by the same method — in parallel. It is up to the binary addon author to implement the exclusion rules governing parallel execution.

SWIG Node-API can implement a default locking, that works for most use cases. It implements mutexes that ensure that every underlying C++ object is used only by a single C++ method at a time. It is not without caveats — about which you should read in the documentation — but it works well enough for ImageMagick.

Enabling the generation of asynchronous wrappers for all classes is as simple as using:

%feature("async:locking", "1");
%feature("async", "Async");
%apply SWIGTYPE LOCK {SWIGTYPE};
%apply SWIGTYPE *LOCK {SWIGTYPE *};
%apply SWIGTYPE &LOCK {SWIGTYPE &};

The first two lines enable locking and asynchronous execution globally. All asynchronous wrappers will have the same name as their synchronous counterparts with an Async suffix. Then we bring in the default locking typemaps for all types. This means that every method will try to lock all of its SWIG-exported C++ objects before proceeding with the operation.

Alas, in the case of ImageMagick this brings the generated C++ code from about 350k lines to about 700k. Having such an absolutely huge wrapper is not only impractical — its compilation also exceeds the free allotment of RAM by Github Actions. Besides, do we really need an async version of this:

const gm = new Geometry(100, 80);
assert.equal(gm.width(), 100);
assert.equal(gm.height(), 80);

Creating a Geometry and retrieving its dimensions are instantaneous operations. It is a helper class. Adding asynchronous handling and locking will only get in our way. A much more economical approach will be to limit the asynchronous execution and the locking to the heavy-weight classes — essentially Image and some global methods that apply processing and do I/O. These can take anywhere from a few milliseconds to several seconds and must be done without blocking the event loop. I won’t go through the exact selection of asynchronous classes in ImageMagick, but I have grouped them in AsyncClasses.i and then I have defined a macro that allows me to enable async support selectively:

%feature("async:locking", "1");
%define LOCKED_ASYNC(TYPE)
%apply SWIGTYPE LOCK {TYPE};
%apply SWIGTYPE *LOCK {TYPE *};
%apply SWIGTYPE &LOCK {TYPE &};
%feature("async", "Async") TYPE;
%enddef
%include "AsyncClasses.i"

This allows me to bring them one by one by using a single statement:

LOCKED_ASYNC(Magick::adaptiveBlurImage);
LOCKED_ASYNC(Magick::adaptiveThresholdImage);
LOCKED_ASYNC(Magick::addNoiseImage);
LOCKED_ASYNC(Magick::adjoinImage);
...

This way the size of the final generated code is kept to reasonable levels.

Putting it all together

Invoke SWIG on your .i file to generate the C++ wrappers:

swig -javascript -napi -typescript -c++ \
-Ideps/ImageMagick/Magick++/lib -Ideps/ImageMagick \
-DMAGICKCORE_HDRI_ENABLE=1 -DMAGICKCORE_QUANTUM_DEPTH=16 \
-o swig/Magick++.cxx -outdir swig src/Magick++.i

As you can see, it has a compiler-like command-line interface. You can even pass include paths and add macros. This will produce the actual code for your addon.

This tutorial is provided in the hope that it will pave the road for porting many existing excellent C++ libraries to Node.js.

Previously, this has been a very labour-intensive process that had greatly limited the number of available options for Node.js.

Then simply add the resulting file to your binding.gyp in the sources section:

'sources': [ 'swig/Magick++.cxx' ]

Now you are ready to launch the build through node-gyp:

node-gyp configure
node-gyp build

I am an unemployed engineer that is currently being extorted with the French IT recruitment companies and the French judiciary about a huge sexually-motivated judicial affair involving high-level corruption in the French administration.

I use my free time to create and maintain open-source software. My main areas of interest are Node.js/V8 internals, linking C++ and JavaScript, geospatial software. I am also a paragliding pilot with a keen interest in numerical weather prediction.

--

--