Building large projects with Emscripten is very easy. Emscripten provides two simple scripts that configure your makefiles to use emcc as a drop-in replacement for gcc — in most cases the rest of your project’s current build system remains unchanged.
To build using Emscripten you need to replace gcc with emcc in your makefiles. This is done using emconfigure, which sets the appropriate environment variables like CXX (C++ compiler) and CC (the compiler).
Consider the case where you normally build with the following commands:
To build with Emscripten, you would instead use the following commands:
emconfigure is called with the normal configure as an argument (in configure-based build systems), and emmake with make as an argument. If your build system uses CMake, replace
cmake . etc. in the above example. If your build system doesn’t use configure or CMake, then you can omit the first step and just run
make (although then you may need to edit the
We recommend you call both emconfigure and emmake scripts in configure- and CMake-based build systems. Whether you actually need to call both tools depends on the build system (some systems will store the environment variables in the configure step, and others will not).
Some build systems may not properly emit bitcode using the above procedure,
and you may see
is not a valid input file warnings. You can run
check what a file contains (also you can manually check if the contents
\0asm to see if they are wasm object files, or
BC if they
are LLVM bitcode). It is also worth running
emmake make VERBOSE=1 which
will print out the commands it runs - you should see emcc being used, and
not the native system compiler. If emcc is not used, you may need to modify
the configure or cmake scripts.
Consider the examples below:
However, sometimes you may want slightly different optimizations on certain files:
# Optimize the first file for size, and the rest using `-O2`. ./emcc -Oz a.cpp -c -o a.o ./emcc -O2 b.cpp -c -o b.o ./emcc -O2 a.o b.o -o project.js
Unfortunately each build-system defines its own mechanisms for setting compiler and optimization methods. You will need to work out the correct approach to set the LLVM optimization flags for your system.
Some build systems have a flag like
You can control whether LLVM optimizations are run using
--llvm-opts N where N is an integer in the range 0-3. Sending
-gN debug level options.
Each build-system defines its own mechanisms for setting debug flags. To get Clang to emit LLVM debug information, you will need to work out the correct approach for your system.
Some build systems have a flag like
For more general information, see the topic Debugging.
Built-in support is available for a number of standard libraries: libc, libc++ and SDL. These will automatically be linked when you compile code that uses them (you do not even need to add
-lSDL, but see below for more SDL-specific details).
For example, consider the case where a project “project” uses a library “libstuff”:
# Compile libstuff to bitcode ./emconfigure ./configure ./emmake make # Compile project to bitcode ./emconfigure ./configure ./emmake make # Compile the library and code together to HTML emcc project.o libstuff.a -o final.html
Emscripten Ports is a collection of useful libraries, ported to Emscripten. They reside on github, and have integration support in emcc. When you request that a port be used, emcc will fetch it from the remote server, set it up and build it locally, then link it with your project, add necessary include to your build commands, etc. For example, SDL2 is in ports, and you can request that it be used with
-s USE_SDL=2. For example,
./emcc tests/sdl2glshader.c -s USE_SDL=2 -s LEGACY_GL_EMULATION=1 -o sdl2.html
You should see some notifications about SDL2 being used, and built if it wasn’t previously. You can then view
sdl2.html in your browser.
SDL_image has also been added to ports, use it with
-s USE_SDL_IMAGE=2. To see a list of all available ports, run
emcc --show-ports. For SDL2_image to be useful, you generally need to specify the image formats you are planning on using with e.g.
-s SDL2_IMAGE_FORMATS='["bmp","png","xpm"]' (note: jpg support is not available yet as of Jun 22 2018 - libjpg needs to be added to emscripten-ports). This will also ensure that
IMG_Init works properly when you specify those formats. Alternatively, you can use
emcc --use-preload-plugins and
--preload-file your images, so the browser codecs decode them (see Preloading files). A code path in the SDL2_image port will load through
emscripten_get_preloaded_image_data(), but then your calls to
IMG_Init with those image formats will fail (as while the images will work through preloading, IMG_Init reports no support for those formats, as it doesn’t have support compiled in - in other words, IMG_Init does not report support for formats that only work through preloading).```
SDL_net has also been added to ports, use it with
-s USE_SDL_NET=2. To see a list of all available ports, run
Emscripten also has support for older SDL1, which is built-in. If you do not specify SDL2 as in the command above, then SDL1 is linked in and the SDL1 include paths are used. SDL1 has support for sdl-config, which is present in system/bin. Using the native sdl-config may result in compilation or missing-symbol errors. You will need to modify the build system to look for files in emscripten/system or emscripten/system/bin in order to use the Emscripten sdl-config.
You can also build a library from ports in a manual way if you prefer
that, but then you will need to also apply the python logic that ports does.
That code (under
tools/ports/) may do things like ensure necessary JS
functions are included in the build, add exports, and so forth. In general,
it’s better to use the ports version as it is what is tested and known to
Adding more ports is fairly easy. Basically, the steps are
Make sure the port is open source and has a suitable license.
Add it to emscripten-ports on github. The ports maintainers can create the repo and add the relevant developers to a team for that repo, so they have write access.
Add a script to handle it under
tools/ports/(see existing code for examples) and use it in
Add testing in the test suite.
Some large projects generate executables and run them in order to generate input for later parts of the build process (for example, a parser may be built and then run on a grammar, which then generates C/C++ code that implements that grammar). This sort of build process causes problems when using Emscripten because you cannot directly run the code you are generating.
In some cases it makes sense to modify the build scripts so that they build the generated executable natively. For example, this can be done by specifying two compilers in the build scripts, emcc and gcc, and using gcc just for generated executables. However, this can be more complicated than the previous solution because you need to modify the project build scripts, and you may have to work around cases where code is compiled and used both for the final result and for a generated executable.
Projects that use configure, cmake, or some other portable configuration method may run checks during the configure phase to verify that the toolchain and paths are set up properly. Emcc tries to get checks to pass where possible, but you may need to disable tests that fail due to a “false negative” (for example, tests that would pass in the final execution environment, but not in the shell during configure).
Ensure that if a check is disabled, the tested functionality does work. This might involve manually adding commands to the make files using a build system-specific method.
In general configure is not a good match for a cross-compiler like Emscripten. configure is designed to build natively for the local setup, and works hard to find the native build system and the local system headers. With a cross-compiler, you are targeting a different system, and ignoring these headers etc.
Emscripten supports .a archive files, which are bundles of object files. This is a simple format for libraries, that has special semantics - for example, the order of linking matters with .a files, but not with plain object files. For the most part those special semantics should work the same in Emscripten as elsewhere.
In addition to the capabilities it shares with gcc, emcc supports options to optimize code, control what debug information is emitted, generate HTML and other output formats, etc. These options are documented in the emcc tool reference (
./emcc --help on the command line).
Emscripten provides the following preprocessor macros that can be used to identify the compiler version and platform:
The preprocessor define
__EMSCRIPTEN__is always defined when compiling programs with Emscripten.
The preprocessor variables
__EMSCRIPTEN_tiny__specify, as integers, the currently used Emscripten compiler version.
Emscripten behaves like a variant of Unix, so the preprocessor defines
__unix__are always present when compiling code with Emscripten.
Emscripten uses Clang/LLVM as its underlying codegen compiler, so the preprocessor defines
__clang__are defined, and the preprocessor defines
__clang_patchlevel__indicate the version of Clang that is used.
Clang/LLVM is GCC-compatible, so the preprocessor defines
__GNUC_PATCHLEVEL__are also defined to represent the level of GCC compatibility that Clang/LLVM provides.
The preprocessor string
__VERSION__indicates the GCC compatible version, which is expanded to also show Emscripten version information.
__clang_version__is present and indicates both Emscripten and LLVM version information.
Emscripten is a 32-bit platform, so
size_tis a 32-bit unsigned integer,
When targeting SSEx SIMD APIs using one of the command line compiler flags
-msse4.1, one or more of the preprocessor flags
__SSE4_1__will be present to indicate available support for these instruction sets.
If targeting the pthreads multithreading support with the compiler & linker flag
-s USE_PTHREADS=1, the preprocessor define
__EMSCRIPTEN_PTHREADS__will be present.
Sometimes it can be useful to use a compiler wrapper in order to do things like
ccache the normal method of
simply wrapping the entire compiler should work, e.g.
ccache emcc. For
distributed builds it can be beneficial to run the emscripten driver locally and
distribute only the underlying clang commands. If this is desirable, the
COMPILER_WRAPPER setting in the config file can be used to add a wrapper
around the internal calls to clang. Like other config settings this can also be
set via an environment variable. e.g:
EM_COMPILER_WRAPPER=gomacc emcc -c hello.c
The Emscripten test suite (tests/runner.py) contains a number of good examples — large C/C++ projects that are built using their normal build systems as described above: freetype, openjpeg, zlib, bullet and poppler.
It is also worth looking at the build scripts in the ammo.js project.
Make sure to use
emar (which calls
llvm-ar), as the system
not support our object files.
emconfigure set the AR
environment variable correctly, but a build system might incorrectly hardcode
Similarly, using the system
ranlib instead of
emranlib (which calls
llvm-ranlib) may lead to problems, like not supporting our object files
and removing the index, leading to
archive has no index; run ranlib to add one from
wasm-ld. Again, using
emconfigure should avoid this by setting the env var RANLIB,
but a build system might have it hardcoded, or require you to
pass an option.
The compilation error
multiply defined symbol indicates that the project has linked a particular static library multiple times. The project will need to be changed so that the problem library is linked only once.
You can use
llvm-nm to see which symbols are defined in each bitcode file.
One solution is to use the dynamic-linking approach described above. This ensures that libraries are linked only once, in the final build stage.