WebAssembly, an executable format for the web

le 11/12/2017 par Arnaud Bétrémieux
Tags: Software Engineering

With the latest Web browser updates, there is now a standard for high-performance code execution via the Web. WebAssembly is indeed now available on Edge, Safari, Chrome and Firefox.

It promises a standard execution environment on all machines, regardless of hardware or OS. Some see in it the future platform for the deployment of universal applications — for mobiles, but not only. A platform that would enable performance very close to that of native applications. Others worry about the upcoming fragmentation of effort and methods for front-end development or lament the further erosion of the original values of the Web.

WebAssembly is the result of an important coordination effort between the major web browser editors. It is a binary format designed to allow what could until now be done only via JavaScript, at least in the most common browsers: distribution of code via the web for execution in the browser.

WebAssembly is designed to have better performance than JavaScript:

  • the format is more compact than even compressed JavaScript, therefore takes less time to download and parse
  • data is strongly typed, which facilitates automated code optimization
  • compilation and optimization once on the target machine are faster because the code, having been pre-compiled before transfer is closer to machine language, and has been pre-optimized

GWT, CoffeScript, TypeScript, Parenscript and the like, "languages" compiled to JavaScript, had already made of Javascript a kind of machine language for the Web. Mozilla had developed asm.js, a faster subset of JavaScript, destined to be the target of compilers like LLVM. Minification tools, in turn, mutilated JavaScript code to make it smaller, with the aim of saving download and parsing time. Google, with their NaCL project, allowed executing native code in a controlled environment within Chrome.

WebAssembly is in a way the concerted fusion, the logical evolution of all this. Its execution performance should enable new applications, around everything that requires high computation power on the client side, like audio or video processing, 3D or encryption.

For now, what the major browsers have implemented is what had been defined as the "MVP" for WebAssembly. This MVP provides:

For the moment, the tools can only compile C, C++ or Rust. Theoretically, any language can now be compiled for the browser, the easier if it is compilable with LLVM, which can generate WebAssembly. That is unless the language relies on Garbage Collection, since this aspect, while planned, is not available in the MVP. Among other aspects in development are exceptions, threads, and direct DOM manipulation (which is currently only possible through JavaScript).

The easiest way to play a bit with WebAssembly is to use WasmFiddle. It lets you write C code on one side, JS code on the other, and compile the C code to WebAssembly for running the whole thing, without leaving your browser.

Writing a "Hello World" program with WebAssembly can feel a bit strange: WebAssembly does not currently support interacting directly with the DOM or with the JavaScript console. To write something, we have to go through the interface with JavaScript. Another hurdle is that WebAssembly has only 4 types: floating point and integer, 32 and 64 bits. There is no string type or array, which means that the "Hello World" text will need to go from WebAssembly to JavaScript in the form of a pointer, which will indicate an area in WebAssembly memory where JavaScript will read integers coresponding to character codes. A char * in our C program becomes an i32 buffer in WebAssembly. We send JavaScript a pointer to this buffer, which it will read to make a string. Simple !

Let's start by writing our "Hello World" in C:

char* hello() {
  return "Hello World !";
}

We get the following WASM code, in which we can see the definition of our function, its "export" and the export of a memory propery that WasmFiddle automatically includes:

(module
  (table 0 anyfunc)
  (memory $0 1)
  (data (i32.const 16) "Hello World !\00")
  (export "memory" (memory $0))
  (export "hello" (func $hello))
  (func $hello (result i32)
    (i32.const 16)
  )
)

The Javascript code, which relies on WasmFiddle putting our compiled WebAssembly code in a "wasmCode" JavaScript array:

function makeStringFromASCIICodes(memory, pointer) {
  let s = "";
  for (i = pointer; memory[i]!==0; i++) {
    s += String.fromCharCode(memory[i]);
  }
  return s;
}

let wasmModule = new WebAssembly.Instance(new WebAssembly.Module(wasmCode));
let memory = new Uint8Array(wasmModule.exports.memory.buffer);
let pointer = wasmModule.exports.hello();
alert(makeStringFromASCIICodes(memory, pointer));

I have made the whole thing available as a WasmFiddle here: https://wasdk.github.io/WasmFiddle/?1aax07

What if we want to compile the same thing on our own machine ? The WebAssembly toolkit by DcodeIO is a good way to start. I found it much easier than direct use of the emscripten compiler as shown in the official tutorials.

I write a hello.c file, containing the same code as earlier, but with an additional "include", and an "export" keyword in front of the function definition. Both are conventions of dcodeIO's WebAssembly tool:

#include <webassembly.h>

export char* hello() {
  return "Hello World !";
}

To compile the file:

wa-compile -o hello.wasm hello.c

I can then look at the corresponding assembly with

wa-disassemble hello.wasm
(module
 (type $0 (func (result i32)))
 (import "env" "memory" (memory $0 1))
 (table 0 anyfunc)
 (data (i32.const 4) "0\'")
 (data (i32.const 16) "Hello World !")
 (export "hello" (func $0))
 (func $0 (type $0) (result i32)
  (i32.const 16)
 )
)

To make things work in a browser, I can use the same JavaScript as before, but I need to reproduce what WasmFiddle did for me: include the compiled assembly code in a JavaScript array, and add memory initialization. Using the JS library bundled in the DcodeIO toolkit would have been easier, as it initializes memory and provides a few utilities, but doing it by hand is a good way to understand what is involved.

To get the WebAssembly code in the form of a JavaScript array, I combined xxd and sed:

xxd -c 10000 -p hello.wasm  | sed -e 's/\w\w/0x\0, /g'

<html>
  <script type="text/javascript">
    var wasmCode = new Uint8Array([0x00, 0x61, 0x73, 0x6d, 0x01, 0x00, 0x00, 0x00, 0x01, 0x85, 0x80, 0x80, 0x80, 0x00, 0x01, 0x60, 0x00, 0x01, 0x7f, 0x02, 0x8f, 0x80, 0x80, 0x80, 0x00, 0x01, 0x03, 0x65, 0x6e, 0x76, 0x06, 0x6d, 0x65, 0x6d, 0x6f, 0x72, 0x79, 0x02, 0x00, 0x01, 0x03, 0x82, 0x80, 0x80, 0x80, 0x00, 0x01, 0x00, 0x04, 0x84, 0x80, 0x80, 0x80, 0x00, 0x01, 0x70, 0x00, 0x00, 0x07, 0x89, 0x80, 0x80, 0x80, 0x00, 0x01, 0x05, 0x68, 0x65, 0x6c, 0x6c, 0x6f, 0x00, 0x00, 0x09, 0x81, 0x80, 0x80, 0x80, 0x00, 0x00, 0x0a, 0x8a, 0x80, 0x80, 0x80, 0x00, 0x01, 0x84, 0x80, 0x80, 0x80, 0x00, 0x00, 0x41, 0x10, 0x0b, 0x0b, 0xa6, 0x80, 0x80, 0x80, 0x00, 0x03, 0x00, 0x41, 0x04, 0x0b, 0x04, 0x30, 0x27, 0x00, 0x00, 0x00, 0x41, 0x0c, 0x0b, 0x04, 0x00, 0x00, 0x00, 0x00, 0x00, 0x41, 0x10, 0x0b, 0x0e, 0x48, 0x65, 0x6c, 0x6c, 0x6f, 0x20, 0x57, 0x6f, 0x72, 0x6c, 0x64, 0x20, 0x21, 0x00]);

    function makeStringFromASCIICodes(memory, pointer) {
      let s = "";
      for (i = pointer; memory[i]!==0; i++) {
        s += String.fromCharCode(memory[i]);
      }
      return s;
    }

    let wasmMemory = new WebAssembly.Memory({ initial: 1 });
    let wasmModule = new WebAssembly.Instance(new WebAssembly.Module(wasmCode), {env: {memory: wasmMemory}});
    let memory = new Uint8Array(wasmMemory.buffer);
    let pointer = wasmModule.exports.hello();
    alert(makeStringFromASCIICodes(memory, pointer));
  </script>
</html>

Job done ! It was a bit more complicated that I imagined, but we got there.

WebAssembly certainly has a lot of potential. But the promise of a universal execution environment for Web clients was not far from the main "raison d'être" for Java at its beginnings. I hope that WebAssembly does not herald the return of applets finally become "cool", in one of the repetitions of history that are often seen in IT.

Being the logical evolution of minification and use of Javascript or asm.js as compilation targets, WebAssembly unfortunately participates in the loss of openness on the Web: we are farther and farther from the Web of free information sharing, the Web where you could know how any site was made by looking at the source code directly in your browser, the Web where the eternal CRUD application was not a "fat client" in my browser.

WebAssembly risks being used for things where one can wonder what gain it brings compared to native code, apart from control over code distribution. At a time when DRM has also reached universal support in browsers, I fear WebAssembly could be one of the blocks that allows trapping users more and more in an environment they have no control over. One of the blocks that enables further acceleration of the "cloudification" of software and data, that is to say of service as a software substitute, robbing users of the possibility of knowing what is done with their data, of verifying what code runs, of modifying that code, and of knowing that features available today will still be available tomorrow.

On the other hand, if WebAssembly, in the same spirit as what HTML5 has started, can help replace some applications with "augmented" websites, I'll be very happy.

I hope we can collectively find the right balance, but I expect a hard battle.