Js_of_ocaml in the VSCode OCaml Platform

2020/10/21

For about two weeks now, the published version of the VSCode OCaml Platform extension has had something special about it.

It is using Js_of_ocaml! This is the result of a month-long effort to switch the extension’s OCaml-to-JS compiler from BuckleScript to Js_of_ocaml.

In this post, I will describe the extension, explain the reasoning for switching to Js_of_ocaml, and go over some of the things I learned through the porting experience.

The OCaml Platform

The VSCode OCaml Platform extension is part of the larger OCaml Platform; it interacts directly with OCaml-LSP, an implementation of the Language Server Protocol for OCaml editor support. The OCaml-LSP language server provides editor features like code completion, go to definition, formatting with ocamlformat, and error highlighting.

OCaml-LSP can be used from any editor that supports the protocol, but the VSCode extension provides additional features: managing different package manager sandboxes; syntax highlighting of many OCaml-related filetypes; and integration with VSCode tasks, snippets, and indentation rules.

Both the language server and VSCode extension are continuously tested to ensure compatibility for opam and esy on macOS, Linux, and even Windows.

Making OCaml more accessible is a goal for these projects. Providing support for a popular cross-platform code editor and the widely supported language server protocol helps achieve that goal.

BuckleScript vs. Js_of_ocaml

BuckleScript and Js_of_ocaml are technologies that accomplish a similar goal: compiling OCaml to Javascript code. However, there are a few differences that made it worthwhile to switch to Js_of_ocaml.

The ways in which BuckleScript and Js_of_ocaml approach compiling to JS are notably different. BuckleScript compiles from an early intermediate representation of the OCaml compiler to generate small JS files for each OCaml module; this is effective but fixes the OCaml language to a certain version (4.06.1 at the time of writing). Js_of_ocaml takes a different approach and generates JS from OCaml bytecode, which is more stable across different versions of OCaml but might provide less information. Using Js_of_ocaml allows the VSCode extension to be built with a recent version of OCaml (4.11.1 right now).

BuckleScript has undergone a rebranding and it is now called ReScript with the addition of a new syntax. At one point, OCaml documentation was removed from the website. As I revisit the site today, OCaml documentation has returned in the old v8.0.0 documentation as “Older Syntax”. It does seem that OCaml and ReasonML will be technically supported for now (forever?), but the project feels more distant from OCaml than it did as BuckleScript.

Js_of_ocaml, on the other hand, is deeply integrated with the OCaml language and ecosystem. This integration is great for existing OCaml developers because it means complex JS projects can be built with the excellent dune build system with access to most of the same opam packages as native projects.

For a more in-depth comparison of the two technologies, I recommend reading @jchavarri’s post about the topic.

gen_js_api

For a VSCode extension, there are many bindings that have to be created for interaction with the VSCode extension API. For the BuckleScript version of the extension, we used the built-in syntax for bindings.

For example, to bind to vscode.window.createOutputChannel in BuckleScript:

external createOutputChannel : name:string -> OutputChannel.t
  = "createOutputChannel"
  [@@bs.module "vscode"] [@@bs.scope "window"]

This expresses that createOutputChannel is a function from the window namespace of the vscode node module. The bs.module annotation automatically inserts the require for the vscode module in the generated JavaScript.

There are a few ways to express the same things in Js_of_ocaml. The first is to use the provided functions in the Js_of_ocaml.Js module to manually convert between OCaml and JS types:

let createOutputChannel ~name =
  Js.Unsafe.global##.vscode##.window##createOutputChannel [| Js.string name |]

Notice that certain OCaml types (like strings) have to be converted into their JS representation. Doing these conversions manually may work for small libraries, but it would be impractical to do that for every binding and type in the expansive VSCode API.

For that reason, gen_js_api is a great alternative. The same binding with gen_js_api looks like this:

val createOutputChannel : name:string -> OutputChannel.t
  [@@js.global "vscode.window.createOutputChannel"]

What gen_js_api will do is generate code that automatically calls Ojs.string_to_js for the parameter and OutputChannel.t_of_js for the return value. An OCaml value can be converted a JS value if it is a “JS-able” type, or if the appropriate of_js/to_js functions exist.

It is important to note that unlike BuckleScript, gen_js_api is actually doing a conversion between values. If a function binding is written that returns an OCaml record, modifying a field of that record only modifies the record itself; the original JS value is untouched. This is different from BuckleScript, where an OCaml type directly corresponds to its JS data representation.

To avoid this, it is possible to keep values as an abstract Ojs.t type, which are the unconverted JS values. Accessing and setting fields can be done with a function annotated with [@@js.set] or [@@js.get]. This method was used for the entirety of the extension’s VSCode bindings, which can be found here.

Node Modules

As mentioned previously, BuckleScript provides a way to reference Node modules with [@@bs.module]. As far as I know, there is no simple equivalent provided by Js_of_ocaml or gen_js_api at the moment. Fortunately, there is a simple workaround with a single JS stub file:

joo_global_object.vscode = require("vscode");

In this JavaScript file, joo_global_object refers to the same value as Js.Unsafe.global. Setting the vscode field this way allows it to be referenced by the gen_js_api functions globally.

For this JavaScript file to be used, the library’s dune configuration must be updated with:

(js_of_ocaml (javascript_files vscode_stub.js))

Afterward, bindings can be created that reference vscode and its namespaces or values.

There is some ongoing work in gen_js_api that may improve the interaction between node modules and scopes.

JSON

JSON is a staple of many browser and node.js projects; the VSCode extension is no different. In the extension, JSON is used to (de)serialize user settings and interact with the language server using JSON-RPC.

When the extension was built with BuckleScript, it used @glennsl’s bs-json library for composable encoding and decoding functions. bs-json wasn’t available for Js_of_ocaml, so I decided to reimplement it for Js_of_ocaml as jsonoo (opam). The documentation for jsonoo is available here.

The main idea is that there are decoder and encoder types which are Jsonoo.t -> 'a and 'a -> Jsonoo.t function types, respectively. The functions provided by jsonoo can easily compose decoder or encoder functions to handle complex JSON values. The Jsonoo.t type is represented as a JS value and uses the Js_of_ocaml library to convert between OCaml values and JavaScript JSON values.

For example, to try to decode a list of integers, returning a default value otherwise:

let decode json =
  let open Jsonoo.Decode in
  try_default [] (list int) json

Since jsonoo provides t_of_js and t_to_js functions, it is also possible to use the JSON type with gen_js_api bindings:

val send_json : t -> Jsonoo.t -> unit [@@js.call]

jsonoo seems to work well for its purpose of a Js_of_ocaml JSON library, but @rgrinberg brought up the point that this fragments the JSON libraries based on their underlying JSON implementation. For that reason, it may be worthwhile to look into json-data-encoding, an alternative that allows using the same API across different JSON representations.

Promises

As an extension that primarily operates on the user-interface level, asynchronous operations through the JS Promise API are very important for a smooth user experience. Creating bindings to the promise functions seems straightforward at first, but you will eventually find that JS promises have a soundness problem.

For example, with a direct binding to the resolve function, one would expect that for every value passed to the function it would return that value wrapped in a promise.

val resolve : 'a -> 'a promise
let x : int promise = resolve 1
let y : int promise promise = resolve (resolve 2) (* flattened! *)

Everything seems fine from the OCaml side, but it turns out that JavaScript automatically flattens nested promises by following the then function of any value that is passed to it. Even though y appears to have the int promise promise type, the JS representation will be flattened to int promise. This is obviously a bad sign because the type system is misrepresenting the data, which will surely result in nasty runtime errors.

So how do we prevent the promise functions from following a value’s then functions? The solution is simple: ensure that the JS functions never receive values that have a then function in the first place.

Using a technique I first saw in @aantron’s promise library for BuckleScript, it is possible to check for values that have a then function and wrap them in an object to prevent the JS functions from calling it:

function IndirectPromise(promise) {
	this.underlying = promise;
}

function wrap(value) {
	if (
		value !== undefined &&
		value !== null &&
		typeof value.then === "function"
	) {
		return new IndirectPromise(value);
	} else {
		return value;
	}
}

function unwrap(value) {
	if (value instanceof IndirectPromise) {
		return value.underlying;
	} else {
		return value;
	}
}

Calling wrap on every value that is passed to Promise.resolve and calling unwrap on every resolved (completed) value will make the behavior consistent. Doing the wrapping and unwrapping for each unsound promise function binding will result in an API that is suitable for type-safe usage.

The final product of these bindings is the promise_jsoo (opam) library for Js_of_ocaml. It includes bindings for the majority of the JS API, as well as supplemental functions that make it easier to interoperate with OCaml. promise_jsoo provides the necessary functions to use it with gen_js_api. The documentation for promise_jsoo is available here.

As an added bonus, running an OCaml version of at least 4.08 (which wasn’t possible with BuckleScript) allows using binding operators:

val let* : 'a promise -> ('a -> 'b promise) -> 'b promise
let async_function () : int promise =
  let* first_num = get_num () in
  let* second_num = get_num () in
  async_calculation first_num second_num

This syntax is reminiscent of await syntax in other languages and it is a good alternative to the previous style of monadic operators:

let async_function () : int promise =
  get_num () >>= fun first_num ->
  get_num () >>= fun second_num ->
  async_calculation first_num second_num

promise_jsoo has a comprehensive test suite that should give weight to its claims of type safety. It was difficult to find a testing library that would work for an asynchronous library in Js_of_ocaml, but I eventually found webtest by @johnelse.

In the future, I’d like to investigate giving types to promise rejections and providing a simple way to convert to Async or Lwt types.

Sys.unix

Apparently, the value of Sys.unix in Js_of_ocaml is always true. The system seems to be hardcoded in Js_of_ocaml’s JS runtime, which caused problems for path handling with the Filename OCaml module on a certain operating system (sorry Windows users!).

I assume the reason for the hardcoded system is because of a lack of a good way to get the operating system across different runtimes (browser, node.js). The browser has user agents and node.js has process.platform, but not vice versa.

As a workaround, the VSCode extension just uses bindings to the node path module for proper cross-platform path handling since the extension already depends on node.js.

Closing

Overall, I am very happy with the transition to Js_of_ocaml. The ability to use the same build system and packages for native and JS projects leads to a smooth and enjoyable development experience. I am still learning the quirks of the JS target, but for the most part, Js_of_ocaml just works.

The VSCode OCaml Platform is an actively developed project with numerous contributors, so please feel free to submit an issue or contribute a pull-request.

If you have any questions or comments about this post, I can answer them on the OCaml Forum topic.