Tianyi Song

Profiling OCaml programs the quick and dirty way

Recently I followed the Ray Tracing in One Weekend tutorial to implement a path tracer in OCaml. I’ve always had a soft spot for ML languages, so I was pretty excited to finally try it out. However, rendering the final scene in the tutorial took very long, so I searched for profiling solutions in OCaml. As it turned out, profiling in OCaml wasn’t as easy as I thought.

What Worked: Landmarks

I’m using the ocaml-base-compiler version 4.12.0 with dune as my build system on MacOS. The simplest way to profile that I’ve found is to use landmarks .

Simply install landmarks with

opam install landmarks

Then, include the landmarks preprocessor in the dune file:

(executable
 (name test)
 (libraries landmarks)
 (preprocess (pps landmarks.ppx --auto))
)

If your project contains multiple dune files (e.g. one for library and one for executable), you need to apply the changes to all of them. Otherwise, the profiling result won’t include the symbols where the preprocessor isn’t added.

To generate the profiling result, set the OCAML_LANDMARKS environment variable, compile and run the program.

OCAML_LANDMARKS=format=json,output=profile.json dune exec bin/main.exe

This writes the output to profile.json. Then you can view the results with the online viewer (http://lexifi.github.io/landmarks/viewer.html). Note that although the README suggests to install a landmarks-viewer, the landmarks-viewer package doesn’t exist on OPAM (as of 4 Aug 2021). There’s an open GitHub Issue tracking this: LexiFi/landmarks#17 .

Here’s the profiling output:

NameLocationCallsCycles
ROOTsrc/landmark.ml0158 060 390 979
load(bin/ray_tracing)bin/ray_tracing.ml:11158 055 051 892
Bin/ray_tracing.ray_colorbin/ray_tracing.ml:1682500157 485 489 898
Lib/hittable.hitlib/hittable.ml:6219 328155 697 662 560

The reason for calling it “quick and dirty” is that the profiling result only shows how many times a user-defined function is called, and how many cycles it took. I can’t find the allocation information, nor the time spent on system/runtime functions. I tried running with sudo (which worked for Rust’s cargo flamegraph), but it didn’t help.

ocamloptp and ocamlprof didn’t work

Many online resources (e.g. Chapter 17 Profiling (ocamlprof) ) suggest using ocamloptp in place of ocamlopt to compile a binary that outputs profiling information, then using ocamlprof to read it.

This didn’t work for me because I’m using Dune instead of ocamlopt and it’s not possible to configure Dune to use ocamloptp as the compiler (see ocaml/dune#398 ).

I could compile the program without Dune, but I didn’t want to leave the comfort of a build system.

Spacetime didn’t work either

There’s an article written in 2017 by Jane Street introducing the Spacetime profiler . It looked promising at first, because we only need to configure to use the modified Spacetime compiler to profile the program.

But I quickly realized that the modified compiler version is too old: it’s version 4.04.0. It doesn’t even have the Float package, so I decided it’s too much work to make the program compile with the older compiler.