State of Eclair in 2023

compilers

Since we're at the start of a new year, I decided to write a post on the current status of the Eclair Datalog compiler and where the language is heading.

Looking back on 2022

In short, it's been a crazy year for Eclair. At the beginning of 2022, the compiler didn't produce a working program yet. In April, I figured out how to compile the first Eclair programs to LLVM IR. Now at the start of 2023, we have a lot more features:

  1. Strings were added to the language.
  2. Arithmetic, equality and comparison operators were added.
  3. Errors are rendered in a pretty way, with a lot of detail.
  4. WebAssembly is now a fully supported compilation target.

Not only features were added; also a large chunk of time was spent on creating a solid foundation for the future:

  1. The test suite has been rewritten to make use of LLVM LIT. LIT makes it trivial to setup complex test-scenarios to test the compiler thoroughly.
  2. A generic framework for transformations was introduced. It is compatible with all the IRs present in the compiler, making it possible to easily add new transformations to the compiler in a nano-pass style. Here's an example of constant folding and dead code elimination (WIP).
  3. Libraries such as eclair-haskell and eclair-wasm-bindings were created to communicate easily with Eclair from other languages. You can now even use something like bakery to compile Eclair to WebAssembly on the fly!
  4. We have a working LSP (OK, not all features are supported yet.. but it works very smooth already. 😄)

Most of these things were a side-project in their own regard, but they have paid off already and will continue do so in 2023.

Looking forward to 2023

That was a quick recap of 2022, but what's planned for the coming year? I tried to group together the main ideas and found the following big topics:

  1. Extending and maturing the language,
  2. Bootstrapping / self-hosting of the compiler,
  3. Documentation,
  4. Performance improvements.

Extending and maturing the language

The language has grown quickly the past couple of months, and a couple more big features will be added in the near future: logical negation in queries and @extern functions. Negation will make it possible to express a much larger set of queries/goals in Datalog. @extern functions will allow to call functions defined outside Eclair, keeping the runtime very compact.

After this, the language will probably stay the same feature-wise for a while, while focusing on maturing the language. This can include things such as improving the compiler, working on the surrounding ecosystem of libraries, ...

"Bootstrapping" the compiler

Another thing I want to focus on this year is bootstrapping the Eclair compiler. Compared to most other languages, this will be a different process for Eclair, since right now (Souffle) Datalog is used only for the semantic analysis part of the compiler. The plan here is to create a Eclair-to-Souffle converter, so that the semantic analysis can be fully written in Eclair itself and is only converted to Souffle for the initial "stage0" of the compiler. Visually this looks as follows:

Bootstrapping schematic

I want to move forward with the language, so I'm not going to spend a large amount of time on this, but bootstrapping does provide some benefits:

  1. Easier to maintain. Changes can be made in the same codebase, and the feedback is pretty much instant.
  2. Better developer experience. Eclair has good tree-sitter and LSP support and the errors are so much clearer than the ones provided by Souffle.
  3. It is a good test for checking the correctness of the compiler.
  4. Quicker compilation. The LLVM IR generated by Eclair compiles much faster than the C++ code generated by Souffle. This also allows us to ship a single binary by always using "compiled" mode, without having to switch back and forth between "interpreted" and "compiled" mode.

Documentation

Eclair is growing quickly in terms of features in the language, so it will become increasingly important that the language is well-documented. For this reason, I bought a domain where I will be hosting the docs. Right now there's nothing yet, but this will change soon!

I've already been experimenting with a tech stack for the website and I think that Astro will be a great choice for the website. It produces a mostly static site, but still offers the possibility to include Javascript in some places (for interactive code examples).

I'm also thinking about streaming some of these "documentation sessions" and turning them into an AMA-style format. Let me know what you think about that!

Performance

The final big topic for Eclair this year will be performance. I consider this a feature on it's own, because if the language is slow, not many people will use it. (You want your queries to be fast, right?!) Right now the speed is "respectable", but a lot of things can be improved still!

To get performance that is comparable (or better) to Souffle, extra optimizations need to be added to produce efficient queries. On top of that, there is still a lot of low hanging fruit in the language runtime that can also be improved. I'm looking forward to applying techniques I know from my C++ days to get good performance (e.g. SIMD, custom allocators, optimizing for cache locality, ... all that good stuff 😄).

Of course, to measure the performance of Eclair, several benchmarks will need to be written. Luckily, Souffle already has a good set of benchmarks. With a Souffle-to-Eclair converter, it should be reasonably easy to plug Eclair programs into these same set of benchmarks. I will probably write another post later this year on my progress regarding this, so stay tuned for that.

Closing thoughts

If you reached this point of the post, I hope I convinced you that 2023 is going to be an exciting year for Eclair! If you are interested in helping out, reach out to me on my Twitter. If you have any feedback on this post (or if you just liked it), you can also let me know there. If a lot of people end up liking this, I might make these kinds of progress posts more common.