Adding a Tree-Sitter to Helix

This is barely worth a blog, but I’d like to avoid the voyage of discovery the next time I do this. I’ll almost certainly forget by then. This is a walk- through of adding someone else’s tree-sitter to Helix.

Tree-sitters are a wonderful development in the editor space. They’re a sort of standardized language grammar that a text editor can use to mark up text – a step above trying to declare a BNF via a bunch of regular expressions. I’m not sure about the history of tree-sitters, but in any case, the very best editors use them (nvim and Helix, at the least), which means that creators of things that have a grammar can provide a single tree-sitter that can be used by many editors – without each editor having to re-invent the grammar wheel.

d2 is a new-ish, excellent diagramming tool descending (inheriting?) from GraphViz Dot, and for which someone has provided a tree-sitter, ostensibly for nvim. It turned out to be simple to get into Helix.

The Helix instructions for installing a tree-sitter are pretty good, but some steps are not immediately obvious from the documentation.

Modify languages.toml #

I chose to modify my user file rather than the global toml; I didn’t want to modify the global file in case the package manager overwrote my changes when I upgraded Helix.

I added this to my $XDG_CONFIG_HOME/helix/languages.toml:

[[language]]
name="d2"
auto-format=true
scope="text.d2"
file-types=["d2"]
roots=[]

[[grammar]]
name="d2"
source={git="https://github.com/pleshevskiy/tree-sitter-d2", rev="47cb1df7c8c1fb1b72f2e8fa43215908cf419517"}

Update #

@pleshevskiy has archived their github project, and it now lives at https://git.pleshevski.ru/pleshevskiy/tree-sitter-d2. Also, I figured out that HEAD works as a rev :-/. So the source line in languages.toml should read:

source={git="https://git.pleshevski.ru/pleshevskiy/tree-sitter-d2", rev="HEAD"}

Build the Tree-Sitter #

Update #

This section is now outdated; the last time I added a tree-sitter on a new machine, none of this was necessary. Helix built the sitter entirely in ~/.config/helix, so now the steps are simply:

  1. Adding the lines to languages.toml per the previous section
  2. helix -g fetch
  3. helix -g build
  4. Symlinking the queries; something like this should work:
    ln -s ~/.config/helix/runtime/grammars/sources/d2/queries \
          ~/.config/helix/runtime/queries/d2
    

and checking the output of helix --health languages. Couldn’t be easier!

Old instructions #

You’re supposed to run hx -g fetch and then hx -g build, but this will try to install files in /usr/lib/helix/runtime/grammars. If you try to sudo the commands, it won’t install anything… because the changes are in your $USER toml. I solved this, for better or worse, by giving myself write permissions to the directories. I’m not sure if this was “correct,” but any way I can see using the documented procedure requires root access (or what I did). In any case:

$ sudo setfacl -m u:ser:rwx /usr/lib/helix/runtime/grammars
$ sudo setfacl -m u:ser:rwx /usr/lib/helix/runtime/grammars/sources

The latter directory is needed by the fetch command, which will download the repo into that sources directory; e.g., /usr/lib/helix/runtime/grammars/sources/d2. The former is required by the build command, which puts the resulting .so there.

I also had to run this:

$ hx -g fetch | grep 'git config' | while read -r line; do
eval ${line[*]}
done

to silence git complaints about every single tree-sitter. There’s probably a better way to do that, too, but I loath git and wasn’t going to waste my time trying to figure out more of its Byzantine maze of commands and config options1.

After this, run the fetch and build commands:

$ hx -g fetch
$ hx -g build

With luck, both of these commands will succeed, and here’s where you are now:

$ hx --health languages | rg '^(Language|d2)'
Language         LSP              DAP              Highlight        Textobject       Indent
d2               None             None             ✘                ✘                ✘

Which is nothing: we’re still missing the queries.

Get the Queries and Indents #

For some reason, you have to manually copy the queries into the right place; I’d guess that there may be that there’s just no standardized directory structure for tree-sitter projects, so Helix can’t know where these files are. In any case, locate the queries in the tree-sitter repo – they’ll be .scm files – and copy them into the right directory for Helix. In my case, the paths were:

$ sudo ln -s /usr/lib/helix/runtime/grammars/sources/d2/queries/ /usr/lib/helix/runtime/queries/d2

I did a symlink to catch future updates. The first path was created by hx -g fetch, and the second is where Helix expects the queries to be. This gets us to:

$ hx --health languages | rg '^(Language|d2)'
Language         LSP              DAP              Highlight        Textobject       Indent
d2               None             None             ✓                ✘                ✘

Highlighting! Woot!

Improvements #

The instructions I followed are here, and I got some extra pointers from the #helix-editor:matrix.org Matrix room.

The biggest gap in the documentation is that it assumes the reader is writing queries, or a tree-sitter; it’s not oriented to someone who (like me, in this case) found a tree-sitter for nvim and wanted to use it in Helix. Since tree- sitters are supposed to be editor-agnostic, it’d be nice if the instructions took the approach of someone who’s using a tree-sitter for a different editor.

I admit that I may be missing something, but I think that hx needs a --user option to install tree-sitters in the user’s directory. I see no way for a user who doesn’t have root access to install tree-sitters, unless they containerize their Helix or some such nonsense.

Conclusion #

And that’s basically it, for d2. It’s almost as good as the GraphViz dot support, which also has no Textobject or Indent queries. If you have an LSP, you’d want to install and configure that as well; there’s none for d2, so there was nothing for me to do there.

Standardizing grammars and syntax highlighting for editors so they can be shared is nearly as brilliant a development as LSP was; IME it works pretty well.


  1. Seriously. Look at jj, git-branchless, or any of the other dozen projects that try to make the git UI not suck↩︎