Tree-sitter adventures: Language injections

Posted on Mo 18 Dezember 2023 in Neovim

I'm currently working on a personal fun project that tries to make this code valid Rust:

#[sql]
fn get_subjects() -> Vec<i32> {
    "select id from subjects"
}

In case you don't know Rust: #[sql] is a macro that rewrites the function to make it legal code. My goal is to be able to write inline SQL queries as convenient and with as little boilerplate as possible. But the point of this article is that the above code snippet looks like this in my editor:

Screenshot showing highlighted Rust & SQL code

So the SQL code inside the Rust string is highlighted properly. That's cool, but it surprised me even more that it still worked while writing this article in Markdown:

Screenshot showing highlighted Markdown, Rust & SQL code

SQL inside Rust inside Markdown and still everything is properly highlighted.

How does this work?

This is possible due to the Tree-sitter integration in Neovim. More or less everything you have to know was already explained in of TJs videos. Well, not everything: The Tree-sitter integration is rather new, some features required plugins in the past, but will be integrated in 0.10, APIs changed, ... Long story short: I'll walk you through my steps required to make this working in Neovim v0.10.0-dev-1836+g36552adb3.

Tree-sitter queries

Tree-sitter allows you to write queries to pick parts of the code in your editor. Just open a source file in Neovim and execute :InspectTree (remember, you need 0.10 for all of this to work). You get the parsing tree of Tree-sitter. Just move the cursor around in any of both windows and observe what is highlighted in the other one. If you have basic knowledge how language parsing and ASTs work, you should get the idea.

The query I came up with for the above example is:

;extends

(
 (attribute_item
   (attribute
     (identifier) @ident (#any-of? @ident "sql")
    )
 ) 
 .
 (function_item
   body: (block (string_literal) @injection.content)
 )
 (#offset! @injection.content 0 1 0 -1)
 (#set! injection.language "sql")
)

The part before the dot looks for the attribute (#[sql]). #any-of? allows to check against a list of strings, which I did in the beginning, but the ended up with just "sql". The @ident (starting with @) is a "capture". The name ident is arbitrary and could also be xyz. But the second one is a built-in name expected by Neovim. The content to be highlighted has to go into @injection.content and @injection.language needs to hold the name of language to be used for highlighting.

The dot means that the next pattern must match directly after the first one. Otherwise it would match any later sibling. Then I match for the string literal inside the body of the function, which is self-explanatory after you got the idea.

Now for a tricky detail: The pattern matches the full string, including the double quotes. If you would highlight that one, it would still be a string. The highlighting needs to be applied to the content of the string without double quotes. The #offset! is doing just that: It removes the first and last characters. Finally I use #set! to tell Tree-sitter which language to highlight.

injections.scm

To give it a try put the query into a file called injections.scm in after/queries/rust in one of your config pathes. Copy the initial Rust snippet into a *.rs file, open it and see what happens.

Highlighting

Motivated by my success I wondered whether it would be possible to highlight the SQL a bit more. And of course it is. You just need a slightly modified query:

; extends

(
 (attribute_item
   (attribute
     (identifier) @x (#any-of? @x "sql")
    )
   ) 
 .
 (function_item
   body: (block (string_literal) @inlinesql)
 )
)

It is more or less the same, mainly different capture names. The @x does not really matter beside checking for the name of the identifier. But if you copy this query into queries/rust/highlights.scm (without after, don't ask me why) the @inlinesql automatically defines a highlighting group in Neovim. You could now execute :hi @inlinesql guibg=#502040 and the SQL code will get a different background color.

Conceal

Neovim has the option to conceal certain characters, so I wondered whether I could hide the double quotes around the SQL code. Long story short: It should be possible, but it does not work.

You can write a query to match only the double quotes and would then be able to conceal them. I had a long conversation in the Neovim-treesitter Matrix channel and very nice and helpful people figured out that something is not fully working yet. So it will be possible in the future, but does not yet work with my 0.10 version.

Feedback

I don't like writing. I hate it. But I like good discussions about technology, code. I consider such blog posts as an invite for exchange. If you like it, have improvements, questions, ... please let me know. You find me at Mastodon as @achim@social.saarland. My email is firstname@lastname.de.

Credits

This post would not exist without input from various people: Clason in the Matrix was insanely patient and helpful in figurint out problems with my setup. TJ and ThePrimagen provide great and encouraging videos, which I would not even know without Sascha pointing me to them.