Mar 5th 2024

Langium 3.0 is Released!

Benjamin F. WilsonBenjamin F. Wilson

We are happy to announce the release of Langium 3.0! We’ve been spending quite a lot of thought & effort into what would make Langium better for everyone, and we’re excited to share what we have in 3.0.

Originally we had planned to roll out this release sometime in December 2023, but we decided to wait a bit longer to get some extra features in. Since the last minor release (2.1.0) was back in early November, there’s quite a bit to share here.

Overall, this release is more focused on improving the way Langium is used in projects, in support of larger frameworks, and as a parsing tool in conjunction with an IDE. We feel that as Langium continues to mature that it’s becoming increasingly important to deliver a quality language engineering framework based on the feedback we receive, while also maintaining the simplicity & flexibility that are core to Langium’s principles.

This brings its own challenges as we continue to see Langium be used in a wider range of applications, but we’re more than happy to continue to rise to that challenge.

Adoption

Since Langium was brought into the Eclipse foundation as Eclipse Langium in 2023, we’ve been progressing steadily on establishing Langium as a standard. Already we’ve seen Langium being used & planned in a large number of contexts. This includes projects such as MermaidJS, ZenStack, and Jayvee. We’ve also seen applications of Langium range from diagramming, machine learning, modeling, prompt engineering, finance, and more.

We recently asked the community about how Langium is being used in the wild. And we’ve gotten feedback that suggests the vast majority of usage is in IDE support + parsing.

Poll

Granted we have cases as well for parsing alone, but it’s clear that Langium is finding a natural home in the realm of IDEs (particularly VSCode) along with parsing.

New Features

This release brings in a ton of new improvements & advanced features that we’ll talk about in more detail below:

  • Async parsing support
  • Reduced bundle size + Granular package exports
  • Introduced a new DocumentUpdateHandler service
  • Support for default values on interface properties
  • Support for lookbehind regexes in terminals

Additionally, we’ve also made some smaller improvements, such as:

  • Improved the JSONSerializer to support references to other documents
  • Better error reporting
  • Added a Langium file icon for all .langium files

This list is non-exhaustive, as there’s quite more that was added. If you’re curious, feel free to check out the CHANGELOG for Langium 3.0. Since we have several other packages contained in this repo, we have several sub-CHANGELOGs for each package, so you can look into the ones that are most relevant to you. However, most of these changes are concentrated in the core Langium CHANGELOG linked above.

Main Impacts

For those of you that are on version 2.0, you’ll need to factor in the new package exports (which are explained in more detail below). The functionality remains the same as it was, but your application will need to reference the correct ones to function.

In most cases, these changes will include importing from langium/lsp instead of langium directly, especially for your language-server & any LSP related functionality. In cases where you cant’ find what you need in /lsp, it’s likely in another one of the exports listed below. We were careful enough to keep the overall API surface quite similar to what it was like in 2.0, but divided up this time around.

There is also a breaking change with regards to the introduction of the DocumentUpdateHandler service. For more details, you can read the section on that below.

Async Parsing Support

In cases where large projects need to stop parsing for any reason, we’ve introduced async parsing support to help. This allows you to parse off the main thread for those particularly large projects, and gives you the freedom to stop parsing as needed. However, the default is still synchronous parsing as it’s always been, so existing projects don’t need to worry.

To support this feature, adopters will need to opt into creating their own WorkerThreadAsyncParser, which will be used to instantiate their worker parser file (ex. ‘worker-parser.js’, or whatever you choose to name it). All this needs to do is parse the incoming text, and then return the dehydrated AST back. You can see this in practice in the following test files:

For more details, you can check out PR #1352 where this was first introduced.

Reduced Bundle Size + Granular Package Exports

Since we moved to ESM back in Langium version 2.0.0, we’ve also been taking advantage of the more granular exports we can produce. Not only has this given us a nice opportunity to think about the project structure, but it also means developers can get a reduced bundle size due to tree shaking. Beyond developers, end-users of Langium applications can also see a reduced application size; which is particularly relevant for web-based deployments.

Going back to the exports, here’s a list of some of the ones we’ve added, and what they contain:

  • /generate
    • Code generator logic, including template string helpers. Useful for those that want to use our generator functionality in their own Language.
  • /grammar
    • Anything related to Langium’s own grammar language. Generated types, validation, type checking, etc… Useful when you want to do work with Langium’s own grammar language directly.
  • /lsp
    • Services & functionality that are either directly (or indirectly) tied to the Language Server Protocol. Most of the time you’ll be using imports from this package for applications, such as those with language-servers.
  • /node
    • Has the Node file system provider, for those that intend to use Langium in a Node environment.
  • /test
    • Test helpers that are useful for Langium-based languages. Helpful in verifying how your language is working & behaving, without having to write all the testing logic yourself.
  • .
    • Core functionality to the framework, without LSP functionality. This is useful if you’re just using Langium in a lightweight context, such as only parsing.

Depending on your use case you can mix & match these imports to get the features you need, whilst avoiding those you don’t. In particular, being able to not include LSP related functionality can save you a nice bit of space in your bundle (about 100KB or so).

Document Update Handler

We added the DocumentUpdateHandler to provide a more customizable way to handle document changes. Previously this was all wired together into a single function, which made it more difficult to make changes.

In particular, if the client is watching files in your workspace, this service will now report back on whether changes have been made to the language server. We’ve had this support already, but pulling it into a dedicated service makes it easier to extend, modify, and use as you see fit.

It’s important to note that this is a breaking change from Langium 2.X and earlier. Previously, developers had to register a file system watcher in their VSCode extension logic. Now, the language server takes care of this through the LSP API.

Support for Default Values on Interface Properties

We’ve added default value support for properties of declared interfaces. As an example, this means you can now do something like this in Langium:

interface Test {
  value: number = 123;
}

And use this declaration to produce an object of type Test, where the value property is either set to a parsed number or defaults to 123.

entry Model:
  {Test} 'val' (value=INT)?;

terminal INT returns number: /\d+/;

This rule will produce an object of type Test, where the value property is either set to a number that can be recognized by the INT terminal or defaults to 123. This is particularly helpful in cases like the one outlined above, where your grammar may make certain properties optional, but you still want to ensure you get some concrete value other than undefined.

Lookbehind Regexes in Terminals

There’s some RegEx syntax that is not currently recognized in Chevrotain (the parser builder we use in Langium), but is increasingly relevant in projects. One key example is lookbehind. These still work as expected when used, but they also cause Chevrotain to spit out quite a few errors along the way. This makes output cluttered and hard to read. To fix this we’ve changed the way we pass terminal definitions to Chevrotain. This new approach retains behavior as it was before, and it’ll make your langium:generate output much cleaner in these cases.

Conclusion

All in all, we’re happy to be sharing this new release of Langium with everyone, and we’re looking forward to what you’ll all be building as well! We’re always open to feedback as well, and you can engage with us on Github Discussions.

We really value what the community sees in Langium, and we’ll continue to take that into account as we develop & improve Langium going forward.

About the Author

Benjamin F. Wilson

Benjamin F. Wilson

Ben is an eclectic language engineering enthusiast. Proficient in full-stack development and well-versed in type systems. He is passionate about software architecture, analysis, and employing effective solution patterns. When he’s not at work, you can find him building tools & electronics with recycled parts.