The Langium Playground

Tue Jan 17 2023 by Markus Rudolph

Langium has its own playground, where you can now test the conversion of text into a well-defined model. It runs without support from the backend.

Pardon? Langium?!

Langium is a tool to create languages (and feature-rich editors). And languages are raw texts that are both human-readable and machine-readable. Think of XML, C source code or CSS.

But to transform a text into an explicit structure, there is some work that needs to be done. A text transformation in Langium is described by a grammar.

Let's take the following text as an example (playground link):

grammar HelloWorld

entry Model:
    (persons+=Person | greetings+=Greeting)*;

Person:
    'person' name=ID;

Greeting:
    'Hello' person=[Person:ID] '!';

hidden terminal WS: /\s+/;
terminal ID: /[_a-zA-Z][\w_]*/;

A grammar – in the context of compiler construction – is a set of rules that spreads out all possible input texts that you want to recognize.

The shape of a rule can be compared with a function definition in your favorite imperative programming language: the left part preceding the colon is the signature of the function, the right part is its body.

The rule body consists of calls to other rules (by rule name) or terminals which represent actual text that gets recognized. Let's have a look at the rule Person.

Person: 'person' name=ID;

Here: Person is the name of a rule, 'person' is a text and name=ID is the call of the terminal rule ID whose parsed text is assigned to a property name. Let's say we have the input person Markus. The result of applying Langium is a model that contains a Person object with the name Markus.

What does the playground do?

The Langium playground can help you to sketch a text-to-model transformation (parsing). And you can prototype and test Langium immediately without installing anything.

All you need to know is the purpose of each panel inside of the playground.

  1. The first panel is the grammar editor: Here we develop the shape of the language. The term shape is also meant as a resulting JavaScript object model that you can use after the parsing is done.
  2. The second panel contains one instance of your language. One instance of the given grammar could be a list of person declarations, like person Markus.
  3. The third panel is initially hidden and can be made visible using the tree button above the content panel. Here you will get the syntax tree of your content in the shape of your grammar.

Every change in the grammar triggers a rebuild of the parsing toolchain. And every content change triggers a new parsing attempt of the content, which populates the syntax tree in the third panel.

So, just drop some lines into the content panel. Just take one instance of your new language. Then, you can start your grammar and improve it until all errors are gone. Fantastic, isn't it?!

You are now able to test and share your ideas. The playground will help you to understand single abstract syntax trees.

Knowing the syntax tree is the key to master validators, generators and even interpreters.

Let's create a JSON parser!

Here is the plan for creating a JSON parser:

  1. we add the primitive values for boolean, true and false.
  2. we add the complex value of list to hold booleans.
  3. we add the object value.
  4. we add the remaining values (number, null, undefined).

Let's do it!

1. Add boolean values (solution)

Simple character data like false or true can be described as terminals. So, just write:

terminal BOOLEAN: /true|false/;

Every grammar has an entry parser rule, to signal where the language starts from. Please add:

entry Value: BooleanValue;
BooleanValue: value=BOOLEAN;

The assignment value=BOOLEAN takes the content of the boolean value and assigns it to a property value in a surrounding model typed as BooleanValue.

If you want to ignore whitespace add this line as well. The hidden keyword filters out all whitespace from the stream of terminals during the parsing process, so you do not have to care about it.

hidden terminal WS: /\s+/;

If you fill in true to the content panel, you will get a nice syntax tree with your value true in it.

2. Add list values (solution)

Out goal now is to add list values. List contain arbitrary values (like lists and booleans for now). So, we add a definition for the list, first:

List:
    '[' (elements+=Value (',' elements+=Value)*)? ']';

Mind the elements+=Value assignments. The += creates a list of values for you. Afterwards, you only need to reference it in the Value rule by adding an alternative with the or operator |:

entry Value: BooleanValue | List;

Now, test it with a list of booleans like [true, false].

Amazing!

3. Add object values (solution)

For an object we need a string value to model the name of each entry. At the same time we can introduce a new Value rule alternative.

StringValue: value=STRING;
terminal STRING: /"[^"]*"/;

Now you can add a series of name-value pairs:


NameValuePair: name=STRING ':' value=Value;

And finally, add the Object rule:

entry Value: BooleanValue | List | StringValue | Object;
Object:
    '{' (nameValuePairs+=NameValuePair (',' nameValuePairs+=NameValuePair)*)? '}';

Test it!

{
  "boolean": false,
  "string": "abc",
  "LIST": ["", false],
  "object": {"hello": "world"}
}

Great! Beautiful :).

4. Practice: Add the remaining values

  • for number
  • for null
  • hint: you can set the type of terminals by adding a returns statement:
    terminal NUMBER returns number: ...
    

Here is the solution.

Happy coding!

Read More