Ioannis Panagopoulos blog

Tutorials on HTML5, Javascript, WinRT and .NET

Domain Specific Languages with M - Part 2/3 Attributes and Productions

by Ioannis Panagopoulos

In the previous post we have covered the basics of the M language concerning the development of a Domain Specific Language (DSL). In this post we will extent the syntax of the described DSL by adding attributes and production rules.

On feature that is kinda interesting is the ability to use attributes to define syntax highliting to the left panel of Intellipad (where your input text appears). Those attributes decorate the tokens of the syntax and are introduced by the @{Classification["Keyword"]} , @{Classification["String"]} etc attributes.

 

module Renting
{
    language RentLanguage
    {
        interleave whitespace = (" " | "\r" | "\n")+ | Comment;
       
        syntax Main= Charges+;
        syntax Charges=ChargeStartToken "(" ChargeName ")" ChargeBody ChargeEndToken;
        syntax ChargeBody=ChargeRule+;
        syntax ChargeRule=ForToken PeriodAmount Period ChargeToken CurrencyAmount CurrencyToken "per" Period "."
                         |EventuallyToken ChargeToken CurrencyAmount CurrencyToken PerToken Period ".";
        token PeriodAmount=('0'..'9')+;
       
        @{Classification["Keyword"]} 
        token Period="week"|"weeks"|"month"|"months"|"days"|"day"|"year"|"years";
        token CurrencyAmount=('0'..'9')+ DecimalSeparator ('0'..'9')* | ('0'..'9')+;
        token ChargeName=QuotedIdentifier;
        token DecimalSeparator=",";
        @{Classification["Keyword"]} 
        token ChargeStartToken="Scenario";
        @{Classification["Keyword"]} 
        token ChargeEndToken="End";
        @{Classification["Keyword"]} 
        token ForToken="For";
        @{Classification["Keyword"]} 
        token EventuallyToken="Eventually";
        @{Classification["Keyword"]} 
        token CurrencyToken="$";
        @{Classification["Keyword"]} 
        token ChargeToken="charge";
        @{Classification["Keyword"]} 
        token PerToken="per";
        @{Classification["Comment"]} 
        token Comment = "//" !("\r" | "\n")+;
        @{Classification["String"]} 
        token QuotedIdentifier = '"' !('\r' | '\n' | '"')+ '"';
    }
}

 

Up to this point, we have written the syntax of the DSL and used Intellipad to verify that a demo input text is recognized as we would expect (see this post ). Then we have used attributes to provide syntax highlighting. Take for example the following snapshot of a successful parsing of the input renting scenario:

 

(Note the syntax highlighting on the left panel)

 

So your requirements guy can start writing his renting rules in Notepad, while you need to figure out a way to persist them in the database or on the fly calculate how much somebody should pay for an item by a renting scenario that will be written in the future. By specifying the DSL syntax, you have made the first step.  Next step is to produce, when parsing the input text, something more meaningful rather than the recognized literal you see on the right hand side of Intellipad.

To achieve this you need to decorate your syntax with “production rules”. Those rules are usually interpreted as: “when you match this syntax, instead of the usual output, produce this”. For example:

 

syntax Charges=ChargeStartToken "(" n:ChargeName ")" b:ChargeBody ChargeEndToken
                      => RentCharge{n,valuesof(b)};

 

In red, I have highlighted aliases for the needed parts. You start a production rule with the “=>” symbol. The production rule above is interpreted as follows: “When the syntax rule Charges is matched, output RentCharge, and within “{}”, output the production of ChargeName, followed by a comma and then the value of ChargeBody”.


The whole syntax, with the production rules that formulate the desired output is given below:
 

module Renting
{
    language RentLanguage
    {
        interleave whitespace = (" " | "\r" | "\n")+ | Comment;
 
        syntax Main = c:Charges+ =>RentCharges{valuesof(c)};
        syntax Charges = ChargeStartToken "(" n:ChargeName ")" b:ChargeBody ChargeEndToken =>RentCharge{n,valuesof(b)};
        syntax ChargeBody = r:ChargeRule+ =>{valuesof(r)};
        syntax ChargeRule = ForToken pa:PeriodAmount pr:Period ChargeToken ea:CurrencyAmount CurrencyToken "per" fp:Period "."
                        =>ChargeRule{pa,pr,ea,fp}
                                          |EventuallyToken ChargeToken ea:CurrencyAmount CurrencyToken PerToken fp:Period "."
                        =>ChargeRule{“-1”,”-1”,ea,fp}; 
       
        token PeriodAmount=('0'..'9')+;
        @{Classification["Keyword"]} 
        token Period="week"=>"w"|"weeks"=>"w"|"month"=>"m"|"months"=>"m"|"days"=>"d"|"day"=>"d"|"year"=>"y"|"years"=>"y";
        token CurrencyAmount=('0'..'9')+ DecimalSeparator ('0'..'9')* | ('0'..'9')+;
        token ChargeName=q:QuotedIdentifier=>q;
        token DecimalSeparator=",";
        @{Classification["Keyword"]} 
        token ChargeStartToken="Scenario";
        @{Classification["Keyword"]} 
        token ChargeEndToken="End";
        @{Classification["Keyword"]} 
        token ForToken="For";
        @{Classification["Keyword"]} 
        token EventuallyToken="Eventually";
        @{Classification["Keyword"]} 
        token CurrencyToken="$";
        @{Classification["Keyword"]} 
        token ChargeToken="charge";
        @{Classification["Keyword"]} 
        token PerToken="per";
        @{Classification["Comment"]} 
        token Comment = "//" !("\r" | "\n")+;
        @{Classification["String"]} 
        token QuotedIdentifier = '"' n:(!('\r' | '\n' | '"')+) '"'=>n;
    }
}

 

The best thing to do to understand the above is omit some production rules and experiment with the result. The grammar above, augmented with the production rules produces for the following input:
 

 

//Renting scenario 1
Scenario ("Monthly")
    For 1 month charge 10,00$ per day.
    For 2 months charge 5,50$ per week.
    Eventually charge 20$ per month.
End

 

The following output:

 

RentCharges{
 RentCharge{"Monthly",
    ChargeRule{"1","m","10,00","d"},
    ChargeRule{"2","m","5,50","w" },
    ChargeRule{ “-1”,”-1”,"20","m"} }
}

This is now in a form that can be more easily processed. And we will do exactly that using a program we will write in C#.

The grammar for English can be downloaded here and for Greek here.

Shout it
blog comments powered by Disqus
hire me