Ioannis Panagopoulos blog

Tutorials on HTML5, Javascript, WinRT and .NET

Domain Specific Languages with M - Part 1/3 Defining the syntax

by Ioannis Panagopoulos

Say you are implementing software for a company that rents items. And you are presented with some bizarre and complex logic on the way renting fees are calculated. To be more specific requirements come to you in the following form:

“I want a renting fee scenario where an item for the first month is charged 10$ per day. Then, the same item costs 5$ per week for the next two months. Finally, after this period of three months the fee will be 20$ per month. But this is just one scenario…”

Clearly your inclination will be to go talk to the guy, help him write all his scenarios and then come up with a clever algorithm that calculates the fees and (hopefully) a UI where the guy can add more scenarios and persist them in the database.

Or you can implement a DSL. A DSL that allows somebody to express such complex renting-fee scenarios in an easy to use language. Hopefully this language will offer you more expressiveness and flexibility than a nicely constructed UI. From the moment you realize that renting fees will need requirement analysis, you come up with the following DSL:


Scenario (“Monthly”)
    For 1 month charge 10$ per day.
    For 2 months charge 5$ per week.
    Eventually charge 20$ per month.


Clearly this is something that can be given to the guy and ask him to express his renting fee scenarios using this language. At the time he writes his specs, you have work to do:


  • Create a database that stores those scenarios in some form.
  • Implement the algorithm that calculates the fees for any arbitrary scenario you may receive.


It all boils down to finding a way to parse the language and express meaningful information from it. How does “M” facilitate this process? As you will see you get a lot of out of the box goodies by implementing this DSL in “M”.


 Defining the syntax in M

Open Intellipad and select “DSL Grammar Mode”:



Create a text file that contains a demo scenario like the one we have written above.  Now go ahead and select “Open File as Input and Show Output”:



This will split your window in three. On the left you have your demo program in your new DSL. In the middle you write your DSL’ s syntax. On the right you get what “M” understands from your syntax and the demo program. Go ahead and write in the middle (where the syntax goes) the following:


module RentingApp
    language RentLanguage
        syntax Main=any*;


That is, you define a module named “RentingApp” which contains the syntax of the language “RentLanguage”. The syntax consists of a single rule that matches everything. The rule says that “Main” contains zero of more occurrences of any character. The moment you write these lines, the framework understands the input language and displays it on the right (Note that the result is a set of single characters):



 Of course this is of little help. Let’s go and make the syntax more specific by providing the following rules:



module RentingApp
    language RentLang
        interleave whitespace = (" " | "\r" | "\n" | "\t")+;
        syntax Main=RentCharge*;
        syntax RentCharge="Scenario" "(" ChargeScenarioName ")" ChargeRules "End";
        syntax ChargeRules=(ChargeRule ".")*;
        token ChargeRule= "For" !('.')+ | "Eventually" !('.')+;
        syntax ChargeScenarioName=QuotedIdentifier;
        token QuotedIdentifier = '"' !('\r' | '\n' | '"')+ '"';


The first thing to note is the directive “interleave…” which tells that our language does not take into consideration any spaces or returns or tabs.

The <Main> part consists of one or more <RentCharge> parts. Each <RentCharge> part begins with the word “Scenario“ and an opening “(“, has a <ChargeScenarioName> part followed by a closing “)”, then has a <ChangeRules> part and finally ends with the word “End”.

The <ChargeScenarioName> part is a <QuotedIdentifier> which matches one or more characters within quotes apart from “\r” and \n”.

The <ChargeRules> part consists of zero or more <ChargeRule> parts which end with “.”.

The <ChargeRule> parts start with the word “For” and then matches everything except the “.” or start with “Eventutally” and then anything except “.”.

This is practically how we construct the syntax of the DSL. The full syntax supporting the language we are building for the renting scenarios is as follows:


module Renting
    language RentLanguage
        interleave whitespace = (" " | "\r" | "\n")+ | Comment;
        syntax Main= Charges+;
        syntax Charges=ChargeStartToken "(" ChargeName ")" ChargeBody ChargeEndToken;
        syntax ChargeBody=ChargeRule+;
        syntax ChargeRule=ForToken PeriodAmount Period ChargeToken CurrencyAmount CurrencyToken "per" Period "."
                         |EventuallyToken ChargeToken CurrencyAmount CurrencyToken PerToken Period ".";
        token PeriodAmount=('0'..'9')+;
        token Period="week"|"weeks"|"month"|"months"|"days"|"day"|"year"|"years";
        token CurrencyAmount=('0'..'9')+ DecimalSeparator ('0'..'9')* | ('0'..'9')+;
        token ChargeName=QuotedIdentifier;
        token DecimalSeparator=",";
        token ChargeStartToken="Scenario";
        token ChargeEndToken="End";
        token ForToken="For";
        token EventuallyToken="Eventually";
        token CurrencyToken="$";
        token ChargeToken="charge";
        token PerToken="per";
        token Comment = "//" !("\r" | "\n")+;
        token QuotedIdentifier = '"' !('\r' | '\n' | '"')+ '"';


As you can see we do not hard code any of the literals of the language (such as “Scenario” etc) within the syntax rules. We do that to be able to easily change the spoken language of our DSL (for example from English to Greek). The syntax you see above recognizes all the possible sentences that will be written for the renting scenario and as you may have noticed from the first rule, it accepts more than one renting scenarios at once.

You can download the aforementiond grammar for the English language here and a little different version for the Greek langauge here.

In the next post we will decorate our DSL with attributes and create productions to alter the output of the DSL parser.



Shout it

kick it on
blog comments powered by Disqus
hire me