Scala and Adding New Syntax

One interesting thing about some languages is their support for adding new syntax. While all languages have the ability to add new functions or types some have specific properties that make it easy to add what looks like new built-in syntax.

Scala is an Object Oriented language. You can declare classes and objects, do inheritance and composition, and all the other things you might expect from an OO language. Scala is also a Functional language because functions are first-class citizens (also called a functor). And when I say Scala is an OO language I really mean it: everything is an Object. Even functions are Objects. (Chew on that one for a bit.)

Scala also supports the idea of optional parenthesis for method calls that only take a single argument (Note: This applies to method calls on object only. Not to functions.). This ends up being for a very practical reason. Take the following example:

1 + 2

This is a very nice way to write an addition operation. In reality what’s happening is:

1.+(2)

1 is an object and + is a method on that object that takes a single parameter. Applying the previous rule we get to remove the dot and the the parenthesis. Which allows us to write our previous example 1 + 2.

The good news is they bring this consistency to the language as a whole, so any method call can optionally use the dot. Any call to a method that only takes a single parameter can exclude the parenthesis around its arguments. These features make it pretty easy to emulate the built-in syntax of a language.

Your Own While Loop

Let’s say I want to write my own while loop:

def mywhile(condition: => Boolean)(command: => Unit) {
if (condition) {
command
mywhile(condition)(command)
}
}

var x = 1
mywhile(x < 100000) { println(x) x += 1 }

As you can see, I end up calling mywhile the same as I would call a built-in while. This is implemented as a tail-recursive function. If the condition is met, the command is executed. The function then recurses, calling itself to continue. x < 100000 is an anonymous function that returns a boolean expression.

Your Own Do...While Loop

A while loop can be built using just a single function. What if you want to create a do...while loop instead? In this case you can make use of the OO/functional hybrid.


class Repeater(command: => Unit){
final def aslongas(condition: => Boolean) {
command
if (condition) aslongas(condition)
}
}

def mydo(command: => Unit): Repeater = {
new Repeater(command)
}

var x = 0
mydo {
x += 1
println(x)
} aslongas (x < 100000)

In this case I use recursion again to do the looping. But I use an Object to bind the command to and an aslongas method to run that command and check the looping condition. I use a function mydo to bootstrap an instance of the Repeater class. Scala gives us the ability to use functions and objects when they make sense.

Why Should You Care?

Ok, so you're not going to write your own while loops. The language has them built-in already. But what this allows you to see is how you can add new "syntax". That ability makes it quite convenient and easy to write higher-order syntax to solve application specific problems or to create DSLs.

Update: Changed the until name to 'aslongas' since it really wasn't until the condition was met.

Coffee DSL Redone With Meta-Programming

In a previous post I wrote about DSLs as Jargon. I implemented a simple Coffee DSL that would allow code to parse an order written by a human and turn it into a domain model. I used a fairly basic method_missing structure to capture the values.

There’s a much better way to do it in Ruby with meta-programming. Meta-programming allows you to write code to write code. You program your programming. In this case we can create the syntax of Coffee using a meta-programming technique.

dsl_attr :size, %w(venti grande tall)

This is us programming the class to say: “If someone calls a method venti, grande, or tall on our object they mean that they are telling us the size of the coffee, so store that value as the size”. So now we can write our Coffee class like this:

# CoffeeDSL.rb
# This is the input from the user, likely read from a file
# or input through a user interface of some sort
CoffeeInput = "venti nonfat whip latte"

class Coffee
dsl_attr :size, %w(venti grande tall)
dsl_attr :whipped, %w(whip nowhip)
dsl_attr :caffinated, %w(caf decaf halfcaf)
dsl_attr :type, %w(regular latte cappachino)
dsl_attr :milks, %w(milk nonfat soy)

def order
params = ''
params += milks + ' ' if milks?
params += caffinated + ' ' if caffinated?
params += whipped + ' ' if whipped?
print "Ordering coffee: #{size} #{params}#{type}\n"
end

def load
# turn one line into multi-line "method calls"
cleaned = CoffeeInput.gsub(/\s+/, "\n")
self.instance_eval(cleaned)
end
end

We are essentially configuring the class in code. We could add extra values as well, such as a default value, required validation, any number of things. We then just need to implement the dsl_attr using meta-programming. That can be done in the Module in Ruby which makes that available to all classes in the system.


class Module
def dsl_attr(param_name, values)
attr param_name
class_eval "def #{param_name}?; @#{param_name}; end"
values.each do |val|
define_method("#{val}") do
instance_eval %{
@#{param_name} = '#{val}'
}
end
end
end
end

Now when you run the code it captures all of the values that are parsed from the input and puts them into your object as meaningful values.

c = Coffee.new
c.load
c.order

I did the same DSL in Groovy and thought I could attempt to do it more justice using meta-programming as well. In Groovy, meta-programming is done with the ExpandoMetaClass – no, I didn’t make that up. Each Class has a metaClass property that gets you access to that types’ ExpandoMetaClass instance. You can then add properties and methods and whatnot to it. This has the effect of making the properties or methods callable on an instance of that type.


ExpandoMetaClass.enableGlobally() // have to do this to get inheritance of dslAttr

Object.metaClass.dslAttr << {String param_name, values ->
def clazz = delegate
clazz.metaClass."${param_name}" = null
values.each() { val ->
clazz.metaClass."${val}" << {-> clazz."${param_name}" = "${val}" }
}
}

class Coffee {
def Coffee() {
dslAttr("size", ['venti', 'tall', 'grande'])
dslAttr("whipped", ['whip', 'nowhip'])
dslAttr("caffinated", ['caf', 'decaf', 'halfcaf'])
dslAttr("type", ['regular', 'latte', 'cappachino'])
dslAttr("milks", ['milk', 'nonfat', 'soy'])
}

def order() {
def params = ''
if (null != getMilks()) params += "${getMilks()} "
if (null != getCaffinated()) params += "${getCaffinated()} "
if (null != getWhipped()) params += "${getWhipped()} "
println "Ordering coffee: ${getSize()} ${params}${getType()}\n"
}

def load(String input) {
// turn one line into multi-line "method calls"
def cleaned = input.split(/\s+/)
cleaned.each() { meth -> this.&"${meth}"() }
}
}

def c = new Coffee()
c.load("venti nonfat whip latte")
c.order()

I’m not sure if there is a better way to do this or not. Ideally I would like to have the dslAttr add something to the Coffee metaClass instead of just adding stuff to the instances, but this seems to do the trick for now.

The Ruby and Groovy implementations become fairly similar at this point. It’s a great way to reduce the amount of boilerplate code you would need to normally write to implement this kind of thing in less dynamic languages.

When Do DSLs Make Sense?

Domain Specific Languages (DSLs) are discussed all the time. There is a lot of writing about implementing DSLs and many dynamic languages like Ruby and Groovy make it really fairly easy to do it. But rarely do I see it discussed how you figure out when it makes sense to implement a DSL.

Some people would say that DSLs are just another form of abstraction. Like a framework to help solve a specific technical problem, a DSL is a “framework” for solving a specific Domain problem. There is often a general discussion about the trade-off of cost vs. benefits of implementing a DSL. This cost/benefit discussion would be appropriate to a general-purpose framework as well. For the vast majority of applications, taken by themselves, the answer is it is never cost effective or appropriate to write a framework for one application.

The cost of creating a generalized solution could be appropriate when you have multiple (lets say at least three) applications that can utilize it. In addition to this rule is some sort of high volume of rules or highly-volatile changes. If you have different rules, behaviors or data that needs to happen on a per state and per product basis in an insurance company? If you need to be able to quickly change rules or variables based on financial market data? When one of these exceptions comes into play then there is a high benefit to the extra work of creating a DSL so the cost becomes one that is more acceptable. Additionally there are some dynamic languages, Ruby, Groovy, and Lisp variants for example that make implementing DSLs much easier to do. When you use those languages often the cost is reduced, because it’s easier to implement, so that the benefit doesn’t need to be as great for the payoff to work.

Rules for when to consider a DSL

  • You have more than two applications that could use the same DSL
  • Highly-Volatile changes are expected – multiple times a day perhaps
  • You have a very large number of cases that could grow exponentially as business or product lines grow
  • You are using a dynamic language that makes it easy to implement

Coffee DSL in Groovy

I thought I’d follow up with my previous post with the Coffee Domain Specific Language in the Groovy Language.

This is really one of my first forays into Groovy, so it’s pretty rough. It’s really just a direct translation of the Ruby code and not what I would expect to be ‘idiomatic Groovy’. I’ll try and update this once I learn some more Groovy.


// CoffeeDSL.groovy
// This is the input from the user, likely read from a file
// or input through a user interface of some sort

CoffeeInput = "venti nonfat decaf whip latte"

class Coffee
{
def size
def whip
def caf
def type
def milk

public invokeMethod(name)
{
if (['venti', 'grande'].contains(name))
size = name
else if (['whip', 'nowhip'].contains(name))
whip = 'whip'.equals(name)
else if (['caf', 'decaf', 'halfcaf'].contains(name))
caf = name
else if (['regular', 'latte', 'cappachino'].contains(name))
type = name
else if (['milk', 'nonfat'].contains(name))
milk = name
else
throw new Exception("Unknown coffee informantion: ${name}.")
}

public order() {
def params = ''
if (milk)
params += milk + ' '
if (caf)
params += caf + ' '
if (whip)
params += 'whip '
println("Ordering coffee: ${size} ${params}${type}\n")
}

public load(input) {
// turn one line into multi-line "method calls"
def cleaned = input.split(/\s+/)
instance_eval(cleaned)
}

public instance_eval(methods) {
for (method in methods) {
this.invokeMethod(method)
}
}
}

// this is your code which loads the DSL input and executes it
coffee = new Coffee()
coffee.load(CoffeeInput) // load the user input
coffee.order() // submit the order

This isn’t even metaprogramming. You could do this in any language, Java, C#, whatever. Everyone talks about metaprogramming in Groovy, but I have not yet found a lot of information on it. Does anyone have any pointers?

Understanding Domain Specific Languages as Jargon

Domain Specific Languages (DSLs) are the idea of creating syntaxes that model a very specific problem domain. Domain Specific Languages are not a new concept. Some people call them ‘little languages’. The Unix world has a bunch of little languages. Grep, awk, sed, lex, and yacc all exhibit features of these domain specific languages. They are little tools that do one thing well. In these cases they are often highly encoded and not in natural language of any sort. Modern domain specific languages should aim to be humane and literate in the language of the user.

Domain Specific Languages should be expressed in the language of the problem being solved. They are a higher level of abstraction than for loops and object instantiation. They are at the level of abstraction of the problem space. Neal Ford uses the example of “venti nonfat decaf whip latte”. What am I talking about if I use those terms? If you guessed coffee, then you know the Jargon of one coffee chain out there. The person listening to the order understands that you are ordering a decaf coffee drink of a certain size, with non-fat milk and whipped cream. There is a lot of shared context in the Jargon of the coffee drinker and the coffee order taker. This shared context sets the stage for a rich conversation without a lot of unnecessary noise. This is true of all Jargon.


# CoffeeDSL.rb
# This is the input from the user, likely read from a file
# or input through a user interface of some sort
CoffeeInput = "venti nonfat decaf whip latte"

class Coffee

def method_missing(symbol)
name = symbol.to_s
if %w(venti grande).include?(name)
@size = name
elsif %w(whip nowhip).include?(name)
@whip = 'whip'.eql?(name)
elsif %w(caf decaf halfcaf).include?(name)
@caf = name
elsif %w(regular latte cappachino).include?(name)
@type = name
elsif %w(milk nonfat).include?(name)
@milk = name
else
raise ArgumentError, "Unknown coffee informantion: #{name}."
end
end

def order
params = ''
params += @milk + ' ' if @milk
params += @caf + ' ' if @caf
params += 'whip ' if @whip
print "Ordering coffee: #{@size} #{params}#{@type}\n"
end

def load
# turn one line into multi-line "method calls"
cleaned = CoffeeInput.gsub(/\s+/, "\n")
self.instance_eval(cleaned)
end
end

# this is your code which loads the DSL input and executes it
coffee = Coffee.new
coffee.load # load the user input
coffee.order # submit the order

Jargon is the terminology of a specific proffession or group. Does your user community or problem space have a vocabulary? Can they express the things they want out of a system using that vocabulary or Jargon? If so, there is a very real possibility that you could utilize a DSL to solve some set of problems for those users.

What about field validation?

# ValidationDSL.rb
# Input from the user that would be read in
Input = < @max_length
return false
end
if @min_length and field.length < @min_length return false end return true end def load self.instance_eval(Input) end end val = ValidateDSL.new val.load print val.validate('foo') print "\n" print val.validate('') print "\n" print val.validate('abbbbbbbbbbbbbbbbbbbbbb') print "\n"

How does your user community talk about a problem? Can they easily express what they intend with simple Jargon that they already know? Is that more natural for a power user than some complicated UI with buttons and checkboxes? Then you might have a good place to use a DSL.