Today’s focus is on scalameta. In this introduction post we’re going to see how to create a macro annotation to generate protobuf formats for case classes.
The idea is to be able to serialise any case classes to protobuf just by adding a @PBSerializable
annotation to the case class declaration.
Then behind the scene the macro will generate implicit formats in the companion object. These implicit formats can then be used to serialise the case class to/from protobuf binary format.
This is quite similar to Json formats of play-json.
In this post we’re going to cover the main principles of scalameta and how to apply them to create our own macros.
Scala.meta
Setup
Getting started with scalameta is quite straightforward. You only need to add a dependency in your build.sbt
:
libraryDependencies += "org.scalameta" %% "scalameta" % "1.7.0"
Then in your code all you have to do is
import scala.meta._
Macro setup
The setup to write a macro is slightly more involved. First you need to separate repos as it’s not possible to use the macros annotations in the same project where they are defined. The reason is that the macros annotations must be compiled before they can be used.
Once compiled you don’t even need a dependency to scalameta to use your macros annotations, you only need a dependency to the project that declares the annotations.
The setup for the macros definition project is slightly more complex as you need to enable the macroparadise plugin but it’s just a single line to add to your build.sbt
.
addCompilerPlugin("org.scalameta" % "paradise" % "3.0.0-M8" cross CrossVersion.full)
Of course you can use sbt subprojects to create one subproject for the macro definition and one subproject for the application that uses the macros annotations.
lazy val metaMacroSettings: Seq[Def.Setting[_]] = Seq( addCompilerPlugin("org.scalameta" % "paradise" % "3.0.0-M8" cross CrossVersion.full), scalacOptions += "-Xplugin-require:macroparadise", scalacOptions in (Compile, console) := Seq(), // macroparadise plugin doesn't work in repl yet. sources in (Compile, doc) := Nil // macroparadise doesn't work with scaladoc yet. ) lazy val macros = project.settings( metaMacroSettings, name := "pbmeta", libraryDependencies += "org.scalameta" %% "scalameta" % "1.7.0" ) lazy val app = project.settings( metaMacroSettings, libraryDependencies += "org.scalatest" %% "scalatest" % "3.0.1" % Test ).dependsOn(macros)
Parsing
At the heart of scalameta is a high-fidelity parser. The scalameta parser is able to parse scala code capturing all the context (comments, word position, …) hence the high-fidelity.
It’s easy to try out:
scala> import scala.meta._ scala> "val number = 3".parse[Stat] res1: scala.meta.parsers.Parsed[scala.meta.Stat] = val number = 3 scala> "Map[String, Int]".parse[Type] res2: scala.meta.parsers.Parsed[scala.meta.Type] = Map[String, Int] scala> "number + 2".parse[Term] res3: scala.meta.parsers.Parsed[scala.meta.Term] = number + 2 scala> "case class MyInt(i: Int /* it's an Int */)".parse[Stat] res4: scala.meta.parsers.Parsed[scala.meta.Stat] = case class MyInt(i: Int /* it's an Int */)
Tokens
As you can see the parser captures all the details (including the comments). It’s easy to get the captured tokens:
scala> res4.get.tokens res5: scala.meta.tokens.Tokens = Tokens(, case, , class, , MyInt, (, i, :, , Int, , /* it's an Int */, ), )
Scalameta also captures the position of each token.
Trees
The structure is captured as a tree.
scala> res4.get.children res6: scala.collection.immutable.Seq[scala.meta.Tree] = List(case, MyInt, def this(i: Int /* it's an Int */), ) scala> res6(2).children res7: scala.collection.immutable.Seq[scala.meta.Tree] = List(, i: Int)
Transform
This is nice but it’s not getting us anywhere. It’s great to capture all these details but we need to transform the tokens in order to generate some code. This is where the transform
method comes in.
scala> "val number = 3".parse[Stat].get.transform { | case q"val $name = $expr" => | val newName = Term.Name(name.syntax + "Renamed") | q"val ${Pat.Var.Term(newName)} = $expr" | } res8: scala.meta.Tree = val numberRenamed = 3
Quasiquotes
Here we have transformed a Tree
into another Tree
but instead of manipulating the Tree
directly (which is possible as well) we have use quasiquotes to both deconstruct the existing Tree
in the pattern match and construct a new Tree
as a result.
Quasiquote makes it much more convenient to manipulate Tree
s. The difficulty (especially at the beginning) is too get familiar with all the scalameta ASTs. Fortunately there is a very useful cheat sheet that summarises them all.
Macros
With all this knowledge we’re now ready to enter the world of metaprogramming and write our first macro. Writing a macros is quite similar to the transformation we did above.
In fact only the declaration changes but the principle remains: we pattern match on the parsed tree using quasiquotes, apply some transformation and return a modified tree.
import scala.collection.immutable.Seq import scala.meta._ class Hello extends scala.annotation.StaticAnnotation { inline def apply(defn: Any): Any = meta { defn match { case cls@Defn.Class(_, _, _, ctor, template) => val hello = q"""def hello: Unit = println("Hello")""" val stats = hello +: template.stats.getOrElse(Nil) cls.copy(templ = template.copy(stats = Some(stats))) } } }
Here we just create an @Hello
annotation to add a method hello
(that prints "Hello"
to the standard output) to a case class.
We can use it like this:
@Hello case class Greetings() val greet = Greetings greet.hello // prints "Hello"
Congratulations! If you understand this, you understand scalameta macros. You can head over to the scalameta tutorial for additional examples.
PBMeta
Now that you understand scalameta macros we are reading to discuss the PBMeta implementation as it is built on these concepts.
It defines an annotation @PBSerializable
to add implicit PBReads
and PBWrites
into the companion object of the case class.
The pattern match is used to detect if the companion objects already exists or if we have to create it. The third case is for handling Scala enums.
defn match { case Term.Block(Seq(cls@Defn.Class(_, name, _, ctor, _), companion: Defn.Object)) => // companion object exists ... case cls@Defn.Class(_, name, _, ctor, _) => // companion object doesn't exist ... case obj@Defn.Object(_, name, template) if template.parents.map(_.syntax).contains("Enumeration()") => // Scala enumeration ... }
Note how we check that the object extends Enumeration
. We don’t have all the type information available at compile time (there is no typer
phase run as part of the macro generation – that’s why scalameta is quite fast). As we don’t have the whole type hierarchy available the only check we can do is if the object extends Enumeration
directly. (If it does indirectly we’re not going to catch it! – probably something we can do with the semantic API).
All the remaining code is here to generate the PBReads
and PBWrites
instances.
PBWrites
PBWrites
trait defines 2 methods:
write(a: A, to: CodedOutputStream, at: Option[Int]): Unit
writes the given objecta
to the specified output streamto
at indexat
. The index is optional and is used to compute the tag (if any).sizeOf(a: A, at: Option[Int]): Int
computes the size (number of bytes) needed to encode the objecta
. If an indexat
is specified the associated tag size is also added into the result.
Quasiquotes are used to generate these methods:
q""" implicit val pbWrites: pbmeta.PBWrites[$name] = new pbmeta.PBWrites[$name] { override def write(a: $name, to: com.google.protobuf.CodedOutputStream, at: Option[Int]): Unit = { at.foreach { i => to.writeTag(i, com.google.protobuf.WireFormat.WIRETYPE_LENGTH_DELIMITED) to.writeUInt32NoTag(sizeOf(a)) } ..${params.zipWithIndex.map(writeField)} } override def sizeOf(a: $name, at: Option[Int]): Int = { val sizes: Seq[Int] = Seq(..${params.zipWithIndex.map(sizeField)}) sizes.reduceOption(_+_).getOrElse(0) + at.map(com.google.protobuf.CodedOutputStream.computeTagSize).getOrElse(0) } } """
In case you’re wondering what the ..$
syntax is, it’s just how to deal with sequences in quasiquotes.
Here a create a collection of Term.Apply
to write each field into the CodedOutputStream
. The ..$
syntax allows us to directly insert the whole sequence into the quasiquote.
(Similarly there is a ...$
syntax to deal with sequences of sequences).
PBReads
PBReads
instances are generated in a similar way. The idea is to generate code that will extract field values from the CodedInputStream
and create a new instance of the object with the extracted field at the end.
val fields: Seq[Defn.Var] = ctor.paramss.head.map(declareField) val cases: Seq[Case] = ctor.paramss.head.zipWithIndex.map(readField) val args = ctor.paramss.head.map(extractField) val constructor = Ctor.Ref.Name(name.value) q""" implicit val pbReads: pbmeta.PBReads[$name] = new pbmeta.PBReads[$name] { override def read(from: com.google.protobuf.CodedInputStream): $name = { var done = false ..$fields while (!done) { from.readTag match { case 0 => done = true ..case $cases case tag => from.skipField(tag) } } new $constructor(..$args) } } """
IDE Support and debugging
In theory macros extension are supported in IntelliJ Idea. From what I experienced while developing PBMeta it works great with simple cases (e.g. adding a method to an existing case class) and it’s great as it allows you to expand to annotated class and see the generated code. Of course it’s great to debug and see what code is actually executed.
However it fails in more complex situations (e.g. creating a companion object):
In this case you’re left with inserting debug statements (i.e. println
) in the generated code. It’s simple and powerful but don’t forget to clean them up when debugging is over.
Conclusion
Scalameta is an amazing tool, it makes meta-programming easy and enjoyable. However there are some shortcomings you need to be aware of:
- You need to get familiar with all the quasi quote paraphernalia. There are many different terms but once you start to know them things got much easier. Plus you can try things out in the console.
- IDE support is great … when it works. When it doesn’t debugging isn’t easy and you’re left with generating
println
statement in your code. Not ideal! - Scalameta doesn’t provide all the type analysis performed by the compiler. Yet we can do amazing things with the available information. Plus it’s fast (no heavy type inference needed)!
I used PBMeta as an introduction to Scalameta and without any knowledge I managed to build all the functionality I wanted. I even managed to add custom field position with the @Pos
annotation. The only thing I missed is the support for sealed trait
mapping to protobuf oneOf
structure.
For more details you can head over to PBMeta, try it out and let me know what you think in the comments below.