内部DSLの表現力 - Digital Romanticism

FolwerのDSL Bookを参照しつつ、クロージャを利用したサンプルを用いて内部DSLの表現力について考察する。

導入 - クロージャについて

前回までのエントリでは、内部DSLの実装パターンとしてObject ScopingとMethod Chainingを利用しました。これらのパターンについてFowlerはこう説明しています。

One of the main purposes for Method Chaining and Object Scoping is to narrow the scope of a function call to a single object. This narrowing both provides a limited namespace for the functions that are part of the DSL and targets all the DSL calls to a single object that can hold state relevant to the expression.

Method ChainingとObject Scopingを用いる主な目的の１つは、関数呼び出しのスコープを単一のオブジェクトに狭めるということです。これにより、DSLの一部となる関数のための名前空間が限定されると同時に、あらゆるDSL呼び出しの対象を単一のオブジェクトに絞り込み、そこで式に対応する状態を保持することができるようになります。
http://martinfowler.com/dslwip/InternalOverview.html#Closures

DSL本においてこの目的のために利用できる第３のテクニックとして紹介されるのが、今回取り上げるクロージャです。なお、サンプルコードはGroovyで記述しています。

サンプル１：コンピュータの設定

まずはシンプルなサンプルを例にクロージャを使った内部DSLの基本的な構造を見ていきます。これはコンピュータのプロセッサとハードディスクを定義するというもので、モデルレイヤのクラスは３つだけです。

public class Computer {
    Processor processor
    def disks = [] as List<Disk>
}

public class Processor {
    int cores
    String type
}

public class Disk {
    int diskSize
    int diskSpeed
    String diskInterface
}

これらのオブジェクトを組み立てる内部DSLがこちらになります。

    void configureComputer() { 

        computer {
            processor { p ->
                p.cores = 2
                p.type = "i386"
            }
            disk { d ->
                d.diskSize = 150
            }
            disk { d ->
                d.diskSize = 75
                d.diskSpeed = 7200
                d.diskInterface = "SATA"
            }
        }
    }

ここでもObject Scopingパターンを利用しており、computer、processor、diskの各メソッドは基底クラスに実装しています。そして、各メソッドの内部ではモデルオブジェクトの生成とクロージャの呼び出しを行っています。processorの実装を示します。

    void processor(Closure configureProcessor) {
        Processor processor = new Processor()
        configureProcessor.call(processor)
        computer.processor = processor
    }

なお、Fowlerはクロージャを用いることのメリットとして、変数のスコープを限定することの他に、遅延評価が可能になる点を挙げています。

サンプル２：IMAPクエリ

表現力という観点から見た場合、クロージャは抽象構文木（またはパースツリー）の操作*1と組み合わせることによって、Method ChainingやObject Scopingでは決してなし得ないことを実現することができます。これについて、Fowlerは次のように説明しています。

When you write code in a closure, that code available to be executed at some future time. Parse Tree Manipulation allows you not just to execute the code but to also examine and modify its parse tree.

コードをクロージャ内に記述することで、そのコードは将来実行することができます。しかし、パースツリーを操作することによって、コードを実行するだけでなく、コードのパースツリーを検証し変更することができるようになるのです。
http://martinfowler.com/dslwip/ParseTreeManipulation.html

これについて、２つめのサンプルを参照しながら考えていきます。

IMAPメッセージの生成

IMAPとはe-mailサーバとやりとりするためのプロトコルです。サンプルではe-mailの検索を行います。「題名に"entity framework"を含み、2008年6月23日以降に送られたもので、@gmail.com ドメインではないもの」の検索は以下のようになります。

SEARCH subject "entity framework" sentsince 23-jun-2008
    not from "@gmail.com"

内部DSLを利用して記述された、これを実行するサービスを示します。

public class ImapQueryService {

    ImapChannel channel

    AstBuilder b = new AstBuilder()
    
    void doExecute() {

        ImapQuery query = new ImapQuery(channel:channel)
        condition(query, b.buildFromCode { q -> 
            q.subject == "entity framework"
            q.date >= "23-jun-2008" 
            q.from != "@gmail.com" 
         })
        query.execute()
        
    }
}

b.buildFromCodeを含む一行が煩雑ですが、注目して頂きたいのはその後です。

// doExecute()の一部を再掲
q.subject == "entity framework"
q.date >= "23-jun-2008" 
q.from != "@gmail.com"

これらはホスト言語であるGroovyの文法に完全に従ったものですが、評価方法は全く異なります。q.subject == "entity framework"の部分は文字列が等価であるかどうかを判定するのではなく、subject "entity framework"という文字列を生成しなければなりません。これを実現するために、クロージャをパースした結果のツリーを操作します。

構文木の操作

Groovyにおいてクロージャから抽象構文木への変換はAstBuilderクラスを利用します。ただし、この処理はコンパイル時にしかできないため、サンプルではコンパイル時に変換を行い、実行時には変換後のツリーを操作するという形を取っています（b.buildFromCodeというノイズが入ってしまっているのはこのためです）。static importされているconditionメソッドの実装を示します。

public class ImapQuery implements IConditionHandler {
    ...
    static void condition(ImapQuery query, List<ASTNode> condition) {
        ConditionVisitor visitor = new ConditionVisitor(query:query)
        condition[0].visit visitor
    }

ここから分かる通り、パースツリーはASTNodeオブジェクトのリスト形式になっています。例に挙げたクロージャであれば、概ね以下のような構造になります。

ExpressionStatement
  - ClosureExpression
    - BlockStatement
      - ExpressionStatement
        - BinaryExpression (q.subject == "entityFramework")

ツリーの解析にはViditorパターンを使用します。GroovyClassVisitorインタフェースのサポートクラスであるClassCodeVisitorSupportを継承してVisitorクラスを作成し、ASTNodeのvisitメソッドコール時に引数として渡します。Visitorクラスでは、各ExpressionおよびStatementに対応したメソッドを実装しています。

public class ConditionVisitor extends ClassCodeVisitorSupport {

    IConditionHandler query

    void visitExpressionStatement(ExpressionStatement statement) {
        statement.expression.visit this
    }

    void visitClosureExpression(ClosureExpression expression) {
        expression.code.visit this
    }

    void visitBlockStatement(BlockStatement statement) {
        statement.statements.each { it.visit this }
    }

    void visitBinaryExpression(BinaryExpression expression) {
        if (expression.leftExpression instanceof PropertyExpression) {
          def prop = expression.leftExpression.property.text
          def operator = expression.operation.type
          def value = expression.rightExpression.text
          query.addCondition(prop, operator, value)
        } 
    }
    
    SourceUnit getSourceUnit(){ }
}

なお、Visitorでは構文解析のみを行い、実際の文字列生成はImapQueryに戻しています。

public class ImapQuery implements IConditionHandler {
    ...    
    void addCondition(String prop, int operator, String value) {
        switch(prop) {
        case "subject":
            handleSubject(operator, value)
            break
        case "date":
            handleDate(operator, value)
            break
        case "from":
            handleFrom(operator, value)
            break
        default:
            throw new IllegalArgumentException("illegal description")
        }
    }

お気づきかもしれませんが、ここでは実装を単純化するために意味モデルを介在させず、直接IMAPの文字列を生成しています。

まとめ

パースツリーの操作を行うことのメリットをFowlerはこう説明しています。

Parse Tree Manipulation allows you to express logic in your host programming language and then manipulate that expression with more flexibility than you otherwise would be able to. With that given, a driving reason to use Parse Tree Manipulation is when you want to use the fuller range of the host language to express something, rather than the pidgin of the usual internal DSL constructs.

パースツリーの操作によって、ホスト言語の内部でロジックを表現し、その表現をそうでなければ不可能なほど柔軟なやり方で操作できるようになります。したがって、パースツリーの操作を行う主要な目的は、何かを表現するために干すと言語を最大限に活用したいとき、それも通常の内部DSLで実現されるピジン語以上を求めるときということになります。
http://martinfowler.com/dslwip/ParseTreeManipulation.html

クロージャを利用するか否かに関わらず、内部DSLのメリットとしては、IDEによるサポートが得られる点と、不正な記述をコンパイル時にチェックできる点が挙げられます。例えば、IMAPクエリの例もクロージャ内に記述された検索条件は、Groovyの文法に完全に従ったものであり、ある程度不正な記述を避けることができます。しかし、ここには同時に暗黙のルールも導入されています（「左辺には"q.xxxx"を置かなければならない」など）。

内部DSLを設計することは、言語を設計することです。派手だけれども複雑な仕様を作り上げてしまうと、かえって表現力を下げる結果になってしまうということは常に意識しておくべきです。内部DSLの設計においては、シンプルさと表現力のバランスを保つことがきわめて重要であると言えるでしょう。

今回使用したサンプルはこちらで公開しています。

*1:抽象構文木と構文木の違いについては深く立ち入りません。