Parser2D user manual

This document gives a user-level description of the Parser2D package, diva.sketch.parser2d, which provides basic two-dimensional parsing based on graph grammars. I start with an overview of the package for context, and follow with a comprehensive description of the grammar format. Finally I describe how the parser is integrated into the flow of a sketch-recognition program and give a source code example to illustrate things more concretely. For information about the design of the package structure and parsing algorithm, please see the Parser2D design document.

Overview

The Parser2D package provides basic two-dimensional parsing for structural recognition. Traditional string parsing is a special case of two-dimensional parsing, where tokens are related to each other only by adjacency. In this two-dimensional parser, tokens are related to each other by more complex relations, such as overlap amount (containment, adjacency, intersection), distance, angle, relative size, etc.

The two-dimensional parser was written for the purpose of recognizing structure in sketched drawings. The parser can just as easily be used for non-sketched diagrammatic interfaces (such as visual languages), but in this document we explain the package in terms of sketch recognition since I expect most users will be more interested in this application.

The following image depicts a user's sketch that can be recognized as a scroll-pane (a pane that is controlled by a vertical scrollbar) using the parser in conjunction with a sketch recognizer. It is used as an example throughout the rest of this document.

Sketched pane and scrollbar The recognition and parsing flow

Grammar

The grammar parameterizes the parser with a set of rules that are matched on the input tokens. It is written in XML (see http://www.xml.org for more info).

The following is a sample grammar that specifies a rule for the vertical scrollbar example in terms of spatial constraints between a vertical rectangle boundary, an upward triangle up-arrow, a downward triangle down-arrow, and a square handle. It then specifies a pane as a horizontal rectangle, vertical rectangle, or square. Finally it composes the vertical scrollbar and pane into a scroll pane as pictured in the previous section. The complete sample grammar for a set of GUI widgets is here, and the document type definition (DTD) is here.

<?xml version="1.0" standalone="no"?>
<!DOCTYPE grammar2d SYSTEM "grammar2d.dtd">
<grammar title="widgets" version="1.0">
  <rule type="vscroll">
    <root name="boundary" type="vrect"/>
    <relative name="upArrow" type="upTri">
      <distance rootSite="NORTH" relativeSite="NORTH"
         min="NO_MIN" max="20"/>
      <!-- Use default thresholds, i.e. 90% overlap -->
      <overlap constraint="CONTAINS"/>
      <widthRatio min=".5" max="1.1"/>
      <heightRatio min="NO_MIN" max=".3"/>
    </relative>

    <relative name="downArrow" type="downTri">
      <distance rootSite="SOUTH" relativeSite="SOUTH"
         min="NO_MIN" max="20"/>
      <overlap constraint="CONTAINS"/>
      <widthRatio min=".5" max="1.1"/>
      <heightRatio min="NO_MIN" max=".3"/>
    </relative>

    <relative name="handle" type="square">
      <distance min="NO_MIN" max="20"/>
      <overlap constraint="CONTAINS"/>
      <widthRatio min=".5" max="1.1"/>
      <heightRatio min="NO_MIN" max=".3"/>
    </relative>
  </rule>

  <rule type="pane"> <root type="hrect"/> </rule>
  <rule type="pane"> <root type="vrect"/> </rule>
  <rule type="pane"> <root type="square"/> </rule>

  <rule type="scrollpane"> <root name="pane" type="pane"/>
    <relative name="vscroll" type="vscroll">
        <distance rootSite="EAST" relativeSite="WEST" min="5" max="100"/>
	<overlap constraint="ADJACENT"/>
	<heightRatio min=".8" max="1.2"/>
    </relative>
  </rule>
</grammar>
Scrollbar, pane and scroll-pane excerpts of widget grammar

Like most context-free grammars for one-dimensional string languages, the 2D grammar specified here consists of a set of productions, each of which has a left side and a right side. However the way these productions are specified differs somewhat from a standard grammar format, such as Yacc or Bison, because of the two-dimensional nature of the input.

The rule element contains a type field that is equivalent to the left side of the production. The contents of the rule are the right side: a root object and a set of relative objects which are spatially related to the root object under a system of constraints.

For the scrollbar object specified above, the root object is the boundary of the scrollbar and the relatives are the up-arrow, down-arrow, and handle. The choice of root is somewhat arbitrary, except that the spatial constraints relate relatives to the root only, and not to each other. This restriction makes the relations tree-structured, which greatly simplifies the parsing algorithm. For the simple examples I've considered (sketching of GUIs and mathematical equations) this restriction does not overly constrain the user.

Interpreting the scrollbar example:

So in fact, based on this interpretation, we see that the parser actually accepts many different scrollbars, some of which may look only vaguely scrollbar-like (such as the case where the up-arrow is 1.1 times the width of the boundary). That is the cost of using conservative constraints.

Constraints

The constraints specified in the grammar fall into four orthogonal flavors: distance, angle, overlap, and size. This table summarizes the different constraints and their parameters. All constraints have min and max fields to specify the minimum and maximum values of the constraint. These are required for the distance and width/height/areaRatio constraints and are optional for angle and overlap, which have more abstract ways to specify the relationship for convenience.

Angle and distance constraints rely on the concept of sites. A site is a point on the bounding box on the object, one of either: NORTH-WEST, NORTH, NORTH-EAST, EAST, SOUTH-EAST, SOUTH, SOUTH-WEST, WEST, or CENTER. For relations that use sites, the default values if they are left unspecified is CENTER.

Constraint

Description

Parameters

Name
Description
Default value
distance Specifies a minimum and maximum distance between sites on two objects rootSite The site on the root object CENTER
relativeSite The site on the relative object CENTER
angle Specifies a minimum and maximum angle from the root object to the relative object, in degrees rootSite The site on the root object CENTER
relativeSite The site on the relative object CENTER
direction The general direction of the angle (NORTH, NORTHWEST, etc.) --
overlap Specifies a minimum and maximum overlap of the relative object on the root object, as a ratio of the intersection's size to the size of the relative constraint The general overlap amount that is desired (CONTAINED, ADJACENT, OVERLAP) --
widthRatio
heightRatio
areaRatio
Specifies a minimum and maximum ratio of the {width, height, area} of the relative object to the root object. --

Parser usage

This section describes the actual code usage of Parser2D. The classes of interest at the user-level are Parser2D, ConstituentSet, and the Constituent interface and its implementors.

Parser2D is the class that performs the parsing. It is constructed with a grammar file as its argument. The parse() method is the user's primary interface to the parsing functionality of the package. It accepts a constituent set as its argument and returns a list of constituent sets (all possible interpretations) as its result.

A constituent is either "terminal" or "composite". A terminal constituent is the two-dimensional equivalent of a "token" in traditional parsing. It consists of a rectangular bounding box and a string type field. A composite constituent is a set of constituents, which has the bounding box of the union of its children's bounding box and a type derived from the grammar.

For the scrollpane example, the parser's input is a set of terminal constituents (whose bounding boxes are abbrieviated by "..."):

ConstituentSet[
   Terminal[square ...]
   Terminal[vrect ...]
   Terminal[upTri ...]
   Terminal[downTri ...]
   Terminal[square ...]
]
and its output is:
List[
   ConstituentSet[
      Composite[ scrollpane :
         Composite[ pane : Terminal[square ...] ...]
         Composite[ vscroll :
            Terminal[vrect ...] Terminal[upTri ...]
            Terminal[downTri ...] Terminal[square ...]
	    ...
         ]
	 ...
      ]
   ]
]

Code example

Here we illustrate the usage with a sketch example, but a different application (such as a non-sketch-based visual language) could simply supply a different constituent set based on its own diagram representations. Parser2D is integrated into the sketch package in the following way:

  1. Users sketch individual strokes in the form of diagram elements (squares, circles, letters), mathematical elements (integral sign, division bar), or handwritten characters.
  2. These strokes are recognized as symbols with types by a low-level recognizer.
  3. The 2D parser (parameterized by a user-specified grammar) recognizes structures in the drawing, such as mathematical equations or diagrams.
  4. An application receives the parsed results and can perform semantic analysis or processing based on the structure.

The following code segment implements a version of this flow by reading in a training file (for recognizing the individual strokes), a grammar file (for 2D parsing), and a user's input sketch. It is included here for a concise snapshot of the interaction between the low-level recognizer and the parser.


/**
 * A non-interactive tutorial version of the low-level recognizer and
 * 2D parser.  This program reads in a saved sketch, recognizes each
 * of the strokes in the sketch, and passes this whole mess as an
 * input to the 2D parser.
 */

public void batchParse (String trainingFile, String grammarFile,
                        String inputFile) throws Exception {

    //Set up the low-level recognizer by training it with the
    //input.

    BasicRecognizer recognizer = new BasicRecognizer(trainingFile);


    //Set up the 2D parser with the given grammar file.

    Parser2D parser = new Parser2D(grammarFile);
        

    //Read in the input file.

    SketchParser sp = new SketchParser();
    SketchModel model = sp.parse(inputFile);


    //Transform the input by recognizing all of the strokes and then
    //converting this to a ConstituentSet for the parser.

    Constituent[] constituents = new Constituent[model.getSymbolCount()];
    int i = 0;
    for(Iterator symbols = model.symbols(); symbols.hasNext(); ) {
        Symbol symbol = (Symbol)symbols.next();
        TimedStroke stroke = symbol.getStroke();
        Recognition r = recognizer.strokeCompleted(stroke);
        String type = r.getHighestConfidenceType().getStringID();
        Rectangle2D bounds = stroke.getBounds();
        constituents[i++] = new TerminalConstituent(type, bounds);
    }
    ConstituentSet cset = new ConstituentSet(constituents);


    //Feed the constituent set to the parser and
    //print the results.

    System.out.println("INPUT:  " + cset);
    List out = parser.parse(cset);
    System.out.println("OUTPUT: " + out);
}

Batch test of parser

This non-interactive code is implemented fully in the tutorial class diva.sketch.demo.BatchParserDemo. An interactive version is available in the tutorial class diva.sketch.demo.ParserDemo.