## Parsing Context-Free Grammars

Compilers identify whether the given program is legal in the programming language, and reward you with syntax errors if not. This requires a precise description of the language syntax typically given by a context-free grammar. Each rule or production of the grammar defines an interpretation for the named symbol on the left side of the rule as a sequence of symbols on the right side of the rule. The right side can be a combination of nonterminals or terminal symbols defined simply as strings, such as “the”, “a”, “cat”, “milk”, and “drank.”

Parsing a given text string S according to a given context-free grammar G is the algorithmic problem of constructing a parse tree of rule substitutions defining S as a single nonterminal symbol of G.

We assume that the text string S has length n while the grammar G itself is of constant size. This is fair, since the grammar defining a particular programming language if of fixed length regardless of the size of the program we are trying to compile.

Further, we assume that the definitions of each rule are in Chomsky normal form. This means that the right sides of every nontrivial rule consists of (a) exactly two nonterminals, or exactly one terminal symbol. Any context-free grammar can be easily and mechanically transformed into Chomsky normal form by repeatedly shortening long right-hand sides at the cost of adding extra nonterminals and productions. Thus, there is no loss of generality with this assumption.

#### Stop and Think: Parsimonious Parserization

Problem: Programs often contain trivial syntax errors that prevent them from compiling. Given a context-free grammar G and input string S, find the smallest number of character substitutions you must make to S so that the resulting string is accepted by G.

### Minimum Weight Triangulation

A triangulation of a polygon is a set of nonintersecting diagonals that partitions the polygon into triangles. We say that the weight of a triangulation is the sum of the lengths of its diagonals.

What if there are points in the interior of the polygon? Then dynamic programming does not apply in the same way, because triangulation edges do not necessarily cut the boundary into two distinct pieces as before. Instead of only [katex]n \choose 2[/katex] possible subregions, the number of subregions now grows exponentially. In fact, the more general version of this problem is known to be NP-complete.