6 - Context-Free Grammar
ucla | CS 181 | 2024-04-29 14:11
Table of Contents
Basic Notions
- The alphabet (also called terminals) can be defined as a set $\Sigma$
- Variables are nonterminal symbols defined by the set $V$
- Transitions (known as substitution rules) are defined as mappings that are not necessarily unique or bijective.
- by convention, the first variable of the list of substitution rules is the start “state”
- e.g. CFG
- $\Sigma={a,b}$
- $V={S,A,B}$
- Substitution Rules:
- $S\to A$
- $S\to B$
- $A\to aAb$
- $A\to\varepsilon$
- $B\to bBa$
- $B\to \varepsilon$
- a derivation is a list of substitutions made from the rule set to result in a string on terminals, substitutions can only be made for nonterminals
- e.g. a derivation
- $S\to A\to aAb\to aaAbb\to aaaAbbb\to aaabbb$
- where we used rules in order:
- $#1\to#3\to#3\to#3\to#4$
- the language of a grammar is the set of all strings that can be derived
- OR
$S\to A B$
- e.g., shorthand of the previous CFG
$S\to A B$ $A\to aAb \varepsilon$ $B\to bBa \varepsilon$
- e.g., the lang for the previous CFG
- $L={a^nb^n:n\ge0}\cup{b^na^n:n\ge0}$
- CFG - a tuple $(V,\Sigma, R,S)$
Language -> CFG Examples

Prog. Lang. PL24
- identifiers: $(a\cup…\cup z)(a\cup…\cup z\cup0\cup…\cup9)^*$
- numbers: positive integers
- statement types: assignment or conditional
- prog: single statement \(\Sigma=\{a,b,...,z,0,...,9,\gt,\lt,=,\sqcup,\mathbb P\}\)
pos digit: $D’\to 1 … 9$ any digit: $D\to D’ 0$ any num: $N\to ND D’$ letter: $L\to a b … z$ identifier: $I\to IL\space \space ID\space \space L$ quantity: $Q\to I N$ whitespace: $W\to\sqcup\space \space\mathbb P\space \space WW$ optional whitespace: $W’\to \varepsilon W$ test: $T\to QW’=W’Q\space \space QW’\gt W’Q\space \space QW’\lt W’Q$ - assignment: $A\to IW’=W’Q$
conditional: $C\to if\space W\space T\space W\space then\space W\space S\space \space if\space W\space T\space W\space then\space W\space S\space W\space else\space W\space S$ start: $S\to A C$
Parse Trees
- a tree that shows the actions taken to get a string from a set of derivations, the string is found as the in order traversal of the leaves of the parse tree

- this can lead to ambiguity due to many possible parse trees that can lead to the same string e.g., above for empty string
- a grammar is ambiguous if some string in the grammar has 2 or more parse trees, otherwise its unambiguous
Disambiguation
- divide into case consideration of strings to disambiguate grammars

- parse trees are unique if all nodes and branches in the trees are the same to the dot
Determining Ambiguity
- consider the following problem

- we can represent the equivalence in number of symbols as a function, so considering all possible graphs for the grammar shows us whether we converge or diverge by the end of the string
- so we know a substring follows the grammar if the end of the substring does not dip below the horizontal line (y-coord) of the start symbol of the substr
- thus for all possible variations of the mappings of the rules, if we can show the strings are properly formatted by the end of the string, it must be unambiguous:

- e.g., identifying the sections of the graph can give us an unambiguous grammar, consider:

- e.g., carrying over unambiguous grammars to new problems:

