@* Rules of the grammar. @:title@>
\def\:#1{`\.{@@#1}'}% for in case this file is processed in isolation
We first arrange the proper setting of |rule_mask|, which will control the
selection of rules actually used. Recall that any bits set in the mask of a
rule prescribe its {\it suppression\/} when the same bit is set in
|rule_mask|; therefore for instance the bit characterising \Cpp\ is called
|no_plus_plus|, so that rules specifying it will not be loaded for \Cpp. In
some cases two masks will be combined using the bitwise-or operator `|@v|',
this means (somewhat counterintuitively) that the rule will only be selected
if the conditions represented by the two masks are {\it both\/} satisfied.
The use of the bitwise-and operator `|&|' is even more exceptional: it is
only meaningful if its two operands both select one setting of the same
three-way switch; the rule will then be selected if that switch is in either
of the two indicated positions. The |merged_decls| flag is special in that
setting `\.{+m}' only enables an extra rule, but does no disable any rules;
therefore only one bit is used for this option, and raising this bit in
|rule_mask| suppresses the rule marked with |merged_decls|.

@d cwebx		0x0001 /* use normally */
@d compatibility	0x0002 /* use in compatibility mode */
@d only_plus_plus	0x0004 /* use in \Cpp\ */
@d no_plus_plus		0x0008 /* use in ordinary \Cee\ only */
@d unaligned_braces	0x0050 /* use if `\.{+u}' flag was set */
@d aligned_braces	0x0020 /* use unless `\.{+u}' flag was set */
@d wide_braces		0x0030 /* use if `\.{+w}' set */
@d standard_braces	0x0060 /* use unless `\.{+u}' or `\.{+w}' set */
@d merged_decls		0x0080 /* use if `\.{+m}' set */
@d forced_statements	0x0100 /* use if `\.{+a}' or `\.{+f}' set */
@d no_forced_statements 0x0600 /* use unless `\.{+a}' or `\.{+f}' set */
@d all_stats_forced	0x0300 /* use if `\.{+a}' set */
@d not_all_stats_forced	0x0400 /* use unless `\.{+a}' set */

@< Set initial values @>=
rule_mask= (compatibility_mode ? 0x0001 : 0x0002)
	 | (C_plus_plus ? 0x0008 : 0x0004)
	 | (flags['w'] ? 0x0040 : flags['u'] ? 0x0020 : 0x0010)
	 | (flags['m'] ? 0x0000 : 0x0080)
	 | (flags['a'] ? 0x0400 : flags['f'] ? 0x0200 : 0x0100)
	 ;
{ static reduction rule[] = { @< Rules @>@;@; }; 
@/int i=array_size(rule); @+ do install_rule(&rule[--i]); while (i>0);
#ifdef DEBUG
  if (install_failed) fatal("inconsistent grammar",0);
#endif
}

@ {\it Expressions}. @:rules@>
These rules should be obvious. Rule~5 allows typedef identifiers to be used
as field selectors in structures; rules 7~and~8 attach a parameter list in a
function call. In rule~14 we prefix a potentially binary operator such as
`|*|' that is used in a unary way by a `\.{\\mathord}' command to make sure
that \TeX\ will not mistake it for a binary operator. In simple cases such
as |*p| this is redundant, but if such operators are repeated more than one
level deep, as in |**p|, \TeX\ would otherwise treat the first operator as
the left operand of the second, and insert the wrong spacing. Moreover,
typical \Cee~constructions as a cast |(void*) &x| or a declaration
|char *p@;| would confuse \TeX\ even more. In rule~13 we need not insert
`\.{\\mathord}', since operators of category |unop| are already treated as
ordinary symbols by~\TeX.

@< Rules @>=
{ 1, {{expression, unop}},			{expression, NULL}},	@/
{ 2, {{expression, binop, expression}},		{expression, NULL}},	@/
{ 3, {{expression, unorbinop, expression}},	{expression, NULL}},	@/
{ 4, {{expression, select, expression}},	{expression, NULL}},	@/
{ 5, {{expression, select, int_like}},		{expression, "__$_"}},	@/
{ 6, {{expression, comma, expression}},		{expression, "__p1_"}},	@/
{ 7, {{expression, expression}},		{expression, NULL}},	@/
{ 8, {{expression, lpar, rpar}},		{expression, "__,_"}},	@/
{ 9, {{expression, subscript}},			{expression, NULL}},	@/
{10, {{lpar, expression, rpar}},		{expression, NULL}},	@/
{11, {{lbrack, expression, rbrack}},		{subscript,  NULL}},	@/
{12, {{lbrack, rbrack}},			{subscript,  "_,_"}},	@/
{13, {{unop, expression}},			{expression, NULL}},	@/
{14, {{unorbinop, expression}},			{expression, "o__"}},	@[@]

@~Here are some less common kinds of formulae. Processing the colon belonging
to the question mark operator in math mode will give it the proper spacing,
which is different from that of a colon following a label. Rule~21 processes
casts, since the category |parameters|, which represents parenthesised lists
specifying function argument types, encompasses the case of a single
parenthesised type specification. The argument of |sizeof| may be a type
specification rather than an expression; in \Cee\ (unlike \Cpp) it then must
be parenthesised. %, but not in \Cpp\ (and |sizeof_like| might be `\&{new}').

@< Rules @>=
{20, {{question, expression, colon}},		{binop, "__m_"}},	@/
{21, {{parameters, expression}},		{expression, "_,_"}},	@/
{22, {{sizeof_like, parameters}},		{expression, NULL}},	@/
{23, {{sizeof_like, expression}},		{expression, NULL}},	@/
{24, {{sizeof_like, int_like}},	{expression,"_~_"},only_plus_plus},	@[@]

@ {\it Declarations}.
In a declaration in \Cee, the identifier being declared is wrapped up in a
declarator, which looks like an expression of a restricted kind: only prefix
asterisk, postfix subscript and formal parameters, and parentheses are used.
In a bottom-up parser of the kind we are using, it is natural, and hardly
avoidable, that declarators are parsed as expressions. Therefore we start
recognising a declaration when we see a type specifier followed by the first
declarator; at that point we have a succession `|int_like| |expression|
|semi|' or `|int_like| |expression| |comma|' (rules 31~and~33). It is also
possible that there are no declarators at all, namely when a |struct|,
|union|, or~|enum| specifier is introduced without declaring any variables;
in that case we have `|int_like| |semi|' (rule~32). Because the type
specifier might be composite, like |unsigned long int|, and there might
moreover be storage class specifiers and type modifiers (like `|const|'), we
first contract any sequence of |int_like| items to a single one (rule~30).
In case the declarator was followed by a comma we reduce to |int_like|, so
that the next declarator can be matched, otherwise we reduce to
|declaration|.

It is not quite true that declarators always look like expressions, since
the type modifiers `|const|' and `|volatile|' may penetrate into
declarators. When they do they will almost always be preceded by an
asterisk, and rule~34 will treat such cases. The choice for |int_like| as
the result category is not completely obvious, since it makes the modifier
and the preceding asterisk part of the type specifier rather than of the
declarator, which strictly speaking is not correct; the choice for |unop| or
|unorbinop| might therefore seem a more logical one. One reason for not
doing that is that a space would have to be inserted in the translation
after the modifier scrap, which would not look right in abstract declarators
for contrived cases like \hbox{|int f(char *const)@;|}; more importantly, if
the modifier would become part of the declarator, it would be a (reserved)
identifier that precedes the identifier actually being declared, and when
the declarator then receives a call from |make_underlined| by rule 31~or~33,
it would mislead |first_ident|. The current solution has a small flaw as
well, since it cannot handle the situation where the modifier is separated
from the type specifier by a parenthesis, as in
$\&{void}~(\m*\&{const}~\m*f)~(\&{int})$; such cases are quite uncommon, are
hard to handle by rules that will not spuriously match in other situations,
and even then they would still cause problems with |make_underlined|, so we
do not attempt to handle them.

@< Rules @>=
{30, {{int_like, int_like}},		   {int_like, "_~_"}},	@/
{31, {{int_like, expression, semi}},	   {declaration, "_~!__"}},	@/
{32, {{int_like, semi}},		   {declaration, NULL}},	@/
{33, {{int_like, expression, comma}},	   {int_like, "_~!__p1"}},	@/
{34, {{unorbinop, int_like}},		   {int_like, "o__"}},		@[@]

@ If a typedef identifier is simultaneously used as a field selector in a
|struct| or |union| declaration, it must be made to parse as expression and
be printed in italic type; this can be achieved by placing the magic wand
\:; before the identifier, by rule~35. The reason that we place \:; at the
beginning rather than at the end of the construction here, is to prevent the
|int_like| identifier from combining with something before it first.
Rule~35 only applies if the \:; does not match by any rule with what comes
before it.

Rule~36 handles the case that a function is declared with specified argument
types, which is not handled by the expression syntax given until now.  It
also parses new-style (\caps{ANSI/ISO}) headings of function definitions; in
that case, the resulting |function_head| will not be incorporated into a
|declaration| (unless a comma or semicolon follows) but rather into a
|function|. If the parameter specifications include identifiers (as in the
case of function headings), the arguments look like declarations without the
final semicolon; rule~37 (with aid of rule~33) constructs such parameter
lists. Parameter specifications using abstract declarators (without
identifiers) will be treated below. In |struct| declarations we may
encounter bit-field specifications with or without an identifier; these are
handled by rules 38~and~39 (the constant expression following the colon will
later receive a spurious call from |make_underlined|, but in case of numeric
constants this does no harm).

@< Rules @>=
{35, {{magic, int_like}},		   {expression, "_$_"}},	@/
{36, {{expression, parameters}},	   {function_head, "_B_"}},	@/
{37, {{lpar, int_like, expression, rpar}}, {parameters, "_+++_~!_---_"}},@/
{38, {{int_like, expression, colon}},      {int_like, "_~!_m_"}},	@/
{39, {{int_like, colon}},		   {int_like, "_m_"}},		@[@]

@ Abstract declarators are used after type specifiers in casts and for
specifying the argument types in declarations of functions or function
pointers. They are like ordinary declarators, except that the defined
identifier has been ``abstracted''; an example is `|**(* )(int)|' in `|void
g(char**(* )(int))@;|', which tells that |g| takes as argument a pointer to
a function with |int| parameter yielding a pointer to pointer to |char|. A
difficulty with abstract declarators is that they are built up around the
vacuum left by abstracting the identifier, and since for more than one
reason we cannot allow rules with empty left hand side, we have to find an
alternative way to get them started.

The natural solution to this problem is to look for sequences that can only
occur if an identifier has been abstracted from between them, for instance
`\.{*)}' (in categories: |unorbinop| |rpar|). The most compelling reason why
in |C_read| we had to laboriously change the category of a |type_defined|
identifier to |expression| instead of |int_like| inside its defining typedef
declaration, is that it allows us to ensure that any remaining |int_like|
scrap that is followed by a |subscript| is a sure sign of an abstract
declarator.

Here are the cases that start off abstract declarators (these are the first
examples of rules that need context categories in their left hand side).  As
a visual hint to the reader we leave a little bit of white space on the spot
where the identifier has vanished. Rules 40~and~41 handle declarators for
pointer arguments, where the vanished identifier is preceded by an asterisk,
which either stands at the end of the declarator, or is parenthesised (for
function pointer arguments). In these rules there is no need to prefix the
asterisk with `\.{\\mathord}', since the right context makes an
interpretation as binary operator impossible. Rules 42~and~43 treat
declarators for arrays, possibly of pointers; there are no corresponding
rules with |parameters| instead of |subscript| since abstract declarators
never specify functions themselves, only function pointers. In fact the
``function analogue'' of rule~43 would incorrectly match a cast following an
operator like `|*|' or `|-|'. Rule~44 treats an abstract declarator
consisting of subscripts only, which are redundantly parenthesised; here too
the corresponding pattern with |parameters| is not only never needed, it
would also spuriously trigger on parenthesised expressions that start with a
cast.

@< Rules @>=
{40, {{unorbinop, rpar}, -1},		{declarator, "_,"}},	@/
{41, {{unorbinop, comma},-1},		{declarator, "_,"}},	@/
{42, {{int_like, subscript},1},		{declarator, ",_"}},	@/
{43, {{unorbinop, subscript},1},	{declarator, ",_"}},	@/
{44, {{lpar, subscript},1},		{declarator, ",_"}},	@[@]

@~ Abstract declarators may grow just like ordinary declarators, to include
prefixed asterisks, as well as postfixed subscripts and parameters, and
grouping parentheses.

@< Rules @>=
{45, {{unorbinop, declarator}},  {declarator, "o__"}},	@/
{46, {{declarator, subscript}},  {declarator, NULL}},	@/
{47, {{declarator, parameters}}, {declarator, NULL}},	@/
{48, {{lpar, declarator, rpar}}, {declarator, NULL}},	@[@]

@~ Here is how abstract declarators are assembled into |parameters|, keeping
in mind that the ``abstract declarator'' might be completely empty (i.e.,
absent) as in `|void f(int);|' (rules 51~and~53). We put no space after the
type specifier here, since it is followed either by an abstract declarator,
a right parenthesis or comma, so certainly not by an identifier; therefore a
space is neither necessary, nor would it improve readability. The
\caps{ANSI/ISO} syntax allows empty parentheses as a parameter specification
in abstract declarators, although this is an old-style form; rule~54 has
been included to handle this case. Fortunately a parenthesised list of
identifiers (which would parse as |expression|) is not allowed as parameter
specification.

@< Rules @>=
{50, {{lpar, int_like, declarator, comma}}, {lpar, "____p5"}},		@/
{51, {{lpar, int_like, comma}},		    {lpar, "___p5"}},		@/
{52, {{lpar, int_like, declarator, rpar}},  {parameters, NULL}},	@/
{53, {{lpar, int_like, rpar}},		    {parameters, NULL}},	@/
{54, {{lpar, rpar}},			    {parameters, "_,_"}},	@[@]

@ {\it Structure, union, and enumeration specifiers}.  It is permissible to
use typedef identifiers as structure, union, or enumeration tags as well, so
we include cases where an |int_like| follows a |struct_like| token. In \Cpp,
we may also find things like `\&{private}:' in a class specifier; these are
parsed just like `|default:|', i.e., as a |label| (rule~66).

@< Rules @>=
{60, {{struct_like, lbrace}}, {struct_head, "_ft_"},standard_braces},	@/
{60, {{struct_like, lbrace}}, {struct_head, "_~_"},unaligned_braces},	@/
{60, {{struct_like, lbrace}}, {struct_head, "_f_"},wide_braces},	@/
{61, {{struct_like, expression, lbrace}},
		{struct_head, "_~!_ft_"},standard_braces},		@/
{61, {{struct_like, expression, lbrace}},
		{struct_head, "_~!_~_"},unaligned_braces},		@/
{61, {{struct_like, expression, lbrace}},
		{struct_head, "_~!_f_"},wide_braces},			@/
{62, {{struct_like, int_like, lbrace}},
	{struct_head, "_~!$_ft_"},standard_braces|no_plus_plus},	@/
{62, {{struct_like, int_like, lbrace}},
	{struct_head, "_~!_ft_"},standard_braces|only_plus_plus},	@/
{62, {{struct_like, int_like, lbrace}},
	{struct_head, "_~!$_~_"},unaligned_braces|no_plus_plus},	@/
{62, {{struct_like, int_like, lbrace}},
	{struct_head, "_~!_~_"},unaligned_braces|only_plus_plus},	@/
{62, {{struct_like, int_like, lbrace}},
	{struct_head, "_~!$_f_"},wide_braces|no_plus_plus},		@/
{62, {{struct_like, int_like, lbrace}},
	{struct_head, "_~!_f_"},wide_braces|only_plus_plus},		@/
{63, {{struct_like, expression}}, {int_like, "_~_"}},			@/
{64, {{struct_like, int_like}},   {int_like, "_~$_"},no_plus_plus},	@/
{64, {{struct_like, int_like}},   {int_like, "_~_"},only_plus_plus},	@/
{65, {{struct_head, declaration, rbrace}},
		{int_like, "_+_-f_"},standard_braces},			@/
{65, {{struct_head, declaration, rbrace}},
		{int_like, "_+f_-f_"},unaligned_braces & wide_braces},	@/
{66, {{label, declaration}}, {declaration, "b_f_"},only_plus_plus},     @[@]

@ Rules 67--70 are for enumerations; they avoid forced line breaks and call
|make_underlined| for all the enumeration constants.

@< Rules @>=
{67, {{struct_like, lbrace, expression},-1}, {struct_head, "_B_"}},	@/
{68, {{struct_like, expression, lbrace, expression},-1},
		{struct_head, "_~_B_"}},				@/
{69, {{struct_head, expression, comma, expression},1},
		{expression, "__B!_"}},					@/
{70, {{struct_head, expression, rbrace}}, {int_like, "_~+!_-B_"}},	@[@]

@ The following rules are added to allow short structure and union
specifiers to be kept on one line without having to repeatedly specify \:+.
The idea is to place \:; after the left brace; this will cause the rules
below to be invoked instead of those above, which avoids introducing forced
line breaks.

@< Rules @>=
{71, {{struct_like, lbrace, magic}}, {short_struct_head, "_B__+"}},	@/
{72, {{struct_like, expression, lbrace, magic}},
		{short_struct_head, "_~!_B__+"}},			@/
{73, {{struct_like, int_like, lbrace, magic}},
		{short_struct_head, "_~!$_B__+"}, no_plus_plus},	@/
{73, {{struct_like, int_like, lbrace, magic}},
		{short_struct_head, "_~!_B__+"}, only_plus_plus},	@/
{74, {{short_struct_head, declaration}}, {short_struct_head, "_B_"}},	@/
{75, {{short_struct_head, rbrace}}, {int_like, "_-B_"}},		@[@]


@ {\it Statements}.
Rule~80 gives the usual way statements are formed, while rule~81 handles the
anomalous case of an empty statement. Its use can always be avoided by using
an empty pair of braces instead, which much more visibly indicates the
absence of a statement (e.g., an empty loop body); when the empty statement
is used however, it will either be preceded by a space or start a new line
(like any other statement), so there is always some distinction between a
|while| loop with empty body and the |while| that ends a |do|~statement. A
rule like this with left hand side of length~1 makes the corresponding
category (viz.~|semi|) ``unstable'', and can only be useful for categories
that usually are scooped up (mostly from the left) by a longer rule.  Rules
82--84 make labels (ordinary, case and default), and rules 85~and~86 attach
the labels to statements. Rule~87 makes \:; behave like an invisible
semicolon when it does not match any of the rules designed for it, for
instance if it follows an expression.

@< Rules @>=
{80, {{expression, semi}},		{statement, NULL}},	@/
{81, {{semi}},				{statement, NULL}},	@/
{82, {{expression, colon}},		{label, "!_h_"}},	@/
{83, {{case_like, expression, colon}},	{label, "_ _h_"}},	@/
{84, {{case_like, colon}},		{label, "_h_"}},	@/
{85, {{label, label}},			{label, "_B_"}},	@/
{86, {{label, statement}}, {statement, "b_B_"},not_all_stats_forced},	@/
{86, {{label, statement}}, {statement, "b_f_"},all_stats_forced},	@/
{87, {{magic}},				{semi, NULL}},		@[@]

@ The following rules format compound statements and aggregate initialisers.
Rules 90--94 combine declarations and statements within compound statements.
A newline is forced between declarations by rule~90, unless the declarations
are local (preceded by a left brace) and `\.{+m}' was specified (rule~91);
this rule does not apply to structure specifiers, because the left brace
will already have been captured in a |struct_head| before the rule can match.
If `\.{+f}'~or~`\.{+a}' was specified, then a newline is forced between
statements as well (rule~93). Between the declarations and statements some
extra white space appears in ordinary \Cee\ (rule~92), but not in \Cpp, where
declarations and statements may be arbitrarily mixed (rule~94).  Rules
95--97 then build compound statements, where the last case is the unusual
one where a compound statement ends with a declaration; empty compound
statements are made into simple statements so that they will look better
when used in a loop statement or after a label. If compound statements are
not engulfed by a conditional or loop statement (see below) then they decay
to ordinary statements by rule~98. Rules 99~and~100 reduce aggregate
initialiser expressions, where the reduction of comma-separated lists of
expressions is already handled by the expression syntax.

@< Rules @>=
{90, {{declaration, declaration}}, {declaration, "_f_"}},		@/
{91, {{lbrace, declaration, declaration},1},
				  {declaration, "_B_"},merged_decls},	@/
{92, {{declaration, statement}},  {statement, "_F_"},no_plus_plus},	@/
{92, {{declaration, statement}},  {statement, "_f_"},only_plus_plus},	@/
{93, {{statement, statement}}, {statement, "_f_"},forced_statements},	@/
{93, {{statement, statement}}, {statement, "_B_"},no_forced_statements},@/
{94, {{statement, declaration}},  {declaration, "_f_"},only_plus_plus},	@/
{95, {{lbrace, rbrace}},	  {statement, "_,_"}},			@/
{96, {{lbrace, statement, rbrace}},
	{compound_statement, "ft_+_-f_"},standard_braces},		@/
{96, {{lbrace, statement, rbrace}},
	{compound_statement, "_+f_-f_"},unaligned_braces},		@/
{96, {{lbrace, statement, rbrace}},
	{compound_statement, "f_+f_-f_"},wide_braces},			@/
{97, {{lbrace, declaration, rbrace}},
	{compound_statement, "ft_+_-f_"},standard_braces},		@/
{97, {{lbrace, declaration, rbrace}},
	{compound_statement, "_+f_-f_"},unaligned_braces},		@/
{97, {{lbrace, declaration, rbrace}},
	{compound_statement, "f_+f_-f_"},wide_braces},			@/
{98, {{compound_statement}},			{statement, "f_f"}},	@/
{99, {{lbrace, expression, comma, rbrace}},	{expression, "_,__,_"}},@/
{100, {{lbrace, expression, rbrace}},		{expression, "_,_,_"}},	@[@]

@ Like for structure and union specifiers, we allow compound statements to
be kept on one line by inserting \:; after the left brace. Such statements
will reduce to |statement| rather that to |compound_statement|, so that they
will be treated as if they were simple statements.

@< Rules @>=
{101, {{lbrace, magic}},		{short_lbrace, "__+"}},	@/
{102, {{short_lbrace, declaration}},	{short_lbrace, "_B_"}},	@/
{103, {{short_lbrace, statement}},	{short_lbrace, "_B_"}},	@/
{104, {{short_lbrace, rbrace}},		{statement, "_-B_"}},	@[@]

@ {\it Selection, iteration and jump statements}.
There are three intermediate categories involved in the recognition of
conditional statements. The category |if_like| stands for `|if|' or an
initial segment of a repeated if-clause, up to and including `|else|~|if|'.
An |if_head| is an |if_like| followed by its (parenthesised) condition
(rules 110~and~111). If the statement following the condition is followed by
`|else|~|if|', the whole construct reduces to |if_like| (so that the
indentation will not increase after the second condition, rules 112~and~113),
otherwise, if only `|else|' follows, reduction is to an |if_else_head|
(rules 114~and~115), and finally, if no |else| follows at all, we reduce with
only the if-branch to |statement| (rules 116~and~117). The reduction rules for
|if_else_head| differ from those for |if_head| in that it will not combine
with an |else|, even if it is present; the formatting is identical to that
of an |else|-less |if_head| (rules 118~and~119).  (It might be tempting to
replace rules 116~and~117 by a reduction from |if_head| to |if_else_head| to
be applied if no matching `|else|' is found, but that would require some
subtle measures to prevent this decay at times when the right context is
insufficiently reduced to decide whether an `|else|' is present or not.) The
formatting of the if and else branches depends on whether they are compound
statements or some other kind of statement (possibly another conditional
statement), and on the flags for statement forcing and brace alignment.

@< Rules @>=
{110, {{if_like, expression}},	      {if_head, "f_~_"}},		@/
{111, {{lbrace,if_like,expression},1}, {if_head, "_~_"},standard_braces},@/
{112, {{if_head, compound_statement, else_like, if_like}},
			{if_like, "__f_~_"},aligned_braces},		@/
{112, {{if_head, compound_statement, else_like, if_like}},
			{if_like, "_~_~_~_"},unaligned_braces},		@/
{113, {{if_head, statement, else_like, if_like}},
			{if_like, "_+B_-f_~_"},not_all_stats_forced},	@/
{113, {{if_head, statement, else_like, if_like}},
			{if_like, "_+f_-f_~_"},all_stats_forced},	@/
{114, {{if_head, compound_statement, else_like}},
			{if_else_head, "__f_"},aligned_braces},		@/
{114, {{if_head, compound_statement, else_like}},
			{if_else_head, "_~_~_"},unaligned_braces},	@/
{115, {{if_head, statement, else_like}},
			{if_else_head, "_+B_-f_"},not_all_stats_forced},@/
{115, {{if_head, statement, else_like}},
			{if_else_head, "_+f_-f_"},all_stats_forced},	@/
{116, {{if_head, compound_statement}},
			{statement, "__f"},aligned_braces},		@/
{116, {{if_head, compound_statement}},
			{statement, "_~_f"},unaligned_braces},		@/
{117, {{if_head, statement}},
			{statement, "_+B_-f"},not_all_stats_forced},	@/
{117, {{if_head, statement}},
			{statement, "_+f_-f"},all_stats_forced},	@/
{118, {{if_else_head, compound_statement}},
			{statement, "__f"},aligned_braces},		@/
{118, {{if_else_head, compound_statement}},
			{statement, "_~_f"},unaligned_braces},		@/
{119, {{if_else_head, statement}},
			{statement, "_+B_-f"},not_all_stats_forced},	@/
{119, {{if_else_head, statement}},
			{statement, "_+f_-f"},all_stats_forced},	@[@]

@ The following rules prevent forced line breaks from conditional statements
that occur within a one-line compound statement.

@< Rules @>=
{120, {{short_lbrace, if_like, expression},1},
		{if_head, "_~_"}},				@/
{121, {{short_lbrace, if_head, statement, else_like}},
		{short_lbrace, "_B_B_B_"}},			@/
{122, {{short_lbrace, if_head, statement}},
		{short_lbrace, "_B_B_"}},			@[@]

@ Switch and loop statements make use of the syntax for conditionals by
reducing to |if_else_head| which will take one further statement and indent
it (rules 130~and~131). Recall that `|for|' and `|switch|' are both
|while_like|; the parenthesised object following `|for|' looks like nothing
we have seen before, however, so we need extra rules to come to terms with
it (rules 132--134). Rule~132 is needed to avoid a line break when these are
normally inserted between statements, and rule~134 is needed in case the
third expression is empty. The |do|-|while| loops have to be treated
separately. Because we want to distinguish the case of a
|compound_statement| as loop body from other kinds of statements, we cannot
wait until the |while| combines with the loop control condition to an
|if_else_head|, since by then a |compound_statement| will have decayed to
|statement|. Hence we pick up the unreduced `|while|' token and form a new
category |do_head| (rules 135~and~136); in case of a compound statement the
`|while|' will be on the same line as the closing brace. Rules 137~and~138
then combine this with the condition and the ridiculous mandatory semicolon
at the end to form a |statement|.

@< Rules @>=
{130, {{while_like, expression}}, {if_else_head, "f_~_"}},		@/
{131, {{lbrace, while_like, expression},1},
			{if_else_head, "_~_"},standard_braces},		@/
{132, {{lpar, statement, statement}, 1},
			{statement, "_B_"}, forced_statements},		@/
{133, {{lpar, statement, expression, rpar}},	{expression, "__B__"}},	@/
{134, {{lpar, statement, rpar}},	{expression, NULL}},			@/
{135, {{do_like, compound_statement, while_like}},
			{do_head, "__~_"},standard_braces},		@/
{135, {{do_like, compound_statement, while_like}},
			{do_head, "_~_~_"},unaligned_braces},		@/
{135, {{do_like, compound_statement, while_like}},
			{do_head, "__f_"},wide_braces},			@/
{136, {{do_like, statement, while_like}},
			{do_head, "_+B_-B_"},not_all_stats_forced},	@/
{136, {{do_like, statement, while_like}},
			{do_head, "_+f_-f_"},all_stats_forced},		@/
{137, {{do_head, expression, semi}}, {statement, "f_~__f"}},		@/
{138, {{lbrace, do_head, expression, semi},1}, {statement, "_~__f"}},	@[@]

@ The following rules prevent forced line breaks from loop statements
that occur within a one-line compound statement. Since no special layout is
required between the heading of a |while| loop and its body, rule~139
incorporates the heading as if it were a separate statement. For a
|do|-|while| loop we must take a bit more effort to get the spacing
following the |while| correct.

@< Rules @>=
{139, {{short_lbrace, while_like, expression}},
					{short_lbrace, "_B_~_"}},	@/
{140, {{short_lbrace, do_like, statement, while_like},1},
					{do_head, "_B_B_"}},		@/
{141, {{short_lbrace, do_head, expression, semi}},
					{short_lbrace, "_B_~__"}},	@[@]

@ The tokens `|goto|', `|continue|', `|break|', and `|return|' are all
|return_like|; although what may follow them is not the same in all cases,
the following two rules cover all legal uses. Note that rule~146 does not
wait for a semicolon to come along; this may lead to a premature match as in
`|return a+b;|', but this does not affect formatting, while the rule allows
saying things like `|return home|' in a module name (or elsewhere) without
risking irreducible scraps.

@< Rules @>=
{145, {{return_like, semi}},	  {statement, NULL}},	@/
{146, {{return_like, expression}}, {expression, "_~_"}},	@[@]

@ {\it Function definitions and external declarations}.
Apart from the initial specification of the result type (which is optional,
defaulting to |int|), a new-style function heading will parse as an
|function_head| (see the declaration syntax above), while an old-style function
heading is an |expression| possibly followed by a |declaration| (specifying
the function parameters). Rules 150--152 parse these two kinds of
function headings together with the function body, yielding category
|function|; rule~153 attaches the optional result type specifier. Although
the \Cee~syntax requires that the function body is a compound statement, we
allow it to be a |statement| (to which |compound_statement| will decay), for
in case a very short function body is specified using `\.{\{@@;}'.

At the outer level declarations and functions can be mixed; when they do a bit
of white space surrounds the functions (rules 154--156). The combination of
several declarations is already taken care of by the syntax for compound
statements; no extra white space is involved there. Rules 157--159 take care
of function declarations that are not definitions (i.e., there is no function
body); if followed by a semicolon, a comma or a right parenthesis, the
|function_head| decays to an |expression|, and the rest of the syntax will
take care of recognising a |declaration| or |parameters|. Rules 153~and~157
will be replaced in~\Cpp, for reasons explained below (incidentally, this is
the reason the category |function_head| was introduced; it used to be simply
|expression|).

@< Rules @>=
{150, {{function_head, statement}}, {function, "!_f_"}},		@/
{151, {{expression, statement}}, {function, "!_f_"}},			@/
{152, {{expression, declaration, statement}},
			  	 {function, "!_++f_--f_"}},		@/
{153, {{int_like, function}},	 {function, "_ _"}},			@/
{154, {{declaration, function}}, {function, "_F_"}},			@/
{155, {{function, declaration}}, {declaration, "_F_"}},			@/
{156, {{function, function}},	 {function, "_F_"}},			@/
{157, {{function_head, semi},-1},  {expression, NULL},no_plus_plus},	@/
{158, {{function_head, comma},-1}, {expression, NULL}},			@/
{159, {{function_head, rpar},-1},  {expression, NULL}},			@[@]

@ {\it Module names}.
Although module names nearly always stand for statements, they can be made
to stand for a declaration by appending \:;, or for an expression by
appending `\.{@@;@@;}'. The latter possibility is most likely to be useful
if the module stands for (part of) an initialiser list. A module name can
also be made into an expression by enclosing it in \:[ and~\:], but in that
case rule~160 will apply first, placing a forced break after the module
name. Rules 161, 164,~and~165 prevent a module name from generating forced
breaks if it occurs on a one-line compound statement or structure or union
specifier, while rules 167~and~168 serve to prevent rules 163~and~164 from
matching with priority over rule~166. The rules given here will be replaced by
other ones in compatibility mode.

@< Rules @>=
{160, {{mod_scrap}},			{statement, "_f"},cwebx},	@/
{161, {{short_lbrace, mod_scrap},1},	{statement, NULL},cwebx},	@/
{162, {{mod_scrap, magic}},		{declaration, "f__f"},cwebx},	@/
{163, {{lbrace, mod_scrap, magic},1},
			{declaration, "__f"},cwebx|standard_braces},	@/
{164, {{short_lbrace, mod_scrap, magic},1}, {declaration, NULL},cwebx},	@/
{165, {{short_struct_head, mod_scrap, magic},1},
					    {declaration,NULL},cwebx},	@/
{166, {{mod_scrap, magic, magic}},	{expression, NULL},cwebx},	@/
{167, {{lbrace, mod_scrap, magic, magic},1},
			{expression, NULL},cwebx|standard_braces},	@/
{168, {{short_lbrace, mod_scrap, magic, magic},1},
			{expression, NULL},cwebx},			@[@]

@ {\it Additional rules for compatibilty mode}.
@^Levy/Knuth \.{CWEB}@>
Although our grammar differs completely from the one used in \LKC., we use
most of it also in compatibility mode (the exception is formed by the rules
concerning module names). We do add a few rules in compatibility mode, mostly
do deal with circumstances that are different for some reason or other.

We start with module names, which behave in a completely different way. In
compatibility mode, as in \LKC., a module name normally stands for an
expression (rule~164) and in practice is almost always followed by a visible
or invisible (|magic|) semicolon. Rules 160~and~161 treat these cases
explicitly, in order to insert a forced break after the semicolon; rule~161
for the case of an invisible semicolon is needed because if we would wait for
the |magic| semicolon to decay to an ordinary one, it might instead combine
with an |int_like| token following it. Rules 162~and~163 are provided to allow
the short form of compound statements even in compatibility mode (even though
it is not present in \LKC.): they preempt rules 160~and~161, avoiding the
forced break. Since in compatibility mode one has no means of indicating that
a module name stands for a set of declarations, we add rule~165 to allow them
nevertheless to be used before a function definition.

Rules 170~and~171 compensate for the fact that compound assignment operators
like `|+=|' are scanned as two tokens in compatibility mode (see
section@#truly stupid@> for an explanation why this is done).
Rule~172 allows types to be used in the argument lists of macros, without
enclosing them between \:[~and~\:], in compatibility mode; this is done
frequently in the Stanford GraphBase. @^Stanford GraphBase@> It is sufficient
to remove expressions from the beginning of the argument list, since types,
and more generally types followed by declarators, are already removed by the
standard rules for |parameters|. As a result the argument list will either
reduce to an |expression| or to |parameters|, depending on whether the final
item was an expression. In both cases it will combine with the macro name to
an |expression|, although the spacing will be a bit too wide in the
|parameters| case. But then, one ought to use \:[~and~\:] anyway, which avoids
this problem.

@< Rules @>=
{160, {{mod_scrap, semi}}, 	{statement, "__f"},compatibility},	@/
{161, {{mod_scrap, magic}}, 	{statement, "__f"},compatibility},	@/
{162, {{short_lbrace, mod_scrap, semi},1}, 
				{statement, NULL},compatibility},	@/
{163, {{short_lbrace, mod_scrap, magic},1}, 
				{statement, NULL},compatibility},	@/
{164, {{mod_scrap}},        	{expression, NULL},compatibility},	@/
{165, {{statement, function}},	 {function, "_F_"},compatibility},	@/
@)
{170, {{binop, binop}},			{binop,"r__"},compatibility},	@/
{171, {{unorbinop, binop}},		{binop,"r__"},compatibility},	@/
{172, {{lpar, expression, comma}}, {lpar, "___p1"}, compatibility},	@[@]
@[@]

@ {\it Additional rules for \Cpp}.
Up to this point we have included some specific rules for \Cpp, in places
where a slight deviation from the \Cee~syntax was required. There are however
a large number of syntactic possibilities of \Cpp\ that are not even remotely
similar to those of~\Cee, so it is most convenient to collect them in a
separate section. The author of \.{CWEBx} wishes to make it clear that he is
quite aware of the incompleteness of the set of rules specified below, and
that he assumes no responsibility for correcting this. One reason for this is
that he has no readable formal grammar of \Cpp, which possibly could be used
for validation (nor does he use \Cpp\ himself), another is that the pieces
of grammar that he has seen show so little coherence that he seriously doubts
whether it is possible at all to parse \Cpp\ reliably with a grammar of the
type implemented here. In fact, the rules here were merely added in an
attempt to cope with problems reported by users.

We start with rules for `\&{operator}', which are simple: it should combine
with a following operator symbol of any type to form an expression (rules
180--182). Then rules 183--186 take care of the `::'~operator: either a class
name or nothing is expected at the left, and either an ordinary or class
identifier at the right; the resulting category is that of the right hand
side. Type identifiers may appear as the left hand side of an assignment
within a list of formal parameters, indicating a default argument; in this
case the while assignment should behave as a type identifier (rule~187).

Next we give rules catering with constructor declarations in class
definitions. First of all we must recognise the fact that the class name is
being used as a function name here; the simplest solution is to recognise the
combination of an |int_like| followed by a (possibly empty) parameter list
(rules 190~and~191). We cannot let a |function_head| (possibly created by the
rules kust mentioned) decay to an |expression| when followed be a semicolon,
as we do for~\Cee, since declarations of constructor members of a class lack
an initial type specification, so the |expression| would fail to become part
of a |declaration|. Therefore, special measures are necessary: the simplest
solution is to simply absorb (rule~192) any preceding type specifier into the
|function_head| (thereby removing the distinction between its presence or
absence), and construct de |declaration| explicitly from the |function_head|
and the following semicolon (rule~193).

@< Rules @>=
{180, {{case_like, binop}},	{expression, "_o_"},only_plus_plus},	@/
{181, {{case_like, unorbinop}},	{expression, "_o_"},only_plus_plus},	@/
{182, {{case_like, unop}},	{expression, NULL},only_plus_plus},	@/
{183, {{int_like, colcol, expression}},
				{expression, NULL},only_plus_plus},	@/
{184, {{colcol, expression}},	{expression, "o__"},only_plus_plus},	@/
{185, {{int_like, colcol, int_like}},
				{int_like, NULL},only_plus_plus},	@/
{186, {{colcol, int_like}},	{int_like, "o__"},only_plus_plus},	@/
{187, {{int_like, binop, expression}},
				{int_like, NULL},only_plus_plus},	@/
{190, {{int_like, parameters}},	{function_head, "_B_"},only_plus_plus},	@/
{191, {{int_like, lpar,rpar}}, {function_head, "_B_,_"},only_plus_plus},@/
{192, {{int_like, function_head}},
				{function_head, "_ _"},only_plus_plus},	@/
{193, {{function_head, semi}},	{declaration, "!__"},only_plus_plus},	@[@]
