10.1 Syntactic Analysis

The first thing CC Mode does when indenting a line of code, is to analyze the line by calling c-guess-basic-syntax, determining the syntactic context of the (first) construct on that line. Although this function is mainly used internally, it can sometimes be useful in Line-up functions (see Custom Line-Up Functions) or in functions on c-special-indent-hook (see Other Special Indentations).

Function: c-guess-basic-syntax

Determine the syntactic context of the current line.

The syntactic context is a list of syntactic elements, where each syntactic element in turn is a list33 Here is a brief and typical example:

((defun-block-intro 1959))

The first thing inside each syntactic element is always a syntactic symbol. It describes the kind of construct that was recognized, e.g. statement, substatement, class-open, class-close, etc. See Syntactic Symbols, for a complete list of currently recognized syntactic symbols and their semantics. The remaining entries are various data associated with the recognized construct - there might be zero or more.

Conceptually, a line of code is always indented relative to some position higher up in the buffer (typically the indentation of the previous line). That position is the anchor position in the syntactic element. If there is an entry after the syntactic symbol in the syntactic element list then it’s either nil or that anchor position.

Here is an example. Suppose we had the following code as the only thing in a C++ buffer 34:

 1: void swap( int& a, int& b )
 2: {
 3:     int tmp = a;
 4:     a = b;
 5:     b = tmp;
 6: }

We can use C-c C-s (c-show-syntactic-information) to report what the syntactic analysis is for the current line:

C-c C-s (c-show-syntactic-information)

This command calculates the syntactic analysis of the current line and displays it in the minibuffer. The command also highlights the anchor position(s).

Running this command on line 4 of this example, we’d see in the echo area35:

((statement 35))

and the ‘i’ of int on line 3 would be highlighted. This tells us that the line is a statement and it is indented relative to buffer position 35, the highlighted position. If you were to move point to line 3 and hit C-c C-s, you would see:

((defun-block-intro 29))

This indicates that the ‘int’ line is the first statement in a top level function block, and is indented relative to buffer position 29, which is the brace just after the function header.

Here’s another example:

 1: int add( int val, int incr, int doit )
 2: {
 3:     if( doit )
 4:         {
 5:             return( val + incr );
 6:         }
 7:     return( val );
 8: }

Hitting C-c C-s on line 4 gives us:

((substatement-open 46))

which tells us that this is a brace that opens a substatement block. 36

Syntactic contexts can contain more than one element, and syntactic elements need not have anchor positions. The most common example of this is a comment-only line:

 1: void draw_list( List<Drawables>& drawables )
 2: {
 3:         // call the virtual draw() method on each element in list
 4:     for( int i=0; i < drawables.count(), ++i )
 5:     {
 6:         drawables[i].draw();
 7:     }
 8: }

Hitting C-c C-s on line 3 of this example gives:

((comment-intro) (defun-block-intro 46))

and you can see that the syntactic context contains two syntactic elements. Notice that the first element, ‘(comment-intro)’, has no anchor position.


Footnotes

(33)

In CC Mode 5.28 and earlier, a syntactic element was a dotted pair; the cons was the syntactic symbol and the cdr was the anchor position. For compatibility’s sake, the parameter passed to a line-up function still has this dotted pair form (see Custom Line-Up Functions).

(34)

The line numbers in this and future examples don’t actually appear in the buffer, of course!

(35)

With a universal argument (i.e. C-u C-c C-s) the analysis is inserted into the buffer as a comment on the current line.

(36)

A substatement is the line after a conditional statement, such as if, else, while, do, switch, etc. A substatement block is a brace block following one of these conditional statements.