Coding style and Documentation

Coding Style

A good and consistent coding style is essential for multi-platform collaborative coding efforts. Changes to files just in style, because perhaps the use of different editors add noise to the system and hinder us to look at the essentials of our developement. These are most often incremental changes, and incremental changes come, as nature and svn gives them to us, as unified diffs. In these, we only want to see the real code change and not that you added a blank to 25000 lines.

So if everbody sticks to some basic rules from the beginning, such marginal changes are much less likely and we will more easily see what is essential.

The style that is proposed in the following is much inspired by the Linux CodingStyle (LCS), but adapted to the needs of C++ and a multiplatform developpement.


Always indent code properly and systematically. Since getting tabs to look the same on different systems is almost impossible we don't use them for parXXL (but, obviously, in makefiles). Please use an appropriate number of blanks. If your editor is not able to do that for you, it is time for a change.

Opening and closing braces

Put the opening brace of any type of construct at the end of the same line of the statement or declaration that defines it. Let any closing brace start a new line and only contine on the same line after the brace if this is a continuation of the statement to which the block belongs. Example:

 foo(int& a) {
    if (a) {
      printf("a = %d", a);
    } else a = 23;
This makes it easier to capture where a block ends.

Breaking lines and line breaks

Please watch that you don't have long lines. 80 charactars maximum is a good rule of thumb. There are still people out there that devellop remotely over ssh in a text terminal.

Line breaks by themselves should not be an issue. Yes, there are still three different encodings around that encode the end of a line. Use the one that your system has usually. Svn should be able to notice that your file is a text file. It will then store your file in the repository independently of your particular EOL convention and present the file on every system in a appropriate way. Trust it so it will trust you.


The corresponding chapter of LCS directly applies:
    Functions should be short and sweet, and do just one thing.  They should
    fit on one or two screenfuls of text (the ISO/ANSI screen size is 80x24,
    as we all know), and do one thing and do that well.

    The maximum length of a function is inversely proportional to the
    complexity and indentation level of that function.  So, if you have a
    conceptually simple function that is just one long (but simple)
    case-statement, where you have to do lots of small things for a lot of
    different cases, it's OK to have a longer function.

    However, if you have a complex function, and you suspect that a
    less-than-gifted first-year high-school student might not even
    understand what the function is all about, you should adhere to the
    maximum limits all the more closely.  Use helper functions with
    descriptive names (you can ask the compiler to in-line them if you think
    it's performance-critical, and it will probably do a better job of it
    than you would have done).

    Another measure of the function is the number of local variables.  They
    shouldn't exceed 5-10, or you're doing something wrong.  Re-think the
    function, and split it into smaller pieces.  A human brain can
    generally easily keep track of about 7 different things, anything more
    and it gets confused.  You know you're brilliant, but maybe you'd like
    to understand what you did 2 weeks from now.

Declarations with derived types.

Think of (and use) all general type modifiers (like `*' `&' or const) as being on the right of the type they modify and glue them most left to them as possible. Separate all such type information with a blank from the declared variables.


 type* var;
 type& var;
 type* var;
 type*const var;
 type const& var;
 type const*const var;

Variable names are easier to read (by human eyes) and we are going to use them (not their type names).

const are special in that the standard allows them to precede the type they modify, but this leads to unfortunate misunderstandings:

 typeof int* int_p;
 const int_p a;
 const int* b;

Here a and b are of different types, namely a could be equivalently declared

 int*const a

So b is a pointer that may be modified but that always points to a location that may not be modified, whereas a is a pointer that cannot be modified but whos target location may be modified.

 // legal:
 *a = 34;
 b = a;
 // illegal:
 *b = 41;
 a = b;


Template classes and template function are a powerfull extension by C++ over C.

We code templates as in the following example:

 template< class T >
 class toto;

 template< int len >
 class tutu;

 toto< tutu< 3 > > A;

Observe that

Unfortunately the syntax for templates is a crying shame, so be very careful with it. The <> characters now have quite different syntactical meanings for the different parts of a C++ compiler system. For the preprocessor they are just operators (like + or - or =). For the compiler itself they may be operators, too, but in the context of templates they are opening and closing parenthesis. There are two particular dangers:

`>>' clash

If we are not careful with nested templates we may produce invalid code as the following:
 toto< tutu< 3 >> A;

Here the >> is interpreted as `right shift' operator and thus this code would create a compile time error. But this would not be fun if we would not be able to produce valid but quite ambiguous code:

 template< int len >
 fun(int x);
 typedef int (*fun_t)(int);
 template< fun_t f >
 fon(int x);
 void total(void) {
   int A = fon< fun< 9 > >(1) >>(2);
   int B = fon< fun< 9 >>(1) > >(2);

Here A and B do not have much in common. For A we take the function fon that depends on function pointer fun<9> and call that with argument 1. The result is then shifted by 2 to the right.

In contrast, for B we call the function fon that depends on function pointer fun<5> and pass it the argument 2. So just because of some blancs that are spread differently the result can be completely different. Have you ever seen the diff message ``only whitespace differences in ...''?

Argument mix up for macros

Templates that receive two macros are dangerous when we have to pass expressions formed with them to a macro.

 #define hei(x) ((x) * 5)

 template< class T, class S >
 struct blub {
   int const prod = sizeof(T) * sizeof(S);

 int A[ hei(blub< float, double >::prod )];
Here the call to hei looks completely harmless, but it generates a compile time error: the macro preprocessor sees two expressions blub<float and double>prod that are separated by a comma but it is only willing to accept one.

The best way to avoid this are inline functions. They have the additional advantage of being type safe. In most cases a template function definition would suffice and be equally efficient:

 template< typename T >
 T hei(T const x) { return (x * 5); }
But clearly in the above case that would not do since we need something that evaluates at compile time. Another more spurious way in C++ to do this is to define an wrapper class
 template< int x >
 class hei {
    enum __weird { val = (x * 5) };
which then could be used as hei<x>::val.

So in the above example it would probably be as good (or bad) to define a `variadic' macro, i.e one that may receive any number of arguments

 #define hei(...) ((__VA_ARGS__) * 5)
Use parenthesis around all subexpressions to avoid ambiguity and operator clashes. Or even better write type safe macros such as
 #define hei(...) (static_cast< uint_t >(static_cast< uint_t >(__VA_ARGS__) * 5u))

Commenting code

For documenting programming interfaces Documentation.

Well structured and well named code needs not much of a comment, but don't think that the reversal of that statement is true. As already the LCS states:

    Comments are good, but there is also a danger of over-commenting.
    never try to explain how your code works in a comment: it's
    much better to write the code so that the working is obvious,
    and it's a waste of time to explain badly written code.

    Generally, you want your comments to tell what your code does, not how.
    Also, try to avoid putting comments inside a function body: if the
    function is so complex that you need to separately comment parts of it,
    you should probably go back to Functions for a while.  You can make
    small comments to note or warn about something particularly clever (or
    ugly), but try to avoid excess.  Instead, put the comments at the head
    of the function, telling people what it does, and possibly WHY it does


We are documenting all interfaces with doxygen. All classes should be described in their functionallity, as well as all their public members.

You should find a lot of examples in the code.

Generated on Tue Oct 13 22:03:46 2009 for parXXL by  doxygen 1.5.8