Code coverage analysis

This chapter describes the code analysis done by Coco in detail.

In order to analyze the code coverage of e.g. a test suite, it is necessary to compile a version of the application in which statements are inserted that record the execution of each part of the source code. The generation of such a modified version of a program is called instrumentation.

Coverage metrics supported by Coco

Several types of code coverage are possible with Coco. The following table summarizes the most common coverage metrics.

MetricDescription
Statement block coverageVerify that all statements are executed, by grouping the statements of the program to blocks. Statements belong to the same block if they are always executed together. Statement blocks roughly correspond to the blocks of most programming languages. The coverage of a program is the number of executed statement blocks in it, divided by the total number of blocks.
Decision (or branch) coverageVerify that all statements are executed and all decisions have all possible results. The coverage of a program is the number of executed statement blocks and decisions divided by the total number of statements and decisions. Here, each decision counts twice, once for the true case and one for the false case.
Condition coverageLike decision coverage, except that the decisions are split into elementary subexpressions (or conditions) that are connected by and or or operators. The coverage of a program is the number of executed statement blocks and conditions divided by the total number of them.

Here each condition counts twice, which may result in a large number of possible outcomes in a complex decision.

Modified condition/decision coverage (MC/DC)Like condition coverage, but every condition in a decision must be tested independently to reach full coverage. For each condition there must be two executions that differ only in the results of the condition, while the other conditions have the same result. In addition, it needs to be shown that each condition independently affects the decision.

The coverage of a program is the number of executed statement blocks and of conditions that were tested independently divided by the number of statement blocks and conditions in the program.

Multiple condition coverageAll statements must be executed and all combinations of truth values in each decision must occur at least once to reach full coverage. The coverage of a program is the number of executed statement blocks and condition combinations divided by their total number in the program.

Other, simpler coverage metrics are also supported.

MetricDescription
Function coverageCount which functions were called and how often. As always with Coco, functions includes also the member functions of objects. The coverage of a program is then the number of functions that were called at least once, divided by the total number of functions.

Relevance for safety standards: IEC 61508 highly recommends 100% structural test coverage of entry points (i.e. functions) for SIL 1, 2, 3 and 4.

Line coverageCount the number of executed code lines and how often each one of them was executed. Only lines that contain executable statements are instrumented, not those with pure declarations. The coverage of a program is then the number of executed lines divided by the number of instrumented lines.

Line coverage is an unstable coverage measurement since it depends strongly on the way the code is formatted (see Problems with line coverage). We do not recommend line coverage.

Coverage metrics and safety standards

In many safety standards, specific coverage levels are required:

  • IEC 61508 is a standard for the functional safety of electrical and electronic programmable systems.
  • ISO 26262 is a standard for the functional safety of electrical and electronic systems in cars.
  • EN 50128 is a standard for safety-relevant software in railway systems.
  • DO 178 is a safety standard for commercial software-based aerospace systems.

The following table shows which coverage levels are required by these safety standards.

IEC 61508ISO 26262EN 50128DO 178
Function coverage (entry points)
Statement block coverage
Decision coverage (branch coverage)
Modified condition/decision coverage
Multiple condition coverage

Depending on the security levels, the coverage requirement is either just recommended, highly recommended or required. More detailed information can be found at the end of the descriptions of the coverage metrics in the following sections.

Description of the coverage metrics

In this section we describe the coverage metrics supported by Coco in more detail. The code that is inserted during the instrumentation process is described in more detail in Code insertion.

In the next sections, we will use the following function to illustrate the coverage metrics and the instrumentation process. It is written in C++; coverage with other languages it is similar.

void foo()
{
  bool found=false;
  for (int i=0; (i<100) && (!found); ++i) {
    if (i==50)
      break;
    if (i==20)
      found=true;
    if (i==30)
      found=true;
  }
  printf("foo\n");
}

Statement block coverage

The most elementary kind of instrumentation records the statements in a program that are executed when it runs. It is however not necessary to record the execution of every statement to get this information. If several statements form a sequence, it is enough to record how often the last statement is executed, since they all form a block that is either executed as a whole or not at all. Coco therefore inserts instrumentation statements only at the end of each block, and the resulting coverage metric is called statement block coverage.

In the following listing, the instrumented statements are underlined:


void foo()
{
  bool found=false; // not instrumented
  for (int i=0; (i<100) && (!found); ++i) {
    if (i==50)
      {break};
    if (i==20)
      found=true;
    if (i==30)
      found=true;
  }
  printf("foo\n");  // not instrumented
}
    

One can see that not all lines are instrumented. The conditions of the if and for statements are not covered since they are not statements themselves. Some other statements need not to be instrumented because they are part of a block. In the example, these are the statements at lines 3 and 12: they belong to the block that begins at line 2 and ends with line 13. It is enough to put an instrumentation point at line 13 to verify that all of them were executed.

Relevance for safety standards

ISO 26262 highly recommends statement coverage for ASIL A and B. ASIL C and D recommend it as well, but the more strict branch coverage and modified condition/decision coverage are highly recommended instead.

Decision coverage

A more detailed coverage metric also records the values of the Boolean conditions in branch and loop statements, like if, while, for, etc. To reach full coverage, the decision in such a construct must have evaluated to true and to false at least once. This kind of code coverage is called decision coverage or sometimes branch coverage.

Decision coverage also handles switch statements. In them, full coverage is reached if every case label in the code was reached at least once during the execution of the program.

Decision coverage also includes the coverage of statements, as in statement block coverage. In the following listing, the conditions instrumented for decision coverage are displayed with a gray background. The instrumented statements are underlined as before. They are the same as in the previous listing.


void foo()
{
  bool found=false;
  for (int i=0; (i$<$100) && (!found); ++i) {
    if (i==50)
      {break};
    if (i==20)
      found=true;
    if (i==30)
      found=true;
  }
  printf("foo\n");
}
    

Relevance for safety standards

  • ISO 26262 recommends branch coverage for ASIL A and highly recommends the method for ASIL B, C and D.
  • EN 50128 recommends branch coverage for SIL 1 and 2. For SIL 3 and 4 this level is even highly recommended.
  • DO 178 mandates that decision coverage should be satisfied with independence for Software Level A and B.
  • IEC 61508 recommends branch coverage for SIL 1 and 2 and highly recommends this level for SIL 3 and 4.

Condition coverage

To analyze the Boolean decisions in the if, while, for, and similar statements in greater detail, use condition coverage.

In this coverage metric, each decision is decomposed into simpler statements (or conditions) that are connected by Boolean operators like ~, || and &&. For full coverage of the decision, each of the conditions must evaluate to true and to false when the program is executed.

Note that || and && are shortcut operators, and therefore in a complex decision, not all conditions are always executed. An example is the code fragment if (a || b) return 0; where the conditions are the variables a and b. To get them both to evaluate to true and false, one has to run the code fragment once with a == true and b arbitrarily (since it is not evaluated), and also two times with a == false: in one run b is set to true, in the other run it is set to false.

This is in contrast to the situation with decision coverage, where it is enough for full coverage that the whole decision, a || b, evaluates to true and to false.

With condition coverage, our example function is instrumented in the following way. As before, the instrumented Boolean expressions are gray:


void foo()
{
  bool found=false;
  for (int i=0; (i$<$100) && (!found); ++i) {
    if (i==50)
      {break};
    if (i==20)
      found=true;
    if (i==30)
      found=true;
  }
  printf("foo\n");
}
    

We can see here that the decision of the for statement has been split into two separately instrumented conditions.

Instrumentation of assignments

CoverageScanner also instruments the assignment of Boolean expressions to a variable. Constant or static expressions are not instrumented because their values are computed at compile time or only once during program initialization, so full coverage cannot be reached anyway.

In the following listing, the instrumented Boolean expressions are again gray:


void foo()
{
  static bool a = x && y;
  bool b = x && y;
  bool c = b;
  int d = b ? 0 : 1;
}
    

We see that the statements at lines 3 and 5 are not instrumented – the first is evaluated at compile time and the second does not involve a Boolean operation.

Potential problems with Boolean overloading

Since Coco works on a syntactic level, it cannot distinguish between expressions that involve only Boolean operators and those with other data types that support Boolean operators – all of them are instrumented.

The instrumentation tries to have no effect on the program, but sometimes this is not possible. One example are older versions of C#. They need to be instrumented with the option --cs-no-csharp-dynamic. Then the instrumented code converts the operands of a Boolean operator like || and && first to Boolean, before the operator is applied. Short circuit evaluation still works, of course. This is different from the way in which an uninstrumented program would do it.

In C# programs that are compiled with Microsoft® Visual Studio® versions before 2010, it is therefore necessary that both cast operators true and false are defined for the objects that are arguments of Boolean operators. For newer C# versions, the default settings of the CoverageScanner can be used and the instrumented code does not have this problem.

Multiple condition coverage

The properties of multiple condition coverage and MC/DC become only visible when a program contains a complex decision with many conditions. Our previous example therefore cannot be used anymore. Instead, we will use the following program, which replaces some letters in a string with underscores: those between a and e (inclusive) in the alphabet, and those that follow the full stop at the end of the sentence. There is no character after the full stop in the example text, but that is on purpose, as we need some conditions that fail.

#include <stdio.h>

int main()
{
    char text[] = "The quick brown dog jumps over the lazy fox.";

    for (char *p = text; *p; p++) {
        if ((*p >= 'a' && *p <= 'e') || (p != text && *(p - 1) == '.'))
            *p = '_';
    }

    puts(text);
    return 0;
}

We will concentrate on the complex condition in line 8. When the coverage of this program is shown in the CoverageBrowser, you can move the mouse over that line and see the following table, which describes the coverage data. Or you can click the line and see a similar table in the Explanation window.

*p >= 'a'*p <= 'e'p != text*(p - 1) == '.'Description
TRUEFALSETRUETRUEnever executed
FALSETRUETRUEnever executed
TRUEFALSETRUEFALSEexecuted 27 times by 1 test
FALSETRUEFALSEexecuted 9 times by 1 test
TRUEFALSEFALSEnever executed
FALSEFALSEexecuted 1 time by 1 test
TRUETRUEexecuted 7 times by 1 test

In this table, each line contains a combination of condition results. The first four columns contain the results of a single condition. Since C uses shortcut operators, not all conditions are shown executed in each row. Rows that were executed have a green background, rows that were not have a background in red.

From the current table we see that three combinations of conditions were not executed and are missing from full condition coverage. To execute e.g. the condition combination in the second red line, it would be necessary to make the condition *p >= 'a' false and the conditions p != text and *(p - 1) == ' ' true. To accomplish this, one could put a capital letter (for the first condition) behind the full stop (for the two others) in text.

Relevance for safety standards

EN 50128 recommends MCC (or modified condition/decision coverage) for SIL 1 and 2. For SIL 3 and 4 this level is even highly recommended.

MC/DC

Modified condition/decision coverage (MC/DC) is a variant of multiple condition coverage that requires fewer tests. Its goal is to make sure that for each condition in a complex decision there are two executions that differ only in the result of that condition.

The condition table of MC/DC is a more complex version of the table for multiple condition coverage:

*p >= 'a'*p <= 'e'p != text*(p - 1) == '.'DecisionDescription
TRUEFALSETRUETRUETRUEnever executed
FALSETRUETRUETRUEnever executed
TRUEFALSETRUEFALSEFALSEexecuted 27 times by 1 test
FALSETRUEFALSEFALSEexecuted 9 times by 1 test
FALSEFALSEFALSEexecuted 1 time by 1 test
TRUETRUETRUEexecuted 7 times by 1 test
TRUEFALSEFALSEFALSEnever executed

Its large-scale structure is the same as with multiple condition coverage, but there are some new features:

  • The column headers for the conditions and the decision have a colored background. If a condition has a green background, the MC/DC conditions for this condition are already fulfilled, otherwise it is red.
  • The red rows contain combinations of truth values that did not occur when the program was executed. Executing one of them would increase the code coverage.
  • The light green rows contain combinations of truth values that did occur in the execution of the program. If they contribute to the coverage of a condition, they contain fields with a dark green background.

    To see which rows contributed to the coverage of a covered condition, search in the column that belongs to this condition for dark green fields. Any two rows with a dark green field contribute together to the coverage of the condition if one of the dark green fields has a TRUE entry and the other one a FALSE entry.

    In the example, we see that the first condition, *p >= 'a', is covered in two different ways: by the second green row together with the fourth green row, and also by the third and fourth green row together. The second condition, *p <= 'e', is covered by the first and fourth green row together. One can also see that these two rows only differ in the second column and in the decision; the other columns contain the same results or at least one empty field.

  • The light red rows contain combinations of truth values that were not executed either, but executing a single of them will not increase the code coverage.

You may have noted that the table is sorted in a different way from that one for multiple condition coverage. First, there are the red rows, then the green rows and then the light red ones. This is always the case.

Relevance for safety standards

  • ISO 26262 recommends MC/DC for ASIL A, B, C and highly recommends this method for ASIL D.
  • EN 50128 recommends MC/DC (or Multiple Condition Coverage) for SIL 1 and 2. For SIL 3 and 4 this level is even highly recommended.
  • DO 178 mandates that MC/DC should be satisfied with independence for Software Level A.
  • IEC 61508 recommends MC/DC for SIL 1, 2 and 3 and highly recommends this level for SIL 4.

Display of the results

The CoverageBrowser is a graphical user interface program to display the analyzed results of the instrumentation. It uses a color coding to indicate the status of the statements. In this manual we use the same colors as in the program.

There are two kinds of statements in an instrumented program. Some contain an instrumentation point, i.e. a piece of code inserted by Coco which increments a counter when it is executed. If a line contains an instrumentation point, it is shown on a dark-colored background by the CoverageBrowser and in the HTML reports.

For reasons of efficiency, not all statements contain instrumentation points. If a line does not contain an instrumentation point but its coverage status can be inferred from other statements, it is shown on a light-colored background.

The resulting color scheme is then:

  • Lines in dark green and in light green have been executed.
  • Lines in dark red and in light red have not been executed.
  • Lines in orange contain boolean expressions that have been partially executed. These expressions occur in general either as part of control structures like while and if or as the right side of an assignment to a boolean variable.

    Boolean expressions are always instrumented, therefore a light orange background is not necessary.

The output for our example function illustrates this:


void foo()
{
  bool found=false;
  for (int i=0; (i < 100) && (!found); ++i)
  {
    if (i==50)
      break;
    if (i==20)
      found=true;
    if (i==30)
      found=true;
  }
  printf("foo\textbackslash n");
}
    

The for statement in line 4 is shown in orange since it contains the expression i<100, which is only partially executed: it always takes the value true when the function is executed. The expressions i==50 and i==30 were also partially executed, and we can see from the way the statements break; and found=true; are colored that in both cases the condition evaluates to false. In CoverageBrowser, the value of the condition is also shown in a tooltip.

Lines 3, 5, and 13 are not directly included in the coverage measurements and are therefore shown on a light background. Their coverage states are inferred by Coco from other statements that were (or were not) executed later. In our example, lines 3 and 13 must have been executed because the closing brace in line 14 has been executed, and line 3 has been executed because of line 12. Therefore all the lines with light background are shown in green.

Performance

The insertion of code during the instrumentation increases the code size and also affects performance of the instrumented application. It will use more memory and run slower.

For non-conditional expressions, the instrumentation code is only a write instruction to a counter at a fixed memory location. However, for conditional expressions, more detailed analysis is needed and this is more computationally expensive.

Overall, an instrumented application will be 60% to 90% larger and will run 10% to 30% slower.

Note: Detailed measurements are available in Code coverage benchmarks.

Statistics

Some developers write more lines of code than others simply by using a particular coding style—for example by putting opening braces on a line of their own rather than on the same line as an if statement. By default, Coco uses a coverage metric that is not susceptible to such differences in coding style. Its calculations are based on the number of executed instrumented instructions compared with the total number of instrumented instructions.

Every instrumented simple statement (like return, break, the last instruction of a function, etc.) is recorded by a single instrumentation counter. A fully instrumented condition in an IF...THEN...ENDIF block uses has two instrumentation counters: one for the true case and one for the false case. If the code is only partially instrumented, only one condition is recorded (either the false case or the true case).

The statistics itself depends of the type of instrumentation. It is in general not possible to compare code coverage at statement block level with that at decision level: having 80% coverage at decision level does not tell us anything about the coverage at statement block level, which could be bigger or smaller. We can only be sure that reaching 100% coverage is more difficult with condition coverage than with decision or statement block coverage.

In our example code, the coverage is between 60% and 75%. The following table shows the details for each instrumentation type.

Code coverage typeInstrumentedExecutedCoverage
statement block (see statement block coverage example)5360%
Decision full (see full decision coverage example)13969%
Decision partial (see partial decision coverage example)9777%
Condition full (see full condition coverage example)151066%
Condition partial (see partial condition coverage example)12975%

In the examples below, the details of the calculations are displayed with subscripts. The first number in a subscript shows how many instrumented statements were executed; the second is the number of instrumented statements in total.

Statement block coverage example:


void foo()
{
  bool found=false;
  for (int i=0; (i < 100) && (!found); ++i)
  {
    if (i==50)
      break;0/1 [not executed]
    if (i==20)
      found=true;1/1 [executed]
    if (i==30)
      found=true;0/1 [not executed]
  }1/1 [executed]
  printf("foo\textbackslash n");
}1/1 [executed]
    

Full decision coverage example:


void foo()
{
  bool found=false;
  for (int i=0; (i < 100) && (!found)2/2 [was false and true]; ++i)
  {
    if (i==50)1/2 [was false but not true]
      break;0/1 [not executed]
    if (i==20)2/2 [was false and true]
      found=true;1/1 [executed]
    if (i==30)1/2 [was false but not true]
      found=true;0/1 [not executed]
  }1/1 [executed]
  printf("foo\textbackslash n");
}1/1 [executed]
    

Partial decision coverage example:


void foo()
{
  bool found=false;
  for (int i=0; (i < 100) && (!found)1/1 [was false]; ++i)
  {
    if (i==50)1/1 [was false]
      break;0/1 [not executed]
    if (i==20)1/1 [was false]
      found=true;1/1 [executed]
    if (i==30)1/1 [was false]
      found=true;0/1 [not executed]
  }1/1 [executed]
  printf("foo\textbackslash n");
}1/1 [executed]
    

Full condition coverage example:


void foo()
{
  bool found=false;
  for (int i=0; (i < 100)1/2 [was true but not false] && (!found)2/2 [was false and true]; ++i)
  {
    if (i==50)1/2 [was false but not true]
      break;0/1 [not executed]
    if (i==20)2/2 [was false and true]
      found=true;1/1 [executed]
    if (i==30)1/2 [was false but not true]
      found=true;0/1 [not executed]
  }1/1 [executed]
  printf("foo\textbackslash n");
}1/1 [executed]
    

Partial condition coverage example:


void foo()
{
  bool found=false;
  for (int i=0; (i < 100)1/2 [was true but not false] && (!found)2/2 [was false and true]; ++i)
  {
    if (i==50)1/1 [was false]
      break;0/1 [not executed]
    if (i==20)1/1 [was false]
      found=true;1/1 [executed]
    if (i==30)1/1 [was false]
      found=true;0/1 [not executed]
  }1/1 [executed]
  printf("foo\textbackslash n");
}1/1 [executed]
    

Problems with line coverage

Line coverage is a natural metric which allows you to see which lines of code are executed, but it is less accurate than instrumentation at statement block level and its results rely on the developer's coding style.

The following example illustrates this problem:

int main()
{
  if (true) return 1;
  foo();
  return 0;
}

Executing it would produce the following result:


int main()
{
  if (true) return 1; // Executed
  foo();              // Not executed
  return 0;           // Not executed
}
    

This execution corresponds to a coverage of 33%.

Since the first line of main contains two executed statements, splitting it in two increases the number of executed lines, and so the test coverage. So, if we reformat the main function as follows:

int main()
{
  if (true)
    return 1;
  foo();
  return 0;
}

the execution of this code results in a code coverage of 66%:


int main()
{
  if (true)            // Executed
    return 1;          // Executed
  foo();               // Not executed
  return 0;            // Not executed
}
    

Another way to increase the coverage by reformatting is to hide an uncovered statement behind an executed one. To do this, you only ned to write the whole code of this main function in one line:

int main()
{
  if (true) return 1; foo(); return 0;
}

This code has a line coverage of 100%:


int main()
{
  if (true) return 1; foo(); return 0; // Executed
}
    

This small example illustrates how the result depends on source code formatting. Therefore, Coco provides line coverage as an additional measurement to decision and the condition coverage, and does not allow instrumenting source code only at the line level.