Skip to main content

Cognitive Complexity

Thomas J. McCabe introduced Cyclomatic Complexity in 1976 as a way to guide programmers in writing methods that "are both testable and maintainable". At SonarSource, we believe Cyclomatic Complexity works very well for measuring testability, but not for maintainability. That's why we're introducing Cognitive Complexity, which you'll begin seeing in upcoming versions of our language analyzers. We've designed it to give you a good relative measure of how difficult the control flow of a method is to understand.

Cyclomatic Complexity doesn't measure maintainability

To get started let's look at a couple of methods:
int sumOfPrimes(int max) {              // +1
  int total = 0;
  OUT: for (int i = 1; i <= max; ++i) { // +1
    for (int j = 2; j < i; ++j) {       // +1
      if (i % j == 0) {                 // +1
        continue OUT;
      }
    }
    total += i;
  }
  return total;
}                  // Cyclomatic Complexity 4
String getWords(int number) {   // +1
    switch (number) {
      case 1:                   // +1
        return "one";
      case 2:                   // +1
        return "a couple";
      default:                  // +1
        return "lots";
    }
  }        // Cyclomatic Complexity 4

These two methods share the same Cyclomatic Complexity, but clearly not the same maintainability. Of course, this comparison might not be entirely fair; even McCabe acknowledged in his original paper that the treatment of case statements in a switchdidn't seem quite right:
The only situation in which this limit [of 10 per method] has seemed unreasonable is when a large number of independent cases followed a selection function (a large case statement)...
On the other hand, that's exactly the problem with Cyclomatic Complexity. The scores certainly tell you how many test cases are needed to cover a given method, but they aren't always fair from a maintainability standpoint. Further, because even the simplest method gets a Cyclomatic Complexity score of 1, a large domain class can have the same Cyclomatic Complexity as a small class full of intense logic. And at the application level, studies have shown that Cyclomatic Complexity correlates to lines of code, so it really doesn't tell you anything new.

Cognitive Complexity to the rescue!

That's why we've formulated Cognitive Complexity, which attempts to put a number on how difficult the control flow of a method is to understand, and therefore to maintain.

I'll get to some details in a minute, but first I'd like to talk a little more about the motivations. Obviously, the primary goal is to calculate a score that's an intuitively "fair" representation of maintainability. In doing so, however, we were very aware that if wemeasure it, you will try to improve it. And because of that, we want Cognitive Complexity to incent good, clean coding practices by incrementing for code constructs that take extra effort to understand, and by ignoring structures that make code easier to read.

Basic criteria

We boiled that guiding principle down into three simple rules:
  • Increment when there is a break in the linear (top-to-bottom, left-to-right) flow of the code
  • Increment when structures that break the flow are nested
  • Ignore "shorthand" structures that readably condense multiple lines of code into one

Examples revisited

With those rules in mind, let's take another look at those first two methods:
                                // Cyclomatic Complexity    Cognitive Complexity
  String getWords(int number) { //          +1
    switch (number) {           //                                  +1
      case 1:                   //          +1
        return "one";
      case 2:                   //          +1
        return "a couple";
      default:                  //          +1
        return "lots";
    }
  }                             //          =4                      =1
As I mentioned, one of the biggest beefs with Cyclomatic Complexity has been its treatment of switch statements. Cognitive Complexity, on the other hand, only increments once for the entire switch structure, cases and all. Why? In short, because switches are easy, and Cognitive Complexity is about estimating how hard or easy control flow is to understand.

On the other hand, Cognitive Complexity increments in a familiar way for the other control flow structures: forwhiledo while, ternary operators, if/#if/#ifdef/...else if/elsif/elif/..., and else, as well as for catch statements. Additionally, it increments for jumps to labels (gotobreak, and continue) and for each level of control flow nesting:
                                // Cyclomatic Complexity    Cognitive Complexity
int sumOfPrimes(int max) {              // +1
  int total = 0;
  OUT: for (int i = 1; i <= max; ++i) { // +1                       +1
    for (int j = 2; j < i; ++j) {       // +1                       +2 (nesting=1)
      if (i % j == 0) {                 // +1                       +3 (nesting=2)
        continue OUT;                   //                          +1
      }
    }
    total += i;
  }
  return total;
}                               //         =4                       =7
As you can see, Cognitive Complexity takes into account the things that make this method harder to understand than getWords - the nesting and the continue to a label. So that while the two methods have equal Cyclomatic Complexity scores, their Cognitive Complexity scores clearly reflect the dramatic difference between them in understandability.

In looking at these examples, you may have noticed that Cognitive Complexity doesn't increment for the method itself. That means that simple domain classes have a Cognitive Complexity of zero:
                              // Cyclomatic Complexity       Cognitive Complexity
public class Fruit {

  private String name;

  public Fruit(String name) { //        +1                          +0
    this.name = name;
  }

  public void setName(String name) { // +1                          +0
    this.name = name;
  }

  public String getName() {   //        +1                          +0
    return this.name;
  }
}                             //        =3                          =0
So now class-level metrics become meaningful. You can look at a list of classes and their Cognitive Complexity scores and know that when you see a high number, it really means there's a lot of logic in the class, not just a lot of methods.

Getting started with Cognitive Complexity

At this point, you know most of what you need to get started with Cognitive Complexity. There are some differences in how boolean operators are counted, but I'll let you read the white paper for those details. Hopefully, you're eager to start using Cognitive Complexity, and wondering when tools to measure it will become available. 

We'll start by adding method-level Cognitive Complexity rules in each language, similar to the existing ones for Cyclomatic Complexity. You'll see this first in the mainline languages: Java, JavaScript, C#, and C/C++/Objective-C. At the same time, we'll correct the implementations of the existing method level "Cyclomatic Complexity" rules to truly measure Cyclomatic Complexity (right now, they're a combination of Cyclomatic and Essential Complexity.) 

Eventually, we'll probably add class/file-level Cognitive Complexity rules and metrics. But we're starting with Baby Steps.

Comments

Popular posts from this blog

@MappedSuperclass vs. @Inheritance

MappedSuperClass must be used to inherit properties, associations, and methods. Entity inheritance must be used when you have an entity, and several sub-entities. You can tell if you need one or the other by answering this questions: is there some other entity in the model which could have an association with the base class? If yes, then the base class is in fact an entity, and you should use entity inheritance. If no, then the base class is in fact a class that contains attributes and methods that are common to several unrelated entities, and you should use a mapped superclass. For example: You can have several kinds of messages: SMS messages, email messages, or phone messages. And a person has a list of messages. You can also have a reminder linked to a message, regardless of the kind of message. In this case, Message is clearly an entity, and entity inheritance must be used. All your domain objects could have a creation date, modification date and ID, and you could thus ...

Some good links

https://www.html5rocks.com/en/tutorials/internals/howbrowserswork/ http://taligarsiel.com/ClientSidePerformance.html -- Client side performance tips https://ariya.io/ https://vertx.io/docs/ -- New exciting Framework, Must read. https://javaee.github.io/ -- Very good resource to see various javaee projects and explore enterprise architecture and design concepts. https://projects.eclipse.org/projects/ee4j -- Lots of interesting open source projects by eclipse http://openjdk.java.net/projects/mlvm/ -- the main project for supporting more dynamic languages to jvm. http://esprima.org/ -- EcmaScript parser http://c2.com/ppr/ and http://hillside.net/ -- Good place to learn patterns http://cr.openjdk.java.net/~briangoetz/lambda/Defender%20Methods%20v4.pdf https://validator.w3.org/nu/ -- This will validate your website css and js https://www.cellstream.com/intranet/reference-reading/faq/216-what-is-2-128.html http://shattered.io/ -- An example of SHA1 collision attack.

String.format or String concat?

I'd suggest that it is better practice to use `String.format()` . The main reason is that `String.format()` can be more easily localised with text loaded from resource files whereas concatenation can't be localised without producing a new executable with different code for each language If you plan on your app being localisable you should also get into the habit of specifying argument positions for your format tokens as well: "Hello %1$s the time is %2$t" This can then be localised and have the name and time tokens swapped without requiring a recompile of the executable to account for the different ordering. With argument positions you can also re-use the same argument without passing it into the function twice: String.format("Hello %1$s, your name is %1$s and the time is %2$t", name, time) Because printf-style format strings are interpreted at runtime, rather than validated by the compiler, they can contain errors that result in the wrong str...