Skip to main content

Cognitive Complexity

Thomas J. McCabe introduced Cyclomatic Complexity in 1976 as a way to guide programmers in writing methods that "are both testable and maintainable". At SonarSource, we believe Cyclomatic Complexity works very well for measuring testability, but not for maintainability. That's why we're introducing Cognitive Complexity, which you'll begin seeing in upcoming versions of our language analyzers. We've designed it to give you a good relative measure of how difficult the control flow of a method is to understand.

Cyclomatic Complexity doesn't measure maintainability

To get started let's look at a couple of methods:
int sumOfPrimes(int max) {              // +1
  int total = 0;
  OUT: for (int i = 1; i <= max; ++i) { // +1
    for (int j = 2; j < i; ++j) {       // +1
      if (i % j == 0) {                 // +1
        continue OUT;
      }
    }
    total += i;
  }
  return total;
}                  // Cyclomatic Complexity 4
String getWords(int number) {   // +1
    switch (number) {
      case 1:                   // +1
        return "one";
      case 2:                   // +1
        return "a couple";
      default:                  // +1
        return "lots";
    }
  }        // Cyclomatic Complexity 4

These two methods share the same Cyclomatic Complexity, but clearly not the same maintainability. Of course, this comparison might not be entirely fair; even McCabe acknowledged in his original paper that the treatment of case statements in a switchdidn't seem quite right:
The only situation in which this limit [of 10 per method] has seemed unreasonable is when a large number of independent cases followed a selection function (a large case statement)...
On the other hand, that's exactly the problem with Cyclomatic Complexity. The scores certainly tell you how many test cases are needed to cover a given method, but they aren't always fair from a maintainability standpoint. Further, because even the simplest method gets a Cyclomatic Complexity score of 1, a large domain class can have the same Cyclomatic Complexity as a small class full of intense logic. And at the application level, studies have shown that Cyclomatic Complexity correlates to lines of code, so it really doesn't tell you anything new.

Cognitive Complexity to the rescue!

That's why we've formulated Cognitive Complexity, which attempts to put a number on how difficult the control flow of a method is to understand, and therefore to maintain.

I'll get to some details in a minute, but first I'd like to talk a little more about the motivations. Obviously, the primary goal is to calculate a score that's an intuitively "fair" representation of maintainability. In doing so, however, we were very aware that if wemeasure it, you will try to improve it. And because of that, we want Cognitive Complexity to incent good, clean coding practices by incrementing for code constructs that take extra effort to understand, and by ignoring structures that make code easier to read.

Basic criteria

We boiled that guiding principle down into three simple rules:
  • Increment when there is a break in the linear (top-to-bottom, left-to-right) flow of the code
  • Increment when structures that break the flow are nested
  • Ignore "shorthand" structures that readably condense multiple lines of code into one

Examples revisited

With those rules in mind, let's take another look at those first two methods:
                                // Cyclomatic Complexity    Cognitive Complexity
  String getWords(int number) { //          +1
    switch (number) {           //                                  +1
      case 1:                   //          +1
        return "one";
      case 2:                   //          +1
        return "a couple";
      default:                  //          +1
        return "lots";
    }
  }                             //          =4                      =1
As I mentioned, one of the biggest beefs with Cyclomatic Complexity has been its treatment of switch statements. Cognitive Complexity, on the other hand, only increments once for the entire switch structure, cases and all. Why? In short, because switches are easy, and Cognitive Complexity is about estimating how hard or easy control flow is to understand.

On the other hand, Cognitive Complexity increments in a familiar way for the other control flow structures: forwhiledo while, ternary operators, if/#if/#ifdef/...else if/elsif/elif/..., and else, as well as for catch statements. Additionally, it increments for jumps to labels (gotobreak, and continue) and for each level of control flow nesting:
                                // Cyclomatic Complexity    Cognitive Complexity
int sumOfPrimes(int max) {              // +1
  int total = 0;
  OUT: for (int i = 1; i <= max; ++i) { // +1                       +1
    for (int j = 2; j < i; ++j) {       // +1                       +2 (nesting=1)
      if (i % j == 0) {                 // +1                       +3 (nesting=2)
        continue OUT;                   //                          +1
      }
    }
    total += i;
  }
  return total;
}                               //         =4                       =7
As you can see, Cognitive Complexity takes into account the things that make this method harder to understand than getWords - the nesting and the continue to a label. So that while the two methods have equal Cyclomatic Complexity scores, their Cognitive Complexity scores clearly reflect the dramatic difference between them in understandability.

In looking at these examples, you may have noticed that Cognitive Complexity doesn't increment for the method itself. That means that simple domain classes have a Cognitive Complexity of zero:
                              // Cyclomatic Complexity       Cognitive Complexity
public class Fruit {

  private String name;

  public Fruit(String name) { //        +1                          +0
    this.name = name;
  }

  public void setName(String name) { // +1                          +0
    this.name = name;
  }

  public String getName() {   //        +1                          +0
    return this.name;
  }
}                             //        =3                          =0
So now class-level metrics become meaningful. You can look at a list of classes and their Cognitive Complexity scores and know that when you see a high number, it really means there's a lot of logic in the class, not just a lot of methods.

Getting started with Cognitive Complexity

At this point, you know most of what you need to get started with Cognitive Complexity. There are some differences in how boolean operators are counted, but I'll let you read the white paper for those details. Hopefully, you're eager to start using Cognitive Complexity, and wondering when tools to measure it will become available. 

We'll start by adding method-level Cognitive Complexity rules in each language, similar to the existing ones for Cyclomatic Complexity. You'll see this first in the mainline languages: Java, JavaScript, C#, and C/C++/Objective-C. At the same time, we'll correct the implementations of the existing method level "Cyclomatic Complexity" rules to truly measure Cyclomatic Complexity (right now, they're a combination of Cyclomatic and Essential Complexity.) 

Eventually, we'll probably add class/file-level Cognitive Complexity rules and metrics. But we're starting with Baby Steps.

Comments

Popular posts from this blog

@MappedSuperclass vs. @Inheritance

MappedSuperClass must be used to inherit properties, associations, and methods. Entity inheritance must be used when you have an entity, and several sub-entities. You can tell if you need one or the other by answering this questions: is there some other entity in the model which could have an association with the base class? If yes, then the base class is in fact an entity, and you should use entity inheritance. If no, then the base class is in fact a class that contains attributes and methods that are common to several unrelated entities, and you should use a mapped superclass. For example: You can have several kinds of messages: SMS messages, email messages, or phone messages. And a person has a list of messages. You can also have a reminder linked to a message, regardless of the kind of message. In this case, Message is clearly an entity, and entity inheritance must be used. All your domain objects could have a creation date, modification date and ID, and you could thus ...

Some good links

https://www.html5rocks.com/en/tutorials/internals/howbrowserswork/ http://taligarsiel.com/ClientSidePerformance.html -- Client side performance tips https://ariya.io/ https://vertx.io/docs/ -- New exciting Framework, Must read. https://javaee.github.io/ -- Very good resource to see various javaee projects and explore enterprise architecture and design concepts. https://projects.eclipse.org/projects/ee4j -- Lots of interesting open source projects by eclipse http://openjdk.java.net/projects/mlvm/ -- the main project for supporting more dynamic languages to jvm. http://esprima.org/ -- EcmaScript parser http://c2.com/ppr/ and http://hillside.net/ -- Good place to learn patterns http://cr.openjdk.java.net/~briangoetz/lambda/Defender%20Methods%20v4.pdf https://validator.w3.org/nu/ -- This will validate your website css and js https://www.cellstream.com/intranet/reference-reading/faq/216-what-is-2-128.html http://shattered.io/ -- An example of SHA1 collision attack.

Patterns Knowledge

Anti Pattern: Its a pattern which we repeatedly do and which brings negative results. Architecture by implication: Systems lacking a clear and document architecture. Cover Your Assets: Continuing to document and present alternatives, without ever making an architectural decision. Witches Brew: Architectures made by groups resulting in a mix of ideas and lack a clear vision. Gold Plating: Continuing to define an architecture well pass the time which results in no benefits to the architecture. Vendor King: A product dependent architectures leading to a loss of control of architecture and development costs Big Bang Architecture: Designing the entire architecture at the beginning of the project when you know the least about the system.