Data Science Digest: Source Code Metrics That Matter

Data Scientists have spent decades researching ways to better understand what's "good" code and what isn't. While software quality is an interesting area to explore, is it worth the effort? After all, it's common for developers to disagree on how to write high-quality code to achieve the same outcome.

Dr. Maciej Gryka, Sr. Data Science Manager and self-proclaimed "Data Nerd", recently sat down for a webinar to discuss his research learnings in Data Science and Machine Learning on the topic of software quality. From source code metrics that matter to where the research is going with Machine Learning, these are the key takeaways from the discussion:

Software Quality Is So Much More Than Functionality

You wouldn't necessarily call something that works well high quality. You also wouldn't call something that looks nice, but doesn't work well, high quality. You need both."

-Maciej Gryka

While Maciej primarily covered source code metrics, he emphasized they should not be the only factor taken into consideration when measuring software quality.

The software development community often focuses on functionality, or how well the primary purpose is served, when measuring software quality. But like any non-software product customers use and love, software needs to be both functional and delightful for it to be high quality.

Consider these questions:

  • Is your software solving the right problem?
  • Is it solving it in the best way?
  • Are your users happy while using the software?
  • Are your developers happy while working on the software?

As we can see, there are so many ways to measure software quality beyond functionality. The key to amazingly high quality software is in both functionality and delight.

Source Code Metrics That Matter and How To Use Them

The software development and research community typically looks to a few key source code metrics to measure software quality. While we can extract numbers from these source code metrics, like complexity and hierarchy, how do we know which numbers are desirable?

This is where historical data and Machine Learning can help: we can approach it as a supervised learning problem, and train a model on which code metrics are indicative of potential bugs. We can understand how to do this from the most well-known source code metrics:


While it's unlikely there will ever be a single number to represent all aspects of quality for a given code base, source code metrics are interesting to explore because they can show us what kinds of code constructs have historically led to defects.

Avoid Using Code Quality Metrics to Measure Developer Productivity

It can be tempting to use code quality metrics to measure developer productivity. Do not do this, it's dangerous!

Evaluating developers based on code quality metrics is a sure-fire way to get them focusing on maximizing metrics, rather than on shipping high quality code. All known metrics are imperfect and gameable, so make sure you do not create misaligned incentives in your team's software development culture.

Learn How Machine Learning is Changing Software Quality Measurement

Testing every piece of code before shipping can be a lot to ask. Sometimes it's not possible, or is very expensive, so we want to focus on the most important parts. Ideally, we could predict the number of bugs in a piece of code before it hits production.

-Maciej Gryka

While commonly-used source code metrics are reasonably good at measuring software quality after code hits production, they do not help prevent bugs from getting into customer's hands. That's where Machine Learning comes into play.

To learn more about how Machine Learning is changing the field of software quality measurement, listen on-demand to "Data Science Digest: Source Code Metrics that Matter".

Related articles

The Downfall of DOM and the Rise of UI Testing

A comprehensive overview of DOM-based automation's limitations, and why UI testing may be superior for most testing use cases.

The Layers of Testing Architecture

The landscape of software testing is changing. Speed and quality are no longer seen as opposing forces.

What Westworld Teaches Us About QA

Software development and QA practices are at the heart of the show Westworld. Check out the five lessons real QA teams can learn from the programming team at Delos, Inc.

AWS re:invent 2017 Recap: Testing & Troubleshooting with AWS Device Farm [video]

Rainforest CTO, Russ Smith, discussed testing and troubleshooting mobile application with AWS Device Farm at AWS re:Invent this year.. Watch the full video here.