http://code-epicenter.com/wp-content/uploads/2015/10/bug.jpg

Writing a bug report

A story about what developers see, and how to maximise their initial understanding of an issue.

As a software engineer, my stock and trade is either bugs or features. My team and I meed to make changes to software to make it perform some new thing, or modify the behaviour of the thing it does on a continual basis, cycling releases through approximately 1x per week.

We don’t always get it right.

We take reasonable care in the way we develop software; all changes are subject to code review, nearly all system changes are expressed in version control and all changes need to go through an independent QA process — all of which means we do okay. But sometimes, something slips through the cracks.

Invariably, the things that slip through aren’t the things that we thought of. They’re usually the result of complex system interactions, a limitation in the way we’ve constructed the business logic or some bespoke circumstance that modifies the state such that it breaks a fundamental assumption. Because they’re things that we both don’t think of and don’t test for, it’s important that we get as much information as possible so as to help us investigate the issue further.

Skipping ahead, when writing a bug it’s best if it’s written according to the formula supplied below in the “Markdown” format:

# The paypal checkout seems to fail with the error "Error: The upstream provider returned an invalid response.

The above report is super detailed — I would be extremely happy to receive such a report. But it’s worth unpacking the report further, to understand why each section is important from the point of view of the developer.

Title

The title is surprisingly important. There are always many outstanding bugs that must be triaged, and it’s often not clear which task has the highest priority. However, we can determine from the title:

The paypal checkout seems to fail with the error "Error: The upstream provider returned an invalid response.

That the error is in the checkout; the last step of the user journey. It’s thus critically important, and is likely to trigger incident response internally. Additionally, we can see that it’s associated with as specific provider, and that from the error we can see the provider is performing abnormally. It may be the provider is experiencing issues, which can usually be quickly verified with status pages.

Story (or steps to reproduce)

Perhaps the most important step in a bug is detailed steps to reproduce. It’s best to write these as detailed as possible, and from the point of view of the person who has experienced the issue. Let’s break it down piece by piece.

Time and Date

On Thursday, 25th of February 2018 at approximately 6:18pm I attempted

As developers, we know ahead of time that we are not going to catch all issues. As far as I’m aware, there has never been bug free software, no matter the expense — the systems that we build upon are simply not that reliable. So, we build the application with this in mind. We collect certain diagnostic information at all times throughout the application, but we collect even more information when we have determined that an unusual situation has occurred.

We are able to look up the information associated with a request, but only if we know specifically which request it is. Sometimes it’s not so obvious which requests are problematic and which are not; the proverbial needle in the haystack. So, narrowing the problem down to a time range allows us to look very specifically at the information from a certain time, which can dramatically reduce the amount of effort required to investigate.

Detailed User Story

I attempted to checkout via the standard checkout with the following items in my...

As mentioned, bugs often arise out of complex behaviour. Unfortunately we cannot test all possible combinations of what will occur in a production environment during testing. There are a couple of reasons for this:

  1. We don’t know them, and
  2. It’s cost prohibitive to test every combination of every feature prior to shipping to production

We make tradeoffs on the cost of an issue in a given area, and shape our testing policy accordingly.

Detailed user journeys that clearly describe the application state as well as the actual error allow us to reproduce exactly the set of conditions that arose from the error. In the case above, it may be that 1x foobar contains a special character in it’s description that causes the application to break when it is returned from PayPal, or that the credit card has been marked as fraudulent and the application hasn’t been designed to deal with the issue.

If you take nothing away, please painstakingly detail the steps to reproduce the issue.

Screenshots

1. The error following the checkout
2. The contents of my cart prior to checkout
3. The order summary page prior to being redirected to Paypal

The phrase “a picture is worth a thousand words” is rarely more true than screenshots associated with an issue.

As much as we try and create a common understanding of what is happening in an application and surface information that is useful debugging, there is certain information that it is inherently complex to communicate. A screenshot is a low effort tool to communicate an extremely large amount of information:

  • Time of day
  • Operating system
  • Browser
  • Application state through status icons
  • User data that can be correlated with the issue

While it would be possible to ask for all of these details up front, it’s often difficult to explain the impact of different browsers, or why an operating system matters. Sending screenshots allows a vast amount of technical communication with an ostensibly simple action.

Further, multiple screenshots compound this effect. It is possible to determine state change over time from screenshots, and there may be things the developer can spot that are difficult to communicate.

Lastly, screenshots even make sense in terms of “terminal” or text only applications!

Technical Detail (Bonus)

There are some bugs that, no matter how much we instrument, are impossible to foresee and thus instrument for. If a bug report is being reported by a more technical user (for example, a project manager) it’s possible for them to communicate additional very technical information about the nature of the bug which helps particularly with bugs that are otherwise extremely hard to trace.

Browser Version

The browser I used was Chrome. The details from the `chrome://version` tab are as follows:

It is possible for a develop code that works perfectly well one day, but breaks the next. One such example would be the use of in-development web specifications whos format changes following the development implementation, or deprecation to other specifications. Lastly, there are a class of bugs in which the browser fails to implement the specification as documented, in which the browser itself will be problematic rather than the application.

By being specific about the browser version we can either rule out a class of issues, or otherwise reproduce an issue exactly as it’s defined. Additionally, copying directly from about or version sections of a browser allows us to capture information that may seem redundant, such as the (stable) part of the Chrome version string.

Browser Console

content.js:4 [Deprecation] chrome.loadTimes() is deprecated, instead use standardized API: nextHopProtocol in Navigation Timing 2. https://www.chromestatus.com/features/5637885046816768.
(anonymous) @ content.js:4

The browser console is a tool that is used by various browsers to express additional, non user facing information to interested parties — specifically information that is useful for the developers. There are various mechanisms to access them, but the in Chrome it’s Ctrl+Shift+J.

There is a particular class of problems with the code “JavaScript” which we often have very little visibility into. These issues are expressed through the JavaScript console, so adding these to the report can render an otherwise completely opaque problem transparent.

HAR File

For the truly technically savvy among us, it’s even possible to capture the entire request and send it through in a format called “HAR”. Instructions on how to do this in Chrome are at the following address:

https://developers.google.com/web/tools/chrome-devtools/network-performance/reference#save-as-har

Conclusion

Bugs are an inevitable part of software development. They are also among the most painful to investigate, and thus the most expensive issues that we can work on. By submitting a detailed bug report we can save a large amount of time and discussion back and fourth, and get fixes into production faster.

Further Reading

Thanks

  • Aario Shahbany and Svetlin Kalendzhiev who assisted in creating the “ideal bug report”.
  • Tomasz Kaplonski for reviewing and suggesting improvements.