This blog is derived from the App Stability Series webinar, “Unlocking Bugsnag's Power Features,” a series focused on showcasing Bugsnag’s most powerful features that deliver an effective app stability experience.
As digital transformation accelerates across verticals and consumers continue to adopt more applications into daily life, the engineering organizations behind these applications need to scale their teams, infrastructure and processes accordingly. Demand for new and improved functionality pushes for shorter development cycles and drives the adoption of progressive delivery practices. Traffic and data volumes continuously increase and apps integrate with more third-party services, increasing interdependency. All of this puts a strain on the ability to efficiently monitor app health and stability, as well as identify root causes of bugs.
There is quite a bit of flexibility within Bugsnag that can sometimes go unnoticed by our users. Although Bugsnag's main functionality can be stood up in a matter of minutes by adding our SDK to your application, we do have a lot of features to address problems at scale and have collected some smart solutions from working with our power users. Some of these best practices are laid out below for your teams to leverage.
Some bugs prevent users from being able to complete critical functions or revenue-generating activity. Some crashes can quickly be isolated from users if they are known to be specific to a release version or a feature toggle. Other functional problems, like ANRs, app hangs and out of memory exceptions (OOMs) could be coming from outside of your code itself.
Some of the visibility and monitoring challenges we often hear from engineering teams, of every size and across all verticals, are:
In order to provide a richer context for your errors, custom metadata may be added. This is incredibly helpful in tracking additional business or industry specific attributes and allowing for this information to be searched against within your errors. Custom metadata opens a range of possibilities, especially since bookmarks allow you to save searches in your team’s inbox, track trends and automate alerting or ticket creation to specific error segments. Some relevant metadata concepts Bugsnag users leverage today are listed below.
Customer Tier/ Type: From eCommerce apps with loyalty or membership programs to gaming apps where users accumulate a lifetime spend, logging the customer's loyalty status to your app is a crucial way to segment and track the experiences of your most important user bases.
We recommend segmenting out errors in production where the user is of high value – this way custom notifications can be leveraged to notify your customer success organization if there are spikes in errors in this segment. This gives the team the ability to proactively respond to any highly valued users who run into significant issues with your application.
Device tier: Understanding if an ANR, app hang or OOM error is coming from a performance issue in the application or from the user's device can be a very valuable distinction to reduce noise and properly prioritize. Some Bugsnag users maintain an internal mapping of device manufacturer or name to a classification (i.e., low, medium, high), and add this as metadata.
isLaunching: By default, Bugsnag provides metadata in Android, iOS, React Native, Unity and Unreal that identifies if the app was launching during the time of the exception. The point at which your app is considered “launched” is configurable and can be either time-based or declared manually (see the "Identifying Crashes at Launch" section of our documentation).
Third Party Requests: Network request information can be an important indicator when monitoring errors. Information like the last requested service, last request status and last response duration are powerful metadata to track. This can provide insights for gaming applications connecting with ad networks, or for general live ops issues, such as detecting spikes in errors potentially caused by payment or other third-party service outages.
Bot Traffic: Within web applications, bot traffic will likely be prevalent and might even trigger errors that end up in your Bugsnag inbox. Given that these events aren’t impacting a true user's experience and most bot traffic identifies itself in the agent string, some Bugsnag users check for bot traffic and add metadata to flag if a session is from a bot. This way, any exceptions linked to bot traffic can be excluded as part of your search parameters.
These concepts outline just some of the ways in which custom metadata allows you to track more precise segments of errors, and how to leverage them to both isolate visibility around revenue and ops situations, as well as to reduce noise.
One of our newest additions to Bugsnag is the Features dashboard, which provides actionable insights into stability of feature rollouts. One of the key functions within the Features dashboard is the Errors Introduced section, which highlights any errors that are exclusive to a specific feature flag or experiment. Our Releases dashboard also provides a similar Errors Introduced section, allowing developers to understand if there are errors unseen in any other version.
The Features and Releases dashboards allow developers to quickly understand if a release version or feature needs to be rolled back to preserve stability, and if code within a feature or release is causing specific exceptions. Many Bugsnag users aren't aware that the search builder allows you to filter the inbox for errors exclusive to a certain release version or feature flag.
Our Releases dashboard typically becomes the command center for mobile teams and release managers in the days after their latest release, as this view provides the ability to monitor both stability and adoption by version number in real time.
For most platforms and error types, valuable information can be accessed and changed via callback functions, which gives you access to the error data before the event is sent to Bugsnag. This helps teams do things like customize error grouping logic, add and change metadata or context based on breadcrumb properties, and even add additional information to tricky errors such as OOM errors in iOS. Information on callback functions for your platform can generally be found under the "Customizing error reports" section of our documentation.
In addition to implementing many of the metadata ideas mentioned earlier within a callback, some powerful examples of leveraging callback functions are below:
We've shared a lot of valuable examples of how to set up and customize Bugsnag to gain additional insight. Some configuration best practices we generally recommend to our users are below that can be simple to implement yet yield powerful results.