Handling errors in Go is simple and flexible – yet no structure!
It's supposed to be simple, right? Just return an error
, wrapped with a message, and move on. Well, that simplicity quickly turns into chaotic as our codebase grows with more packages, more developers, and more "quick fixes" that stay there forever. Over time, the logs are full of "failed to do this" and "unexpected that", and nobody knows if it’s the user’s fault, the server’s fault, buggy code, or it's just a misalignment of the stars!
Errors are created with inconsistent messages. Each package has it own set of styles, constants, or custom error types. Error codes are added arbitrarily. No easy way to tell which errors may be returned from which function without digging into its implementation!
So, I took the challenge of creating a new error framework. We decided to go with a structured, centralized system using namespace codes to make errors meaningful, traceable, and – most importantly – give us peace of mind!
This is the story of how we started with a simple error handling approach, got thoroughly frustrated as the problems grew, and eventually built our own error framework. The design decisions, how it's implemented, the lessons learned, and why it transformed our approach to managing errors. I hope that it will bring some ideas for you too!
Go has a straightforward way to handle errors: errors are just values. An error is just a value that implements the error
interface with a single method Error() string
. Instead of throwing an exception and disrupting the current execution flow, Go functions return an error
value alongside other results. The caller can then decide how to handle it: check its value to make decision, wrap with new messages and context, or simply return the error, leaving the handling logic for parent callers.
We can make any type an error
by adding the Error() string
method on it. This flexibility allows each package to define its own error-handling strategy, and choose whatever works best for them. This also integrates well with Go's philosophy of composability, making it easy to wrap, extend, or customize errors as required.
The common practice is to return an error value that implements the error
interface and lets the caller decide what to do next. Here's a typical example:
func loadCredentials() (Credentials, error) {
data, err := os.ReadFile("cred.json")
if errors.Is(err, os.ErrNotExist) {
return nil, fmt.Errorf("file not found: %w", err)
}
if err != nil {
return nil, fmt.Errorf("failed to read file: %w", err)
}
cred, err := verifyCredentials(cred);
if err != nil {
return nil, fmt.Errorf("invalid credentials: %w", err)
}
return cred, nil
}
Go provides a handful of utilities for working with errors:
errors.New()
and fmt.Errorf()
for generating simple errors.fmt.Errorf()
and the %w
verb.errors.Join()
merges multiple errors into a single one.errors.Is()
matches an error with a specific value, errors.As()
matches an error to a specific type, and errors.Unwrap()
retrieves the underlying error.
In practice, we usually see these patterns:
errors.New()
or fmt.Errorf()
.In the early days, like many Go developers, we followed Go's common practices and kept error handling minimal yet functional. It worked well enough for a couple of years.
Include stacktrace using pkg/errors, a popular package at that time.
Export constants or variables for package-specific errors.
Use errors.Is()
to check for specific errors.
Wrap errors with a new messages and context.
For API errors, we define error types and codes with Protobuf enum.
Including stacktrace with pkg/errors
We used pkg/errors, a popular error-handling package at the time, to include stacktrace in our errors. This was particularly helpful for debugging, as it allowed us to trace the origin of errors across different parts of the application.
To create, wrap, and propagate errors with stacktrace, we implemented functions like Newf()
, NewValuef()
, and Wrapf()
. Here's an example of our early implementation:
type xError struct {
msg message,
stack: callers(),
}
func Newf(msg string, args ...any) error {
return &xError{
msg: fmt.Sprintf(msg, args...),
stack: callers(), // 👈 stacktrace
}
}
func NewValuef(msg string, args ...any) error {
return fmt.Errorf(msg, args...) // 👈 no stacktrace
}
func Wrapf(err error, msg string, args ...any) error {
if err == nil { return nil }
stack := getStack(err)
if stack == nil { stack = callers() }
return &xError{
msg: fmt.Sprintf(msg, args...),
stack: stack,
}
}
Exporting error variables
Each package in our codebase defined its own error variables, often with inconsistent styles.
package database
var ErrNotFound = errors.NewValue("record not found")
var ErrMultipleFound = errors.NewValue("multiple records found")
var ErrTimeout = errors.NewValue("request timeout")
package profile
var ErrUserNotFound = errors.NewValue("user not found")
var ErrBusinessNotFound = errors.NewValue("business not found")
var ErrContextCancel = errors.NewValue("context canceled")
Checking errors with errors.Is()
and wrapping with additional context
res, err := repo.QueryUser(ctx, req)
switch {
case err == nil:
// continue
case errors.Is(database.NotFound):
return nil, errors.Wrapf(ErrUserNotFound, "user not found (id=%v)", req.UserID)
default:
return nil, errors.Wrapf(ctx, "failed to query user (id=%v)", req.UserID)
}
This helped propagate errors with more detail but often resulted in verbosity, duplication, and less clarity in logs:
internal server error: failed to query user: user not found (id=52a0a433-3922-48bd-a7ac-35dd8972dfe5): record not found: not found
Defining external errors with Protobuf
For external-facing APIs, we adopted a Protobuf-based error model inspired by Meta's Graph API:
message Error {
string message = 1;
ErrorType type = 2;
ErrorCode code = 3;
string user_title = 4;
string user_message = 5;
string trace_id = 6;
map<string, string> details = 7;
}
enum ErrorType {
ERROR_TYPE_UNSPECIFIED = 1;
ERROR_TYPE_AUTHENTICATION = 2;
ERROR_TYPE_INVALID_REQUEST = 3;
ERROR_TYPE_RATE_LIMIT = 4;
ERROR_TYPE_BUSINESS_LIMIT = 5;
ERROR_TYPE_WEBHOOK_DELIVERY = 6;
}
enum ErrorCode {
ERROR_CODE_UNSPECIFIED = 1 [(error_type = UNSPECIFIED)];
ERROR_CODE_UNAUTHENTICATED = 2 [(error_type = AUTHENTICATION)];
ERROR_CODE_CAMPAIGN_NOT_FOUND = 3 [(error_type = NOT_FOUND)];
ERROR_CODE_META_CHOSE_NOT_TO_DELIVER = 4 /* ... */;
ERROR_CODE_MESSAGE_WABA_TEMPLATE_CAN_ONLY_EDIT_ONCE_IN_24_HOURS = 5;
}
This approach helped structure errors, but over time, error types and codes were added without a clear plan, leading to inconsistencies and duplication.
Errors were declared everywhere
gorm.ErrRecordNotFound
or user.ErrNotFound
or both?
Random error wrapping led to inconsistent and arbitrary logs
unexpected gorm error: failed to find business channel: error received when invoking API: unexpected: context canceled
No standardization led to improper error handling
No categorization made monitoring impossible
context.Canceled
error may be a normal behavior when the user closes the browser tab, but it's important if the request is canceled because that query is randomly slow.To address the growing challenges, we decided to build a better error strategy around the core idea of centralized and structured error codes.
Error
type with a comprehensive set of helpers.All error codes are defined at a centralized place with namespace structure.
Use namespaces to create clear, meaningful, and extendable error codes. Example:
PRFL.USR.NOT_FOUND
for "User not found."FLD.NOT_FOUND
for "Flow document not found."DEPS.PG.NOT_FOUND
, meaning "Record not found in PostgreSQL."
Each layer of service or library must only return its own namespace codes.
gorm.ErrRecordNotFound
from a dependency, the "database" package must wrap it as DEPS.PG.NOT_FOUND
. Later, the "profile/user" service must wrap it again as PRFL.USR.NOT_FOUND
.
All errors must implement the Error
interface.
error
) and our internal Error
s.
An error can wrap one or multiple errors. Together, they form a tree.
[FLD.INVALID_ARGUMENT] invalid argument
→ [TPL.INVALID_PARAMS] invalid input params
1. [TPL.PARAM.EMPTY] name can not be empty
2. [TPL.PARAM.MALFORM] invalid format for param[2]
Always require context.Context
. Can attach context to the error.
trace_id
, and have no idea where it comes from.
When errors are sent across service boundary, only the top-level error code is exposed.
For external errors, keep using the current Protobuf ErrorCode and ErrorType.
Automap namespace error codes to Protobuf codes, HTTP status codes, and tags.
ErrorCode
, ErrorType
, gRPC status, HTTP status, and tags for logging/metrics.There are a few core packages that form the foundation of our new error-handling framework.
connectly.ai/go/pkgs/
errors
: The main package that defines the Error
type and codes.errors/api
: For sending errors to the front-end or external API.errors/E
: Helper package intended to be used with dot import.testing
: Testing utilities for working with namespace errors.
Error
and Code
The Error
interface is an extension of the standard error
interface, with additional methods to return a Code
. A Code
is implemented as an uint16
.
package errors // import "connectly.ai/go/pkgs/errors"
type Error interface {
error
Code() Code
}
type Code struct {
code uint16
}
type CodeI interface {
CodeDesc() CodeDesc
}
type GroupI interface { /* ... */ }
type CodeDesc struct { /* ... */ }
Package errors/E
exports all error codes and common types
package E // import "connectly.ai/go/pkgs/errors/E"
import "connectly.ai/go/pkgs/errors"
type Error = errors.Error
var (
DEPS = errors.DEPS
PRFL = errors.PRFL
)
func MapError(ctx context.Context, err error) errors.Mapper { /* ... */ }
func IsErrorCode(err error, codes ...errors.CodeI) { /* ... */ }
func IsErrorGroup(err error, groups ...errors.GroupI) { /* ... */ }
Example error codes:
// dependencies → postgres
DEPS.PG.NOT_FOUND
DEPS.PG.UNEXPECTED
// sdk → hash
SDK.HASH.UNEXPECTED
// profile → user
PRFL.USR.NOT_FOUND
PFRL.USR.UNKNOWN
// profile → user → repository
PRFL.USR.REPO.NOT_FOUND
PRFL.USR.REPO.UNKNOWN
// profile → auth
PRFL.AUTH.UNAUTHENTICATED
PRFL.AUTH.UNKNOWN
PRFL.AUTH.UNEXPECTED
Package database
:
package database // import "connectly.ai/go/pkgs/database"
import "gorm.io/gorm"
import . "connectly.ai/go/pkgs/errors/E"
type DB struct { gorm: gorm.DB }
func (d *DB) Exec(ctx context.Context, sql string, params ...any) *DB {
tx := d.gorm.WithContext(ctx).Exec(sql, params...)
return wrapTx(tx)
}
func (x *DB) Error(msgArgs ...any) Error {
return wrapError(tx.Error()) // 👈 convert gorm error to 'Error'
}
func (x *DB) SingleRowError(msgArgs ...any) Error {
if err := x.Error(); err != nil { return err }
switch {
case x.RowsAffected == 1: return nil
case x.RowsAffected == 0:
return DEPS.PG.NOT_FOUND.CallerSkip(1).
New(x.Context(), formatMsgArgs(msgArgs))
default:
return DEPS.PG.UNEXPECTED.CallerSkip(1).
New(x.Context(), formatMsgArgs(msgArgs))
}
}
Package pb/services/profile
:
package profile // import "connectly.ai/pb/services/profile"
// these types are generated from services/profile.proto
type QueryUserRequest struct {
BusinessId string
UserId string
}
type LoginRequest struct {
Username string
Password string
}
Package service/profile
:
package profile
import uuid "github.com/google/uuid"
import . "connectly.ai/go/pkgs/errors/E"
import l "connectly.ai/go/pkgs/logging/l"
import profilepb "connectly.ai/pb/services/profile"
// repository requests
type QueryUserByUsernameRequest struct {
Username string
}
// repository layer → query user
func (r *UserRepository) QueryUserByUsernameAuth(
ctx context.Context, req *QueryUserByUsernameRequest,
) (*User, Error) {
if req.Username == "" {
return PRFL.USR.REPO.INVALID_ARGUMENT.New(ctx, "empty request")
}
var user User
sqlQuery := `SELECT * FROM "user" WHERE username = ? LIMIT 1`
tx := r.db.Exec(ctx, sqlQuery, req.Username).Scan(&user)
err := tx.SingleRowError()
switch {
case err == nil:
return &user, nil
case IsErrorCode(DEPS.PG.NOT_FOUND):
return PRFL.USR.REPO.USER_NOT_FOUND.
With(l.String("username", req.Username))
Wrap(ctx, "user not found")
default:
return PRFL.USR.REPO.UNKNOWN.
Wrap(ctx, "failed to query user")
}
}
// user service layer → query user
func (u *UserService) QueryUser(
ctx context.Context, req *profilepb.QueryUserRequest,
) (*profilepb.QueryUserResponse, Error) {
// ...
rr := QueryUserByUsernameRequest{ Username: req.Username }
err := u.repo.QueryUserByUsername(ctx, rr)
if err != nil {
return nil, MapError(ctx, err).
Map(PRFL.USR.REPO.NOT_FOUND, PRFL.USR.NOT_FOUND,
"the user %q cannot be found", req.UserName,
api.UserTitle("User Not Found"),
api.UserMsg("The requested user id %q can not be found", req.UserId)).
KeepGroup(PRFL.USR).
Default(PRFL.USR.UNKNOWN, "failed to query user")
}
// ...
return resp, nil
}
// auth service layer → login user
func (a *AuthService) Login(
ctx context.Context, req *profilepb.LoginRequest,
) (*profilepb.LoginResponse, *profilepb.LoginResponse, Error) {
vl := PRFL.AUTH.INVALID_ARGUMENT.WithMsg("invalid request")
vl.Vl(req.Username != "", "no username", api.Detail("username is required"))
vl.Vl(req.Password != "", "no password", api.Detail("password is required"))
if err := vl.ToError(ctx); err != nil {
return err
}
hashpwd, err := hash.Hash(req.Password)
if err != nil {
return PRFL.AUTH.UNEXPECTED.Wrap(ctx, err, "failed to calc hash")
}
usrReq := profilepb.QueryUserByUsernameRequest{/*...*/}
usrRes, err := a.userServiceClient.QueryUserByUsername(ctx, usrReq)
if err != nil {
return nil, MapError(ctx, err).
Map(PRFL.USR.NOT_FOUND, PRFL.AUTH.UNAUTHENTICATED, "unauthenticated").
Default(PRFL.AUTH.UNKNOWN, "failed to query by username")
}
// ...
}
Well, there are a lot of new functions and concepts in the above code. Let's go through them step by step.
First, import package errors/E
using dot import
This will allow you to directly use common types like Error
instead of errors.Error
and access to codes by PRFL.USR.NOT_FOUND
instead of errors.PRFL.USR.NOT_FOUND
.
import . "connectly.ai/go/pkgs/errors/E"
Create new errors using CODE.New()
Suppose you get an invalid request, you can create a new error by:
err := PRFL.USR.INVALID_ARGUMENT.New(ctx, "invalid request")
PRFL.USR.INVALID_ARGUMENT
is a Code
.Code
exposes methods like New()
or Wrap()
for creating a new error.New()
function receives context.Context
as the first argument, followed by message and optional arguments.
Print it with fmt.Print(err)
:
[PRFL.USR.INVALID_ARGUMENT] invalid request
or with fmt.Printf("%+v")
to see more details:
[PRFL.USR.INVALID_ARGUMENT] invalid request
connectly.ai/go/services/profile.(*UserService).QueryUser
/usr/i/src/go/services/profile/user.go:1234
connectly.ai/go/services/profile.(*UserRepository).QueryUser
/usr/i/src/go/services/profile/repo/user.go:2341
Wrap an error within a new error using CODE.Wrap()
dbErr := DEPS.PG.NOT_FOUND.Wrap(ctx, gorm.ErrRecordNotFound, "not found")
usrErr := PRFL.USR.NOT_FOUND.Wrap(ctx, dbErr, "user not found")
will produce this output with fmt.Print(usrErr)
:
[PRFL.USR.NOT_FOUND] user not found → [DEPS.PG.NOT_FOUND] not found → record not found
or with fmt.Printf("%+v", usrErr)
[PRFL.USR.NOT_FOUND] user not found
→ [DEPS.PG.NOT_FOUND] not found
→ record not found
connectly.ai/go/services/profile.(*UserService).QueryUser
/usr/i/src/go/services/profile/user.go:1234
The stacktrace will come from the innermost Error
. If you are writing a helper function, you can use CallerSkip(skip)
to skip frames:
func mapUserError(ctx context.Context, err error) Error {
switch {
case IsErrorCode(err, DEPS.PG.NOT_FOUND):
return PRFL.USR.NOT_FOUND.CallerSkip(1).Wrap(ctx, err, "...")
default:
return PRFL.USR.UNKNOWN.CallerSkip(1).Wrap(ctx, err, "...")
}
}
Add context to an error using With()
.With(l.String(...))
.logging/l
is a helper package to export sugar functions for logging.l.String("flag", flag)
return a Tag{String: flag}
and l.UUID("user_id, userID)
return Tag{Stringer: userID}
.import l "connectly.ai/go/pkgs/logging/l"
usrErr := PRFL.USR.NOT_FOUND.
With(l.UUID("user_id", req.UserID), l.String("flag", flag)).
Wrap(ctx, dbErr, "user not found")
The tags can be output with fmt.Printf("%+v", usrErr)
:
[PRFL.USR.NOT_FOUND] user not found
{"user_id": "81febc07-5c06-4e01-8f9d-995bdc2e0a9a", "flag": "ABRW"}
→ [DEPS.PG.NOT_FOUND] not found
{"a number": 42}
→ record not found
Add context to errors directly inside New()
, Wrap()
, or MapError()
:
By leverage l.String()
function and its family, New()
and similar functions can smartly detect tags among formatting arguments. No need to introduce different functions.
err := INF.HEALTH.NOT_READY.New(ctx,
"service %q is not ready (retried %v times)",
req.ServiceName,
l.String("flag", flag)
countRetries,
l.Number("count", countRetries),
)
will output:
[INF.HEALTH.NOT_READY] service "magic" is not ready (retried 2 times)
{"flag": "ABRW", "count": 2}
Error0
, VlError
, ApiError
Currently, there are 3 types that implements the Error
interfaces. You can add more types if necessary. Each one can have different structure, with custom methods for specific needs.
Error
is an extension of Go's standard error
interface
type Error interface {
error
Code()
Message()
Fields() []tags.Field
StackTrace() stacktrace.StackTrace
_base() *base // a private method
}
It contains a private method to ensure that we don't accidentally implement new Error
types outside of the errors
package. We may (or may not) lift that restriction in the future when we experience with more usage patterns.
Why don't we just use the standard error
interface and use type assertion?
Because we want to separate between third-party errors and our internal errors. All layers and packages in our internal codes must always return Error
. This way we can safely know when we have to convert third-party errors, and when we only need to deal with our internal error codes.
It also creates a boundary between migrated packages and not-yet-migrated packages. Back to reality, we cannot just declare a new type, wave a magic wand, whisper a spell prompt, and then all millions lines of code are magically converted and work seamlessly with no bugs! No, that future is not here yet. It may come someday, but for now, we still have to migrate our packages one by one.
Error0
is the default Error
type
Most error codes will produce an Error0
value. It contains a base
and an optional sub-error. You can use NewX()
to return a concrete *Error0
struct instead of an Error
interface, but you need to be careful.
type Error0 struct {
base
err error
}
var errA: Error = DEPS.PG.NOT_FOUND.New (ctx, "not found")
var errB: *Error0 = DEPS.PG.NOT_FOUND.NewX(ctx, "not found")
base
is the common structure shared by all Error
implementation, to provide common functionality: Code()
, Message()
, StackTrace()
, Fields()
, and more.
type base struct {
code Code
msg string
kv []tags.Field
stack stacktrace.StackTrace
}
VlError
is for validation errors
It can contain multiple sub-errors, and provide nice methods to work with validation helpers.
type VlError struct {
base
errs []error
}
You can create a VlError
similar to other Error
:
err := PRFL.USR.INVALID_ARGUMENT.New(ctx, "invalid request")
Or make a VlBuilder
, add errors to it, then convert it to a VlError
:
userID, err0 := parseUUID(req.UserId)
err1 := validatePassword(req.Password)
vl := PRFL.USR.INVALID_ARGUMENT.WithMsg("invalid request")
vl.Add(err0, err1)
vlErr := vl.ToError(ctx)
And include key/value pairs as usual:
vl := PRFL.USR.INVALID_ARGUMENT.
With(l.Bool("testingenv", true)).
WithMsg("invalid request")
userID, err0 := parseUUID(req.UserId)
err1 := validatePassword(req.Password)
vl.Add(err0, err1)
vlErr := vl.ToError(ctx, l.String("user_id", req.UserId))
Using fmt.Printf("%+v", vlErr)
will output:
[PRFL.USR.INVALID_ARGUMENT] invalid request
{"testingenv": true, "user_id": "A1234567890"}
ApiError
is an adapter for migrating API errors
Previously, we used a separate api.Error
struct for returning API errors to the front-end and external clients. It includes ErrorType
as ErrorCode
as mentioned before.
package api
import errorpb "connectly.ai/pb/models/error"
// Deprecated
type Error struct {
pbType errorpb.ErrorType
pbCode errorpb.ErrorCode
cause error
msg string
usrMsg string
usrTitle string
// ...
}
This type is now deprecated. Instead, we will declare all the mapping (ErrorType
, ErrorCode
, gRPC code, HTTP code) in a centralize place, and convert them at corresponding boundaries. I will discuss about code declaration in the next section.
To do the migration to the new namespace error framework, we added a temporary namespace ZZZ.API_TODO
. Every ErrorCode
becomes a ZZZ.API_TODO
code.
ZZZ.API_TODO.UNEXPECTED
ZZZ.API_TODO.INVALID_REQUEST
ZZZ.API_TODO.USERNAME_
ZZZ.API_TODO.META_CHOSE_NOT_TO_DELIVER
ZZZ.API_TODO.MESSAGE_WABA_TEMPLATE_CAN_ONLY_EDIT_ONCE_IN_24_HOURS
And ApiError
is created as an adapter. All functions that previously return *api.Error
were changed to return Error
(implemented by *ApiError
) instead.
package api
import . "connectly.ai/go/pkgs/errors/E"
// previous
func FailPreconditionf(err error, msg string, args ...any) *Error {
return &Error{
pbType: ERROR_TYPE_FAILED_PRECONDITION,
pbCode: ERROR_CODE_MESSAGE_WABA_TEMPLATE_CAN_ONLY_EDIT_ONCE_IN_24_HOURS,
cause: err,
msg: fmt.Sprintf(msg, args...)
}
}
// current: this is deprecated, and serves and an adapter
func FailPreconditionf(err error, msg string, args ...any) *Error {
ctx := context.TODO()
return ZZZ.API_TODO.MESSAGE_WABA_TEMPLATE_CAN_ONLY_EDIT_ONCE_IN_24_HOURS.
CallerSkip(1). // correct the stacktrace by 1 frame
Wrap(ctx, err, msg, args...)
}
When all the migration is done, the previous usage:
wabaErr := verifyWabaTemplateStatus(tpl)
apiErr := api.FailPreconditionf(wabaErr, "template cannot be edited").
WithErrorCode(ERROR_CODE_MESSAGE_WABA_TEMPLATE_CAN_ONLY_EDIT_ONCE_IN_24_HOURS).
WithUserMsg("According to WhatsApp, the message template can be only edited once in 24 hours. Consider creating a new message template instead.").
ErrorOrNil()
should become:
CPG.TPL.EDIT_ONCE_IN_24_HOURS.Wrap(
wabaErr, "template cannot be edited",
api.UserMsg("According to WhatsApp, the message template can be only edited once in 24 hours. Consider creating a new message template instead."))
Notice that the ErrorCode
is implicitly derived from the internal namespace code. No need to explicitly assign it every time. But how to declare the relationship between codes? It will be explained in the next section.
At this point, you already know how to create new errors from existing codes. It's time to explain about codes and how to add a new one.
A Code
is implemented as an uint16
value, which has a corresponding string presentation.
type Code struct { code: uint16 }
fmt.Printf("%q", DEPS.PG.NOT_FOUND)
// "DEPS.PG.NOT_FOUND"
To store those strings, there is an array of all available CodeDesc
:
const MaxCode = 321 // 👈 this value is generated
var allCodes [MaxCode]CodeDesc
type CodeDesc {
c int // 42
code string // DEPS.PG.NOT_FOUND
api APICodeDesc
}
type APICodeDesc {
ErrorType errorpb.ErrorType
ErrorCode errorpb.ErrorCode
HttpCode int
DefMessage string
UserMessage string
UserTitle string
}
Here's how codes are declared:
var DEPS deps // dependencies
var PRFL prfl // profile
var FLD fld // flow document
type deps struct {
PG pg // postgres
RD rd // redis
}
// tag:postgres
type pg struct {
NOT_FOUND Code0 // record not found
CONFLICT Code0 // record already exist
MALFORM_SQL Code0
}
// tag:profile
type PRFL struct {
REPO prfl_repo
USR usr
AUTH auth
}
// tag:profile
type prfl_repo struct {
NOT_FOUND Code0 // internal error code
INVALID_ARGUMENT VlCode // internal error code
}
// tag:usr
type usr struct {
NOT_FOUND Code0 `api-code:"USER_NOT_FOUND"`
INVALID_ARGUMENT VlCode `api-code:"INVALID_ARGUMENT"`
DISABlED_ACCOUNT Code0 `api-code:"DISABLED_ACCOUNT"`
}
// tag:auth
type auth struct {
UNAUTHENTICATED Code0 `api-code:"UNAUTHENTICATED"`
PERMISSION_DENIED Code0 `api-code:"PERMISSION_DENIED"`
}
After declaring new codes, you need to run the generation script:
run gen-errors
The generated code will look like this:
// Code generated by error-codes. DO NOT EDIT.
func init() {
// ...
PRFL.AUTH.UNAUTHENTICATED = Code0{Code{code: 143}}
PRFL.AUTH.PERMISSION_DENIED = Code0{Code{code: 144}}
// ...
allCodes[143] = CodeDesc{
c: 143, code: "PRFL.AUTH.UNAUTHENTICATED",
tags: []string{"auth", "profile"},
api: APICodeDesc{
ErrorType: ERROR_TYPE_UNAUTHENTICATED,
ErrorCode: ERROR_CODE_UNAUTHENTICATED,
HTTPCode: 401,
DefMessage: "Unauthenticated error",
UserMessage: "You are not authenticated.",
UserTitle: "Unauthenticated error",
}))
}
Each Error
type has a corresponding Code
type
Ever wonder how PRFL.USR.NOT_FOUND.New()
creates an *Error0
and PRFL.USR.INVALID_ARGUMENTS.New()
creates an *VlError
? It's because they use different code types.
And each Code
type returns different Error
type, each can have its own extra methods:
type Code0 struct { Code }
type VlCode struct { Code }
func (c Code0) New(/*...*/) Error {
return &Error0{/*...*/}
}
func (c VlCode) New(/*...*/) Error {
return &VlError{/*...*/}
}
// extra methods on VlCode to create VlBuilder
func (c VlCode) WithMsg(msg string, args ...any) *VlBuilder {/*...*/}
type VlBuilder struct {
code VlCode
msg string
args []any
}
func (b *VlBuilder) ToError(/*...*/) Error {
return &VlError{Code: code, /*...*/ }
}
Use api-code
to mark the codes available for external API
The namespace error code should be used internally.
To make a code available for returning in external HTTP API, you need to mark it with api-code
. The value is the corresponding errorpb.ErrorCode
.
If an error code is not marked with api-code
, it's internal code and will be shown as a generic Internal Server Error
.
Notice that PRFL.USR.NOT_FOUND
is external code, while PRFL.USR.REPO.NOT_FOUND
is internal code.
Declare mapping between ErrorCode
, ErrorType
, and gRPC/HTTP codes in protobuf using enum option:
// error/type.proto
ERROR_TYPE_PERMISSION_DENIED = 707 [(error_type_detail_option) = {
type: "PermissionDeniedError",
grpc_code: PERMISSION_DENIED,
http_code: 403, // Forbidden
message: "permission denied",
user_title: "Permission denied",
user_message: "The caller does not have permission to execute the specified operation.",
}];
// error/code.proto
ERROR_CODE_DISABlED_ACCOUNT = 70020 [(error_code_detail_option) = {
error_type: ERROR_TYPE_DISABlED_ACCOUNT,
grpc_code: PERMISSION_DENIED,
http_code: 403, // Forbidden
message: "account is disabled",
user_title: "Account is disabled",
user_message: "Your account is disabled. Please contact support for more information.",
}];
UNEXPECTED
and UNKNOWN
codes
Each layer usually has 2 generic codes UNEXPECTED
and UNKNOWN
. They serve slightly different purposes:
UNEXPECTED
code is used for errors that should never happen.UNKNOWN
code is used for errors that are not explicitly handled.When receiving an error returned from a function, you need to handle it: convert third-party errors to internal namespace errors and map error codes from inner layers to outer layers.
Convert third-party errors to internal namespace errors
How you handle errors depends on: what the third-party package returns and what your application needs. For example, when handling database or external API errors:
switch {
case errors.Is(err, sql.ErrNoRows):
// map a database "no rows" error to an internal "not found" error
return nil, PRFL.USR.NOT_FOUND.Wrap(ctx, err, "user not found")
case errors.Is(err, context.DeadlineExceeded):
// map a context deadline exceeded error to a timeout error
return nil, PRFL.USR.TIMEOUT.Wrap(ctx, err, "query timeout")
default:
// wrap any other error as unknown
return nil, PRFL.USR.UNKNOWN.Wrap(ctx, err, "unexpected error")
}
Using helpers for internal namespace errors
IsErrorCode(err, CODES...)
: Checks if the error contains any of the specified codes.IsErrorGroup(err, GROUP)
: Return true if the error belongs to the input group.
Typical usage pattern:
user, err := queryUser(ctx, userReq)
switch {
case err == nil:
// continue
case IsErrorCode(PRL.USR.REPO.NOT_FOUND):
// check for specific error code and convert to external code
// and return as HTTP 400 Not Found
return nil, PRFL.USR.NOT_FOUND.Wrap(ctx, err, "user not found")
case IsGroup(PRL.USR):
// errors belong to the PRFL.USR group are returned as is
return nil, err
default:
return nil, PRL.USR.UNKNOWN.Wrap(ctx, err, "failed to query user")
}
MapError()
for writing mapping code easier:
Since mapping error codes is a common pattern, there is a MapError()
helper to make writing code faster. The above code can be rewritten as:
user, err := queryUser(ctx, userReq)
if err != nil {
return nil, MapError(ctx, err).
Map(PRL.USR.REPO.NOT_FOUND, PRFL.USR.NOT_FOUND, "user not found").
KeepGroup(PRF.USR).
Default(PRL.USR.UNKNOWN, "failed to query user")
}
You can format arguments and add key/value pairs as usual:
return nil, MapError(ctx, err).
Map(PRL.USR.REPO.NOT_FOUND, PRFL.USR.NOT_FOUND,
"user %v not found", username,
l.String("flag", flag)).
KeepGroup(PRF.USR).
Default(PRL.USR.UNKNOWN, "failed to query user",
l.Any("retries", retryCount))
Error
sTesting is critical for any serious code base. The framework provides specialized helpers like ΩxError()
to make writing and asserting error conditions in tests easier and more expressive.
// 👉 return true if the error contains the message
ΩxError(err).Contains("not found")
// 👉 return true if the error does not contain the message
ΩxError(err).NOT().Contains("not found")
There are many more methods, and you can chain them too:
ΩxError(err).
MatchCode(DEPS.PG.NOT_FOUND). // match any code in top or wrapped errors
TopErrorMatchCode(PRFL.TPL.NOT_FOUND) // only match code from the top error
MatchAPICode(API_CODE.WABA_TEMPLATE_NOTE_FOUND). // match errorpb.ErrorCode
MatchExact("exact message to match")
Why use methods instead of Ω(err).To(testing.MatchCode())
?
Because methods are more discoverable. When you're faced with dozens of functions like testing.MatchValues()
, it's hard to know which ones will work with Error
s and which will not. With methods, you can simply type a dot .
, and your IDE will list all available methods specifically designed for asserting Error
s.
The framework is just half of the story. Writing the code? That’s the easy part. The real challenge starts when you have to bring it into a massive, living codebase where dozens of engineers are pushing changes daily, customers expect everything to work perfectly, and the system just can’t stop running.
Migration comes with responsibility. It’s about carefully splitting hair tiny bits of code, making tiny changes at a time, breaking a ton of tests in the process. Then manually inspecting and fixing them one by one, merging into the main branch, deploying to production, watching the logs and alerts. Repeating it over and over...
Here are some tips for migration that we learned along the way:
Start with search and replace: Begin by replacing old patterns with the new framework. Fix any compilation issues that arise from this process.
For example, replace all error
in this package with Error
.
type ProfileController interface {
LoginUser(req *LoginRequest) (*LoginResponse, error)
QueryUser(req *QueryUserRequest) (*QueryUserResponse, error)
}
The new code will look like this:
import . "connectly.ai/go/pkgs/errors"
type ProfileController interface {
LoginUser(req *LoginRequest) (*LoginResponse, Error)
QueryUser(req *QueryUserRequest) (*QueryUserResponse, Error)
}
Migrate one package at a time: Start with the lowest-level packages and work your way up. This way, you can ensure that the lower-level packages are fully migrated before moving on to the higher-level ones.
Add missing unit tests: If parts of the codebase lack tests, add them. If you are not confident in your changes, add more tests. They are helpful to make sure that your changes don’t break existing functionality.
If your package depends on calling higher-level packages: Consider changing the related functions to DEPRECATED then add new functions with the new Error
type.
Assume that you are migrating the database package, which has the Transaction()
method:
package database
func (db *DB) Transaction(ctx context.Context,
fn func(tx *gorm.DB) error) error {
return db.gorm.Transaction(func(tx *gorm.DB) error {
return fn(tx)
})
}
And it is used in the user service package:
err = s.DB(ctx).Transaction(func(tx *database.DB) error {
user, usrErr := s.repo.CreateUser(ctx, tx, user)
if usrErr != nil {
return usrErr
}
}
Since you are migrating the database
package first, leaving the user
and dozens of other packages as it. The s.repo.CreateUser()
call still returns the old error
type while the Transaction()
method needs to return the new Error
type. You can change the Transaction()
method to DEPRECATED
and add a new TransactionV2()
method:
package database
// DEPRECATED: use TransactionV2 instead
func (db *DB) Transaction_DEPRECATED(ctx context.Context,
fn func(tx *gorm.DB) error) error {
return db.gorm.Transaction(func(tx *gorm.DB) error {
return fn(tx)
})
}
func (db *DB) TransactionV2(ctx context.Context,
fn func(tx *gorm.DB) error) Error {
err := db.gorm.Transaction(func(tx *gorm.DB) error {
return fn(tx)
})
return adaptToErrorV2(err)
}
Add new error codes as you go: When you encounter an error that doesn’t fit into the existing ones, add a new code. This will help you build a comprehensive set of error codes over time. Codes from other packages are always available as references.
Error handling in Go can feel simple at first—just return an error
and move on. But as our codebase grew, that simplicity turned into a tangled mess of vague logs, inconsistent handling, and endless debugging sessions.
By stepping back and rethinking how we handle errors, we’ve built a system that works for us, not against us. Centralized and structured namespace codes give us clarity, while tools for mapping, wrapping, and testing errors make our lives easier. Instead of swimming through sea of logs, we now have meaningful, traceable errors that tell us what’s wrong and where to look.
This framework isn’t just about making our code cleaner; it’s about saving time, reducing frustration, and helping us prepare for the unknown. It's just the beginning of a journey — we are still discovering more patterns — but the result is a system that can somehow bring peace of mind to error handling. Hopefully, it can spark some ideas for your projects too! 😊
I'm Oliver Nguyen. A software maker working mostly in Go and JavaScript. I enjoy learning and seeing a better version of myself each day. Occasionally spin off new open source projects. Share knowledge and thoughts during my journey.
The post is also published at blog.connectly.ai and olivernguyen.io 👋