Stop using try/except as control flow in Python services

Last updated: May 15, 2026


Using try/except as service control flow is an antipattern in Python whenever the failure path fires more than rarely, spans more than a single expression, or crosses a subsystem boundary. CPython still builds a traceback at every raise, your APM ingests the event as a real error, static type checkers cannot enforce that callers handle it, and one stray exception inside asyncio.gather() can cancel its siblings. Reserve exceptions for genuinely exceptional, caller-actionable conditions; return a tagged result otherwise.

  • Decision rubric: if miss-rate × call-rate × stack-depth is non-trivial, the explicit check beats try/except in CPython 3.12.
  • The “zero-cost” change in 3.11+ removed the entry cost of try, not the cost of raise itself — traceback construction is still eager.
  • asyncio.gather(*tasks) with the default return_exceptions=False propagates the first exception and can cancel sibling tasks; TaskGroup always cancels remaining tasks.
  • mypy and pyright cannot force callers to handle a raised exception, but can enforce handling of a Result[T, E] union.
  • Single-expression EAFP (d[k], next(it)) is not the antipattern; block-level or cross-subsystem try/except is.

The 70-word verdict and the rubric

The python exception control flow antipattern is not “ever using try/except.” It is using it as a substitute for a branch, in a hot path, around a block, across a subsystem boundary. The rubric is a product: miss-rate × call-rate × stack-depth. If any factor is tiny — a rare error, a cold path, a one-frame helper — exceptions remain cheap. When all three rise together, the cost moves from microseconds to seconds of CPU per minute of traffic and to noise in every alerting tool you own.

EAFP is not one thing: four uses people conflate

The standard advice quotes the EAFP entry in the Python glossary as though it endorsed a single style. It doesn’t. There are four distinct uses, and only two of them are defensible at the service tier.

Topic diagram for Stop using try/except as control flow in Python services
Purpose-built diagram for this article — Stop using try/except as control flow in Python services.

The diagram separates the four cases along two axes: scope of the protected expression (single op vs. block) and locality of the failure handler (same function vs. across a boundary). Only the bottom-right quadrant — a multi-line try block whose except lives in a different module or service — is the case this article argues against.

For more on this, see silent ORM antipatterns.

Four uses of try/except and where they belong
Use Example Verdict Why
Single-expression EAFP try: d[k]\nexcept KeyError: ... Fine when miss-rate is low One opcode is protected; failure path is local; reader sees the failure clearly.
Iterator protocol StopIteration in a generator Required The language itself uses exceptions to terminate iteration.
Block-level try/except Wrapping 20 lines of business logic Antipattern Catches unrelated exceptions; hides which line failed; defeats type narrowing.
Cross-subsystem try/except HTTP handler catches RepositoryError from a deep call Antipattern Implicit contract; impossible to enforce; APM cannot distinguish from a bug.

What raise actually costs in CPython 3.12

The “zero-cost exception handling” change in Python 3.11, described in the CPython 3.11 bytecode changes, replaced the old SETUP_FINALLY/POP_BLOCK pair with an out-of-line exception table. The happy path no longer pays for entering a try. But the unhappy path still pays — and that is the cost the SERP folklore misses.

Compare two equivalent lookups with dis:

profiling Django under load goes into the specifics of this.

import dis

def with_try(d, k):
    try:
        return d[k]
    except KeyError:
        return None

def with_get(d, k):
    return d.get(k)

dis.dis(with_try)
print("---")
dis.dis(with_get)

On CPython 3.12 the with_get body is a straight-line LOAD_FAST, LOAD_METHOD, LOAD_FAST, CALL, RETURN_VALUE. The with_try happy path is similar — that is the zero-cost win. The miss path is the difference: BINARY_SUBSCR raises, the interpreter consults the exception table, builds a traceback object that walks the current frame, and dispatches to the handler. None of this work appears in the dictionary version, where the C method simply returns None.

Terminal output for Stop using try/except as control flow in Python services
Here’s what the example produces.

The terminal output shows the two disassemblies side-by-side. The bodies are similar in opcode count, but the except arm contains the traceback construction the get path skips entirely. As PEP 657 — Include Fine Grained Error Locations in Tracebacks documents, CPython now also attaches column-level position data to every frame in the chain, which makes tracebacks more useful for debugging and slightly more expensive to construct.

The cost of an individual raise is small in isolation. The point of the rubric is that “small × hot path × deep stack” is not small. Stack depth matters because traceback construction is proportional to the number of frames the exception traverses before it is caught — a request handler that catches deep inside a repository layer pays for every frame in between, every time.

A useful baseline: write the same code with try/except, with k in d, and with d.get(k), then call each 10 million times at miss rates of 1%, 50%, and 100%. The qualitative shape is consistent across CPython 3.11 and 3.12: the three are within noise at 0% miss; get and in pull ahead linearly with miss rate; and the gap widens with each additional stack frame the exception must unwind. The exact crossover varies by hardware, but you do not need a precise number to know which side of it your service is on.

The observability tax

This is the cost the existing SERP never mentions. APMs like Sentry, Datadog, and OpenTelemetry instrumentation capture exceptions because exceptions are, by definition, the language’s signal for “something exceptional happened.” When your code uses them as a branch, the instrumentation cannot tell the difference.

Dashboard: Exceptions as Control Flow: The Cost
Multi-metric dashboard — Exceptions as Control Flow: The Cost.

The dashboard mock shows what control-flow exceptions look like in a real issue tracker: an event with a familiar message (“KeyError: ‘session_id'”), counted in the thousands per hour, sitting in the same list as a genuine null-pointer regression. Triage becomes archaeology. Error budgets, alert thresholds, and on-call rotations all degrade in proportion to the noise.

A related write-up: distributed trace context.

You can suppress these in the SDK — most APMs let you filter exception classes or breadcrumbs — but suppression is a band-aid. The real fix is to stop emitting the event, which means stop raising for expected outcomes. If a missing key is a normal answer to “is this user in cache?”, the function should return None or a typed miss, not raise.

Async makes it worse

The behavior of asyncio.gather() is explicit in the standard library docs: if return_exceptions is False (the default), the first exception is propagated immediately to the awaiter, and “future and task futures that have not completed yet are not cancelled” — except when gather itself is cancelled by its caller, in which case all submitted awaitables are cancelled. In practice, that last case is common: a request handler awaiting gather is cancelled when the client disconnects, when a timeout fires, or when the surrounding task is cancelled by anything else.

The trap is that a control-flow exception inside one coroutine becomes the cancellation trigger for siblings as soon as a supervising layer catches it. Code that “worked in tests” with a single task loses sibling work in production where the same call is wrapped in asyncio.wait_for or a custom timeout. The Python 3.11+ TaskGroup is more honest about this: it always cancels the remaining tasks when any task raises. Either way, using a raised exception as “no result, try the next branch” inside a gather is a silent data-loss bug waiting for a slow day.

See also async Python bug tracing.

The type-checker cannot save you

Mypy’s type narrowing understands that a function that raises does not return. What it does not do is force the caller to handle that raise. There is no checked-exception annotation in Python, and the mypy issue for specifying possible unhandled exceptions has been open since 2021 without a path to acceptance.

Compare two equivalent functions:

from dataclasses import dataclass
from typing import Generic, TypeVar, Union

T = TypeVar("T")
E = TypeVar("E")

@dataclass(frozen=True)
class Ok(Generic[T]):
    value: T

@dataclass(frozen=True)
class Err(Generic[E]):
    error: E

Result = Union[Ok[T], Err[E]]


def fetch_user_raising(uid: int) -> str:
    raise LookupError("missing")


def fetch_user_result(uid: int) -> Result[str, LookupError]:
    return Err(LookupError("missing"))


def consumer_raising(uid: int) -> int:
    name = fetch_user_raising(uid)
    return len(name)


def consumer_result(uid: int) -> int:
    res = fetch_user_result(uid)
    return len(res.value)

Run mypy --strict on this file. consumer_raising passes — mypy has no idea the caller forgot to handle LookupError. consumer_result fails: accessing .value on a Union[Ok[str], Err[LookupError]] is a type error until you discriminate the union with isinstance. That is the enforcement an exception-based API cannot give you, and it is the reason “but exceptions are Pythonic” loses force the moment your service surface stabilises into something other people call.

A decision rubric, not a slogan

The rubric: keep try/except when the failure is rare, local, and caller-actionable. Move to a typed return when the failure is expected, common, or crosses a boundary other engineers maintain.

Architecture diagram for Stop using try/except as control flow in Python services
Walkthrough of the moving parts.

The architecture diagram sketches the refactor: a deep repository function returns Result[Row, NotFound] instead of raising; the service layer pattern-matches on the result and converts it to an HTTP 404 at the edge, where the exception class actually carries useful framing for the response. Exceptions live at the edges; the interior speaks in values.

I wrote about task instrumentation internals if you want to dig deeper.

Concrete recipe: identify functions whose error path your tests exercise more than your happy path; replace raise SomeError with return Err(SomeError(...)); type the return as a discriminated union; let mypy walk the callers and fail on every site that forgot to handle the miss. The diff is mechanical, the static analysis pays for itself within a sprint, and the APM goes quiet.

The strongest counter-argument

The best objection is the one PEP 463 (Exception-catching expressions) makes implicitly: Python has no ? operator, no Result in the standard library, no Rust-style match on errors (until PEP 634, which still doesn’t add discriminated unions as a type). Hand-rolling Result[T, E] across a codebase imposes real ergonomic cost: every call site grows an isinstance branch, generic variance bites, and the third-party libraries you depend on still raise.

The rebuttal: the boundary is the point. You don’t convert every internal function; you convert the ones that cross a subsystem line, get monitored by an APM, or compose under asyncio.gather. Inside a tight module — a parser, a single repository class, a generator — keep raising. The criterion is whether the caller is the same person who wrote the raise. When it isn’t, the type system is the only thing left to enforce the contract.

More detail in hidden runtime costs.

The legitimate cases this article is not arguing against

Iterator termination via StopIteration is the language’s own protocol. Single-expression EAFP on a hot dict with low miss-rate is faster than in-then-index because it does one lookup instead of two; the Real Python best-practices reference on exception handling is correct on this narrow point. Genuinely exceptional cases — disk full, OOM, programmer-error invariants — should keep raising, because they are exactly what the exception machinery was designed for.

The line is whether the failure is part of the function’s expected output. If “not found” is a normal answer, it should travel in the return type. If “the database vanished” is the answer, raise. That distinction — the one the entire SERP refuses to draw — is the only one worth memorising.

If you want to keep going, sync_to_async deadlock patterns is the next stop.

Is try/except always bad in Python services?

No. Single-expression EAFP on a low-miss-rate dictionary, iterator termination via StopIteration, and genuinely exceptional conditions like disk-full or OOM are all legitimate. The antipattern is using try/except as a substitute for a branch — wrapping a block of business logic, catching across a subsystem boundary, or replacing what should be a normal return value with a raised exception that travels through several frames before anyone handles it.

Why does raise still have a cost after the 3.11 zero-cost exceptions change?

The 3.11 change removed the entry cost of try by replacing SETUP_FINALLY with an out-of-line exception table. The happy path is now free, but the unhappy path still pays: when raise fires, CPython walks the exception table, builds a traceback object with PEP 657 column-level position data, and unwinds every frame between the raise site and the handler. That cost scales with stack depth, not with whether you entered a try.

How do I migrate a function from raising to returning a Result?

Replace raise SomeError(...) with return Err(SomeError(...)), then change the return annotation to a discriminated union like Result[T, E]. Run mypy --strict and let it flag every call site that accessed the value without checking the tag. Fix each one with an isinstance branch or pattern match. Keep exceptions at the edges — convert Err to an HTTP 4xx response inside the request handler, where the framing actually matters.

References

More From Author

Inside V8 deoptimization: how inline caches distort JavaScript

Leave a Reply

Your email address will not be published. Required fields are marked *