Testing Azure retry logic locally: why I stopped mocking 429s and started injecting them
There is a certain category of test that feels good to write but does not actually test what you think it does. Retry logic sits squarely in that category.
The usual pattern is this: inject a fake HttpMessageHandler, make it return a 429 or 503 on the first N calls, assert that the code retried and eventually succeeded. The test passes. You ship with confidence. Then, in production, a real throttling event triggers a path through the Azure SDK that your mock never covered, and the retry policy does not behave the way the test implied.
The issue is not that the mock is wrong. It is that the mock bypasses the entire SDK transport layer. When you return a 429 from a fake handler, you are testing whether your own retry wrapper handles it correctly. You are not testing whether Azure.Core's built-in retry pipeline fires, whether the Retry-After header is respected, or whether the SDK's own exception hierarchy propagates through your application code the way you assumed. That is a different bar entirely.
