Server errors indicate temporary issues with the Serverless Inference service. This page helps you identify these errors, handle them gracefully in your client code, and decide when to escalate to support.Documentation Index
Fetch the complete documentation index at: https://wb-21fd5541-style-guide-support-models-articles-20260527-00.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Error types
The following sections describe the error codes that indicate transient server-side problems rather than issues with your request.500 internal server error
This is a temporary internal error on the server side. The response message is: “The server had an error while processing your request.”503 service overloaded
The service is experiencing high traffic. The response message is: “The engine is currently overloaded, please try again later.”Handle server errors
Because these errors are usually transient, the following techniques give the service time to recover before you retry.-
Wait before retrying. Use the following wait times:
500errors: Wait 30 to 60 seconds.503errors: Wait 60 to 120 seconds.
-
Use exponential backoff.
-
Set appropriate timeouts. Apply the following adjustments:
- Increase timeout values for your HTTP client.
- Consider async operations for better handling.
Contact support
If retries and backoff don’t resolve the issue, contact support so the team can investigate further. Contact support if any of the following apply:- Errors persist for more than 10 minutes.
- You see patterns of failures at specific times.
- Error messages contain additional details.
- Error messages and codes.
- Time when errors occurred.
- Your code snippet (remove API keys).
- W&B entity and project names.
Inference