← Back to DigitalOcean History
minor✓ Resolved
Serverless Inference - High error rates for open source models ( Qwen 3 32B)
Status
resolved
Duration
3h 5m
Updates
2
Coverage
0 articles
INCIDENT TIMELINE
✓ ResolvedTue, Apr 7, 2026, 03:55 PM
This incident has been resolved.
identifiedTue, Apr 7, 2026, 12:55 PM
We are currently investigating reports of elevated latency affecting requests to this model when using Serverless Inference and Agents.
Earlier observations indicated increased error rates for the open-source Qwen 3 32B model. The Ray dashboard also showed multiple workers in a pending state, suggesting capacity constraints.
Our analysis determined that the model was experiencing higher-than-expected request volume without sufficient resources to scale accordingly. To address this, the node pool size has been increased to improve available capacity. However, there are still insufficient nodes to fully support the desired number of model replicas.
Following the node pool expansion, a new pod-related error has been identified. Our Engineering team is actively working to resolve this issue and restore full service performance.
investigatingTue, Apr 7, 2026, 12:49 PM
Serverless inference for alibaba-qwen3-32b (Qwen 3 32B) in tor1 is experiencing high error rates starting at 10:46 UTC.
📊 TECHNICAL DETAILS
Internal ID
0e306cf3-7a4e-42af-a408-78ce1639a50b
External ID
bx60kdvsvtvb
🕐 Started
Tue, Apr 7, 2026, 12:49 PM
🔄 Last Updated
Tue, Apr 7, 2026, 03:55 PM
✓ Resolved
Tue, Apr 7, 2026, 03:55 PM
MINOR
Impact Level
3h 5m
Total Duration
2
Status Updates
Affected Components
Gradient AI
REQUEST COVERAGE
No article has been written for this incident yet. When 100 people request coverage, we automatically generate one.
0 / 100 requests100 more needed