major✓ Resolved

Slow API requests

Fly.io•Fri, Feb 27, 2026, 06:50 PM

Status

resolved

Duration

1h 34m

Updates

Coverage

0 articles

INCIDENT TIMELINE

✓ ResolvedFri, Feb 27, 2026, 08:25 PM

This incident has been resolved.

monitoringFri, Feb 27, 2026, 08:05 PM

API and platform operations have normalized. We are continuing to monitor to ensure full and stable recovery. 

Background jobs are almost fully caught up. Users may still see slightly slower requests creating new apps / orgs, but they should complete successfully.

Sprite and MPG cluster creations are processing as normal.

identifiedFri, Feb 27, 2026, 07:41 PM

A second fix has been deployed and database load has returned to normal, resulting in API response times beginning to normalize. Most Machines API requests should succeed as normal, and deploys to existing apps should also work.

We are working through a backlog of background jobs. New app / organization creations and other other operations that use these will continue to see increased latency or failures while we work thorough these. New MPG cluster and new Sprite creation continues to be impacted.

identifiedFri, Feb 27, 2026, 07:23 PM

An initial fix has been deployed and we are seeing improvements in load and API performance. Some operations that rely on the Graphql API, such as new app creations and some deployments, will continue to fail at this time. 

We are continuing to work on restoring full availability.

identifiedFri, Feb 27, 2026, 07:05 PM

We are currently seeing full API failures for requests to our Graphql API and elevated failures for the machines API. Direct calls to these apis may fail, along with many flyctl commands. 

We have identified the cause of the issue and are continuing to work on a fix. 

Existing running machines and apps should continue to be reachable, but creates, deploys, or other features relying on platform API calls will fail at this time.

identifiedFri, Feb 27, 2026, 06:59 PM

New Sprite creations are also timing out or failing at this time. We are continuing to work on a fix for this issue.

identifiedFri, Feb 27, 2026, 06:53 PM

We are continuing to work on a fix for this issue.

identifiedFri, Feb 27, 2026, 06:52 PM

We have identified the cause of the increased latency and are working on a fix.

The most common errors we are seeing is timeouts when users attempt to perform an action against a newly created app / machine resource. Those may timeout or fail with an `app|machine not found` error

investigatingFri, Feb 27, 2026, 06:50 PM

We are investigating increased in API request latency and timeouts with the main platform API.
 This is impacting multiple operations, including creating, querying or performing actions against machines, as well as platform level operations like adding payment methods.

📊 TECHNICAL DETAILS

Internal ID

4d7fcb45-2c1c-4147-b4ac-288b315d2517

External ID

8ncr1mvjcj5h

🕐 Started

Fri, Feb 27, 2026, 06:50 PM

🔄 Last Updated

Fri, Feb 27, 2026, 08:25 PM

✓ Resolved

Fri, Feb 27, 2026, 08:25 PM

MAJOR

Impact Level

1h 34m

Total Duration

Status Updates

Affected Components

DashboardMachines APIDeploymentsRemote BuildsSprites

REQUEST COVERAGE

No article has been written for this incident yet. When 100 people request coverage, we automatically generate one.

0 / 100 requests100 more needed