Back to overview
Degraded

Degraded availability (test runs, prompt playground)

Oct 04 at 06:54pm IST
Affected services
Dashboard
API Service
AI Models

Resolved
Oct 04 at 08:39pm IST

We are up now.

RCA

  • We use a third-party library to acquire distributed locks that expect specific LUA scripts to be cached. At 6:00 AM PT today, we realized that the Redis cache was burst due to disc corruption that led to the deletion of these scripts.
  • We learned that the lib does not reindex the scripts, so we had to update them manually - once updated system is working as expected

Updated
Oct 04 at 07:00pm IST

We have narrowed down onto the exact issue. Triggered a patch.

Created
Oct 04 at 06:54pm IST

We are facing an issue with one of the infra components. We are working on the fix