Back to overview
Degraded
Degraded availability (test runs, prompt playground)
Oct 04 at 06:54pm IST
Affected services
Dashboard
API Service
AI Models
Resolved
Oct 04 at 08:39pm IST
We are up now.
RCA
- We use a third-party library to acquire distributed locks that expect specific LUA scripts to be cached. At 6:00 AM PT today, we realized that the Redis cache was burst due to disc corruption that led to the deletion of these scripts.
- We learned that the lib does not reindex the scripts, so we had to update them manually - once updated system is working as expected
Affected services
Dashboard
API Service
AI Models
Updated
Oct 04 at 07:00pm IST
We have narrowed down onto the exact issue. Triggered a patch.
Affected services
Dashboard
API Service
AI Models
Created
Oct 04 at 06:54pm IST
We are facing an issue with one of the infra components. We are working on the fix
Affected services
Dashboard
API Service
AI Models