Close sincerely apologizes for the interruption of our service. We take the stability of our platform very seriously. Below is an explanation of what happened and how we will prevent another such interruption from occurring.
Between 13:40 and 16:55 UTC on Wednesday September 28, 2024 the Close App and API experienced degraded performance. Some users may have noticed the App UI & API responding sluggishly.
Concurrently, between 14:33 and 19:30 UTC background task processing inside of the Close app was disrupted. During this time Workflows and Email sending may not have occurred on schedule.
At 13:14 UTC on Wednesday September 28, 2024 Close Engineering deployed an updated version of our browser application. A bug in this new version caused a large increase of impactful requests to be sent to our back end system. By 14:00 UTC the number of additional requests had grown such that our back end database was overloaded causing poor application performance.
Close Engineering was able to revert the change to our browser application by 14:51 UTC. While waiting for all of our clients’ browsers to update to the fixed version of our app Close Engineering undertook several steps to reduce the load on our overloaded database between 14:30 UTC and 17:00 UTC.
Disruption during this time also degraded our ability to collect runtime metrics on our background task processing system. This caused the background task processing system to think that it was not under load and to scale down. Close Engineering fixed the issue with metrics gathering by 18:20 UTC. At which point background task processing returned to normal operation.
To prevent another incident like the from occurring Close Engineering will audit our growing data stores for opportunities to better distribute load and prevent the database from becoming overloaded. We will also implement a training regimen for our incident responders to ensure more timely and consistent communication during future incidents.