Application Loading Issue
Incident Report for Close
Postmortem

Close sincerely apologizes for the interruption of our service. We take the stability of our platform very seriously. Below is an explanation of what happened and how we will prevent another such interruption from occurring. 

Impact

Between 1900 and 2000 UTC on September 12, 2024 the Close app and API were severely degraded due to a partial outage of our back end MongoDB database. The database was restored to normal operation by 2000 UTC without data loss. The Close app and API returned to normal operation by 2000 UTC.

Root Cause and Resolution

At 1900 UTC on September 12, 2024 components of our back end MongoDB database in one of our data centers came under anomalous load and became unresponsive. This resulted in wide spread intermittent disruption to the Close app and API. Once the affected components were identified and restarted performance of the Close app and API returned to normal.

We are in the process of deploying a new architecture for this part of our system that will be more resilient to this class of failure. In the meantime we are deploying additional monitoring that will reduce the amount of time required to identify and mitigate such issues going forward.

Timeline

  • 1900 UTC: The Close app and API begin to experience elevated error rates
  • 1904 UTC: Close Engineering is alerted of the elevated error rate by automatic monitoring
  • 1909 UTC: Close Engineering identifies a network issue affecting one of our data-centers
  • 1945 UTC: Close Engineering identifies our back end MongoDB database as being critically impaired
  • 1954 UTC: Close Engineering begins operating on the affected database to restore normal operation
  • 2001 UTC: Close Engineering completes operations on the affected database
  • 2001 UTC: The Close app and API returns to normal operation
Posted Sep 13, 2024 - 09:06 PDT

Resolved
This incident has been resolved, and Close is fully operational.
Posted Sep 12, 2024 - 14:01 PDT
Monitoring
The issue has been identified, and our team are currently monitoring performance.
Posted Sep 12, 2024 - 13:06 PDT
Investigating
We are currently seeing intermittent issues with Close loading. We will let you know as soon as this is fixed.
Posted Sep 12, 2024 - 12:29 PDT
This incident affected: Application UI.