Application Loading Issue
Incident Report for Close
Postmortem

Close sincerely apologizes for the interruption of our service. We take the stability of our platform very seriously. Below is an explanation of what happened and how we will prevent another such interruption from occurring. 

Impact

API endpoints and application areas backed by Search infrastructure - leads, contacts, opportunities list pages, and certain reporting pages were inaccessible for 20 minutes.

Root Cause and Resolution

As part of an ongoing effort to increase security of Close application, we shipped a change to how authentication is handled between the application and internal search clusters. However, a certain kind of authentication exchange wasn’t handled correctly, which caused most of the requests to Search infrastructure to fail in production.

We have reverted the change, and will be improving our testing setup to catch these issues during development, as well as deployment and monitoring systems to catch production issues earlier.

Timeline

  • 10:20 UTC - An engineer rolls out an incorrect change in how authentication handled between application and internal search clusters.
  • 10:21 UTC - Monitoring systems alert on-call engineer
  • 10:33 UTC - The change was reverted
  • 10:40 UTC - The application functionality fully recovered
Posted May 24, 2024 - 05:08 PDT

Resolved
API endpoints and application areas backed by Search infrastructure - leads, contacts, opportunities list pages, and certain reporting pages were inaccessible for 20 minutes.

This has been resolved.
Posted May 24, 2024 - 04:13 PDT
This incident affected: Application UI and API.