The API returned 500 errors for a small set of locations starting this afternoon around 4:05 PM ET. Our monitoring did not detect these issues, but we were alerted to them via our support email at 4:21 PM ET. We fixed the bug and had a fix deployed across all of our systems by 4:29 PM ET.
The root cause of these errors was twofold:
We found and fixed a number of bugs that were causing, in rare cases, artifacts in the data we returned. In fixing these bugs, we exposed some brittle legacy code that only triggered in certain circumstances, and which now caused a runtime error when executed. We have fixed the brittle checks in that legacy code, and will be replacing it entirely with a modern version in the near future.
Runtime errors of this variety have historically been infrequent for us, and so our monitoring and alerting systems for them have unfortunately fallen by the wayside and weren’t tested as carefully as they should have been. We will be updating our monitoring systems to catch these kinds of errors within the next couple days so that we will be notified as soon as they occur.
Thanks for bearing with us as we worked through these issues!