Thanks to all of our users for bearing with us this week with our downtime and database issues. We’ve worked nonstop the past couple of days to make sure the site is up, functional, and behaving like it was before the downtime. A very small number of users are still affected, and we’re working with each and every one of them on a case-by-case basis to make sure they’re happy with their profiles. We thought our downtime could be a learning experience for everyone - I wanted to let you in on what happened in the aftermath as well.
RECOVERING FROM DATA LOSS
After we had our database issue, we pieced together the data from last week using backups and with the help of some of our external service providers that had cached copies of submissions, posts, achievements, and more. Our entire engineering team worked shifts for 36 hours straight to pull data out of our backups and prepare a fixed version of our database for when the site came back online.
Thursday evening at 6 or 7pm, after a little more than 5 hours of downtime, we decided not to bring the site up until we were confident we had restored the majority of users’ progress, points, and badges. We spent the next 11 or 12 hours working on that (all hands on deck).
Thursday morning, we brought the site back with almost all of the data we were able to restore. Most users were unaffected and saw no issues after the site came back up. Linda and Karen, our community management team, designed a form for people to report issues. Ryan, Codecademy’s cofounder and CTO, put up a blog post explaining the reasons behind the downtime. Lots of users reported no problems and offered to lend a hand to help out - thanks! It’s amazing to work with such a receptive and helpful community.
SWEATING THE SMALL STUFF
After bringing the site back online, we dedicated Thursday and Friday to working with our users to recover their data based on the feedback we got through email, Twitter, Facebook, and the form we designed. We know how important progress, points, and badges are to our users, so we have spent the past couple of days getting everything back to normal. Tonight, we’re running a final script and the site will be down for 15 minutes as we add in a small amount of data that should affect only a handful of users.
We know downtime is not good for our users and we are not proud of it. That said, it happens to everyone. Great companies we admire like Etsy and Twitter have had downtime issues in the past. What’s most important is how they handle themselves afterwards.
Everyone who was affected by data loss will get a special Codecademy badge.
BUSINESS AS USUAL
Thanks for bearing with us after this outage. Going forward, we’ve implemented a lot more policies internally to make sure this never happens again. We’ve increased our daily backup schedule and, in some cases, started backing data up more frequently. Our backups are redundant across several data sources. After this experience, we’ve practiced how to recover from many circumstances - data loss, downtime, etc. We want you, our users and learners, to be confident that Codecademy is the best place to learn to code whenever you want to learn. We’re back to building new features and working on old ones - check back next week for more announcements!