In the last 2,5 weeks, our little side-project ispokemongoavailableyet.comserved:
In this post, I’ll briefly go over the tech behind this site and the things we’ve done along the way to keep everything running smoothly.
We’ve created this Pokémon Go availability tracker with Rails 5.0. In essence, this app contains a controller to render the list of countries, a controller to handle the signups and a background job that uses the iTunes Search API for every country in the App Store territories list to check if the app is available and queue the notification emails when it is.
Our application runs on a VPS at TransIP, we're using a quite powerful server (quad-core, 16GB of RAM) which we share with many other (staging/testing) Rails apps.
The first few days, everything was humming along just fine. Pageloads < 1sec and we were handling ~100 concurrent visitors at ease.
On Monday July 11th, we decided to add our project to Product Hunt. This is where things took off. Our project became featured and got picked up all over the world. For example, a national paper in Peru wrote about it, CNN Chile dedicated a small post to our website. But also some blog in Austria, Malaysia, Taiwan and many others. There were even some YouTubers who decided to “vlog” about it.
Our 50-100 concurrent visitors quickly turned into 700+ concurrent visitors, and this is where our website started to get in trouble. Our single 5-thread worker wasn’t able to serve all request anymore. People were getting 502 errors.
This was when we made two simple and quick changes to tackle this problem from two angles:
Simply changing one line and redeploying added three more workers and multiplied the number of requests our website could handle by four!
We were using this gem for rendering the flag images, which works well, but in our case, we needed 155 flag images on a single page, which basically meant 155 requests to fetch the images. By swapping this gem with the sprite from this gem we managed to reduce the number of requests by (you guessed it) 154.
After making these two changes, our website kept running smoothly, even when we hit the 1700+ concurrent visitors mark.
When the first batch of email notifications were supposed to be sent out, we hit a Postgres connection pool limit. With the default pool size being 5 and Sidekiq running on 25 threads by default, you can see how this could result in exceeding the connection pool limit.
Luckily, this was only a matter of increasing the pool size in our database.yml and everything was connecting fine again!
It was nice to be able to test our own infrastructure like this. At this point, we’re still serving around 100k page views per day, but the number is slowly declining, simply because there are more and more countries where PokemonGoIsAvailableYet.