Lessons from a life of startups, coding, countryside, and kids
We all make mistakes. Sometimes we forget to test something. Sometimes we didn’t think our users would enter ‘zero’ instead of 0. Or we forgot that the system could be in a state where
@current_project is nil. Shit happens. Exceptions happen.
When something goes wrong, your app will dutifully write it to the logs and tell your customer, “Oops, something went wrong”. What it won’t do is tell you, the person who can actually fix the problem. So you’ll eventually get an angry email from your customer saying, “I can’t login” or “the site is broken” or something else sufficiently vague, and you’ll have to trawl through logs trying to figure out what the problem was.
An “Exception Tracker” listens for these problems (usually via a small library installed in your app) and notifies you when the shit hits the fan. Ok, so that’s useful but there’s some other basic functionality. It will also capture information about the page where the problem occurred, the HTTP parameters and the stack trace identifying the problem code. A good implementation will also collate similar exceptions so if 100 users encounter the same problem, you’ll only get a notification on the first one.
I’ve settled on Rollbar as my go-to exception tracker not just because it has the basic functionality but because I think it redefines the minimum functionality that all exception trackers need to reach.
Tracking exceptions across your stack is one thing but notifications are the other side of it. Rollbar integrates nicely with both Pivotal Tracker and Trello (and others) so you can automatically have it create development tasks when exceptions occur. You can also do this manually.
It also integrates with Slack and other communication channels.
For a long time, exception tracking has been considered solely a development tool. After all, it’s just the developers that need to know when problems occur, right?
Rollbar automatically captures who the current user is along with their id, name and email address. Knowing which users encountered a problem is so useful in many contexts. For the developer, it makes it easier to spot data integrity problems or problems specific to that user’s account. But the real gem is for customer support. Now you know which users encountered a problem you don’t have to wait for them to reach out to you, you can be proactive.
“Hi Dave, I’m really sorry about that problem you were having 20minutes ago when viewing your project. I just wanted to let you know that we’ve squashed that little bug and you should be good to go”
How confident does you customer feel now in your service? This ability is a must-have for today’s web applications.
Rollbar has a pretty good dashboard that lets you slice & dice your data to focus on just the production environment, or just those exceptions happening in the Rails app, or recent exceptions, and so on. You can comment on exceptions, resolve them etc. And like I mentioned above, you can also create tickets in your project management software. There are hooks with Github that will automatically resolve them when the commit is shipped. In short, I think there’s some pretty great options for workflow in there.
Not every exception is a major problem. Sometimes people mistype a URL, or your new styling still references an old font. None of these are terribly important and they aren’t going to make the top of the priority fix-it list any time soon. Rollbar lets you mute exceptions which is basically saying, “Yep, I know that keeps happening but I don’t need to be reminded”
As a freelancer, it’s also important that I can join different projects, invite others in and generally control access between multiple project and people. Rollbar handles this pretty well, and even has the concept of teams which have different access permissions.
Running ad-hoc queries on your exception data isn’t a top priority for most startups but it can be increasingly important.
As your largest customer is about to renew their annual contract, they mention something about the service having a few problems recently. Uh-oh! You should be able to arm yourself with data: how may problems occurred for that company; who was affected; that the majority of problem were caused by a bad data import; and that the frequency has dramatically reduced in the 3mths. The difference between having that data and not, is the difference between a satisfied customer renewing their contract and a suspicious one scuppering the deal.
Or you can head off complaints when you see that the majority of problems are caused by users on IE8 which wasn’t a supported browser.
Rollbar has a feature called RQL that lets you query all your historical exception data using an SQL-like query language.
For example, find which pages this exception occurred on, group it by the url and sort by the worst affected pages:
select request.url from item_occurrence where item.counter = <exception id> group by request.url order by count(*) DESC
Or, find exceptions within a data range that aren’t related to fonts:
select * from item_occurrence where timestamp > 1427068800 and timestamp < 1427155200 and request.url not like '%fonts%' order by timestamp ASC
Ad-hoc querying isn’t a feature you’ll buy an exception tracker for but it’s a feature that you’ll gradually find more and more uses for.
When an exception occurs, a frequent question to ask is: “what’s changed?” This is particularly important when working in a larger team with developers independently deploying changes throughout the day. It can be challenging to figure out which version of the code is actually in production.
Luckily, Rollbar has you back here with a nice clean deploy-tracking screen. It tells you who deployed what, when and links to the full commit. You can even see a complete diff of changes.
Sometimes it can be hard to convince young startups to pay for good tools (note: not my favourite clients) but Rollbar makes this an easy sell: they offer a free plan for up to 5000 exception occurrences a month. Many startups will never generate enough traffic to hit this limit so Rollbar is a really easy sell2. And if you’re generating more than 5000 exceptions, then $12/mth is the least of your problems.
You should probably be using Rollbar. It’s my preferred option for both personal and client projects, and my required tool if I’ll be responsible for the ongoing support & maintenance.
I’m not going to be mad if you choose something else. However, when evaluating other exception trackers, I think you should be comparing them to Rollbar. Perhaps the other exception trackers have caught up, or offer some advantages I haven’t considered? I don’t think so but please look around and compare your options. I still think Rollbar comes out on top.
I’ll also give an honourable mention to AppSignal which combines error tracking and performance metrics. Their performance metrics were the first that gave me any actionable insights without too many headaches. That said, I’d still use Rollbar to track exceptions.