You’re reading Signal v. Noise, a publication about the web by Basecamp since 1999. Happy .

Noah

About Noah

Noah Lorang is the data analyst for Basecamp. He writes about instrumentation, business intelligence, A/B testing, and more.

Behind the Scenes: Twitter, Part 1

Noah
Noah wrote this on 7 comments

This is the first in a three part series looking at how we manage Twitter as a support channel. In the parts 2 and 3, I’ll discuss some of the finer points of how we sort through hundreds of tweets each day to get people answers quickly.

Since the launch of the new Basecamp back in March, we’ve been encouraging the use of Twitter as a support channel. On our help form we encourage people with simple questions to use Twitter rather than sending an email, and we monitor mentions of 37signals throughout the day. We’ve always gotten support requests via Twitter and answered them, but it’s only this year that we’ve actively encouraged and focused on it.
Our Twitter presence has grown substantially: in October of this year, 37signals was mentioned an average of 443 times every weekday, roughly double what it was in October 2011. Not all of these need an immediate reply from our support team – many are people sharing links or things that they found interesting. The 60 or so replies we do send a day in response to immediate support requests represent a little less than 10% of our total support “interactions”.
One of the things I spend part of my time working on is how to improve the speed and quality of the responses that we provide to customers, and part of that involves providing advice on the best tools and processes for the support team to do their job. As far as Twitter goes, the biggest pain point is the actual tool used to monitor and send tweets.

The search for a Twitter tool

Since we got serious about Twitter, we’ve mostly used the built in Twitter functionality that our support tool (Desk.com) provides. When I asked the team how it was working for them a couple months ago, the general reaction was tepid. The consensus was that while it gets the job done, it was rather slow to use, and the large number of retweets and links to SvN posts mixed in makes it hard to get people with urgent questions answers promptly. Most of the team was using it, but no one was happy about it.
What did we want in a tool?

Continued…

How I came to love big data (or at least acknowledge its existence)

Noah
Noah wrote this on 10 comments

“Big data” is all the rage these days – there are conferences, journals, and a million consultants. Until a few weeks ago, I mocked the term mercilessly. I don’t mock it anymore.

Not a “big” data problem

Facebook has a big data problem. Google has a big data problem. Even MySpace probably has a big data problem. Most businesses, including 37signals, don’t.
I would guess that among our “peer group” (SaaS businesses), we probably handle more data than most, but our volume of data is still relatively small: we generate around a terabyte of assorted log data (Rails, Nginx, HAproxy, etc.) every day, and a few gigabytes of higher “density” usage and performance data. I’m strictly talking about non-application data here – not the core data that our apps use, but all of the tangential data that’s helpful to improve the performance and functionality of our products. We’ve only even attempted to use this data in the last couple of years, but it’s invaluable for informing design decisions, finding performance hot spots, and otherwise improving our applications.
The typical analytical workload with this data is a few gigabytes or tens of gigabytes – sometimes big enough to fit in RAM, sometimes not, but generally within the realm of possibility with tools like MySQL and R. There are some predictable workloads to optimize for (add indexes for data stored in MySQL, instrument in order to work with more condensed data, etc.), but the majority aren’t things that you ordinarily plan for particularly well. Querying this data can be slow, but it’s all offline, non-customer facing applications, so latency isn’t hugely important.
None of this is an insurmountable problem, and it’s all pretty typical of “medium” data – enough data you have to think about the best way to manage and analyze it, but not “big” data like Facebook or Google.

Technology changes everything

Continued…

The business intelligence scorecard

Noah
Noah wrote this on 9 comments

One way I like to think about the different aspects of “business intelligence” is as an organizational scorecard. It helps to maintain a mental model of what you’re doing and why when prioritizing investments of time or money.

On this scorecard, the rows represent analytical competencies of growing sophistication from top to bottom. I classify these competencies as:

  1. Instrumentation / Warehousing – can you measure things, and can you store that data in a clean, retrievable format?
  2. Reporting – can you get the data out of your warehouse and into the hands of people who can use it?
  3. Analytics – can you add value to raw data with analytics, benchmarks, etc.?
  4. Strategic Impact – do the results of your data and analysis impact the direction of the organization in a meaningful, accretive way?

The columns represent different functional areas of relevance to your organization. For our purposes, I use ‘Application Health/Ops’, ‘Support’, ‘Financial’, ‘Marketing’, ‘Retention’, and ‘Product Usage’. This taxonomy isn’t completely clean, and there’s some overlap, but they’re roughly distinct areas.

When you draw this grid out, you end up with something that looks like the below.

I’ve drawn my columns in what I generally think of as increasing long-term strategic importance. Every column on here is critically important, but our long-term success comes from people getting value from using our products, and so I put that at the far right. You could make an argument for ordering them differently, but the general idea is the same.

My aspiration is always to spend most of my time and energy in the bottom right few boxes—doing analytics and having impact on things like retention and usage.

The reality is that in order for those to matter at all, you have to have rock solid instrumentation and reporting across the board, and some of the functional areas on the left side of the chart are more pressing – if your applications are falling over and you don’t know why, or your team is buried under thousands and thousands of support tickets, all the wonderful analytics in the world on usage probably won’t keep your company heading in the right direction.

Take a minute and give your organization a letter grade in each of these boxes. Think about what you would have given yourself in each box a year or two ago, and where you’d like to be a year or two from now. Have you made progress? Do you still have work to do?

Picking the right analysis to solve the real problem

Noah
Noah wrote this on 21 comments

My job is to gather, study, and understand data and its implications, and then make recommendations to help the business improve – in short, to deliver business value from data.

One of the things you learn when you work in analytics is that there’s an endless depth to virtually any problem – you can keep digging deeper and deeper forever. One of the most valuable skills you can learn is deciphering what’s needed to solve the real problem – when has the bulk of the business value been delivered, and when are you doing things that are just intellectual interesting but not actually valuable?

I’ve found that I end up performing analyses in one of four different levels of detail:

  1. The quick ‘n dirty: These are short and simple – for example, a designer wants to know what the distribution of the number of posts on a project is because they’re designing a new screen, or David or Jason wants to know how our support ticket response time is trending. These are some mix of data retrieval and analysis, but the results don’t need a lot of explanation or interpretation. Most of the time, the results are communicated via IM or Campfire, and I end up spending between 30 seconds and 30 minutes.

  2. The basic look: The most common analysis I do is a moderate depth one – something like a look at conversion rates and retention by traffic source, or a basic overview of how people are using a specific feature in the new Basecamp compared to how they used a similar feature in Basecamp Classic. The results here are more involved and need some interpretation or “color commentary”, and may come with specific recommendations. This sort of analysis gets written up in a post on one of our Basecamp projects, and usually takes somewhere between a couple hours and a day.

  3. The deep dive: When it comes to understanding root causes and developing significant recommendations, a more in depth analysis is called for. For things like understanding the root causes of cancellation or support cases, the bulk of the work tends to be on analysis, interpretation, and then actionable recommendations to address those causes. Frequently, there’s some instrumentation or reporting project that spins off from this as well – I may add a report to our dashboard on the topic so we can more easily track it over time. These analyses usually get written up in a longer document with significantly more detail, and sometimes come with a live or recorded video explanation and discussion as well. This sort of analysis usually takes between 1 and 3 weeks.

  4. The boiled ocean: If you want to understand a substantive issue from every single possible angle, try every statistical technique in the book, and write a report with every possible visualization, then you’re probably looking at investing multiple months in a problem. We haven’t done anything like this in the 18 months I’ve been here at 37signals, and that’s by design: in most cases, this type of analysis ends up providing essentially the same business value as a deep dive that takes a fraction of the time.

Next time you’re faced with an analytical problem, ask yourself what the real underlying problem you’re trying to solve is, and figure out what depth of analysis is the required to deliver the bulk of the business value; after all, your job is probably really about improving the business.

A/B Testing: It's not about the results, and it's definitely not about the why

Noah
Noah wrote this on 8 comments

In college, I worked for a couple of years in a lab that tested the effectiveness of surgical treatments for ACL rupture using industrial robotics. Sometimes, the reconstructions didn’t hold. The surgeons involved were sometimes frustrated; it can be hard to look at data showing that something you did didn’t work. But for the scientists and engineers, all that mattered was that we’d followed our testing protocol and gathered some new data. I came to learn that this attitude is exactly what it takes to be a successful scientist over the long term and not merely a one-hit wonder.

Occasionally, when we’re running an A/B test someone will ask me what I call “success” for a given test. My answer is perhaps a bit surprising to some:

  • I don’t judge a test based on what feedback we might have gotten about it.
  • I don’t judge a test based on what we think we might have learned about why a given variation performed.
  • I don’t judge a test based on the improvement in conversion or any other quantitative measure.

I only judge a test based on whether we designed and administered it properly.

As an industry, we don’t yet have a complete analytical model of how people make decisions, so we can’t know in advance what variations will work. This means that there’s no shame in running variations that don’t improve conversion. We also lack any real ability to understand why a variation may have succeeded, so I don’t care much whether or not we understood the results at a deeper level.

The only thing we can fully control is how we set up the experiment, and so I judge a test based on criteria like:

  • Did we have clear segmentation of visitors into distinct variations?
  • Did we have clear, measurable, quantitative outcomes linked to those segments?
  • Did we determine our sample size using appropriate standards before we started running the test, and run the test as planned, not succumbing to a testing tool’s biased measure of significance?
  • Can we run the test again and reproduce the results? Did we?

This might sound a lot like the way a chemist evaluates an experiment about a new drug, and that’s not by accident. The way I look at running an A/B test is much the same as I did when I was working in that lab: if you run well-designed, carefully implemented experiments, the rest will take care of itself eventually.

You might hit paydirt this time, or it might take 100 more tests, but all that matters is that you keep trying carefully. I evaluate the success of our overall A/B testing regimen based on whether it improves our overall performance, but not individual tests; individual tests are just one step along what we know will be a much longer road.

Mini tech note: MySQL query comments in Rails

Noah
Noah wrote this on 13 comments

One of the things we’ve added to our applications in the last few months is a little gem that (among other things) adds a comment to each MySQL query that is generated by one of our applications.

Now, when we look at our Rails or slow query logs, our MySQL queries include the application, controller, and action that generated them:

Account Load (0.3ms)  SELECT `accounts`.* FROM `accounts` 
WHERE `accounts`.`queenbee_id` = 1234567890 
LIMIT 1 
/*application:BCX,controller:project_imports,action:show*/

When we’re trying to improve a slow query, or identify a customer problem, we never have to go digging to understand where the query came from—it’s just right there. This comes in handy in development, support, and operations – we used it during a pre-launch review of unindexed queries in the brand new Basecamp which launched a couple months ago. If you combine this with something like pt-query-digest, you end up with a powerful understanding of how each Rails action interacts with MySQL.

It’s easy to add these comments to your Rails application in a relatively unintrusive way. We’ve released our approach that works in both Rails 2.3.x and Rails 3.x.x apps as a gem, marginalia.

marginalia (mar-gi-na-lia, pl. noun) — marginal notes or embellishments
Merriam-Webster

We’ve been using this in production on all of our apps now since December, ranging from Rails 2.3.5 to Rails master and Ruby 1.8.7 to 1.9.3. You should be able to have it running in your application in a matter of minutes.

It’s worth acknowledging that anytime you modify the internals of something outside your direct control there are risks, and that every function call adds some overhead. In our testing, these have both been well worth the tradeoff, but I absolutely encourage you to consider the tradeoff you’re making for yourself every time you instrument or log something. You may certainly have a different set of tradeoffs, and you should absolutely test on your own application.

Have a suggested improvement to our sample code or another way to do this? We’d love to hear it.

Thanks to Taylor for the original idea, and to Nick for helping to extract it into its own gem.

How's the new Basecamp doing so far?

Noah
Noah wrote this on 15 comments

About three weeks ago we launched the all new Basecamp, and it’s been an exciting few weeks.

Since I’m a numbers kind of guy, I wanted to share some things I’ve seen in looking at the new Basecamp that are particularly exciting:

  • This has been our strongest product launch ever. The new Basecamp is our fifth “big” product launch, and it’s our strongest yet in terms of signups in the period immediately after launch. With two weeks in the books, we had more than 3x more signups than we had in the same period after our last brand new product launch (for Highrise back in 2007). If you go all the way back to Basecamp’s original launch in 2004, signups for the new Basecamp were more than 30 times higher.

  • We’ve brought in lots of new customers. About a third of new Basecamp accounts immediately after launch were from people who migrated their existing account, and about half were from people who previously held some sort of 37signals account before. While we’re thrilled to see so many of our loyal customers enjoying the new Basecamp, we’re even more excited to see so many new people trying Basecamp for the first time.

  • Usage is fantastic. On a per account basis, new Basecamp accounts are creating twice as many projects and todo items as on Basecamp Classic, as well as more attachments, messages, comments, calendar events, and more.

  • We have a great new marketing site. Jamie, Mig, and Jason F. really knocked it out the park with our new public site at basecamp.com. We’ve sustained substantially higher traffic levels from all kinds of sources more than two weeks after launch, and conversion rate is up 76%. We’re always testing new ideas here, but the early results are bright.

  • Basecamp Classic continues to perform well. Plenty of existing customers continue to use Basecamp Classic. Retention rates haven’t dipped, and usage levels are right where they were before we launched the new Basecamp. This is great news – our strategy of maintaining two separate Basecamps (Classic and new) seems to be working so far with no ill effects.

We’re excited and encouraged by the first few weeks of the all new Basecamp. We have lots of great improvements planned for it in the coming weeks and months – we’re hard at work on a few already.

If you don’t have an account, get started with a 45-day free trial now, or join us for a free introductory class about the new Basecamp.

What being on the front page of Hacker News does for our bottom line

Noah
Noah wrote this on 32 comments

There’s been some speculation that we significantly increased the amount of posts here on SvN in the build up to the launch of the new Basecamp, and in particular that we targeted the front page of Hacker News for those articles. Some people aren’t happy about this.

I’d like to bring a little context and fact to bear on this to put these speculations to rest.

In the month before the launch of the new Basecamp, we published 25 posts here on Signal vs. Noise. For comparison, during the same period in prior years, we published (before 2007 we used a different blogging engine, so I don’t have those numbers handy):

  • 29 posts in 2011
  • 50 posts in 2010
  • 36 posts in 2009
  • 49 posts in 2008
  • 42 posts in 2007

Relatively speaking, this was actually a pretty low level of posting activity for us. During all the years prior to this one in that period, we were also maintaining a separate product blog, whose posts aren’t included in these totals.

During that period, there were 24,826 first time visitors to any of our sites who we could identify as having first gotten to us via Hacker News (in all, we received more like 105,000 unique visitors from Hacker News, but many of those were repeat visitors). 97 of those visitors signed up, with more than 85% of them electing the free plan. This conversion rate pales compared to our average conversion rate, particularly for non-search-engine traffic.

When all is said and done, what’s our likely financial outcome from Hacker News visitors for those 25 posts? About $300 total per month.

We typically write on SvN because we have an announcement to make, or because we have something we’re thinking about that we’d like to share.

Do we benefit from other people noticing our blog posts and linking them up from their blogs or other outlets? Absolutely – we’ve been talking about the power of word-of-mouth marketing for almost a decade.

As a writer, do I like it when more people read what I’ve written? Sure.

Is there any business value for us in getting on the front page of Hacker News? Not really.

Upvote us, downvote us, ignore us – I don’t care, but I hope you’ll make that decision based on the merits of the content of a given post, not because you think we’re trying to manipulate the front page of Hacker News for our gain.

Pssst... your Rails application has a secret to tell you

Noah
Noah wrote this on 27 comments

What would you say if I told you that you could get more precise, actionable, and useful information about how your Rails application is performing than any third party service or log parsing tool with just a few hours of work?

For years, we’ve used third party tools like New Relic in all of our apps, and while we still use some of those tools today, we found ourselves wanting more – more information about the distribution of timing, more control over what’s being measured, a more intuitive user interface, and more real-time access to data when something’s going wrong.

Fortunately, there are simple, minimally-invasive options that are available virtually for “free” in Rails. If you’ve ever looked through Rails log files, you’ve probably seen lines like:

Feb  7 11:27:49 bc-06 basecamp[16760]: [projects]   Person Load (0.5ms)  SELECT `people`.* FROM `people` WHERE `people`.`id` = ? LIMIT 1
Feb  7 11:27:49 bc-06 basecamp[16760]: [projects] Rendered events/_post.rhtml (0.4ms)
Feb  7 11:27:50 bc-06 basecamp[16760]: [projects] Rendered project/index.erb within layouts/in_global (447.2ms)
Feb  7 11:27:50 bc-06 basecamp[16760]: [projects] Completed 200 OK in 529ms (Views: 421.7ms | ActiveRecord: 58.0ms)

You could try to parse these log files, or you could tap into Rails’ internals to extract just the numbers, but both of those are somewhat difficult and open up a lot of areas for things to go wrong. Fortunately, in Rails 3, you can get all this information and more in whatever form you want with just a few lines of code.

All the details you could want to know, after the jump…

Continued…

No framework needed

Noah
Noah wrote this on 21 comments

It goes without saying that we use Rails a lot here at 37signals. Often times, when we look at a problem, we turn to Rails or something similar, because when you have a high-performance precision screwdriver, everything starts to look like a finely engineered screw. Sometimes, what you really need is a big hammer, because what you’re looking at is a nail.

Our public sites – sites like 37signals.com and basecamphq.com – are a perfect example of this.

Let me tell you about our journey with these sites over the years, and how we’ve landed on a simple solution that boosted conversion rate by about 5%.

Good enough

There’s nothing particularly dynamic about these sites; we might throw a “Happy Monday” in there, or we might make some tweaks based on a URL parameter, and we A/B test them extensively, but there’s no database or background services involved.

Stretching back to the pre-Basecamp days, the 37signals.com site was written with PHP. There was no Rails back then, Ruby wasn’t commonly used for web development, and DHH and others worked in PHP, so it was the logical choice. As we added sites, they continued to use PHP since it was fast and easy. This worked well for years and years—our public sites were relatively performant and rock-stable, and we didn’t really have many problems. The biggest pain was in setting up for local development, which ended up being quite the pain to get set up in OS X in a way that behaved well with Pow, Passenger, etc.

Getting better

A few years ago, Sam Stephenson and Josh Peek wrote Brochure as a way to translate our marketing sites to Rack apps. This solved the local development challenges, and let us use a language we were all generally more comfortable with. It was a little slower than PHP, and meant dealing with Passenger on deployment, but it was a fair compromise at the time. We moved one site to brochure, and then ran out of steam to move the rest – work on our applications took a higher priority.

A few months ago I took a serious look at our public sites’ performance. They were making a lot of requests for individual assets and page load times were pretty poor – Basecamp itself loaded much faster than the essentially static signup page for it. Local setup problems with the PHP sites also meant that it was harder to work on the sites, and so we were less productive and less inclined to work on them.

Back to the basics for fun and profit

Our solution to this (in addition to spriting images and cleaning up unused styles and Javascript) was to switch to using totally static HTML pages. We’re using the stasis gem to compile .html.erb files locally and on deploy, along with Sprockets to pre-process and concatenate stylesheets and Javascript. Our web server ends up serving plain old HTML and a single CSS and Javascript file, with no interpretation.

This makes local development easy, and what you see locally is always what will be deployed. This also makes it trivial to distribute the marketing site to multiple datacenters or distribution networks around the world—just upload the compiled files, rather than worrying about dependencies for running an interpreted site.

While we haven’t done that yet, just from some mild spriting and cleanup and moving to static HTML, we shaved about half a second off the total load time for basecamphq.com, and saw about a 5% improvement in conversion rate result from that (the link between page speed and conversion rate has been studied more rigorously as well by the likes of Google, Amazon, etc.).