You’re reading Signal v. Noise, a publication about the web by Basecamp since 1999. Happy .

Signal vs. Noise: Programming

Our Most Recent Posts on Programming

Drive development with budgets not estimates

David
David wrote this on 11 comments

Much of software development planning is done through estimates. You give me a description of the feature, I give you a best guess on how long it’s going to take. This model has been broken since the dawn of computer programming, yet we keep thinking it’s going to work. That’s one definition of insanity.

What I’ve found to be a more useful model is simply to state what something is worth. If a feature is worth 5 weeks of development, for example, that’s the budget. Such a budget might well be informed by an estimate of whether some version of that feature can be possibly built in 5 weeks, but it’s not driven by it.

Because most features have scales of implementation that are world’s apart. One version of the feature might take 2 weeks, another might take 6 months. It’s all in where you draw the line, how comprehensive you want to be, and what you’re going to do about all those inevitable edge cases.

The standard response to the estimation approach is to propose a 100% implementation that’s going to take 100% of the effort to build. Some times that’s what you need. Nothing less than having everything is going to be good enough. I find that’s a rare case.

A more common case is that you can get 80% of the feature for 20% of the effort. Which in turn means that you can get five 80% features, improvements, or fixes for the price of one 100% implementation. When you look at it like that, it’s often clear that you’d rather get more done, even if it isn’t as polished.

This is particularly true if you don’t have all the money and all the people in the world. When you’re trying to make progress on a constrained budget, you have to pinch your development pennies. If you splurge on gold-plating for every feature, there’s not going to be anything left over to actually ship the damn thing.

That’s what proposing a budget based on worth helps you with. It focuses the mind on what assumptions we can challenge or even ignore. If we only have 5 weeks to do something, it’s just not going to work to go through the swamp to get there. We have to find a well-paved road.

In the moment, though, it can be frustrating. If we just had a little more time, we could do so much better! So much better for whom? Your developer pride? Or the customer? Will the latter actually care about all the spit and grit you poured into these particular corners? Don’t be so sure.

In the end, accepting a budget is about accepting constraints. Here are the borders of scope for our wild dreams and crazy colors. Much of invention lies in the fight within those constraints. Embrace that.

Hybrid sweet spot: Native navigation, web content

David
David wrote this on 41 comments

When we launched the iPhone version of Basecamp in February of last year, it was after many rounds of experimentation on the architectural direction. The default route suggested by most when we got started back in early/mid-2012 was simple enough: Everything Be Native!

This was a sensible recommendation, given the experiences people had in years prior with anything else. Facebook famously folded their HTML5 implementation in favor of going all native to get the speed they craved with the launch of their new app in August of 2012.

Thus their decision was likely driven by what the state of the art in HTML and on mobile looked like circa 2010-2011. In early 2010, people were rocking either the iPhone 3GS or 3G. By modern 2014 standards, these phones are desperately slow. Hence, any architectural decisions based in the speed of those phones are horribly outdated.

Continued…

Dragons on the far side of the histogram

David
David wrote this on 9 comments

Performance tuning is a fun sport, but how you’re keeping score matters more than you think, if winning is to have real impact. When it comes to web applications, the first mistake is start with what’s the easiest to measure: server-side generation times.

In Rails, that’s the almighty X-Runtime header — reported to the 6th decimal of a second, for that extra punch of authority. A clear target, easily measured, and in that safe realm of your own code to make it appear fully controllable and scientific. But what good is saving off milliseconds for a 50ms internal target, if your shit (or non-existent!) CDNs are costing you seconds in New Zealand? Pounds, not pennies, is where the wealth is.

Yet that’s still the easy, level one, part of the answer: Don’t worry too much about your internal performance metrics until you’ve cared enough about the full stack of SSL termination overhead, CDN optimization, JS/CSS asset minimization, and client-side computational overhead (the latter easily catching out people following the “just do a server-side API”, since the json may well generate in 50ms, but then the client-side computation takes a full second on the below-average device — doh!).

Level two, once reasonable efforts have been made to trim the fat around the X-Runtime itself, is getting some big numbers up on the board: Mean and the 90th percentile. Those really are great places to start. If your mean is an embarrassing 500ms+, well, then you have some serious, fundamental problems that need fixing, which will benefit everyone using your app. Get to it.

Keep going beyond even the 99th

Just don’t stop there. Neither at the mean or the 90th. Don’t even stop at the 99th! At Basecamp, we sorta fell into that trap for a while. Our means were looking pretty at around 60ms, the 90th was 200ms, and even the 99th was a respectable 700ms. Victory, right?

Well, victory for the requests that fell into the 1st to 99th percentile. But when you process about fifty million requests a day, there’s still an awful lot of requests hidden on the far side of the 99th. And there, young ones, is where the dragons lie.

A while back we started shining the light into that cave. And even while I expected there to be dragons, I was still shocked at just how large and plentiful they were at our scale. Just 0.4% of requests took 1-2 seconds to resolve, but that’s still a shockingly 200,000 requests when you’re doing those fifty million requests.

Yet it gets worse. Just 0.0025% of requests took 10-30 seconds, but that’s still a whooping 1,250 requests. While some of those come from API requests that users do not see directly, a fair slice is indeed from real, impatient human beings. That’s just embarrassing! And a far, far away land from that pretty picture painted by the 60ms mean. Ugh.

Finally, there was the true elite: The 0.0001%, for a total of 50 instances. Those guys sat and waited between 30 and 60 seconds on their merry request to complete. Triple ugh.

Dragon slaying

Since lighting the cave, we’ve already been pointed to big, obvious holes in our setup that we weren’t looking at before. One simple example was file uploads: We’d stage files in one area, then copy them over to their final resting place as part of the record creation process. That’s no problem when it’s a couple of 10MB audio files, but try that again with 20 400MB video files — it takes a while! So now we stage straight in the final resting place, and cut out the copy process. Voila: Lots of dragons dead.

There’s still much more work to do. Not just because it sucks for the people who actually hit those monster requests, but also because it’s a real drain on the rest of the system. Maybe it’s a N+1 case that really only appears under very special circumstances, but every time the request hits, it’s still an onslaught on the database, and everyone else’s fast queries might well be slowed down as a result.

But it really does also just suck for those who actually have to sit through a 30 second request. It doesn’t really help them very much to know that everyone else is having a good time. In fact, that might just piss them off.

It’s like going to the lobby of your hotel to complain about the cockroaches, and then seeing the smug smile of the desk clerk saying “oh, don’t worry about that, none of our other 499 guests have that problem… just deal with it”. You wouldn’t come back next Summer.

So do have a look at the far side of your histogram. And use actual request counts, not just feel-good percentiles.

Finding your workbench muse

David
David wrote this on 21 comments

Much intellectual capital is spent examining the logical advantages and disadvantages of our programming tools. Much ego invested in becoming completely objective examiners of productivity. The exalted ideal: To have no emotional connection to the workbench.

Hogs and wash. There is no shame in being inspired by your tools. There is no shame in falling in love with your tools. Nobody would chastise a musician for clinging to their favorite, out-dated, beat-up guitar for that impossible to explain “special” sound. Some authors even still write their manuscripts on actual type writers, just for the love of it.

This highlights the tension between programmers as either engineers or craftsmen. A false dichotomy, but a prevalent one. It’s entirely possible to dip inspiration and practice from both cans.

I understand where it’s coming from, of course—strong emotions often run counter to good arguments. It’s hard to convince people who’ve declared their admiration or love of something otherwise. Foolhardy, even. It can make other types of progress harder. If we all fell madly in love with Fortran and punch cards, would that still be the state of the art?

I find the benefits far outweigh the risks, though. We don’t have to declare our eternal fidelity to our tools for them to serve as our muse in the moment. And in that moment, we can enjoy the jolt of energy that can come from using a tool fitting your hand or mind just right. It’s exhilarating.

So much so that it’s worth accepting the limitations of your understanding. Why do I enjoy Ruby so very much? Well, there’s a laundry list of specific features and values to point to, but that still wouldn’t add up to the total sum. I’ve stopped questioning it constantly, and instead just embraced it.

Realizing that it’s not entirely rational, or explainable, also frees you from necessarily having to push your muse unto others. It’s understandable to be proud and interested in inviting others to share in your wonder, but mainly if they haven’t already found their own.

If someone is already beholden to Python, and you can sense that glow, then trying to talk them into Ruby isn’t going to get you anywhere. Just be happy that they too found their workbench muse.

At the end of the day, nobody should tell you how to feel about your tools (let alone police it out of you, under the guise of what’s proper for an engineer). There’s no medal for appearances, only great work.

Server-generated JavaScript Responses

David
David wrote this on 29 comments

The majority of Ajax operations in Basecamp are handled with Server-generated JavaScript Responses (SJR). It works like this:

  1. Form is submitted via a XMLHttpRequest-powered form.
  2. Server creates or updates a model object.
  3. Server generates a JavaScript response that includes the updated HTML template for the model.
  4. Client evaluates the JavaScript returned by the server, which then updates the DOM.

This simple pattern has a number of key benefits.

Benefit #1: Reuse templates without sacrificing performance
You get to reuse the template that represents the model for both first-render and subsequent updates. In Rails, you’d have a partial like messages/message that’s used for both cases.

If you only returned JSON, you’d have to implement your templates for showing that message twice (once for first-response on the server, once for subsequent-updates on the client) — unless you’re doing a single-page JavaScript app where even the first response is done with JSON/client-side generation.

That latter model can be quite slow, since you won’t be able to display anything until your entire JavaScript library has been loaded and then the templates generated client-side. (This was the model that Twitter originally tried and then backed out of). But at least it’s a reasonable choice for certain situations and doesn’t require template duplication.

Benefit #2: Less computational power needed on the client
While the JavaScript with the embedded HTML template might result in a response that’s marginally larger than the same response in JSON (although that’s usually negligible when you compress with gzip), it doesn’t require much client-side computation to update.

This means it might well be faster from an end-to-end perspective to send JavaScript+HTML than JSON with client-side templates, depending on the complexity of those templates and the computational power of the client. This is double so because the server-generated templates can often be cached and shared amongst many users (see Russian Doll caching).

Benefit #3: Easy-to-follow execution flow
It’s very easy to follow the execution flow with SJR. The request mechanism is standardized with helper logic like form_for @post, remote: true. There’s no need for per-action request logic. The controller then renders the response partial view in exactly the same way it would render a full view, the template is just JavaScript instead of straight HTML.

Complete example
0) First-use of the message template.

<h1>All messages:</h1>
<%# renders messages/_message.html.erb %>
<%= render @messages %>

1) Form submitting via Ajax.

<% form_for @project.messages.new, remote: true do |form| %>
  ...
  <%= form.submit "Send message" %>
<% end %>

2) Server creates the model object.

class MessagesController < ActionController::Base
  def create
    @message = @project.messages.create!(message_params)

    respond_to do |format|
      format.html { redirect_to @message } # no js fallback
      format.js   # just renders messages/create.js.erb
    end
  end
end

3) Server generates a JavaScript response with the HTML embedded.

<%# renders messages/_message.html.erb %>
$('#messages').prepend('<%=j render @message %>');
$('#<%= dom_id @message %>').highlight();

The final step of evaluating the response is automatically handled by the XMLHttpRequest-powered form generated by form_for, and the view is thus updated with the new message and that new message is then highlighted via a JS/CSS animation.

Beyond RJS
When we first started using SJR, we used it together with a transpiler called RJS, which had you write Ruby templates that were then turned into JavaScript. It was a poor man’s version of CoffeeScript (or Opalrb, if you will), and it erroneously turned many people off the SJR pattern.

These days we don’t use RJS any more (the generated responses are usually so simple that the win just wasn’t big enough for the rare cases where you actually do need something more complicated), but we’re as committed as ever to SJR.

This doesn’t mean that there’s no place for generating JSON on the server and views on the client. We do that for the minority case where UI fidelity is very high and lots of view state is maintained, like our calendar. When that route is called for, we use Sam’s excellent Eco template system (think ERB for CoffeeScript).

If your web application is all high-fidelity UI, it’s completely legit to go this route all the way. You’re paying a high price to buy yourself something fancy. No sweat. But if your application is more like Basecamp or Github or the majority of applications on the web that are proud of their document-based roots, then you really should embrace SJR with open arms.

The combination of Russian Doll-caching, Turbolinks, and SJR is an incredibly powerful cocktail for making fast, modern, and beautifully coded web applications. Enjoy!

Tom Ward & Zach Waugh join 37signals

David
David wrote this on 5 comments

We’d like to welcome two new members of our programming team.

Tom Ward hails from London and has been active in the Rails community since 2005. He was responsible for the SQLServerAdapter back in the day and has a cool 37 commits(!) under his belt for Rails. He’s formerly of Go Free Range, the team behind big parts of the GOV.UK project. We’re very happy to have him here, thanks to the great recommendation of fellow UK employee Pratik, who made the connection.

Zach Waugh is from Baltimore and the creator of the awesome Flint iOS and OSX clients for Campfire. Much of 37signals have already been enjoying Campfire through Zach’s clients, so we are thrilled that he’ll be able to join.

Both guys will of course stay where they’ve chosen to live and work remotely. We have a lot of both web and mobile projects to dig into, so great to have them both here. That now makes 41 of us!

Rethinking Agile in an office-less world

David
David wrote this on 28 comments

Much of agile software development lore exalts the virtue of in-person collaboration. From literal stand-up meetings to at-the-same-desk pair programming. It’s an optimization for the assumption that we’re all going to be in the same place at the same time. Under that assumption, it’s a great set of tactics.

But assumptions change. “Everyone in the same office” is less true now than it ever was. People are waking up to the benefits of remote working. From quality of life to quality of talent. It’s a new world, and thus a new set of assumptions.

The interesting, and tricky, part of choosing a work pattern is comparing these different worlds. What’s the value of a group of people who a) can only be picked from amongst those within a 30-mile radius of a specific office, b) who have to deal with the indignity of a hour-long daily commute, c) but who’s Agile with that capital A?

Versus a team composed of a) the best talent you could find, regardless of where they live, and b) who has the freedom to work their own schedule, c) but can’t do the literal daily stand-up meeting or pair in front of the same physical computer?

Obviously we’ve made our pick, and the latter setup won by a landslide. But it also made discussions about methodology more complicated.

When you’re talking to someone who thinks that the world is already defined by that first “everyone in the same office” assumption, they’ll naturally champion the hacks that makes that setup more workable. Without necessarily rethinking the overall value of the advice outside that assumption-laden context. Local maxima and all that.

It’s time for a reset. We need the same care and diligence that was put into documenting the agile practices of an office-centric world applied to an office-less world. There’s a new global maxima to be found. Let’s chart its path.

My observations about teaching and learning programming

Dan Kim
Dan Kim wrote this on 6 comments

Over the past 6 months, I’ve had the unique opportunity to observe a wide range of programmers – students, teachers, and world-class experts.

I was curious if there were any patterns and characteristics that made for successful programming teachers and students. My goal was simple: to make myself a better teacher and a better student, so I could help others too.

Here’s what I’ve found so far.

Great programming teachers…

…point students in the right direction, but don’t give out all the answers.

They coach students to be curious and independent – to search for their own answers. They let students stumble and rewrite code a few times. They let students discover valuable patterns, problem solving techniques, and the sheer joy of shipping something that works.

…are encouraging, but never critical.

They give praise when students write great code, and offer suggestions for refactoring clunky code. They give pointers and offer direction, but never criticize a line of thinking.

…promote writing as little code as possible.

They will direct students away from complexities and toward simple, elegant solutions. The fewer lines of code, the better.

…are patient and remember what it’s like to be a beginner.

They remember that everyone was a really bad programmer at some point and needed help (including themselves).

Great programming students…

…learn how to learn.

They don’t just look for answers, but look for patterns and techniques. They search for root causes by breaking down problems. They use Google and Stack Overflow constantly.

…talk less, listen more, and observe.

They realize they can learn a ton by just watching how their teacher solves problems, communicates, and teaches others. They literally watch and learn.

…tend to move more slowly, and are thoughtful in their analysis.

They take the time to think about a problem, craft smart questions and succinctly explain the issues they’re running into. They get better and better at embracing slow time.

…keep their code tidy.

Messy code is often the result of messy thinking. Having neat code makes it easier for others to help them, and as a result, they learn faster and understand more deeply.

…are persistent.

They are absolutely relentless in the belief that they can solve any problem (and usually do). Hours or days can go by with little or no progress, so it’s crucial to have this mindset.

…have a “ship it” attitude.

They set reasonable goals against relatively short time frames. They recognize that momentum is critically important because it provides the feeling of accomplishment and energy to keep pushing forward.

…start coding when all else fails.

There are so many places to get stuck – syntax, concepts, gems, libraries, tools, books, tutorials – the list is endless. They realize the sooner there’s real code, the sooner they’ll have something to react to, adjust, and make better.

And great programmers…

…never stop learning or teaching.

Beyond the default Rails environments

David
David wrote this on 22 comments

Rails ships with a default configuration for the three most common environments that all applications need: test, development, and production. That’s a great start, and for smaller apps, probably enough too. But for Basecamp, we have another three:

  • Beta: For testing feature branches on real production data. We often run feature branches on one of our five beta servers for weeks at the time to evaluate them while placing the finishing touches. By running the beta environment against the production database, we get to use the feature as part of our own daily use of Basecamp. That’s the only reliable way to determine if something is genuinely useful in the long term, not just cool in the short term.
  • Staging: This environment runs virtually identical to production, but on a backup of the production database. This is where we test features that require database migrations and time those migrations against real data sizes, so we know how to roll them out once ready to go.
  • Rollout: When a feature is ready to go live, we first launch it to 10% of all Basecamp accounts in the rollout environment. This will catch any issues with production data from other accounts than our own without subjecting the whole customer base to a bug.

These environments all get a file in config/environments/ and they’re all based off the production defaults.

Continued…