Some new features are sexy. They come with shiny new interfaces, extra buttons, more power. These are obvious and easy to spot. They are fun to develop and fun to release.
However, there’s another side to improving a product that doesn’t get as much respect. It’s the optimization. Nothing new, but everything better. Small tweaks here, hardware upgrades there. Everything runs more smoothly but you don’t really notice it. You feel it, but there’s nothing pretty to point to as evidence of the hard work.
The speed initiative
We want to treat speed as a feature. It should be one of our best features. So, for the past few months Jeremy and Mark have been working hard on speeding up our apps through software optimizations, caching, and network and hardware improvements. They deserve a huge round of applause for the results. They’ve made a big difference.
Let’s talk about Basecamp
We’re rolling these optimizations out to different 37signals products at different times, but let’s start with Basecamp, our most popular product. Basecamp gets used a lot by a lot of people. It’s also the type of product that people are in and out of all day long so speed is a critical factor. We rolled out a series of optimizations this week.
Some data
Here are some charts generated by New Relic that shine a light on the results of the hard work.
These charts compare an hour of traffic this morning with the same hour last week. As you can see, the changes we’ve implemented have made a dramatic difference. Our overall response time was cut very nearly in half, meaning that pages are loading roughly twice as fast as they were for the same time period last week. At the same time, we’ve managed to cut CPU usage by about a third and database time by about half.
How we did it
These gains were achieved using a variety of techniques including:
- Analysis: We relied heavily on New Relic’s outstanding RPM performance management suite to give us insight about the parts of Basecamp that were accessed the most as well as those that were most in need of improvement.
- Caching: We’ve begun using Memcached in a variety of spots. Caching can be tricky with dynamic apps like Basecamp since different people often see different things, but we’ve implemented it carefully where it could be used to its best advantage.
- MySQL optimizations: We’ve been working with a MySQL performance consultant to help us optimize our database calls and queries. We’re still early in the process but we’ve learned a lot so far.
- Hardware upgrades: We recently made some significant upgrades to our database servers. We went from servers with 2 x Dual Core 2GHz processors, 32GB of RAM, and 6×73GB 15,000 RPM SAS drives to servers with 2 x Quad Core 3GHz processors, 128GB of RAM, and 8×73GB 15,000 RPM SAS drives. We’ve also upgraded our load balancers and have new switches coming soon as well.
Change you can feel
While you may not immediately notice speed increases like you’d notice a big new feature, we think that over time you’ll see your productivity increase due to these speed increases. Less time for pages to load, less waiting for results. Everything’s just smoother. It’s change you can feel. The more you use Basecamp the more you’ll feel it.
Kevin
on 07 Aug 08While often overlooked, speed and responsiveness is a great feature. Thanks for taking the time to work on that (it was already pretty good).
If you have any tips or insights on making MySQL queries faster or more efficient that you want to share I’d love to hear them. There is so much to know and learn on that topic and it is hugely important.
Jule
on 07 Aug 08Living in the boonies, satellite is my best option. Every ms helps (a lot). Thx!
GeeIWonder
on 07 Aug 08So… let me get this straight. Instead of writing Basecamp with a framework that actually, y’know, scales and is performant, you’re creating a patchwork Frankenstein of ridiculously overpowered servers and caching?
That’s a house of cards if I’ve ever seen one.
Russell Quinn
on 07 Aug 08Do people only have to say scales every time someone else says rails because it rhymes? Like some illogical synaptic response?
Roy Tomeij
on 07 Aug 08@GeelWonder: You’re actually blaming non-optimized MySQL queries on RoR? That doesn’t happen with PHP, does it? Just like PHP apps don’t need memcached.
DHH
on 07 Aug 08GeelWonder, if you weren’t so obviously trolling, I would have bothered to reply that this is exactly what scaling means. Adding more hardware to deal with growth in usage. And that caching is exactly how any major site and application achieves lower response times.
Two years ago, I was talking to the Facebook guys about their 200-server memcached farm and how if their caching servers went down, the site basically stops working.
Anyway, all of this would be a discussion of facts and best practices between people actually interested in learning something and exchanging ideas.
You’re obviously just interested in playing a caricature of the armchair know-it-all nerd with nothing but snark. Please go back to hiding under your rock.
Beth Long
on 07 Aug 08Thanks both for the optimization and for the peek behind the curtain.
Adam
on 07 Aug 08Good effort ! Cant say I had noticed until you mentioned it but thats some impressive speed improvements and 128gig of memory blimey, didnt realise it was possible !
Are the servers whiteboxes or from a vendor? Be interesting to know who’s behind the hardware.
Also I second Kevin’s request on the mysql teqniques !
I’m launching a website next week after a rebuild (I don’t like rebuilding apps, but its a .net app and really did need a rewrite, it was awful) and our current pages where about 800k – 1.1mb in size and ive managed to get them down to around 300k. Still not great, but better.
This is a website that sees over 1,000,000 visitors a month so hopefully the bandwidth thats been wasted on ugly markup and large images can be spent on nice new customers !
Anonymous Coward
on 07 Aug 08That’s what I call a spank in the **....
James
on 07 Aug 08Congratulations on the speed improvements
Daniel Miller
on 07 Aug 08So the product that births a framework that births a product that improves the product that…..? Circle metaphors abound, but my gut says that’s awesome.
Tony
on 07 Aug 08That New Relic looks like a really nice package. Does anyone happen to know if there’s an equivalent for PHP? (please no language warriors – I think by now Rails & PHP have both proven they work).
DHH
on 07 Aug 08Adam, I believe we’re using Dell boxes. Rackspace is responsible for the actual hardware procurement, though.
Having 128GB of memory is really nice when the dataset for Basecamp is about that size already and will soon be even bigger.
But it’s all about scaling to your needs. For the first year of Basecamp, we were hosted on a single machine that ran everything. Database, web server, apps.
Obviously, once you have a hit on your hands, it’s good to spend the cash rolling in on some redundancy.
Prakash S
on 07 Aug 08Are you using a hardware or software Load Balancer? If so which one?
mkb
on 07 Aug 08A skilled database engineer is a wonderful thing. They’re hard to find. We looked for 6 months before we found the right guy.
DHH
on 07 Aug 08Prakash, we’re using both. On the software side we’re using HAProxy. We just changed our HW load balancers to be gigabit, so I’m not sure what brand they are.
Greg
on 07 Aug 08Thanks Jason, it’s much appreciated.
Tor Løvskogen Bollingmo
on 07 Aug 08Nice tweaks.
Have you ever considered hosting Basecamp on national located servers? I really notice speed improvements when visiting sites on norwegian servers contra sites on US servers. What are your thoughts on this?
Peter Urban
on 07 Aug 08congrats on the speed update. highly appreciated.
GeeIWonder
on 07 Aug 08Hey, that wasn’t me!
GeeIWonder
on 07 Aug 08For the record, I think people (including myself) spend way too much time optimizing code (or, in fact, not optimizing it for the application at hand) rather than considering improvements via hardware.
I’m presenting a paper in Chicago/October about commodity computing and how it should shift our paradigms.
Prakash S
on 07 Aug 08Thanks, David!
Typically, turning on compression, caching and persistent connections on the hardware LB will give you great benefits. That and other tweaks.
Brad Fults
on 07 Aug 08Neat to see the specs behind your machines. I’d also be interested in any patterns you recognized when applying memcached that would be useful to other typical Rails apps.
Also, it’s “its” when it’s a possessive!
Megan M.
on 07 Aug 08Fantastic! I actually noticed a significant difference in speed at some point in the last few days and was fascinated & delighted by it—the speed of 37signals apps has been something that felt a bit tedious for me, and having it changed so dramatically was a great boost in my workflow. Thank you for all your hard work!
SSS
on 07 Aug 08Did the recent changes break your RSS aggregator (the signon user name/password)?
Broke SharpReader for me.
Cat Mikkelsen
on 07 Aug 08Thank you for this article. We have an analytics suite (and a “data warehouse in the sky” and our awesome coders just finished a significant performance rev. At the meeting where they demonstrated it, the entire company got up and clapped.
I like your message about treating speed as a feature. It’s nice to read about some of the amazing performances that happen deep in in the internals of engineering departments. Even though they don’t come with a nifty new button. :-)
Bobby Lehew
on 07 Aug 08Screaming fast. I did not know about the upgrade until a few minutes ago but the speed was visibly noticeable this morning when I logged on (I wondered what happened). Our team is in and out of Basecamp all day long … excellent work!
Jason Grigsby
on 07 Aug 08You didn’t mention this, but you’ve got a very nice score on the YSlow plugin. Looks like you taken pains to move images across multiple domains, etc. That makes a big difference.
Thanks for focusing on this.
Chris Martin
on 07 Aug 08Thanks for the improvements.
Speaking of speed, are you thinking about copying the “add 10 milestones at a time” feature in similar ways in other areas? That speeds up the addition of milestones, for sure, but I find myself adding a lot of to-dos at one time far more often than I do milestones. So “add 10 to-do items at a time” would probably save me an hour in a typical work week.
Don Schenck
on 07 Aug 08I’m glad to see that your strategy includes hardware.
Seriously. I’ve been in meetings (ugh) about performance and I most often suggest “upgrade the hardware” as a FIRST option. They laugh, then when I explain the cost of software optimization versus, say, another GB of RAM, suddenly I’m a frickin’ genius.
(Truth be told, I’ve always been a frickin’ genius. A gorgeous, frickin’ genius …)
Anyhoo … GREAT post. Really gives Real World insight.
Bob Moore
on 07 Aug 08Very nice !
Years ago I wrote a DOS TSR that drove a plotter as a background process for AutoCAD users. By handcrafting the code in tight assembly language, making it interrupt driven off the UART (utilizing the hardware the way it should!) and playing around with disk buffer size, it was amazing to see the difference that software optimizations made. It will always be important. Nice to see that someone still thinks about it. I wonder if Microsoft ever has?
Don Schenck
on 07 Aug 08That’s funny; Bob and I posted back-to-back with totally different takes on this post.
Great stuff.
Ron Steckly
on 07 Aug 08So about the speed thing…what about file uploads? I’m writing a file upload feature in my application based on Rails and am thinking I might need to change frameworks because of how long it takes to get large, uploaded files up.
Its also been a barrier to some of the staff where I work using Basecamp. I’m basically the only user because people don’t want to wait too long to upload files. Is there any way around this issue both for Basecamp and for Rails in general?
Thanks!
Ron
Anonymous Coward
on 07 Aug 08Ron, file upload speed is a function of bandwidth. Rails can’t make file uploading faster if your connection is only sending the file at 300k/sec.
GeeIWonder
on 07 Aug 08@Ron
There was a discussion (well, a mini-rant) on the file upload during the video thing the other day.
I think a good point was made, that’ll I’ll try to paraphrase: it’s not just the length of time it takes to upload, it’s also the lack of responsiveness in the UI. Unfortunately, the UI basically means the browser in this case.
Henrik Lied
on 07 Aug 08Ron, be sure that your application is writing in chunks to the disk. If you upload large files to memory, both the memory and the CPU usage will go through the roof, and eventually start blocking and causing “slowness”. Chunk writing is iiimportant. :)
Matt Radel
on 07 Aug 08Sweeeet. Speed is indeed a cherished enhancement – I’m sure folks will notice. It’s amazing what a little tweaking can do. :)
someone
on 07 Aug 08If I remember correctly, Rails doesn’t support prepared statements. How big a role does that play in response times?
richallum
on 07 Aug 08Don’t understand any of the technical stuff but sounds like I am going to work faster which is good news to me!
Stephen Jenkins
on 07 Aug 08As somebody that spends all day in Basecamp and Campfire, thank you!
Kevin Compton
on 07 Aug 08I can handle slower, personally, if this had some of the basic features I need. Basecamp is a necessary evil for me at the moment and doesn’t do some basic things I really miss. Like Tasks being handled more intuitively, passing them back and forth with comments. Message board is pointless.
Tim
on 07 Aug 08So do the graphs represent performance increases solely bases on software optimization or do the performance increase graphed represent the new hardware used as well?
Tim
on 07 Aug 08To expand on my point, it’s really not a far comparison to show both software optimizations and hardware improvements on the same graph since you really don’t know which contributed to the performance increase.
Just looking at the hardware ALONE, you 4x the RAM and 2x the CPU.
So it makes sense, just looking at hardware, that the graphs indicate about 1/2 the CPU usage … b/c you double the # of CPUs (cores) you now have.
MI
on 07 Aug 08The graphs represent both changes, but honestly the database server upgrade didn’t provide us much additional performance, it just increased our scalability as our data set gets larger. The bulk of the performance boost is due to Jeremy’s hard work integrating caching where possible.
The CPU utilization in the graph is from the application servers, not the database servers, so the CPU reduction is directly attributable to software improvements and not the database upgrade.
Salman
on 07 Aug 08David,
With the stats you provided, it is hard to distinguish between the improvements you guys made via mysql/code and the new server hardware. A better comparison would be using the same hardware, and then a before/after graph with JUST the code improvements.
BTW, how many page views are you guys doing on the average day for basecamp? (excluding the chat app ofcourse since it probably polls).
Scott Ruthfield
on 07 Aug 08Hi Jason -
Nice to see your team treating speed as a feature!
Your description above shows that you focused your performance energy on the backend. This is standard and often smart, and as you can see, you can get real performance improvements out of optimizing backend performance.
Customers, though, deal with performance problems that come from both the backend and the frontend – so optimizing your speed of client execution can be just as important, and the toolset for that is much weaker.
That’s why we built and open-sourced Jiffy, and released it this year at the O’Reilly Velocity conference – it’s a toolset for getting real-world (rather than just developer desktop) client-side data and optimizing client performance. Might be worth checking out.
EH
on 07 Aug 08Both this story and Adam’s comment elide something very important and measurable when it comes to performance optimization: money. Shrinking the size of your pages and using hardware/code more efficiently allows you to focus on the application while taking a step back due to not having to worry so often about hardware upgrades and bandwidth expenditures. These things aren’t always invisible!
Mark Phillips
on 07 Aug 08Thanks a million! We love Basecamp, but the number one complaint on our team was loading speed. These improvements are just what we were looking for!
Terinea Weblog
on 07 Aug 08Now you mention it, yeah I’ve noticed a difference in speed. I was thinking about turning off SSL to speed things up.
Well done guys, keep up the work.
Jamie
Emil
on 07 Aug 08Great job with the speed update. Any plans for speed upgrades for the Writeboard integration? It feels very slow, especially since it redirects through second redirect page (“Just a moment….”).
You should be able to get pretty fast document open/close since you don’t need to load a lot of javascript compared to google docs etc.
Seth / Subimage
on 07 Aug 08Also forgot to mention – but if you’re looking for other ways to gauge performance of your rails app this is a wonderful logging tool.
I’m using that on Cashboard and it’s provided me with wonderful information on areas that I need to address.
Tobin Harris
on 07 Aug 08“We want to treat speed as a feature”. I love that, it’s the kind of bare-metal thinking you guys seems so good at.
I was really shocked to learn that the Basecamp IT has to be scaled up rather than out. Couldn’t you just add N more commodity servers to handle growth?
Actually, I’m strictly talking from the “armchair” here, I don’t have a clue about your system, domain or such scalability challenges. But, I was surprised that you had to beef up the server to handle performance problems, rather than add more servers.
Michael Boyer
on 08 Aug 08Thank you for the increased speed. The team will love it.
Kevin Kim
on 08 Aug 08Is it just me or am I the only person who has not noticed the performance increases in Basecamp.
Don’t get me wrong, running on brand new shinny hardware is always nice.
I’m just not seeing what others are talking about.
Luke Flood
on 08 Aug 08no difference here… any chance of getting some Australian servers to help us aussies? you must have enough Australian customers to at least consider this. We are on a lightning fast connection here, and freqently wait 5-10 seconds for basecamp pages to load. Great product, load times seriously suck.
Chris Cavallucci
on 08 Aug 08I definitely feel the pages loading faster. Thank you for the optimizations—our clients will appreciate it, too.
Will you be optimizing the javascript, XHTML, and CSS?
Evan Leonard
on 08 Aug 08Great work. Another good place to look is at the js/images download process. The YUI blog is a good source of inspiration. Here’s a good one that it looks like you’re not doing yet:
The rendering of the entire page, including downloading other resources, blocks while javascript files load. They block even if they’re loading from the cache. Take a look at the Net tab in firebug for evidence, and this article for some tips on speeding it up: http://yuiblog.com/blog/2008/07/22/non-blocking-scripts/
Keep up the great work, Evan Leonard
Tayfun Ozturkmen
on 08 Aug 08Feels WAY faster! I was about to write an email about the speed issues, lol.
Ben Blakley
on 08 Aug 08@Ron, I’d recommend looking at the new Nginx upload module to improve handling large file uploads with Rails. I’ve been trying it out and have been really impressed with it.
http://brainspl.at/articles/2008/07/20/nginx-upload-module
Ross Kimbarovsky
on 08 Aug 08Jason – thanks so much for posting. We’ve been dealing with hardware scaling at crowdSPRING (http://www.crowdpsring.com) as we’re growing quickly (we also host with Rackspace) and during a conversation last week, wondered what 37signals was doing to respond to the tremendous success of its products (we’re big fans and users!). Very nice of you to share this type of information and data! New Relic looks like a great tool—wishing there was a PHP equivalent…
Ross
Roy Tomeij
on 08 Aug 08@Evan: Yeah, the Yahoo guys wrote some excellent guides (and a book) and give great talks about the subject of front-end optimization (which actually accounts for 90% of the time people wait for a page to load). A must-read for every developer.
Michael Cohen
on 08 Aug 08Apparently nobody has clued you all in to the yahoo performance primitives :) halving response time does not halve the time it takes for a page to load!
MI
on 08 Aug 08We’re very aware of the various client side optimizations that can be done, we make use of some of the techniques already. If you look at a YSlow report for a Basecamp account, you’ll see that we have a score of A in all but one area: we don’t currently use a content distribution network. We did have a problem with ETags for static assets that has since been fixed, but we’ve historically done very well with the YSlow results.
Michael Cohen
on 08 Aug 08YSlow’s not all of their suggestions… take a look at the “After a YSlow A” video on YDN Theater.
Anyway, the point I was trying to make is that halving response times doesn’t halve page load times at all unless you really have managed to magically shave off 80%+ of rendering time as well.
Michael Cohen
on 08 Aug 08Actually, I would be really quite intrigued by the corresponding dip in page load times, if you are able to benchmark it.
Data is always fun!
MiSc
on 08 Aug 08very obamaesque last heading ; )
Manik
on 08 Aug 08Please share the mysql optimization tips with the world. There might be things that go back into the framework, but there might be some like mysql settings which would help all.
Thanks.
Rick Vugteveen
on 08 Aug 08On the client side performance front, have you thought about supporting Gears? Wordpress now uses it to cache commonly used components locally.
Derek Hoshiko
on 08 Aug 08Let’s make the connection between performance of web servers and carbon emissions (thus climate change), and energy usage (electrical power, and thus impact due to rising energy costs). Algorithms directly impact us socially, environmentally, and economically. This makes this work an even greater selling point. Thanks!
Morten
on 08 Aug 08Very nice work! One question (to you and your readers): How many % of your revenue is spent on hosting/hardware?
We’re a small SaaS shop (compared to 37s) and are at about 8-10% which I find reasonable, but then again, I don’t have much material to benchmark us against.
Owen van Dijk
on 08 Aug 08I found it interesting that you emphasize the ‘from servers’ and ‘to servers’ part. Is it to note that your not running your apps ‘in the cloud’? :)
Oliver
on 08 Aug 08This is change I can believe in.
Martin Carrion
on 08 Aug 08Basecamp is faster indeed. Thanks Jason!
Martin
Tom Reitz
on 09 Aug 08WTG Guys! An improvement on something that is just working so ridiculously fantastic for me. Love it! Keep it up! And yes do share some MYSQL optimization tricks… EVERYONE wants to know that one!
Peter
on 11 Aug 08To be honest, Basecamp never seemed slow for me. I guess this is a pre-emptive rather than totally neccessary update, with a little boost in speed for good measure.
Luca Guidi
on 11 Aug 08Thanks for take care of speed. I really appreciate the “speed as feature” concept.
I think it’s time to start a community wide discussion about scalability recipes. I noticed the lack of articles, tutorials, books about this kind of issues. Only if those concepts are owned by the most part of the community, Rails can win. It’s a crucial point. We know Rails can scale, but not all the teams have JK, MI or DHH.. What do you think about, guys?
J C
on 12 Aug 08Talk about bogus smoke and mirrors stats… anyhow
Here’s the main mysql optimisation trick—switch to PostgreSQL (or anything else, eg. Informix, FrontBase, Sybase, etc). When that kind of hardware can be purchased on a whim (for example, 128GB RAM to store the db in memory) why not invest and build upon in a decent database server? MySQL apologists can get their knickers in a twist but MySQL is the worst DBMS selection that can be made and that is the truth. Move on.
Hire website designer
on 12 Aug 08Many Many Congratulations on the speed improvements. I have been using the basecamp since 1 years and glad to have a product like this. My clients are also very happy with this new speed improvement. Thanks Again..
Kurt
on 12 Aug 08As a reply to Tobin Harris regarding scaling vertically or horizontally-
Horizontal scaling is fine, but there’s definitely a point at which managing all those servers gets challenging, especially for companies with a smaller staff (datacenter operations get expensive quickly). They’ve already shown their smarts in that their using a managed facility (Rackspace, who is awesome, IMHO), most likely because managing a ton of computers isn’t at the top of most people’s “fun list”. Plus, application architecture has to be taken into account- can your architecture effectively utilize more servers, what bang do you get for your buck, etc.
The company I work at always tried to scale horizontally (at Rackspace as well), and now after a few years we realize we’re running on a bunch of very old machines- we just figured out we can scale 3-4x and cut our number of machines in 1/2 just by getting some updated hardware. Its a no-brainer.
Cyril
on 13 Aug 08I wondered if you also did some measurements of typical scenarios with an end-to-end point of view? Like every five minutes access a basecamp, then go in one project, then add a todo, then switch to another project, add a message, go see a whiteboard (this one takes time).
A mean performance metrics doesn’t reveal the actions that are on the extreme. An overall improvement doesn’t mean you have improve in the fields where you had real problems.
Did you take this apporach too? Which were the biggest improvements?
This discussion is closed.