Next up in the Nuts & Bolts series, I want to cover storage. There were a number of questions about our storage infrastructure after my new datacenter post asking about the Isilon storage cluster that is pictured.
To set the stage, I’ll share some file statistics from Basecamp. On an average week day, there are around 100,000 files uploaded to Basecamp with an average file size that is currently 2MB for a total of about 200GB per day of uploaded content in Basecamp. And that’s just Basecamp! We have a number of other apps that handle tens of thousands of uploaded files per day as well. Based on that, you’d expect we’d need to handle maybe 60TB of uploaded files over the next 12 months, but those numbers don’t take into account the acceleration in the amount of data uploaded. Just since January we’ve seen an increase in the average file size uploaded from 1.88MB to 2MB and our overall storage consumption rate has increased by 50% with no signs of slowing down.
When I sat down to begin planning our move from Rackspace to our new environment, I looked at a variety of options. Our previous environment consisted of a mix of MogileFS and Amazon S3. When a customer uploaded a file to one of our applications we would immediately store the file in our local MogileFS cluster and it would be immediately available for download. Asynchronously, we would upload the file to S3, and after around 20 minutes, we would begin serving it directly from S3. The staging of files in MogileFS was necessary to account for the eventually consistent nature of S3.
While we’ve been generally happy with that configuration, I thought that we could save money over the long term by moving our data out of S3 and onto local storage. S3 is a phenomenal product, and it allows you to expand storage without having to worry much about capacity planning or redundancy, but it is priced at a comparative premium. With that premise in mind I crunched some numbers and was even more convinced that we could save money on our storage needs without sacrificing reliability and while reducing the complexity of our file workflow at the same time.
The main contenders for our new storage platform were either an expanded MogileFS cluster or a commercial NAS. We knew that we did not want to have to juggle LUNs or a layer like GFS to manage our storage, so we were able to eliminate traditional SAN storage as a contender fairly early on. We have had generally good luck with MogileFS, but have had some ongoing issues with memory growth on some of our nodes and have had at least a couple of storage related outages over the past couple of years. While the user community around MogileFS is great, the lack of commercial support options raises its head when you have an outage.
After weighing all of the options, we decided to purchase a commercial solution and we settled on Isilon as the vendor for our storage platform. Protecting our customer’s data is our most important job and we wanted a system that we could be confident in over the long term. We initially purchased a 4 node cluster of their 36NL nodes, each with a raw capacity of 36TB. The usable capacity of our current cluster with the redundancy level we have set is 108TB. We’ve already ordered another node to expand our usable space to 144TB in order to keep pace with the storage growth that took place between the time we planned the move and when we implemented it.
The architecture of the Isilon system is very interesting. The individual nodes interconnect with one another over an InfiniBand network (SDR or 10 Gbps right now) to form a cluster. With the consistency level we chose, each block of data that is written to the cluster is stored on a minimum of two nodes in the cluster. This means that we’re able to lose an entire node without affecting the operation of our systems. In addition, the nodes cooperate with one another to present the pooled storage to our clients as a single very large filesystem over NFS. Isilon also has all the features like snapshots, replication, quotas, and so on that you would expect from a commercial NAS vendor. These weren’t absolute requirements, but they certainly make management simpler for us and are a welcome addition to the toolbox.
As we grow, it’s very simple to expand the capacity of the cluster. You just rack up another node, connect it to the InfiniBand backend network and to the network your NFS clients are connected to and push a button. The node configures itself into the existing cluster, its internal storage is added to the global OneFS filesystem, its onboard memory is added to the globally coherent cache, and its CPU is available to help process I/O operations. All in about a minute. It’s pretty awesome stuff, and we had fun testing these features in our datacenter when we were deploying it.
For now, we continue to use Amazon S3 as a backup, but we intend to replace it with a second Isilon cluster in a secondary datacenter which we’ll keep in sync via replication within the next several months.
Michael
on 28 Jul 10Why did you guys move away from Rackspace?
Plaqq
on 28 Jul 10Awesome posts! I was not familiar with isilion. It sounds like you made some great decisions. Is S3 always like that? I.e. Do stored assets take a while to appear available for download from S3? Also, does your storage cluster use any compression when storing files? Thanks
Richard Nyström
on 28 Jul 10Michael: http://37signals.com/svn/posts/2471-nuts-bolts-new-datacenter
Eric Anderson
on 28 Jul 10Awesome post. Great to see the inner workings of high end apps from big companies like 37Signals.
Viktoras
on 29 Jul 10@Eric Anderson: big companies? It’s all about being small here! :p
I really enjoy the Nuts & Bolts series, OK, just like a enjoy 90% of Signal vs. Noise blog, but it’s really inspiring/useful/awesome to have an insight to different aspects of 37signals.
The overall setup is impressive, but i must admit i thought you would prefer custom solutions than professional ones, although it seems the right choice given the importance & the complexity of your data.
alex
on 29 Jul 10so primarily you have moved away from amazon because of price?
Brian Armstrong
on 29 Jul 10I’m actually a little surprised to read this as (at least the way I see it) the 37Signals mantra seems to be trade a little efficiency for simplicity. Simple lets you stay agile, keep staff low, etc.
Switching this over does seem like a lot of work. Do you feel like it will actually simplify it in the long run or is the money side just too big to ignore here?
Would be awesome to see how the numbers play out with people, up front costs, when it pays off, etc. Fascinating to watch and great example of emulating chefs by giving us an inside look – thanks for sharing it!
Gothy
on 29 Jul 10Does it really make storage that much cheaper(especially when there will be a second backup setup in another dc) than S3? That’s a lot of migration work also and other stuff.
Shashikant
on 29 Jul 10For S3, what is the split between storage cost and badwidth cost?
Since your products are collaborative in nature, I suppose single file gets downloaded multiple times. So, I suppose bandwidth cost is non-trivial.
Pies
on 29 Jul 10Nuts & Bolts is a great series, keep it coming and thanks :) Our projects don’t require systems at anything like this scale, but it’s very interesting nonetheless.
MI
on 29 Jul 10alex, Brian: I wouldn’t say we moved to local storage entirely because of cost. I’d say the cost savings allowed us to consider it. Keeping the data locally is significantly simpler than using S3 since we can avoid the previous step of staging uploads. The only reason we didn’t do it all along is that it was difficult to manage the data volumes that we needed with the options that were available to us. One of the major draws of the Isilon equipment was ease of management. Since it’s all really just one big filesystem, a lot of the management overhead that you see in SAN-like environments evaporates.
Having the data in a local file store also makes some features that we’ve wanted to implement significantly simpler to conceptually as well, particularly when those features would need to touch a lot of files. Local NFS calls over gigabit ethernet are lots faster than calls to S3.
Don’t get me wrong, though, I think S3 is a great service and I’m sure there are things we’ll continue to use it for. In our particular case we wanted more control over our data than we could get in that environment. The fact that we could get that control while saving money was the icing on the cake that made the decision easy.
MI
on 29 Jul 10Shashikant: For us, the ratio of storage cost to bandwidth cost is on the order of 10:1 with storage being 10x more expensive.
Bret
on 29 Jul 10Really enjoying these—keep ‘em coming!
Bret
on 29 Jul 10How has the decision to store files locally affected your bandwidth requirements?
Anonymous Coward
on 29 Jul 10Awesome Mark. Love this content.
I don’t think this violates the 37Signals mantra of simple – I think you just have to realize who gets the simplicity. Mark’s team takes on A LOT of complexity & redundancy & speed issues so that all of us out here get to have Basecamp work quickly and simply.
This is right out of the Data Warehouse world. Users want speed and no downtime and yet simple. In fact, they (and I) really want to be able to take a system like Basecamp for granted. AKA It just works. Kudos on achieving that.
BillP
on 29 Jul 10Oops. I am the Anonymous Coward post above…
MI
on 29 Jul 10Thanks, Bill, I really appreciate that.
Tom
on 29 Jul 10Yep really loving this posts,
Would you be able to talk about how you manage monitoring and alerts etc? Do you a dashboard of crazy graphs you could show us?
Thanks,
Bryan Batchelder
on 29 Jul 10Very interesting. I am surprised that a local setup is cheaper than S3, given the stories that I have heard (particularly Smug Mug) that said they saved a ton of money by using S3 rather than a local setup.
Has the storage industry changed so much since S3 in the last 3-4 years that this is no longer the case?
Does this mean Amazon needs to re-evaluate their pricing?
Chris Nagele
on 29 Jul 10Great post Mark. When it comes to NFS, did you test or benchmark any settings on the client or server side? For instance, NFSv3 compare to v4. From our experience it varies, especially when optimizing for millions of small files.
Walt
on 29 Jul 10Has 37signals looked at data duplication for your backup infrastructure?
Walt
on 29 Jul 10Sorry, that should be “data de-duplication” for your backup infrastructure.
Rob
on 31 Jul 10For now, we continue to use Amazon S3 as a backup, but we intend to replace it with a second Isilon cluster in a secondary datacenter which we’ll keep in sync via replication within the next several months.
BAD 37S!
Repeat after me. Replication is not backup! That does not in any way sound like a backup solution. It’ll give you availability and a fast RTO if you have HW issues on one site – but your RPO options are very limited.
Bruno
on 31 Jul 10Do you have any needs to expunge content? If so, how do you deal with that (pre-expunge, backup, lower cost storage solutions)?
MI
on 31 Jul 10Rob: We’re planning to use a combination of replication and snapshots for our backup requirements. Your’e right that replication alone isn’t a backup solution.
Andrew Richards
on 01 Aug 10@Bryan Batchelder
That blog post from Smug Mug is over three years old. It also talks about a very pedestrian storage growth model (direct attached storage on a host for each expansion) versus S3. Isilon scales in every important aspect for the growth of 37signals’ products. It may appear expensive when only considering $/TB, but there is much more to the total cost of storage than that lone metric. The ease of expansion with Isilon is particularly important when you consider the mostly automated deployment methodology they are shooting for in the new datacenter. Between Chef and Isilon, they can rack up either a server or more storage (or both) and have it online in a matter of minutes, not hours.
If you listen to the 37signals podcast #12, the sysadmins discuss the cost issue and point out that the linear pricing model of S3 crosses the curve of local storage around 80TB (at least by their reckoning). They also talk about the pricing shenanigans of the big storage vendors, which I can attest to as well.
Bryan Batchelder
on 02 Aug 10Thanks Andrew, I will definitely listen to that podcast.
This discussion is closed.