You’re reading Signal v. Noise, a publication about the web by Basecamp since 1999. Happy !

Taylor

About Taylor

Husband. Father. Son. Brother. Ops Manager at Basecamp. I have a Vizsla named Ruby. I've been rebuilding an old boat (19' Mako) for two years. I sharpen my own knives. I don't have as many guns as Jamie says I do.

When Disaster Strikes

Taylor
Taylor wrote this on 14 comments

Safety Camper Saves Basecamp From Disaster

Nearly 3 years ago we asked “What would happen if a truck crashed into the datacenter?” The resulting discussion could be summarized as “Well we would probably be offline for days, if not weeks or months. We wouldn’t have many happy customers by the time Basecamp was back online.” No one was satisfied with that answer and, in fact, we were embarrassed. So we worked really hard to be prepared with an answer that made us proud.

This past Sunday, February 15th 2015, we demonstrated that answer in public. With one command we moved Basecamp’s live traffic out of our Chicago, IL datacenter and in to our Ashburn, VA datacenter in about 60 seconds.

Chart: Traffic Swapping Between Sites

Not one of our customers even noticed the change, which is exactly as we planned it. A few hours later we ran one command and moved it all back. Again, no one noticed.

This probably qualifies as the least publicly visible project at Basecamp. And we hope it stays that way. But if it doesn’t, just know Basecamp will be online even if disaster strikes.

(There’s much to share about how we accomplished this and what we learned along the way. I’ll share the technical nitty gritty in future posts.)

December 4th Basecamp Classic, Campfire and Highrise Outage

Taylor
Taylor wrote this on 13 comments

Basic Explanation

Some background

On Dec. 4 around 5:30 p.m. CT, a number of our sites began throwing errors and were basically unusable. Specifically, Basecamp Classic was briefly impacted as it was very slow. Campfire users experienced elevated errors and transcripts were not updated for quite some time. Highrise was the most significantly impacted: For two hours every page view produced an error.

Why our sites failed

When you visit a site like Basecamp it sends you information that’s generated from a number of database and application servers. These servers all talk to each other to share and consume data via connections to the same network.

Recently, we’ve been working to improve download speeds for Basecamp. On Tuesday afternoon we set up one server with software that simulates a user with a bad Internet connection. This bad traffic tickled a bug in a number of the database and application servers which caused them to become inaccessible. Ultimately this is why users received error messages while visiting our sites.

How we fixed the sites

We powered off the server sending out the bad traffic. We powered back on the database and application servers that were affected. We checked the consistency of the data and then restarted each affected site.

How we will prevent this from happening again

  • We successfully duplicated this problem so we have an understanding of the cause and effect.
  • We asked all staff not to run that specific piece of software again.
  • We know someone might forget or make a mistake, so we set up alerts to notify us if the software is running anywhere on the network. We verified the check works too.
  • We are working with our vendors to remove the bugs that caused the servers to go offline.


In-Depth Explanation

Topology

Our network is configured with multiple redundant switches in the core, two top of rack (TOR) switches per cabinet, and every server has at least 2×10Gbe or 2×1Gbe connections split over the TOR switches. Servers are are spread among cabinets to isolate the impact of a loss of network or power in any given cabinet. As such, application servers are spread throughout multiple cabinets; master and slave database pairs are separated, etc. Finally the cabinets are physically divided into two “compute rooms” with separate power and cooling.

Before the failure

We’ve been investigating ways to improve the user experience for our customers located outside the U.S. Typically these customers are located far enough away that best case latency is around 200 ms to the origin and many traverse circuits and peering points with high levels of congestion/packet loss. To simulate this type of connectivity we used netem. Other significant changes preceding the event included: an update to our knife plugin that allows us to make network reconfiguration changes, the decomm of a syslog server, and an update of check_mk.

Failure

At 5:25 p.m. CT, Nagios alerted us that two database and two bigdata hosts were down. A few second later Nagios notified us that 10 additional hosts were down. A “help” notification was posted in Campfire and all our teams followed the documented procedure to join a predefined (private) Jabber chat.

One immediate effect of the original problem was that we lost both our internal DNS servers. To address this we added two backup DNS servers to the virtual server on the load balancer. While this issue was being addressed other engineers identified that the affected applications and servers were in multiple cabinets. Since we were unable to access the affected servers via out of band management, we suspected a possible power issue. Because the datacenter provides remote hands service, we immediately contacted them to request a technician go to one of our cabinets and inspect the affected servers.

Recovery

We prioritized our database and nosql (redis) servers first, since they were preventing some applications from working even in a degraded mode. (Both our master and slave servers were affected, and even our backup db host was affected. Talk about bad luck …) About five minutes after we had a few of the servers online, they stopped responding again. We asked the onsite technician to reboot them again, and we began copying data off to hosts that were unaffected. But the servers failed again before the data was successfully copied.

From our network graphs we could see that broadcast traffic was up. We ran tcpdump on a few hosts that weren’t affected, but nothing looked amiss. Even though we didn’t have a ton of supporting evidence it was the problem, we decided to clear the arp cache on our core, in case we had some how poisoned it with bad records. That didn’t seem to change anything.

We decided to regroup and review any information we might have missed in our earlier diagnosis: “Let’s take a few seconds and review what every person worked on today … just name everything you did even if it’s something obvious.” We each recited our work. It became clear we had four likely suspects: “knife switch,” our knife plugin for making changes to our network; syslog-02, which had just been decommisioned; an upgraded version of the check_mk plugin that was rolled out to some hosts; and the chef-testing-01 box with netem for simulating end user performance.

It seemed pretty likely that knife-switch or chef-testing-01 were the culprits. We reviewed our chef configuration and manually inspected a few hosts to rule out syslog-02. We were able to determine that the check_mk plugin wasn’t upgraded everywhere, and that there were no errors logged.

We shut down chef-testing-01 and had the remote hands technician power on the servers that had just gone awol again. We decided that since we were pretty sure this was a networking issue, and it very likely was related to lacp/bonding/something related, we should shut down one interface on each server in case that too prevented a repeat performance. We disabled a single port in each bond both on the switch and on the server. Then we waited 15 long minutes (about 10 minutes after the server was booted and we had confirmed the ports were shut down correctly) before we called the all-clear. During this time we let the databases reload their lru dumps so they were “warm.” We also restarted replication and let it catch up and got the redis instances started up.

With these critical services back online our sites began functioning normally again. Almost 2.5 long hours had passed at this point.

Finally, we made a prioritized list of application hosts that were still offline. For those with working out-of-band management, we used our internal tools to reboot them. For the rest we had the datacenter technician power cycle them in person.

Resolution

  • We were able to reproduce this failure with the same hardware during our after-incident testing. We know what happens on the network, but we have not identified the specific code paths that cause this failure. (The change logs for the network drivers leave lots to be desired!)
  • We have adjusted the configuration of the internal DNS virtual server to automatically serve via the backup servers if the two primary servers are unavailable.
  • We have added additional redis slaves on hosts that were not previously affected by the outage.
  • We are continuing to pursue our investigation with the vendor and through our own testing.
  • Everyone on the operations team has made a commitment to halt further testing (with netem) until we can demonstrate it will not cause this failure again.
  • We have added “netem” to our Nagios check for blacklisted modules in case anyone forgets about that commitment.
  • We are updating our tools so that physically locating servers when Campfire (and thus our Campfire bot) is broken isn’t a hassle.

Additional information

We’ve built a Google spreadsheet which outlines information about the hosts that were affected. We’re being a bit cautious with reporting every single configuration detail because this could easily be used to maliciously impact someone’s (internal) network. If you’d like more information please contact netem (at) 37signals and we’ll vet each request individually.

Using Information About Our Network to Remove Monitoring Noise

Taylor
Taylor wrote this on 3 comments

Our team adds new checks and alerts every week so that we can stay ahead of new issues. We try very hard to make sure that each alert is configured and tested such that it provides timely and credible evidence of a real problem. Sometimes though, when things go wrong we are inundated with alert information which actually hinders and confuses our problem identification and resolution.

A real world example

A server with two 10 Gigabit network connections experiences a hardware failure and spontaneously reboots. Our Campfire room is filled with alerts not just for the host being down, but also for the switch (ports) the host is connected to.

We monitor the switch ports because we want to know that they are at the correct speed, that there are no individual failures, and that no “foreign” devices have been plugged into the network. In the case of a host failure, the information about the switch ports is secondary to the information about the host—but it represents 2x the volume of alert data we receive.

In cases like this we need to make our monitoring system more aware of the dependencies exist between these checks so that we can eliminate the noise. To do so we use a number of open source technologies:

Continued…

Scaling Your Database via InnoDB Table Compression

Taylor
Taylor wrote this on 7 comments

Basecamp Classic’s database is actually split across two sets of servers. One set contains a single table which is approximately 430 Gbs or more than half the entire volume of data (across both sets) in total.

Two years ago we separated this table because of its growth and size compared to the other tables. By separating the table we could scale the database hardware more closely to data growth, and we kept InnoDB buffer pool evictions to a minimum which made performance more stable.

Recently our monitoring showed some less than desirable metrics regarding this database pair: the least of which was that free storage would be exhausted in about 90 days. There was also a number of slow queries due to insufficient buffer pool space and slow queries from data “on disk”. We had already exhausted the normal tuning approaches and we needed to find a solution for these problems that didn’t involve significant time or money expenditures.

There are two common methods used to keep growing MySQL databases peforming optimally: buying new hardware or reducing the volume of data such that approximately 80% fits in memory. Buying new hardware is expensive and usually incurs a high time and staffing penalty. In most situations reducing the amount of the data is impossible because the database is actually growing through active use.

Continued…

Making Application Maintenance Fun

Taylor
Taylor wrote this on Discuss

Today I made intermission public. As I mentioned in my post about mysql_role_swap we’ve been working hard to limit / eliminate the impact our operations maintenance tasks have on our customer’s experience.

A few people noticed the /tmp/hold “leftover” in mysql_role_swap script. intermission is a product of that early exploration with coordinating database maintenance with request pausing in the web application tier. I’ve done a good bit of non production testing with intermission, but only limited production testing.

Last Friday we used intermission with mysql_role_swap to move Writeboard’s database to a new server. We had a single user facing exception, and we think it was likely caused by something other than the maintenance. For Friday’s maintenance we enabled request pausing via intermission, ran mysql_role_swap, restarted the unicorn (rails) processes, and then unpaused the requests. Total maintenance time was just a few seconds!

Making Database Maintenance Fun: mysql_role_swap

Taylor
Taylor wrote this on 10 comments

We’ve come a long way in the last year in the way we operate our sites. We’ve stabilized our applications, improved their response time, and increased their availability.

To accomplish these improvements we’ve done a series of database maintenances that varied from upgraded hardware, to new database servers, to configuration changes that required restart. In each of these operations we had one common goal: minimize the interruption to our customers.

Today we are releasing a small script that has made our lives, and our customer’s lives a whole lot better. We use this script to change the roles of our databases from replication masters to slaves, and vice versa. The fact that the script does all the steps previously performed by a human in a more timely and perfect manner is where we achieve all the gain.

Without this script we used to spend minutes accomplishing these maintenance tasks. With the script we’ve swapped databases under production load with no user noticeable interruption!

The script has lots of hard coded paths and users and other assumptions. But this is too good to keep to ourselves. We’re sharing it with you with the hope that it will improve your operations experience, and that you will contribute back changes that make it even better.

Behind the Scenes: The Hardware that Powers Basecamp, Campfire and Highrise

Taylor
Taylor wrote this on 20 comments

A few of us recently attended Velocity Conference in San Jose, CA. In the “hallway track sessions” a number people asked about the hardware that powers Basecamp, Campfire and Highrise.

Application Servers
All of our Ruby/Rails application roles run on Dell C Series 5220 servers. We chose C5220 servers because they provide high density, high performance, and low cost compute sleds at a decent cost point. The C5220 sleds replaced invidual Dell R710 servers which consumed a greater amount of power and rack space in addition to offering expandability we were not utilizing.

We use an 8 sled configuration with E31270 3.40GHz processors, 32/16G of ram, an LSI raid card and 2 non Dell SSDs. (For those of you thinking of ordering these … get the LSI raid card. The built in Intel raid is unreliable.) Each chassis with 8 sleds takes up 4U of rackspace: 3 for the chassis and 1 for cabling.

Job / Utility Servers
We use a combination of C6100 and C6220 servers to power our utility/jobs and API roles. We exclusively use the 4 sled version (of each) which means we get 4 “servers” in 2U. Each sled has 2x X5650 processors, 48-96G of ram, 2-6 ssds, and 4×1G or 1×10G network interfaces. This design allows to have up to 24 disks in a single chassis while consuming the same space as a single R710 server (which holds 8 disks max).

Search
For Solr we run R710s filled with SSDs. Each instance varies, but a common configuration is 2x E5530 processors, 48G of ram, 4-8 ssds, and 4×1g network interfaces. For Elastic Search we run a mix of Poweredge 2950 servers and C 5220 sleds with 12-16G of ram and 2×400G ssds in a raid 1.

Database and Memcache/Redis Servers
For Database roles we use R710s with 2x X5670 processors, 1.2TB Fusion-IO duo cards and varying amounts of memory. (Varies based on the database size.) We also have a number of older R710s powering Memcache and Redis instances. Each of these has has 2x E5530 processors and 2-4 disks with 4×1G network interfaces.

Storage
We have around 400TB / 9 nodes of Isilon 36 and 72NL storage. We serve all of the user uploaded content off this storage with backups to S3.

OS Choice
Database servers run RHEL or Centos 6 while application and utility servers run Ubuntu LTS.

Matt Kent joins 37signals as Sysop

Taylor
Taylor wrote this on 11 comments

Today Matt Kent started at 37signals as the sixth member of our operations team. Previously Matt worked for 10 years(!) at Bravenet as a System Administrator handling application deployment and scaling tasks. Some of Matt’s managers and coworkers used phrases like “backbone of our team” and “a great person, and a fantastic engineer” to describe what working with Matt is like.

Recently Matt was selected as an Opscode MVP for his contributions to Chef not once, not twice, but three times. Less than a day after Matt’s first interview, he resolved both our Chef bug reports endearing him to the entire team. In addition, Matt has experience with multiple data center setups, automated deployment, monitoring, and just about every other area we touch.

Welcome Matt!

I heard you like numbers...

Taylor
Taylor wrote this on 61 comments

In 2011

  • Our support team responded to 100,000 cases
  • Our syslog server logged an average of 1,500,000,000+ messages each day
  • Our solr indexer processed 428,000,000 indexing requests for Highrise alone
  • We hit our 100,000,000th person/company creation in Highrise
  • And a Basecamp user uploaded the 100,000,000th file (It was a picture of a cat!)

In the first 10 business days of 2012

  • We stored 100,000,000 unique statsd measurements
  • Our syslog server logged over 20,000,000,000 messages
  • Our applications sent more than 2,000 email notifications per minute
  • And Basecamp accepted an average of 75,000 file uploads from users per day

Interested in numbers other than revenue and profit? Leave a comment and we (mostly Noah) will dig them up for a future post.

Why Attacking Application Exceptions is Important

Taylor
Taylor wrote this on 14 comments

Background

Every week, a few of our programmers focus on responding to customer problems that might indicate a bigger issue with one of our applications. In addition, we’re constantly looking at issues (Rails exceptions, performance “hotspots”, etc.) that don’t bubble up to the customer level, but that are just “good housekeeping”. Taking care of these issues can have both real and measurable impacts — for example, we reduced application exceptions by 43% in December, allowing us to proactively fix even more problems before customers noticed.

Lets enter the house of Campfire, our real time chat product. Campfire uses memcache in a number of ways, one of which is to store fragments of messages that have been “spoken” in a room. When a user requests the transcript for that room we attempt to build a list of these messages from the cache before we hit the database. To do this, we build an array of message keys which are dispatched to a memcache-client multi_get request.

For a long time Campfire has generated exceptions showing multi_get failures, but the errors aren’t bringing down the site or causing users problems. When I first came to 37signals the exceptions were described as being caused by a bug in older versions of memcached (server), but no one knew “for sure”. Lacking any open issues or similar bug reports, as a work around our developers had patched the memcache-client to rescue ENOPIPE errors, raise by these failed get requests. The patch didn’t make the problem go away, but it kept it from being customer impacting…

Identifying the Problem

While testing a new branch of Campfire in staging, one of our developers was repeatedly encountering this error to the point where an exception was being raised for almost every page. Remember how I said these things pop up at the most inconvenient times? I decided to try and figure out these exceptions once and for all. (Ultimately, it turns out we hadn’t bundled the patched gem and that’s why the exception was being raised so often.)

After reviewing the code for memcache-client, I decided there wasn’t anything obviously wrong with its implementation for multi_get requests. To start my debugging, I fired up telnet, set two keys, and issued a “get keyone keytwo”. This worked fine. I had established that multi gets in the most basic form do work.

Next I used strace to attach to the unicorn (Ruby on Rails) process and see what the client was sending:

write(15, "get enter_messages/343555238/hide_author=false&hide_date=false&last_read=343841341 timestamp_messages/343575826/hide_author=false&hide_date=false&last_read=343841341 enter_messages/343575827/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343582929/hide_author=false&hide_date=true&last_read=343841341 enter_messages/343582930/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343589192/hide_author=false&hide_date=true&last_read=343841341 kick_messages/343589193/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343592436/hide_author=false&hide_date=true&last_read=343841341 enter_messages/343592437/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343606646/hide_author=false&hide_date=true&last_read=343841341 enter_messages/343606647/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343654455/hide_author=false&hide_date=true&last_read=343841341 upload_messages/343654456/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343663288/hide_author=false&hide_date=true&last_read=343841341 text_messages/343663289/hide_author=false&hide_date=true&last_read=343841341 text_messages/343663294/hide_author=true&hide_date=true&last_read=343841341 text_messages/343663298/hide_author=true&hide_date=true&last_read=343841341 text_messages/343663301/hide_author=true&hide_date=true&last_read=343841341 text_messages/343663459/hide_author=false&hide_date=true&last_read=343841341 text_messages/343663763/hide_author=true&hide_date=true&last_read=343841341 text_messages/343663833/hide_author=true&hide_date=true&last_read=343841341 timestamp_messages/343664243/hide_author=false&hide_date=true&last_read=343841341 text_messages/343664244/hide_author=false&hide_date=true&last_read=343841341 text_messages/343664249/hide_author=false&hide_date=true&last_read=343841341 text_messages/343664928/hide_author=true&hide_date=true&last_read=343841341 text_messages/343665538/hide_author=false&hide_date=true&last_read=343841341 text_messages/343665582/hide_author=true&hide_date=true&last_read=343841341 text_messages/343665661/hide_author=false&hide_date=true&last_read=343841341 text_messages/343665730/hide_author=true&hide_date=true&last_read=343841341 timestamp_messages/343668494/hide_author=false&hide_date=true&last_read=343841341 text_messages/343668495/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343670646/hide_author=false&hide_date=true&last_read=343841341 kick_messages/343670647/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343674727/hide_author=false&hide_date=true&last_read=343841341 upload_messages/343674728/hide_author=false&hide_date=true&last_read=343841341 text_messages/343675113/hide_author=true&hide_date=true&last_read=343841341 text_messages/343675330/hide_author=true&hide_date=true&last_read=343841341 text_messages/343675512/hide_author=true&hide_date=true&last_read=343841341 text_messages/343675662/hide_author=false&hide_date=true&last_read=343841341 text_messages/343675783/hide_author=true&hide_date=true&last_read=343841341 text_messages/343675850/hide_author=false&hide_date=true&last_read=343841341 text_messages/343675926/hide_author=false&hide_date=true&last_read=343841341 text_messages/343675962/hide_author=false&hide_date=true&last_read=343841341 text_messages/343676003/hide_author=true&hide_date=true&last_read=343841341 text_messages/343676419/hide_author=false&hide_date=true&last_read=343841341 text_messages/343676627/hide_author=true&hide_date=true&last_read=343841341 timestamp_messages/343683775/hide_author=false&hide_date=true&last_read=343841341 enter_messages/343683778/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343697190/hide_author=false&hide_date=true&last_read=343841341 kick_messages/343697191/hide_author=false&hide_date=true&last_read=343841341 text_messages/343698503/hide_author=false&hide_date=true&last_read=343841341 text_messages/343698601/hide_author=true&hide_date=true&last_read=343841341 text_messages/343698676/hi", 4096) = 4096
rt_sigprocmask(SIG_BLOCK, [RT_27 RT_28], [], 8) = 0
select(17, [], [15], [], {30, 525125})  = 1 (out [15], left {30, 525122})
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
write(15, "de_author=false&hide_date=true&last_read=343841341 text_messages/343698734/hide_author=false&hide_date=true&last_read=343841341 upload_messages/343699275/hide_author=true&hide_date=true&last_read=343841341 text_messages/343699336/hide_author=true&hide_date=true&last_read=343841341 text_messages/343699454/hide_author=true&hide_date=true&last_read=343841341 text_messages/343699578/hide_author=false&hide_date=true&last_read=343841341 text_messages/343699907/hide_author=false&hide_date=true&last_read=343841341 text_messages/343700196/hide_author=true&hide_date=true&last_read=343841341 text_messages/343700303/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343700490/hide_author=false&hide_date=true&last_read=343841341 text_messages/343700491/hide_author=false&hide_date=true&last_read=343841341 text_messages/343700960/hide_author=true&hide_date=true&last_read=343841341 text_messages/343701033/hide_author=true&hide_date=true&last_read=343841341 text_messages/343701249/hide_author=true&hide_date=true&last_read=343841341 text_messages/343701495/hide_author=true&hide_date=true&last_read=343841341 paste_messages/343701499/hide_author=true&hide_date=true&last_read=343841341 text_messages/343701536/hide_author=true&hide_date=true&last_read=343841341 text_messages/343701811/hide_author=true&hide_date=true&last_read=343841341 timestamp_messages/343705859/hide_author=false&hide_date=true&last_read=343841341 upload_messages/343705860/hide_author=false&hide_date=true&last_read=343841341 upload_messages/343705927/hide_author=true&hide_date=true&last_read=343841341 timestamp_messages/343707289/hide_author=false&hide_date=true&last_read=343841341 text_messages/343707290/hide_author=false&hide_date=true&last_read=343841341 text_messages/343707541/hide_author=false&hide_date=true&last_read=343841341 text_messages/343707657/hide_author=false&hide_date=true&last_read=343841341 text_messages/343708101/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343711297/hide_author=false&hide_date=true&last_read=343841341 upload_messages/343711298/hide_author=false&hide_date=true&last_read=343841341 upload_messages/343711359/hide_author=true&hide_date=true&last_read=343841341 timestamp_messages/343714551/hide_author=false&hide_date=true&last_read=343841341 text_messages/343714552/hide_author=false&hide_date=true&last_read=343841341 text_messages/343714779/hide_author=false&hide_date=true&last_read=343841341 text_messages/343715190/hide_author=true&hide_date=true&last_read=343841341 text_messages/343715303/hide_author=true&hide_date=true&last_read=343841341 text_messages/343715380/hide_author=true&hide_date=true&last_read=343841341 text_messages/343715452/hide_author=true&hide_date=true&last_read=343841341 text_messages/343715558/hide_author=true&hide_date=true&last_read=343841341 text_messages/343716023/hide_author=false&hide_date=true&last_read=343841341 text_messages/343716257/hide_author=false&hide_date=true&last_read=343841341 text_messages/343716298/hide_author=false&hide_date=true&last_read=343841341 text_messages/343716338/hide_author=true&hide_date=true&last_read=343841341 text_messages/343716577/hide_author=false&hide_date=true&last_read=343841341 text_messages/343716745/hide_author=false&hide_date=true&last_read=343841341 text_messages/343716816/hide_author=true&hide_date=true&last_read=343841341 text_messages/343716900/hide_author=false&hide_date=true&last_read=343841341 text_messages/343716999/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343717255/hide_author=false&hide_date=true&last_read=343841341 text_messages/343717256/hide_author=false&hide_date=true&last_read=343841341 text_messages/343717265/hide_author=false&hide_date=true&last_read=343841341 text_messages/343717555/hide_author=true&hide_date=true&last_read=343841341 text_messages/343717960/hide_author=true&hide_date=true&last_read=343841341 text_messages/343718088/hide_author=false&hide_date=true&last_read=343841341 text_messages/343718097/hide_author=false&hide_date=true&last_read=343841341 text_messages/34371819", 4096) = 4096
--- SIGRT_27 (Real-time signal 25) @ 0 (0) ---
getrusage(RUSAGE_SELF, {ru_utime={4, 270000}, ru_stime={0, 190000}, ...}) = 0
timer_settime(0x14, 0, {it_interval={0, 12712000}, it_value={0, 12712000}}, NULL) = 0
rt_sigreturn(0x4b5412a22c0ef)           = 4096
rt_sigprocmask(SIG_BLOCK, [RT_27 RT_28], [], 8) = 0
select(17, [], [15], [], {30, 517198})  = 1 (out [15], left {30, 517196})
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
write(15, "2/hide_author=false&hide_date=true&last_read=343841341 text_messages/343718248/hide_author=false&hide_date=true&last_read=343841341 text_messages/343718670/hide_author=false&hide_date=true&last_read=343841341 text_messages/343718779/hide_author=true&hide_date=true&last_read=343841341 text_messages/343718955/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343725746/hide_author=false&hide_date=true&last_read=343841341 text_messages/343725747/hide_author=false&hide_date=true&last_read=343841341 text_messages/343725757/hide_author=true&hide_date=true&last_read=343841341 text_messages/343725762/hide_author=true&hide_date=true&last_read=343841341 timestamp_messages/343729823/hide_author=false&hide_date=true&last_read=343841341 text_messages/343729824/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343758537/hide_author=false&hide_date=true&last_read=343841341 enter_messages/343758538/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343764428/hide_author=false&hide_date=true&last_read=343841341 text_messages/343764429/hide_author=false&hide_date=true&last_read=343841341 text_messages/343764474/hide_author=true&hide_date=true&last_read=343841341 text_messages/343764528/hide_author=true&hide_date=true&last_read=343841341 text_messages/343764620/hide_author=true&hide_date=true&last_read=343841341 timestamp_messages/343773529/hide_author=false&hide_date=true&last_read=343841341 text_messages/343773530/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343783468/hide_author=false&hide_date=true&last_read=343841341 text_messages/343783469/hide_author=false&hide_date=true&last_read=343841341 upload_messages/343783615/hide_author=true&hide_date=true&last_read=343841341 text_messages/343783650/hide_author=true&hide_date=true&last_read=343841341 text_messages/343783875/hide_author=true&hide_date=true&last_read=343841341 text_messages/343784004/hide_author=true&hide_date=true&last_read=343841341 text_messages/343784277/hide_author=false&hide_date=true&last_read=343841341 text_messages/343784372/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343784578/hide_author=false&hide_date=true&last_read=343841341 text_messages/343784579/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343802940/hide_author=false&hide_date=true&last_read=343841341 text_messages/343802941/hide_author=false&hide_date=true&last_read=343841341 paste_messages/343802954/hide_author=true&hide_date=true&last_read=343841341 text_messages/343803300/hide_author=true&hide_date=true&last_read=343841341 timestamp_messages/343812622/hide_author=false&hide_date=true&last_read=343841341 text_messages/343812623/hide_author=false&hide_date=true&last_read=343841341 text_messages/343813469/hide_author=false&hide_date=true&last_read=343841341 paste_messages/343813500/hide_author=true&hide_date=true&last_read=343841341 text_messages/343813695/hide_author=true&hide_date=true&last_read=343841341 text_messages/343813765/hide_author=false&hide_date=true&last_read=343841341 text_messages/343814066/hide_author=true&hide_date=true&last_read=343841341 enter_messages/343814107/hide_author=false&hide_date=true&last_read=343841341 upload_messages/343814360/hide_author=false&hide_date=true&last_read=343841341 text_messages/343814556/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343814680/hide_author=false&hide_date=true&last_read=343841341 text_messages/343814681/hide_author=false&hide_date=true&last_read=343841341 text_messages/343815559/hide_author=false&hide_date=true&last_read=343841341 text_messages/343815889/hide_author=true&hide_date=true&last_read=343841341 paste_messages/343815893/hide_author=false&hide_date=true&last_read=343841341 text_messages/343815915/hide_author=true&hide_date=true&last_read=343841341 text_messages/343815953/hide_author=true&hide_date=true&last_read=343841341 text_messages/343816359/hide_author=false&hide_date=true&last_read=343841341 text_messages/343816657/hide_author=true&hide_date=true&last_read=3438413", 4096) = -1 EPIPE (Broken pipe)
--- SIGPIPE (Broken pipe) @ 0 (0) ---
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
sendto(9, "<13>Dec 29 20:43:11 campfire[17363]: [37s] Generic failure: Errno::EPIPE: Broken pipe", 85, MSG_NOSIGNAL, NULL, 0) = 85
rt_sigprocmask(SIG_BLOCK, [RT_27 RT_28], [], 8) = 0
select(17, [], [15], [], {30, 516190})  = 1 (out [15], left {30, 516188})
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, [RT_27 RT_28], [], 8) = 0
select(17, [], [15], [], {30, 516071})  = 1 (out [15], left {30, 516069})
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
write(15, "get enter_messages/343555238/hide_author=false&hide_date=false&last_read=343841341 timestamp_messages/343575826/hide_author=false&hide_date=false&last_read=343841341 enter_messages/343575827/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343582929/hide_author=false&hide_date=true&last_read=343841341 enter_messages/343582930/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343589192/hide_author=false&hide_date=true&last_read=343841341 kick_messages/343589193/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343592436/hide_author=false&hide_date=true&last_read=343841341 enter_messages/343592437/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343606646/hide_author=false&hide_date=true&last_read=343841341 enter_messages/343606647/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343654455/hide_author=false&hide_date=true&last_read=343841341 upload_messages/343654456/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343663288/hide_author=false&hide_date=true&last_read=343841341 text_messages/343663289/hide_author=false&hide_date=true&last_read=343841341 text_messages/343663294/hide_author=true&hide_date=true&last_read=343841341 text_messages/343663298/hide_author=true&hide_date=true&last_read=343841341 text_messages/343663301/hide_author=true&hide_date=true&last_read=343841341 text_messages/343663459/hide_author=false&hide_date=true&last_read=343841341 text_messages/343663763/hide_author=true&hide_date=true&last_read=343841341 text_messages/343663833/hide_author=true&hide_date=true&last_read=343841341 timestamp_messages/343664243/hide_author=false&hide_date=true&last_read=343841341 text_messages/343664244/hide_author=false&hide_date=true&last_read=343841341 text_messages/343664249/hide_author=false&hide_date=true&last_read=343841341 text_messages/343664928/hide_author=true&hide_date=true&last_read=343841341 text_messages/343665538/hide_author=false&hide_date=true&last_read=343841341 text_messages/343665582/hide_author=true&hide_date=true&last_read=343841341 text_messages/343665661/hide_author=false&hide_date=true&last_read=343841341 text_messages/343665730/hide_author=true&hide_date=true&last_read=343841341 timestamp_messages/343668494/hide_author=false&hide_date=true&last_read=343841341 text_messages/343668495/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343670646/hide_author=false&hide_date=true&last_read=343841341 kick_messages/343670647/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343674727/hide_author=false&hide_date=true&last_read=343841341 upload_messages/343674728/hide_author=false&hide_date=true&last_read=343841341 text_messages/343675113/hide_author=true&hide_date=true&last_read=343841341 text_messages/343675330/hide_author=true&hide_date=true&last_read=343841341 text_messages/343675512/hide_author=true&hide_date=true&last_read=343841341 text_messages/343675662/hide_author=false&hide_date=true&last_read=343841341 text_messages/343675783/hide_author=true&hide_date=true&last_read=343841341 text_messages/343675850/hide_author=false&hide_date=true&last_read=343841341 text_messages/343675926/hide_author=false&hide_date=true&last_read=343841341 text_messages/343675962/hide_author=false&hide_date=true&last_read=343841341 text_messages/343676003/hide_author=true&hide_date=true&last_read=343841341 text_messages/343676419/hide_author=false&hide_date=true&last_read=343841341 text_messages/343676627/hide_author=true&hide_date=true&last_read=343841341 timestamp_messages/343683775/hide_author=false&hide_date=true&last_read=343841341 enter_messages/343683778/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343697190/hide_author=false&hide_date=true&last_read=343841341 kick_messages/343697191/hide_author=false&hide_date=true&last_read=343841341 text_messages/343698503/hide_author=false&hide_date=true&last_read=343841341 text_messages/343698601/hide_author=true&hide_date=true&last_read=343841341 text_messages/343698676/hi", 4096) = -1 EPIPE (Broken pipe)
--- SIGPIPE (Broken pipe) @ 0 (0) ---
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
sendto(9, "<13>Dec 29 20:43:11 campfire[17363]: [37s] Generic failure: Errno::EPIPE: Broken pipe", 85, MSG_NOSIGNAL, NULL, 0) = 85
close(15)                               = 0
munmap(0x7f7114948000, 4096)            = 0
close(15)                               = -1 EBADF (Bad file descriptor)

Here’s a basic run down of what you see above: The unicorn process (Rails) opens a socket to our memcache server in staging, and then issues a get command which is broken up into multiple 4096 byte packets. However, instead of receiving a reply from the server, our client socket gets a Broken Pipe error and it’s closed.

“OK…” I thought. I need to reference the memcache protocol specification and make sure this request conforms to the “get” protocol command. Was this request within in the protocol description? Yes.

At this point I knew the client was sending a legitimate request, so I needed to debug the server. Since our application uses multiple instances, I downed all but one instance, and fired up strace (strace -s 64000 -fp 7133).

[pid  7136] epoll_wait(19, {{EPOLLIN, {u32=32, u64=32}}}, 32, 4294967295) = 1
[pid  7136] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
[pid  7136] read(32, "get enter_messages/343555238/hide_author=false&hide_date=false&last_read=343841341 timestamp_messages/343575826/hide_author=false&hide_date=false&last_read=343841341 enter_messages/343575827/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343582929/hide_author=false&hide_date=true&last_read=343841341 enter_messages/343582930/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343589192/hide_author=false&hide_date=true&last_read=343841341 kick_messages/343589193/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343592436/hide_author=false&hide_date=true&last_read=343841341 enter_messages/343592437/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343606646/hide_author=false&hide_date=true&last_read=343841341 enter_messages/343606647/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343654455/hide_author=false&hide_date=true&last_read=343841341 upload_messages/343654456/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343663288/hide_author=false&hide_date=true&last_read=343841341 text_messages/343663289/hide_author=false&hide_date=true&last_read=343841341 text_messages/343663294/hide_author=true&hide_date=true&last_read=343841341 text_messages/343663298/hide_author=true&hide_date=true&last_read=343841341 text_messages/343663301/hide_author=true&hide_date=true&last_read=343841341 text_messages/343663459/hide_author=false&hide_date=true&last_read=343841341 text_messages/343663763/hide_author=true&hide_date=true&last_read=343841341 text_messages/343663833/hide_author=true&hide_date=true&last_read=343841341 timestamp_messages/343664243/hide_author=false&hide_date=true&last_read=343841341 text_messages/343664244/hide_author=false&hide_date=true&last_read=343841341 text_messages/343664249/hide_author=false&hide_date=true&last_read=343841341 text_messages/343664928/hide_author=true&hide_date=true&last_read=343841341 text_messages/343665538/hide_author=false&hide_date=true&last_read=343841341", 2048) = 2048
[pid  7136] read(32, " text_messages/343665582/hide_author=true&hide_date=true&last_read=343841341 text_messages/343665661/hide_author=false&hide_date=true&last_read=343841341 text_messages/343665730/hide_author=true&hide_date=true&last_read=343841341 timestamp_messages/343668494/hide_author=false&hide_date=true&last_read=343841341 text_messages/343668495/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343670646/hide_author=false&hide_date=true&last_read=343841341 kick_messages/343670647/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343674727/hide_author=false&hide_date=true&last_read=343841341 upload_messages/343674728/hide_author=false&hide_date=true&last_read=343841341 text_messages/343675113/hide_author=true&hide_date=true&last_read=343841341 text_messages/343675330/hide_author=true&hide_date=true&last_read=343841341 text_messages/343675512/hide_author=true&hide_date=true&last_read=343841341 text_messages/343675662/hide_author=false&hide_date=true&last_read=343841341 text_messages/343675783/hide_author=true&hide_date=true&last_read=343841341 text_messages/343675850/hide_author=false&hide_date=true&last_read=343841341 text_messages/343675926/hide_author=false&hide_date=true&last_read=343841341 text_messages/343675962/hide_author=false&hide_date=true&last_read=343841341 text_messages/343676003/hide_author=true&hide_date=true&last_read=343841341 text_messages/343676419/hide_author=false&hide_date=true&last_read=343841341 text_messages/343676627/hide_author=true&hide_date=true&last_read=343841341 timestamp_messages/343683775/hide_author=false&hide_date=true&last_read=343841341 enter_messages/343683778/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343697190/hide_author=false&hide_date=true&last_read=343841341 kick_messages/343697191/hide_author=false&hide_date=true&last_read=343841341 text_messages/343698503/hide_author=false&hide_date=true&last_read=343841341 text_messages/343698601/hide_author=true&hide_date=true&last_read=343841341 text_messages/343698676/hi", 2048) = 2048
[pid  7136] read(32, "de_author=false&hide_date=true&last_read=343841341 text_messages/343698734/hide_author=false&hide_date=true&last_read=343841341 upload_messages/343699275/hide_author=true&hide_date=true&last_read=343841341 text_messages/343699336/hide_author=true&hide_date=true&last_read=343841341 text_messages/343699454/hide_author=true&hide_date=true&last_read=343841341 text_messages/343699578/hide_author=false&hide_date=true&last_read=343841341 text_messages/343699907/hide_author=false&hide_date=true&last_read=343841341 text_messages/343700196/hide_author=true&hide_date=true&last_read=343841341 text_messages/343700303/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343700490/hide_author=false&hide_date=true&last_read=343841341 text_messages/343700491/hide_author=false&hide_date=true&last_read=343841341 text_messages/343700960/hide_author=true&hide_date=true&last_read=343841341 text_messages/343701033/hide_author=true&hide_date=true&last_read=343841341 text_messages/343701249/hide_author=true&hide_date=true&last_read=343841341 text_messages/343701495/hide_author=true&hide_date=true&last_read=343841341 paste_messages/343701499/hide_author=true&hide_date=true&last_read=343841341 text_messages/343701536/hide_author=true&hide_date=true&last_read=343841341 text_messages/343701811/hide_author=true&hide_date=true&last_read=343841341 timestamp_messages/343705859/hide_author=false&hide_date=true&last_read=343841341 upload_messages/343705860/hide_author=false&hide_date=true&last_read=343841341 upload_messages/343705927/hide_author=true&hide_date=true&last_read=343841341 timestamp_messages/343707289/hide_author=false&hide_date=true&last_read=343841341 text_messages/343707290/hide_author=false&hide_date=true&last_read=343841341 text_messages/343707541/hide_author=false&hide_date=true&last_read=343841341 text_messages/343707657/hide_author=false&hide_date=true&last_read=343841341 text_messages/343708101/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343711297/hide_author=false&hide_date=true&last_read=343841341 upload_messages/343711298/hide_author=false&hide_date=true&last_read=343841341 upload_messages/343711359/hide_author=true&hide_date=true&last_read=343841341 timestamp_messages/343714551/hide_author=false&hide_date=true&last_read=343841341 text_messages/343714552/hide_author=false&hide_date=true&last_read=343841341 text_messages/343714779/hide_author=false&hide_date=true&last_read=343841341 text_messages/343715190/hide_author=true&hide_date=true&last_read=343841341 text_messages/343715303/hide_author=true&hide_date=true&last_read=343841341 text_messages/343715380/hide_author=true&hide_date=true&last_read=343841341 text_messages/343715452/hide_author=true&hide_date=true&last_read=343841341 text_messages/343715558/hide_author=true&hide_date=true&last_read=343841341 text_messages/343716023/hide_author=false&hide_date=true&last_read=343841341 text_messages/343716257/hide_author=false&hide_date=true&last_read=343841341 text_messages/343716298/hide_author=false&hide_date=true&last_read=343841341 text_messages/343716338/hide_author=true&hide_date=true&last_read=343841341 text_messages/343716577/hide_author=false&hide_date=true&last_read=343841341 text_messages/343716745/hide_author=false&hide_date=true&last_read=343841341 text_messages/343716816/hide_author=true&hide_date=true&last_read=343841341 text_messages/343716900/hide_author=false&hide_date=true&last_read=343841341 text_messages/343716999/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343717255/hide_author=false&hide_date=true&last_read=343841341 text_messages/343717256/hide_author=false&hide_date=true&last_read=343841341 text_messages/343717265/hide_author=false&hide_date=true&last_read=343841341 text_messages/343717555/hide_author=true&hide_date=true&last_read=343841341 text_messages/343717960/hide_author=true&hide_date=true&last_read=343841341 text_messages/343718088/hide_author=false&hide_date=true&last_read=343841341 text_messages/343718097/hide_author=false&hide_date=true&last_read=343841341 text_messages/34371819", 4096) = 4096
[pid  7136] rt_sigprocmask(SIG_BLOCK, [RT_27 RT_28], [], 8) = 0
[pid  7136] read(32, "2/hide_author=false&hide_date=true&last_read=343841341 text_messages/343718248/hide_author=false&hide_date=true&last_read=343841341 text_messages/343718670/hide_author=false&hide_date=true&last_read=343841341 text_messages/343718779/hide_author=true&hide_date=true&last_read=343841341 text_messages/343718955/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343725746/hide_author=false&hide_date=true&last_read=343841341 text_messages/343725747/hide_author=false&hide_date=true&last_read=343841341 text_messages/343725757/hide_author=true&hide_date=true&last_read=343841341 text_messages/343725762/hide_author=true&hide_date=true&last_read=343841341 timestamp_messages/343729823/hide_author=false&hide_date=true&last_read=343841341 text_messages/343729824/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343758537/hide_author=false&hide_date=true&last_read=343841341 enter_messages/343758538/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343764428/hide_author=false&hide_date=true&last_read=343841341 text_messages/343764429/hide_author=false&hide_date=true&last_read=343841341 text_messages/343764474/hide_author=true&hide_date=true&last_read=343841341 text_messages/343764528/hide_author=true&hide_date=true&last_read=343841341 text_messages/343764620/hide_author=true&hide_date=true&last_read=343841341 timestamp_messages/343773529/hide_author=false&hide_date=true&last_read=343841341 text_messages/343773530/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343783468/hide_author=false&hide_date=true&last_read=343841341 text_messages/343783469/hide_author=false&hide_date=true&last_read=343841341 upload_messages/343783615/hide_author=true&hide_date=true&last_read=343841341 text_messages/343783650/hide_author=true&hide_date=true&last_read=343841341 text_messages/343783875/hide_author=true&hide_date=true&last_read=343841341 text_messages/343784004/hide_author=true&hide_date=true&last_read=343841341 text_messages/343784277/hide_author=false&hide_date=true&last_read=343841341 text_messages/343784372/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343784578/hide_author=false&hide_date=true&last_read=343841341 text_messages/343784579/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343802940/hide_author=false&hide_date=true&last_read=343841341 text_messages/343802941/hide_author=false&hide_date=true&last_read=343841341 paste_messages/343802954/hide_author=true&hide_date=true&last_read=343841341 text_messages/343803300/hide_author=true&hide_date=true&last_read=343841341 timestamp_messages/343812622/hide_author=false&hide_date=true&last_read=343841341 text_messages/343812623/hide_author=false&hide_date=true&last_read=343841341 text_messages/343813469/hide_author=false&hide_date=true&last_read=343841341 paste_messages/343813500/hide_author=true&hide_date=true&last_read=343841341 text_messages/343813695/hide_author=true&hide_date=true&last_read=343841341 text_messages/343813765/hide_author=false&hide_date=true&last_read=343841341 text_messages/343814066/hide_author=true&hide_date=true&last_read=343841341 enter_messages/343814107/hide_author=false&hide_date=true&last_read=343841341 upload_messages/343814360/hide_author=false&hide_date=true&last_read=343841341 text_messages/343814556/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343814680/hide_author=false&hide_date=true&last_read=343841341 text_messages/343814681/hide_author=false&hide_date=true&last_read=343841341 text_messages/343815559/hide_author=false&hide_date=true&last_read=343841341 text_messages/343815889/hide_author=true&hide_date=true&last_read=343841341 paste_messages/343815893/hide_author=false&hide_date=true&last_read=343841341 text_messages/343815915/hide_author=true&hide_date=true&last_read=343841341 text_messages/343815953/hide_author=true&hide_date=true&last_read=343841341 text_messages/343816359/hide_author=false&hide_date=true&last_read=343841341 text_messages/343816657/hide_author=true&hide_date=true&last_read=343841341 paste_messages/343816830/hide_author=false&hide_date=true&last_read=343841341 text_messages/343817036/hide_author=true&hide_date=true&last_read=343841341 paste_messages/343817042/hide_author=true&hide_date=true&last_read=343841341 text_messages/343817189/hide_author=false&hide_date=true&last_read=343841341 text_messages/343817229/hide_author=true&hide_date=true&last_read=343841341 text_messages/343817955/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343818321/hide_author=false&hide_date=true&last_read=343841341 text_messages/343818322/hide_author=false&hide_date=true&last_read=343841341 text_messages/343819079/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343825796/hide_author=false&hide_date=true&last_read=343841341 text_messages/343825797/hide_author=false&hide_date=true&last_read=343841341 text_messages/343825802/hide_author=true&hide_date=true&last_read=343841341 text_messages/343825809/hide_author=true&hide_date=true&last_read=343841341 text_messages/343826383/hide_author=false&hide_date=true&last_read=343841341 paste_messages/343826403/hide_author=true&hide_date=true&last_read=343841341 text_messages/343826708/hide_author=true&hide_date=true&last_read=343841341 paste_messages/343826722/hide_author=true&hide_date=true&last_read=343841341 text_messages/343827244/hide_author=true&hide_date=true&last_read=343841341 text_messages/343827298/hide_author=true&hide_date=true&last_read=343841341 text_messages/343827299/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343830049/hide_author=false&hide_date=true&last_read=343841341 text_messages/343830050/hide_author=false&hide_date=true&last_read=343841341 text_messages/343830187/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343831000/hide_author=false&hide_date=true&last_read=343841341 text_messages/343831001/hide_author=false&hide_date=true&last_read=343841341 text_messages/343831123/hide_author=true&hide_date=true&last_read=343841341 text_messages/343831211/hide_author=true&hide_date=true&last_read=343841341 text_messages/343831487/hide_author=false&hide_date=true&last_read=343841341 paste_messages/343831492/hide_author=true&hide_date=true&last_read=343841341 text_messages/343831675/hide_author=false&hide_date=true&last_read=343841341 text_messages/343831725/hide_author=true&hide_date=true&last_read=343841341 text_messages/343831788/hide_author=true&hide_date=true&last_read=343841341 text_messages/343831979/hide_author=false&hide_date=true&last_read=343841341 text_messages/343833552/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343834848/hide_author=false&hide_date=true&last_read=343841341 paste_messages/343834849/hide_author=false&hide_date=true&last_read=343841341 text_messages/343834896/hide_author=true&hide_date=true&last_read=343841341 text_messages/343835031/hide_author=false&hide_date=true&last_read=343841341 text_messages/343836063/hide_author=true&hide_date=true&last_read=343841341 timestamp_messages/343840446/hide_author=false&hide_date=true&last_read=343841341 kick_messages/343840447/hide_author=false&hide_date=true&last_read=343841341 text_messages/343840829/hide_author=false&hide_date=true&last_read=343841341 text_messages/343840885/hide_author=false&hide_date=true&last_read=343841341 text_messages/343841050/hide_author=true&hide_date=true&last_read=343841341 paste_messages/343841076/hide_author=true&hide_date=true&last_read=343841341 text_messages/343841236/hide_author=false&hide_date=true&last_read=343841341 text_messages/343841341/hide_author=true&hide_date=true&last_read=343841341 text_messages/343841351/hide_author=false&hide_date=true&last_read=343841341 text_messages/343841428/hide_author=true&hide_date=true&last_read=343841341 text_messages/343841582/hide_author=true&hide_date=true&last_read=343841341 text_messages/343841603/hide_author=false&hide_date=true&last_read=343841341 text_messages/343841692/hide_author=true&hide_date=true&last_read=343841341 text_messages/343841855/hide_author=true&hide_date=true&last_read=343841341", 8192) = 8192
[pid  7136] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
[pid  7136] rt_sigprocmask(SIG_BLOCK, [RT_27 RT_28], [], 8) = 0
[pid  7136] read(32, " text_messages/343841924/hide_author=false&hide_date=true&last_read=343841341 text_messages/343841969/hide_author=false&hide_date=true&last_read=343841341 text_messages/343842042/hide_author=false&hide_date=true&last_read=343841341 text_messages/343842523/hide_author=true&hide_date=true&last_read=343841341 text_messages/343842910/hide_author=false&hide_date=true&last_read=343841341 text_messages/343843039/hide_author=true&hide_date=true&last_read=343841341 text_messages/343843171/hide_author=false&hide_date=true&last_read=343841341 text_messages/343843284/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343843427/hide_author=false&hide_date=true&last_read=343841341 text_messages/343843428/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343843476/hide_author=false&hide_date=true&last_read=343841341 text_messages/343843477/hide_author=false&hide_date=true&last_read=343841341 text_messages/343843547/hide_author=false&hide_date=true&last_read=343841341 text_messages/343843686/hide_author=false&hide_date=true&last_read=343841341 paste_messages/343843878/hide_author=true&hide_date=true&last_read=343841341 text_messages/343843929/hide_author=false&hide_date=true&last_read=343841341 text_messages/343844070/hide_author=false&hide_date=true&last_read=343841341 text_messages/343844547/hide_author=false&hide_date=true&last_read=343841341 text_messages/343844842/hide_author=true&hide_date=true&last_read=343841341 text_messages/343844983/hide_author=true&hide_date=true&last_read=343841341 timestamp_messages/343847757/hide_author=false&hide_date=true&last_read=343841341 text_messages/343847758/hide_author=false&hide_date=true&last_read=343841341 text_messages/343848887/hide_author=true&hide_date=true&last_read=343841341 text_messages/343849266/hide_author=false&hide_date=true&last_read=343841341 text_messages/343849330/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343850692/hide_author=false&hide_date=true&last_read=343841341 text_messages/343850693/hide_author=false&hide_date=true&last_read=343841341 text_messages/343850994/hide_author=false&hide_date=true&last_read=343841341 paste_messages/343851014/hide_author=true&hide_date=true&last_read=343841341 text_messages/343851040/hide_author=false&hide_date=true&last_read=343841341 text_messages/343851100/hide_author=true&hide_date=true&last_read=343841341 text_messages/343851171/hide_author=false&hide_date=true&last_read=343841341 text_messages/343852048/hide_author=false&hide_date=true&last_read=343841341 paste_messages/343852052/hide_author=true&hide_date=true&last_read=343841341 text_messages/343852107/hide_author=true&hide_date=true&last_read=343841341 text_messages/343852129/hide_author=true&hide_date=true&last_read=343841341 text_messages/343852168/hide_author=true&hide_date=true&last_read=343841341 text_messages/343852282/hide_author=false&hide_date=true&last_read=343841341 text_messages/343852487/hide_author=false&hide_date=true&last_read=343841341 text_messages/343852631/hide_author=true&hide_date=true&last_read=343841341 text_messages/343852765/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343855843/hide_author=false&hide_date=true&last_read=343841341 text_messages/343855844/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343856603/hide_author=false&hide_date=true&last_read=343841341 text_messages/343856604/hide_author=false&hide_date=true&last_read=343841341 text_messages/343856802/hide_author=false&hide_date=true&last_read=343841341 text_messages/343856970/hide_author=false&hide_date=true&last_read=343841341 text_messages/343857838/hide_author=false&hide_date=true&last_read=343841341 text_messages/343858343/hide_author=false&hide_date=true&last_read=343841341 text_messages/343858380/hide_author=false&hide_date=true&last_read=343841341 text_messages/343858474/hide_author=true&hide_date=true&last_read=343841341 text_messages/343858544/hide_author=true&hide_date=true&last_read=343841341 text_messages/343858596/hide_author=true&hide_date=true&last_read=343841341 timestamp_messages/343863819/hide_author=false&hide_date=true&last_read=343841341 text_messages/343863820/hide_author=false&hide_date=true&last_read=343841341 paste_messages/343863844/hide_author=true&hide_date=true&last_read=343841341 text_messages/343864027/hide_author=true&hide_date=true&last_read=343841341 text_messages/343864037/hide_author=false&hide_date=true&last_read=343841341 text_messages/343864071/hide_author=false&hide_date=true&last_read=343841341 text_messages/343864588/hide_author=false&hide_date=true&last_read=343841341 text_messages/343864630/hide_author=true&hide_date=true&last_read=343841341 paste_messages/343864755/hide_author=false&hide_date=true&last_read=343841341 text_messages/343865034/hide_author=true&hide_date=true&last_read=343841341 timestamp_messages/343866397/hide_author=false&hide_date=true&last_read=343841341 text_messages/343866398/hide_author=false&hide_date=true&last_read=343841341 text_messages/343866591/hide_author=true&hide_date=true&last_read=343841341 paste_messages/343866596/hide_author=true&hide_date=true&last_read=343841341 text_messages/343867002/hide_author=true&hide_date=true&last_read=343841341 text_messages/343867646/hide_author=false&hide_date=true&last_read=343841341 text_messages/343868189/hide_author=false&hide_date=true&last_read=343841341 text_messages/343868443/hide_author=false&hide_date=true&last_read=343841341 text_messages/343868603/hide_author=false&hide_date=true&last_read=343841341 text_messages/343868688/hide_author=true&hide_date=true&last_read=343841341 timestamp_messages/343868946/hide_author=false&hide_date=true&last_read=343841341 paste_messages/343868947/hide_author=false&hide_date=true&last_read=343841341 paste_messages/343869494/hide_author=false&hide_date=true&last_read=343841341 text_messages/343869517/hide_author=true&hide_date=true&last_read=343841341 timestamp_messages/343875073/hide_author=false&hide_date=true&last_read=343841341 text_messages/343875074/hide_author=false&hide_date=true&last_read=343841341 paste_messages/343875398/hide_author=true&hide_date=true&last_read=343841341 text_messages/343875689/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343877689/hide_author=false&hide_date=true&last_read=343841341 text_messages/343877690/hide_author=false&hide_date=true&last_read=343841341 text_messages/343878129/hide_author=false&hide_date=true&last_read=343841341 text_messages/343878477/hide_author=false&hide_date=true&last_read=343841341 text_messages/343878485/hide_author=true&hide_date=true&last_read=343841341 text_messages/343878491/hide_author=true&hide_date=true&last_read=343841341 text_messages/343878499/hide_author=false&hide_date=true&last_read=343841341 text_messages/343878588/hide_author=true&hide_date=true&last_read=343841341 timestamp_messages/343880810/hide_author=false&hide_date=true&last_read=343841341 text_messages/343880811/hide_author=false&hide_date=true&last_read=343841341 paste_messages/343880830/hide_author=true&hide_date=true&last_read=343841341 text_messages/343880919/hide_author=false&hide_date=true&last_read=343841341 text_messages/343880994/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343885259/hide_author=false&hide_date=true&last_read=343841341 text_messages/343885260/hide_author=false&hide_date=true&last_read=343841341 text_messages/343885374/hide_author=true&hide_date=true&last_read=343841341 text_messages/343885436/hide_author=true&hide_date=true&last_read=343841341 text_messages/343885452/hide_author=true&hide_date=true&last_read=343841341 enter_messages/343885679/hide_author=false&hide_date=true&last_read=343841341 text_messages/343885711/hide_author=false&hide_date=true&last_read=343841341 text_messages/343885723/hide_author=true&hide_date=true&last_read=343841341 text_messages/343885744/hide_author=true&hide_date=true&last_read=343841341 text_messages/343885878/hide_author=true&hide_date=true&last_read=343841341 text_messages/343886065/hide_author=true&hide_date=true&last_read=343841341 timestamp_messages/343886343/hide_author=false&hide_date=true&last_read=343841341 text_messages/343886344/hide_author=false&hide_date=true&last_read=343841341 text_messages/343888444/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343890769/hide_author=false&hide_date=true&last_read=343841341 text_messages/343890770/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343894891/hide_author=false&hide_date=true&last_read=343841341 text_messages/343894892/hide_author=false&hide_date=true&last_read=343841341 text_messages/343895056/hide_author=false&hide_date=true&last_read=343841341 text_messages/343895130/hide_author=true&hide_date=true&last_read=343841341 text_messages/343895152/hide_author=true&hide_date=true&last_read=343841341 text_messages/343895269/hide_author=true&hide_date=true&last_read=343841341 text_messages/343895393/hide_author=true&hide_date=true&last_read=343841341 text_messages/343895542/hide_author=true&hide_date=true&last_read=343841341 timestamp_messages/343902958/hide_author=false&hide_date=true&last_read=343841341 text_messages/343902959/hide_author=false&hide_date=true&last_read=343841341 text_messages/343903053/hide_author=true&hide_date=true&last_read=343841341 text_messages/343903267/hide_author=false&hide_date=true&last_read=343841341 text_messages/343903512/hide_author=false&hide_date=true&last_read=343841341 text_messages/343903628/hide_author=false&hide_date=true&last_read=343841341 text_messages/343903725/hide_author=true&hide_date=true&last_read=343841341 text_messages/343903864/hide_author=true&hide_date=true&last_read=343841341 text_messages/343903882/hide_author=false&hide_date=true&last_read=343841341 text_messages/343903968/hide_author=false&hide_date=true&last_read=343841341 text_messages/343904055/hide_author=false&hide_date=true&last_read=343841341 text_messages/343904090/hide_author=true&hide_date=true&last_read=343841341 text_messages/343904553/hide_author=false&hide_date=true&last_read=343841341 text_messages/343904624/hide_author=true&hide_date=true&last_read=343841341 text_messages/343904678/hide_author=true&hide_date=true&last_read=343841341 timestamp_messages/343908400/hide_author=false&hide_date=true&last_read=343841341 text_messages/343908401/hide_author=false&hide_date=true&last_read=343841341 text_messages/343908458/hide_author=false&hide_date=true&last_read=343841341 text_messages/343908510/hide_author=false&hide_date=true&last_read=343841341 text_messages/343908566/hide_author=true&hide_date=true&last_read=343841341 text_messages/343908629/hide_author=true&hide_date=true&last_read=343841341 timestamp_messages/343910155/hide_author=false&hide_date=true&last_read=343841341 text_messages/343910156/hide_author=false&hide_date=true&last_read=343841341 text_messages/343910180/hide_author=true&hide_date=true&last_read=343841341 text_messages/343910261/hide_author=true&hide_date=true&last_read=343841341 text_messages/343910355/hide_author=true&hide_date=true&last_read=343841341 text_messages/343910536/hide_author=true&hide_date=true&last_read=343841341 text_messages/343910565/hide_author=true&hide_date=true&last_read=343841341 timestamp_messages/343912945/hide_author=false&hide_date=true&last_read=343841341 text_messages/343912946/hide_author=false&hide_date=true&last_read=343841341 text_messages/343912955/hide_author=true&hide_date=true&last_read=343841341 timestamp_messages/343917638/hide_author=false&hide_date=true&last_read=343841341 kick_messages/343917639/hide_author=false&hide_date=true&last_read=343841341 text_messages/343919040/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343928070/hide_author=false&hide_date=true&last_read=343841341 text_messages/343928071/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343944056/hide_author=false&hide_date=true&last_read=343841341 text_messages/343944057/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343946376/hide_author=false&hide_date=true&last_read=343841341 text_messages/343946377/hide_author=false&hide_date=true&last_read=343841341 text_messages/343946786/hide_author=true&hide_date=true&last_read=343841341 text_messages/343947443/hide_author=true&hide_date=true&last_read=343841341 text_messages/343948263/hide_author=true&hide_date=true&last_read=343841341 timestamp_messages/343951411/hide_author=false&hide_date=true&last_read=343841341 kick_messages/343951412/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343958275/hide_author=false&hide_date=true&last_read=343841341 text_messages/343958276/hide_author=false&hide_date=true&last_read=343841341 paste_messages/343958310/hide_author=true&hide_date=true&last_read=343841341 text_messages/343959039/hide_author=false&hide_date=true&last_read=343841341 text_messages/343959484/hide_author=true&hide_date=true&last_read=343841341 timestamp_messages/343961703/hide_author=false&hide_date=true&last_read=343841341 text_messages/343961704/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343972558/hide_author=false&hide_date=true&last_read=343841341 kick_messages/343972559/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343976120/hide_author=false&hide_date=true&last_read=343841341 enter_messages/343976121/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343982780/hide_author=false&hide_date=true&last_read=343841341 enter_messages/343982781/hide_author=false&hide_date=true&last_read=343841341 text_messages/343982834/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343984059/hide_author=false&hide_date=true&last_read=343841341 kick_messages/343984060/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343987418/hide_author=false&hide_date=true&last_read=343841341 enter_messages/343987419/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/343997513/hide_author=false&hide_date=true&last_read=343841341 kick_messages/343997514/hide_author=false&hide_date=true&last_read=343841341 kick_messages/343997666/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/344008833/hide_author=false&hide_date=true&last_read=343841341 enter_messages/344008834/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/344009870/hide_author=false&hide_date=true&last_read=343841341 text_messages/344009871/hide_author=false&hide_date=true&last_read=343841341 text_messages/344010043/hide_author=false&hide_date=true&last_read=343841341 upload_messages/344010066/hide_author=true&hide_date=true&last_read=343841341 upload_messages/344010083/hide_author=true&hide_date=true&last_read=343841341 text_messages/344010323/hide_author=true&hide_date=true&last_read=343841341 timestamp_messages/344015438/hide_author=false&hide_date=true&last_read=343841341 kick_messages/344015439/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/344018437/hide_author=false&hide_date=true&last_read=343841341 kick_messages/344018438/hide_author=false&hide_date=true&last_read=343841341 timestamp_messages/344033626/hide_author=false&hide_date=true&last_read=343841341 enter_messages/344033627/hide_author=false&hide_date=true&last_read=343841341 text_messages/344033767/hide_author=false&hide_date=true&last_read=343841341 text_messages/344033769/hide_author=true&hide_date=true&last_read=343841341 text_messages/344033772/hide_author=true&hide_date=true&last_read=343841341 text_messages/344033774/hide_author=true&hide_date=true&last_read=343841341 timestamp_messages/344041143/hide_author=false&hide_date=true&last_read=343841341 upload_messages/344041144/hide_author=false&hide_date=true&last_read=343841341 text_messages/344041191/hide_author=true&hide_date=true&last_read=343841341 text_messages/344041227/hide_author=true&hide_date=true&last_read=343841341 timestamp_messages/344041414/hide_author=false&hide_date=true&last_read=343841341 text_messages/344041415/hide_author=false&hide_date=true&last_read=343841341 text_messages/344041637/hide_author=true&hide_date=true&last_read=343841341 text_messages/344041654/hide_author=true&hide_date=true&last_read=343841341 timestamp_messages/344052470/hide_author=false&hide_date=true", 16384) = 16384
[pid  7136] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
[pid  7136] --- SIGRT_27 (Real-time signal 25) @ 0 (0) ---
[pid  7136] getrusage(RUSAGE_SELF, {ru_utime={12, 90000}, ru_stime={13, 600000}, ...}) = 0
[pid  7136] rt_sigreturn(0x4b54187fd1996) = 0
[pid  7136] epoll_ctl(19, EPOLL_CTL_DEL, 32, {EPOLLIN, {u32=32, u64=32}}) = 0
[pid  7136] close(32)                   = 0
[pid  7136] listen(28, 1024)            = 0
[pid  7136] listen(27, 1024)            = 0
[pid  7136] brk(0)                      = 0x1c52000
[pid  7136] brk(0)                      = 0x1c52000
[pid  7136] brk(0)                      = 0x1c52000
[pid  7136] brk(0x1c4b000)              = 0x1c4b000
[pid  7136] brk(0)                      = 0x1c4b000
[pid  7136] brk(0)                      = 0x1c4b000
[pid  7136] brk(0)                      = 0x1c4b000
[pid  7136] brk(0)                      = 0x1c4b000
[pid  7136] brk(0x1c49000)              = 0x1c49000
[pid  7136] brk(0)                      = 0x1c49000
[pid  7136] rt_sigprocmask(SIG_BLOCK, [RT_27 RT_28], [], 8) = 0
[pid  7136] epoll_wait(19,  <unfinished ...>
[pid  7133] <... epoll_wait resumed> {}, 32, 85) = 0
[pid  7133] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
[pid  7133] --- SIGRT_27 (Real-time signal 25) @ 0 (0) ---
[pid  7133] getrusage(RUSAGE_SELF, {ru_utime={12, 90000}, ru_stime={13, 600000}, ...}) = 0
[pid  7133] timer_settime(0x104, 0, {it_interval={0, 12812000}, it_value={0, 12812000}}, NULL) = 0
[pid  7133] rt_sigreturn(0x4b54187fe0371) = 0
[pid  7133] rt_sigprocmask(SIG_BLOCK, [RT_27 RT_28], [], 8) = 0
[pid  7133] epoll_wait(3, {}, 32, 1)    = 0
[pid  7133] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
[pid  7133] rt_sigprocmask(SIG_BLOCK, [RT_27 RT_28], [], 8) = 0
[pid  7133] epoll_wait(3, {}, 32, 1000) = 0
[pid  7133] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
[pid  7133] --- SIGRT_27 (Real-time signal 25) @ 0 (0) ---
[pid  7133] getrusage(RUSAGE_SELF, {ru_utime={12, 90000}, ru_stime={13, 600000}, ...}) = 0
[pid  7133] timer_settime(0x104, 0, {it_interval={0, 12787000}, it_value={0, 12787000}}, NULL) = 0
[pid  7133] rt_sigreturn(0x4b541880d6c8e) = 0

(Strace Tip: I used -f to follow the forked processes, and -s 64000 to make sure I got enough string output to see the full read/write data sent to the socket.)

You can see that the memcache server reads the first 2048 bytes, then reads another 2048 bytes, then reads another 4096 bytes, then reads 8192 bytes, and finally reads another 16384 bytes. So after the first read, it doubles the amount of data it’s reading for each iteration. However, after the fourth read, the epoll call returns 0, and the client’s connection is closed.

Identifying the Root Cause

In this case, I knew Jeremy had worked on this bug before and I thought he would understand the all the data I had gathered. So with Jeremy’s help, I started looking at the memcache code on Github. Within minutes Jeremy pinged me saying he had noticed this code that’s designed to nuke junk connections. Which makes sense, but also doesn’t help us here!

This code path does exactly what I had seen in the strace above: It reads from buffer 4 times, while doubling the amount read each time. After 4 times, it stops reading and returns.

Conclusion and Resolution

Although memcache’s behavior is by design, it’s not compatible with the expected behavior of the protocol. In the issue I opened, I recommended a configuration option to set the amount of data to read, or the number of times to attempt reading more data. I’m not sure if this is a realistic solution though.

Hypothetically we could work around this by increasing the initial buffer size and recompiling memcache so that in 4 iterations the server would receive all of the get command. We could also patch the memcache-client multi_get method to break the multi_get requests into multiple requests with fewer keys. For now we’ve left our patched client in place, with hope that the memcache team will be able to get this resolved soon.

As I said previously, the goal is to identify a root cause. And, with Jeremy’s help, I’ve done just that. It feels great to have a complete understanding of this exception, and where the break down is in our stack.

We now know, we don’t have to be afraid of upgrading memcache with a fear of causing more multi_get problems. We also know that this is a legitimate problem that’s causing a significant performance penalty in our application. (Since large multi_gets fail, that means we aren’t getting the data back from memcache and hitting the database more than we need to.) Finally, for me, it shows that I don’t have to be bothered by debugging with strace. I’m no C code expert, but even with my limited understanding, strace provided enough clues for me to follow this all the way to the source in less than one hour. And since you all have an extra hour to spare, I hope this encourages you to take a look at that problem that’s been lingering in your stack.