I’m going to give away all of my secret sauce, and tell you the three things that are required to be successful at business analytics (or data science, or whatever you want to call it):
1) Know at least a little bit about what you’re talking about.
At its foundation, business analytics is about converting a rather broad question (like “why do people cancel?”) into a set of specific questions that you can answer with data. This takes a bit of knowledge of what you’re talking about—you need the ability to construct hypotheses (e.g., people might cancel because they left the company, or because their corporate IT department rejected it, or…), and you need to know where you can get the answers. The best statistician in the world will be useless if they don’t get the context of the business.
A theoretician might call this “domain knowledge”, but all this is is the ability to know what you’re looking for and where to find it. There are no real shortcuts here – it takes exposure to your content matter to pick this up. Just open up the firehose—read as much as you can about your business and industry, stay involved in every conversation that’s even tangentially related, and be patient. Domain expertise will follow.
2) Make it easy.
There’s actually very little about data science that’s technically or intellectually challenging – it’s mostly about execution. Sure, machine learning uses advanced concepts that are worth understanding before you build a predictive model, and a passing understanding of statistics is a big deal, but most business questions can be answered with little more than simple arithmetic—what portion of customers do X? How long does it take people to do Y?
What then differentiates good data analysts from the rest? The ability to do this simple analysis quickly and easily, with minimal “friction”, so you can do more analysis faster. A few things help with this:
- Set up “one click” access to data. Shell scripts to drop you directly as a read-only user in your database, libraries to get your data from your database into a clean format in your prefered analysis environment, robots, and data exploration / dashboard type tools all help with this. These aren’t rocket science, but being able to get to your data in under ten seconds means you’ll be able to do it more.
- Have common definitions. Know what “active” means for each of users, accounts, etc. Try to pick consistent time periods for how far back you look at data. If you can skip the figuring out what someone means, data analysis becomes easier.
- Keep a calculator handy. Seriously, the friction difference between a calculator that you’re familiar with at arms reach and having to find one in a drawer, or opening Calculator.app, makes an impact in how easily you can do the simple arithmetic.
- Memorize your database schema. This isn’t a joke. You’ll know what you’re able to be able to find from a database alone and you’ll save a ton of time by not doing
SHOW TABLES
andDESCRIBE TABLE X
all the time. At the very least, know what tables are named and generally what the relationship between them is. I’d hazard a guess that if you surveyed the business analytics people at a handful of brand-name internet companies, they could all write out the schemas for their main databases by memory.
3) Look at lots of data.
The only way to know what “normal” is when it comes to your data is to look at it, a lot. Look at graphs of different things over different time periods, review log files, read customer emails—be the most voracious consumer of new information and data that you can. Keep finding that next source of information.
Try to spend an hour a day consuming new data – just reading it in, absorbing it, maybe making some notes. It doesn’t need to be for any specific investigation of the moment, but it will pay dividends down the line. This is very similar to the “Rule of 10,000 hours” and other notions of “Just do, a lot”.
Kris Jenkins
on 08 Sep 11Learn SQL. Seriously, importing into Excel may be comfortable, but if you really want to open up your data, learn some SQL. It’s easy.
JD
on 08 Sep 11Having Noah analyze my design tests this way is great. I understand that this may sound “rah rah go team” and completely biased.
Before working with Noah I participated in design tests. The way we implemented and analyzed them left me feeling skeptical of A/B testing’s value. Now working with Noah I truly appreciate testing. These are great tips.
Nirav Mehta
on 08 Sep 11Appreciate you sharing these insights. Like you suggested, real insights come from a continuing study of a variety of your statistics.
At the same time, I was put down to find no secrets sauce here ;-) The title promised three secrets, so I was looking forward to some great insights. But guess the greatest things are simple!
I’ve been bugged by the analytics for my online business. I wanted to find out how well sales are going, what’s my customer loyalty, what are the trends – basic KPIs. I didn’t find anything out there that would do this so I developed my own! This software has grown well over the last two years and we now even offer it to others. It’s called Putler – and analyzes your PayPal transactions to show you useful stats about your business.
Frankly, it’s doing these analysis that told me that PlannerX – our Basecamp extension was not doing as well as our activeCollab extensions. So we put more resources on them. And then the trend showed that we do have consistent new sign ups for PlannerX, so we are allocating more resources to PlannerX and considering building new extensions.
Thought about sharing these because I felt your post was beyond a typical entrepreneur’s understanding or needs!
Rich
on 08 Sep 11Haha! You said, ”...do do…”
(there’s a typo in your post)
B Wagner
on 08 Sep 11This is a great post! I’ll be sharing this with many associates. I can’t emphasize the importance of the items listed in ‘Secret 2’ enough. The minute you have more than one definition enter the conference room, you’re no longer analyzing data, you’re arguing semantics. And without a solid, easily defined and universally understood database schema, no one trusts analytics’ findings.
Anonymous Coward
on 08 Sep 11Is this blog still down from taking new comments?
I was getting lots of 500 errors
Alex Anglin
on 08 Sep 11As someone who works with analytics software on a daily basis, I would add to Noah’s advice: Build a data warehouse using dimensional modeling principals. Dimensional models are based around the notion of making the data available in an easy to use format, focused around business processes or subjects. If you are in the circumstance in which you are asking questions which pertain to data from database X and Y and spreadsheet Z, the data should be pre-integrated prior to analysis. Otherwise one runs the risk of integrating between these sources differently, depending on the analyst. This could lead to inconsistent answers to business questions.
Jay Schumacher
on 08 Sep 11While I agree with all of these points, I think #3 is the most important… Getting to know your data sounds cheesy – but it’s incredibly powerful. As Noah mentions, the more you look at it, the more you will see trends, see patterns, see issues, etc. Further, it will help you potentially learn 1) better ways to ask questions of the data and 2) questions you aren’t asking but should be.
I’ve had this experience 100 times over, learning something new by watching data patterns and realizing that there were even more valuable questions I could be asking.
NL
on 08 Sep 11@Kris Jenkins – you’re absolutely right. The fewer data handoffs you can make, the easier analysis is (and to Alex’s point, the fewer points to introduce transformation errors). If there’s some analysis you can do entirely in SQL, that will almost always be the fastest way (assuming reasonable indices, hardware, etc.).
To be honest, I don’t even have Excel currently installed on this computer…
Rishi
on 08 Sep 11Where do you go to consume data on a regular basis?
Equipment appraiser
on 08 Sep 11What’s the difference between excel & sql? Isn’t both programs the same???
Ben Garvey
on 09 Sep 11Memorize your database schema
procmer
on 09 Sep 111. Wish you had not started your post by saying you could not give away any “secret sauce” – seems like the antithesis to 37 philosophy to say that at least – and maybe even if you meant it. 2. I’m not sure there is any “secret sauce” to analysis when you have great products developed by great designers and developers. 3. Is it the great products driving success or the business analysis being done after the fact driving the success? food for thought and analysis 4. enjoyed your post – please give us more of your insight please!
Jamis
on 11 Sep 11@procmer, I think you misread the start of the post. Noah said, in the first sentence, “I’m going to give away all of my secret sauce”. :)
Filip
on 12 Sep 11A lot of business is using statistical methods like data mining, factorial analysis etc, that helps to detect new connections. They are using software SAS or SPSS :-)
filip (at) < a href=”http://www.secretarylist.com”>secretarylist.com
procmer
on 12 Sep 11@Jamis – the post was changed to reflect “i’m going to give away…” – But, that is not what it said at first. It seems seedy to change the post and then make it look like I can not read. Thought I was crazy for a minute until I checked the cache. That’s part of my secret sauce to debunk. Anyways, whatever, more posts from Noah (and from Jamis) please! I really enjoy each and every post.
This discussion is closed.