You’ve probably seen this diagram before, or a variation on it; data science happens at the conjunction of statistics, domain knowledge and computer science. Sometimes the domain knowledge part is swapped for data visualisation, but it still works.
I came to data science from statistics – my first job was as an econometrician – and I hope 20 years in and around marketing has given me enough domain knowledge to get by. Computer science though? Yeah, about that… I started with self-taught VBA macros, via a crash course in naming conventions and code structure from a friendly but exasperated IT department, who had been asked to rebuild one of my early efforts.
I don’t have a background in good development practices. But I’m trying. And I’ve started to understand where the gaps are.
There are lots of us R developers, certainly in my company, who have 2/3 of the data science triangle, but badly need the last part. Get into Shiny and you’ll quickly get yourself into trouble, building a fragile tower of difficult-to-maintain app code, so I tweeted this recently and helpful people replied with some fantastic resources. I thought I’d capture them here in case they’re helpful for others too.
Isn’t the #rstats hashtag brilliant?