Great data scientists are like great chefs: they know that the best results come from having access to the highest quality ingredients. Therefore, just as the best business executives are desperate to hire the best data scientists, data scientists are desperate to hire the best data engineers, the people who provide them with the tools and the data they need to get their jobs done. As important and interdependent as the two functions are, there is still a great deal of misunderstanding about the boundaries between the roles and the different constraints that each is operating under. I would like to spend a little bit of time describing how I think that data engineers and data scientists should be organized within a company and discussing how you can be the kind of data scientist/engineer that every data engineer/scientist wants to work with.
Data Engineering and Data Science: Bridging the Gap
Josh Wills is the head of data engineering at Slack. Prior to Slack, he built and led data science teams at Cloudera and Google. He is the founder of the Apache Crunch project, co-authored an O'Reilly book on advanced analytics with Apache Spark, and wrote a popular tweet about data scientists. This is the only hat he owns.