Information Visualization for Large-Scale Data Workflows

Friday, May 9, 2014 - 5:00 pm

The ability to instrument and interrogate data as it moves through a processing pipeline is fundamental to effective machine learning at scale. Applied in this capacity, information visualization technologies drive product innovation, shorten iteration cycles, reduce uncertainty, and ultimately improve the performance of predictive models. It can be challenging, however, to understand where in a workflow to employ data visualization, and, once committed to doing so, developing revealing visualizations that suggest clear next steps can be similarly daunting.

In this talk we’ll describe the role that information visualization technologies play in the LinkedIn data science ecosystem, and explore best practices for understanding the structure of large-scale data in a production environment. From hypothesis generation and feature development to model evaluation and tooling, visualization is at the heart of LinkedIn’s machine learning workflows, enabling our data scientists to reason and communicate more effectively. Broken down into clear, structured insights based on proven technology and workflow patterns, this talk will help you understand how to apply information visualization to the analytical challenges you encounter every day.

Senior Data Scientist
LinkedIn

A senior data scientist at LinkedIn, Michael Conover develops machine-learning infrastructure that leverages the relationships and behavior of hundreds of millions of individuals. He has a Ph.D. from Indiana University in complex systems analysis with a focus on information propagation in large-scale social networks. His research interests run towards understanding the political process and the structure of economic opportunity, and his work has appeared in the New York Times, the Wall Street Journal, Science, and MIT Technology Review and on National Public Radio.