How and why do some emerging open source projects become winners, while others are relegated to the dust bin of history?
The big data open source movement is a mosaic from mainstream technologies like HDFS and MapReduce to maturing technologies like Spark and Storm to bleeding edge technologies. And there’s plenty to choose from: GitHub currently has over 9 million users collaborating across over 20 million repositories! So why do certain platforms bubble up into winners and earn mindshare from the open source community? Is it the Darwinian theory of evolution of the fittest or are there other forces at play?
This session will discuss the impact of third party companies and the open source ecosystem on the growth and maturation of platforms such as Spark, Storm, YARN, HBase, MongoDB, Presto, Phoenix and more. The presentation will examine the big data ecosystem of today with an emphasis on characteristics of winners vs. losers based on hands on experience, benchmarking and ecosystem support. Which open source projects are too hyped? Which ones are risky bets? And how can we predict what’s next in big data?