Our previous month’s Blendo Data Monthly was all around the first ever Kafka Summit in San Francisco and our usual data sources.
This month we will begin with a question: Who would win in Captain America: Civil War? Pure awesomeness! The great people from FiveThirtyEight pulled Data Science to give Captain America 4-To-1 odds against winning the Civil War.
Source: MARVEL / AP / FiveThirtyEight
I see dead people: Making simulations of deceased people
#Marketing & Customer Data
The preview release of Apache Spark 2.0 is available now! Since Spark 1.0 came out a lot have changed, Spark 2.0 builds on what they learned out of the community. You may read a great blog post from Databricks as they also provide a Technical Preview of Apache Spark 2.0.Source: Databricks
Twitter process billions of events every day. In order to analyze these events in real time is a huge challenge. In the beggining Twitter used Storm which was open sourced in 2011. In 2015 they introduced Heron, a real-time distributed stream computation system. This month Twitter open sourced Heron under Apache v2.0 license!
Heron is powering all of Twitter’s real-time analytics for over two years as a heavy-weight real-time stream processing engine and backward compatible with Storm.
+ Martin Kleppmann explores using event streams and Kafka for keeping data in sync across heterogeneous systems. Video of Staying in Sync: from Transactions to Streams at InfoQ.
#Machine Learning / AI
This post is about Reinforcement Learning and it’s chalenges. Computers learning and winning on games like Go or Pong up to robots learning to perform complex tasks and what Reinforcement Learning has to do with it. Great post from Andrej Karpathy, a PhD student at Stanford working on Deep Learning.
Data is at its most powerful when it is interconnected. A major challenge for modern data is interconnection of different data types to obtain a fuller picture of the data subject. The idea of Data Trusts explained.