Latest Industry News

5 steps to a modern data architecture

5 steps to a modern data architecture Modern data systems still mainly process data in batch. The next stage is to move to “real time” technologies and make the entire company operate on an “event” instead of the year, the quarter, and the month. We have all become accustomed to batch processing by calendar period. […]

Read More

Apache Storm 1.0 packs a punch

Apache Storm 1.0 packs a punch When big data mavens debate the merits of using Apache Spark versus Apache Storm for streaming data processing, the argument usually sounds like this: Sure, Storm has great scale and speed, but it’s hard to use. Plus, it’s slowly being overtaken by Spark, so why go with old and busted when there’s new […]

Read More

Machine learning’s biggest job

Machine learning's biggest job When Satya Nadella made machine learning the centerpiece of the Microsoft Build conference, I think it became official: 2016 is the year of machine learning. All the major clouds now (or will soon) have machine learning APIs. In fact, InfoWorld’s Martin Heller has already reviewed the machine learning services offered by […]

Read More

Get started with Apache Spark

Get started with Apache Spark Apache Spark is an open source clustering framework for batch and stream processing. The framework originated at the AMPLab in UC Berkeley in 2009, became an Apache project in 2013, and emerged as one of the organization’s top priorities in 2014. It is currently supported by Databricks, which was founded […]

Read More