A few months ago, Microsoft announced their commitment to Apache Hadoop™ providing details on interoperability between SQL Server and Hadoop. As we have discussed and noted in the past, in the data deluge faced by businesses, there is an increasing need to store and analyze vast amounts of unstructured data including data from sensors, devices, bots and crawlers and this volume is predicted to grow exponentially over the next decade. Customers have been asking us to help store, manage, and analyze these new types of data – in particular, data stored in Hadoop environments.
During the Ted Kummert’s Day 1 keynote of SQL Server PASS Summit 2011, we disclosed an end to end roadmap for Big Data that embraces Apache Hadoop™.
To deliver on this roadmap, we announced:
- The general availability (GA) the release to manufacturing of the Hadoop connector for SQL Server and Hadoop connector for SQL Server Parallel Data Warehouse free to licensed SQL Server & PDW customers. These connectors will enable bi-directional data movement across SQL Server and Hadoop enabling customers work effectively with both structured and unstructured data
- Plans to deliver a Hadoop based distribution for Windows Server and Hadoop based service for Windows Azure. By enabling organizations to deploy Hadoop based big data analytic solutions in Hybrid IT scenarios either on premises, in the cloud or both, customers have the flexibility to process data wherever it is born and wherever it lives. Both distributions will offer simplified acquisition, installation and configuration experience of several Hadoop based technologies i.e. HDFS, Hive, Pig etc., enhanced security through integration with Active Directory, unified management through integration with System Center and a familiar and productive development platform through integration with Visual Studio and .NET – all of this optimized to provide the best in class performance in Windows environments.
- Plans to integrate Hadoop with Microsoft’s industry leading Business Intelligence Platform that will enable users to use the familiar productivity tools such as Microsoft Excel and award winning BI clients such as PowerPivot for Excel and Power View to perform analysis on Hadoop datasets in an immersive and interactive way. Our first set of deliverables here will include a Hive ODBC Driver and Hive Add-in for Excel.
- A strategic partnership with Hortonworks that enables us to build on the experience and expertise from the Hadoop ecosystem to help us enable Hadoop to run great on Windows Server and Windows Azure. Hortonworks was formed by the key architects and core Hadoop committers from the Yahoo! Hadoop software engineering team in June 2011 and the team is a major driving force behind the next generation of Apache Hadoop.
Building on our leading Business Intelligence and Data Warehousing platform, we are extending our mission to ‘provide business insight to all users from not only the structured and unstructured data that exists in databases and data warehouses today, but from non-traditional data sources e.g. file systems that include large volumes of data that has not previously been activated to provide new business value.’
Microsoft hopes to deliver on this mission by making Hadoop accessible to a broader class of developers, IT professionals and end users, by providing enterprise class Hadoop based distributions on Windows and by enabling all users to derive breakthrough insights from any data.
Exciting times ahead! We hope you join us for this ride!
For more information on Microsoft’s Big Data solution, visit microsoft.com/bigdata.