This blog post first appeared here.

One of the non-Pentaho side projects I’ve become interested in is Apache Drill, I like all the different aspects of it and hope to contribute in some meaningful way shortly :) As a first step, without touching any code, I thought I’d see if I could configure PDI and Drill to play nicely together. My Proof-Of-Concept was a single Table Input step in PDI, using a Generic driver to point at my local Drill instance. I was able to query the datasources as described in the Drill wiki.

At first I wanted to be able to start Drill in embedded mode, as my end goal was to be able to provide a PDI plugin so “Apache Drill” would show up on the Database types list in the Database Connection dialog. However I ran into a bunch of classloading issues (see previous post), so I thought I’d try a different approach that worked much better. Here’s my procedure:

Read More


This blog post was originally published on the MapR blog.

Q & A from “The Future of Hadoop Analytics: Total Data Warehouses and Self-Service Data Exploration” Webinar

The recent MapR webinar titled “The Future of Hadoop Analytics: Total Data Warehouses and Self-Service Data Exploration” proved to be a highly informative, in-depth look at the future of data warehouses and how SQL-on-Hadoop technologies will play a pivotal role in those settings. Matt Aslett, Research Director for 451 Research, along with Apache Drill architect Jacques Nadeau, discussed what lies ahead for enterprise data warehouse architects and BI users in 2015 and beyond.

If you missed the webinar, you can watch the replay here. Following are the answers to the questions that were asked at the end of the webinar:

Read More

There are several projects for SQL-on-Hadoop. What makes Drill different? What are the top 10 reasons why Drill is a valuable and innovative technology in your tool belt for interactive data exploration on big data? 

To read more, the whole blog post can be found here


It is our pleasure to announce the 0.5.0 release of Apache Drill.  This is Drill’s first beta release and the second in our iterative monthly release cycle. It includes more than 100 issues addressed since last month’s release and more than 1,000 addressed since Drill’s inception, this is a great release to start exploring your data, wherever and whatever it is.

To read more on this release, please visit the original Apache blog post by committer Jacques Nadeau.

How would you use Drill?

What questions or comments do you have about the design of Drill? 

What are your thoughts or suggestions for the Drill community?

The Bay Area Apache Drill User group is going to meet in San Jose, California next Monday 24 February at 6pm, and where ever you may live, we want to hear from you.

Please tweet your ideas or comments using the hashtag #drilltalk by Monday evening Pacific time (you can follow Drill on Twitter as @ApacheDrill).  Or add a comment or question here.

Read More

by Ellen Friedman on Twitter as @Ellen_Friedman

As we welcome the new year, Apache Drill has two new committers: Timothy Chen and Julian Hyde.  Their hard work on behalf of Drill has earned the notice and gratitude of the project and community.

Tim Chen is an engineer at Microsoft in Seattle who recently spoke at the Bay Area Apache Drill User Group meet-up about his work related to the lifetime of a Drill query end-to-end. Tim’s presentation was part of the celebration of the first milestone release for Drill. Please see the earlier post here at the Drill User blog for details. You can find out more at Tim’s blog or follow him on Twitter @tnachen

Julian Hyde was an engineer at Pentaho who recently moved to Hortonworks. For Drill, Julian has worked on the SQL. Julian is also lead developer of Mondrian OLAP engine and Optiq data platform and is one of the authors of the Manning book Mondrian in Action   Julian will be one of the speakers at the next Bay Area Apache Drill User Group planned for 24 Feb 2014. Stay tuned for details. You can follow Julian on Twitter @julianhyde

Drill is also fortunate to have the help of a new project mentor, Sebastian Schelter. Sebastian is a PhD student and research associate at TU Berlin, with expertise in machine learning, especially recommendation. Sebastian is active with the Apache Foundation, being a PMC member and committer for the Apache Mahout project. Sebastian is on Twitter as @sscdotopen

And a Happy New Year for 2014 to you all!

Follow the Apache Drill community on Twitter @ApacheDrill

Check out the Apache Drill project website at


By Ellen Friedman, Twitter ID: @Ellen_Friedman

Co-organizer of Bay Area Apache Drill User Group

The Apache Drill project is building an innovative tool for ad hoc, interactive queries in the time scale of 100ms to 20 minutes on large, distributed data systems. Participants in the open source Apache Drill community recently came together to take a look at how Drill works now and what will be the next steps in the project.

Read More

After we’ve released the M1 alpha version of Apache Drill a lot of things happened:

UPCOMING: The Bay Area Apache Drill User Group will have a meet-up on coming Monday 4 Nov 2013 on Apache Drill: First Milestone Release.

Michael giving an Apache Drill talk at JAX London 2013

In his recent blog post, Yash Sharma provides a detailed account of how to contribute to Apache Drill: Implementing Drill Math Functions—the article is geared towards Java developers but I’d argue that also Apache Drill users in general would benefit from studying it.

Huge congrats to the Apache Drill team! The alpha release is being shipped now and Drill has won its first award: it’s one of the best open source big data tools 2013.

BTW, last week I gave an Apache Drill talk and demo at the HUG Stockholm—slides and a video recording are available.