NoSQL matters 2013 in Cologne, Germany—lots of good discussions and great people around for the Apache Drill training day, thank you everybody involved and hope to ‘see’ you on the mailing list, on Twitter or F2F next time!
There are many things happening in parallel at the moment. We’re making great progress in terms of APIs and storage engines.
Also, there are a number of events where Drill is discussed. For example, I recently gave a status talk at the HUG Munich and if you want to start get active in the development, be it via code or test data and test queries, consider joining us on our weekly Google+ hangout at 9am PT/5pm UTC.
Note that the top vote getters in each track will automatically be added to the Hadoop Summit agenda, so please spread the word and let’s make this happen!
The Apache Drill Design Meeting on 13 Sep was a huge success, with some 60 people attending. Now, Camuel Gilyadov shared his thoughts re the design considerations and Ted Dunning spoke about Drill on 19 Sep at the Chicago Hadoop User Group - check out the video and his slide set.
Note: we now have an official hashtag #apachedrill - for example, you can search on Twitter for it now.
While more and more people are joining the mailing list, first design-related discussions have been taking place.
Design meeting.
On 2012-09-13 there was a design meeting with some 60 attendees. Jason Frantz and Julian Hyde presented design guidelines and suggestions. Check out Jason’s slide set, Julian’s slide set and/or watch the meeting.
After the meeting, a number of people signed up to drive certain tasks:
Wire formats.
Last week I raised an issue re supporting Thrift as a wire format. Turns out that protobuf seems to perform best {1}, {2}, {3} and hence the internal wire format will be protobuf, but Avro and Thrift might well be supported, externally.
Front-end?
So, I’m currently working on an Apache Drill front-end (essentially, an HTML5/Ajax browser app like you might know from BigQuery).
Some 10 days ago I stumbled upon the proposal for the Apache Drill Incubator group. Drill is a distributed system for interactive analysis of large-scale datasets, inspired by Google’s Dremel.
We’re talking about querying hundreds and thousand of billions of records in a few seconds, working against HDFS and supporting a number of nested data wire formats, including JSON, Avro, ProtoBuf, and Thrift.
Since then, quite some things have happened:
In a recent GDG meeting, Ryan Boyd from Google and Tomer Shiran from MapR talked about BigQuery - Google’s hosted-Dremel version - as well as about Drill’s positioning, requirements and design decisions. Check out the video from the meeting and Tomer’s slides - his talk starts roughly 1 hour into the video.
This site, drill-user.org aims at documenting the development of Drill, from its early days of incubating to, hopefully, one day becoming an Apache top-level project and as successful as Hadoop. At least ;)
Consider joining us on this journey. Submissions and comments are more than welcome! For now you might as well sign up on the drill-dev@incubator.apache.org mailing list …