Apache Hadoop Get Together

10 Mär 2010 - 16:00
10 Mär 2010 - 19:00

This is to announce the next Apache Hadoop Get Together that will take place in newthinking store in Berlin.

As always there will be slots of 20min each for talks on your Hadoop topic. After each talk there will be a lot time to discuss. You can order drinks directly at the bar in the newthinking store. If you like, you can order pizza. We will go to Cafe Aufsturz after the event for some beer and something to eat.

Talks scheduled so far:

Chris Male (JTeam/ Amsterdam): Spatial Search Abstract

Abstract: The rise in popularity of Google Maps and mobile devices with GPS have resulted in a trend in the search field. People are no longer content with finding results that match a text query, they also want to find results which are near a location. So called spatial search differs considerably from traditional free text search in that it cannot be achieved through common search techniques such as inverted indexes. Instead, new algorithms and data structures had to be developed that achieve efficient and accurate spatial search, that also allow spatial search to have a role in the determination of a result's relevance. This technology has primarily been found in proprietary closed source search applications, however in the last 12-18 months, considerable effort has been invested into bringing open source spatial search support to Apache Solr and Lucene. While much is still left to be done, this talk will introduce how spatial search is currently supported in Solr, what work is happening currently, and a roadmap for future developments.

Dragan Milosevic (zanox/ Berlin: Product Search and Reporting powered by Hadoop

Abstract:

To efficiently process and index 80 million products, as well as store and analyse 30 billion clicks and 500 billion views daily, Zanox AG is using Hadoop HDFS and Map?Reduce technologies. This talk will present product-processing and reporting frameworks running on 17 node Hadoop cluster, being able to (1) robustly store products and tracking data in distributed manner, (2) rapidly consolidate, normalise and categorise products, (3) merge and aggregate tracking data and (4) efficiently builds indexes for supporting distributed search and reporting, running in several search clusters.

Bob Schulze (eCircle/ Munich): Database and Table Design Tips with HBase

Abstract: Recurring design patterns for the BigTable/HBase storage model.

A big thanks goes to Nokia Gate 5 (http://maps.ovi.com) for sponsoring videos of the talks. Links to the videos will be posted here.

Please do indicate if you are planning to attend to make planning (and booking tables at Aufsturz) easier.

Signup: https://www.xing.com/events/apache-hadoop-march-2010-459305