Skip to Content

You are currently visiting an old archive website with limited functionality. If you are looking für the current Berlin Buzzwords Website, please visit https://berlinbuzzwords.de

Urania Berlin, June 6-7, 2011

Kleistsaal (Pink)

Search Analytics: What? Why? How?

Location: 
Kleistsaal
Date and time: 
Mon, 2011-06-06 11:00 - 11:40
Speaker: 
Otis Gospodnetić

You've indexed your data and people are searching it. But how do you know if they are happy with the results? How do you know if they are finding what they need?

Distributed search of heterogeneous collections with Solr

Location: 
Kleistsaal
Date and time: 
Mon, 2011-06-06 11:50 - 12:30
Speaker: 
Andrzej Białecki

This talk will discuss challenges of distributed searching, scoring and ranking, and merging partial result lists. Common issues in such situations include: global IDF calculation, global faceting, assigning ranks when merging partial lists, etc. Several effective approaches to these problems will be presented using Solr as a search platform.

Solr: Peak Performance

Location: 
Kleistsaal
Date and time: 
Mon, 2011-06-06 13:30 - 14:10
Speaker: 
Mark Miller

Mark Miller examines and expounds on some of the most relevant performance factors for a large scale, single server, Solr installation. Learn about FieldCaches, FilterCaches, FileSystem caches, Garbage Collection, and which knobs you can turn when you are battling performance problems. This talk will teach you the basics you need to know to get the best performance out of your Solr installation.

Integrating Solr with JEE Applications

Location: 
Kleistsaal
Date and time: 
Mon, 2011-06-06 14:20 - 15:00
Speaker: 
Chris Male

So you have downloaded Solr, configured it, indexed your data and are now ready to integrate it with the rest of your enterprise Java application. For most situations, this process will begin with Solr's Java client library SolrJ, but how do you configure SolrJ? What are best practices for using it? And how can you use it to build a robust search service tailored to your applications needs? In this presentation we'll show how to configure SolrJ in a Spring application and walkthrough a powerful design for incorporating Solr/Lucene and SolrJ in your application.

Hadoop Introduction

Location: 
Kleistsaal
Date and time: 
Mon, 2011-06-06 15:20 - 16:00
Speaker: 
Kai Voigt

Kai will present the basic ideas behind Hadoop. How to store large data in HDFS, the Hadoop distributed filesystem, and how to run distributed jobs with MapReduce.
The talk will cover an introduction into the theoretical background, and handson examples on how to store data in Hadoop and run jobs.

Highly Available Data Storage & Processing for Socorro

Location: 
Kleistsaal
Date and time: 
Mon, 2011-06-06 16:10 - 16:50
Speaker: 
Daniel Einspanjer

The Mozilla Metrics team in their role as data services for Mozilla is developing a new data storage and processing infrastructure for the Socorro crash report system.

Kafka - Bringing reliable stream processing to a confusing, dark world

Location: 
Kleistsaal
Date and time: 
Mon, 2011-06-06 17:00 - 17:20
Speaker: 
Jakob Homan

Kafka is a new distributed publish-subscribe system recently open sourced by LinkedIn designed to provide extremely high-throughput, reliable message passing. Kafka is focused on processing high-volume, site-critical stream activity and provides integration for loading data into Hadoop for further processing. Kafka is under heavy development and in active production use at LinkedIn.

The Multiple Uses of HBase

Location: 
Kleistsaal
Date and time: 
Mon, 2011-06-06 17:30 - 17:50

HBase has been deployed in production to solve often very different problems at companies such as StumbleUpon, Mozilla, Facebook, and Twitter. It helps scaling web applications, it stores insanely high amounts of historical data, it’s used as source or sink of analytical jobs, it keeps track of millions of counters, and so on. This presentation aims at helping people interested in HBase and current users understanding how it can be adapted to their own use case by analyzing the current deployments and their characteristics.

Oh Leonhard, Where art thou?

Location: 
Kleistsaal
Date and time: 
Tue, 2011-06-07 11:00 - 11:40
Speaker: 
Jim Webber

"Most NoSQL stores focus on throughput and resilience to failure. The trade-off in achieving those goals has been a substantial decrease in the sophistication of data models compared to RDBMS systems, and a corresponding increase in processing frameworks to deal with large volumes of unconnected data.

Realtime Big Data at Facebook with Hadoop and HBase

Location: 
Kleistsaal
Date and time: 
Tue, 2011-06-07 11:50 - 12:30
Speaker: 
Jonathan Gray

Buzzwords: NoSQL, Big Data, Realtime, Hadoop, HBase

Facebook has one of the largest Apache Hadoop data warehouses in the world, primarily queried through Apache Hive for offline data processing and analytics. However, the need for realtime analytics and end-user access has led to the development of several new systems built using Apache HBase. This talk will cover specific use cases and the work done at Facebook around building large scale, low latency and high throughput realtime services with Hadoop and HBase.

Syndicate content