Urania Loft White

Node.js for heavy I/O

Location:

Einsteinsaal

Date and time:

Mon, 2011-06-06 11:00 - 11:40

Speaker:

Felix Geisendörfer

"Server-side JavaScript has been around since 1996, but despite the huge success on the client, it never quite managed to attract a significant audience.

This all changed in 2009, when Ryan Dahl presented his non-blocking I/O framework for Google's V8 JavaScript engine at JSConf.eu. The radical new approach, combined with one of the fastest JavaScript engines, has attracted an explosive number of early adopters ever since.

This talk gives you a brief introduction to node.js, exploring the technology behind the buzz.

Read more

Building next-generation Web Apps with WebSocket

Location:

Einsteinsaal

Date and time:

Mon, 2011-06-06 12:10 - 12:30

Speaker:

Matthias Wessendorf

To enable Sever-Side-Push in Web-Applications a lot of hacks (aka Comet/Bayeux) have been done in the past. The WebSocket spec fixes that by introducing a bi-directional and full duplex communication channel over a single TCP connection. This session gives an overview of the WebSocket API and shows how to use other protocols like AMQP or Stomp on-top of it, to build powerful and future-proof Web-Applications.

Messaging Patterns with RabbitMQ

Location:

Einsteinsaal

Date and time:

Mon, 2011-06-06 13:30 - 14:10

Speaker:

Alvaro Videla

RabbitMQ is emerging as good solution for open source messaging in modern architectures. This opens new challenges on how to implement integration patterns in order to go beyond simple produce/consume applications. In this presentation we will show how to implement several messaging patterns using RabbitMQ as backend technology such as:

- Competing consumers
- RPC
- Parallel RPC
- Smart Proxy
- Publish/Subscribe

The examples will be give using standard AMQP commands so the attendee can translate them later to her favorite RabbitMQ/AMQP client.

The other Apache technologies your big data solution needs!

Location:

Einsteinsaal

Date and time:

Mon, 2011-06-06 14:20 - 15:00

Speaker:

Nick Burch

You've gone to the talks on Hadoop / SOLR / NoSQL / etc, and now you're ready to start building your own solution on top of that! What you might not realise is that you may end up reinventing some bits of the wheel whilst building your system..

In this talk we'll take a whistle-stop tour through a number of the projects from the Apache Software Foundation that aren't "big data" projects, but which could prove very helpful to you in building your big data solution.

Read more

Time Series or Causal Analysis Without Limits!

Location:

Einsteinsaal

Date and time:

Mon, 2011-06-06 15:20 - 16:00

Speaker:

Shevek

Analysis of a causal or time series relationship between two data sets (or functions) is important for fields from yield optimization to signal processing, stock market analysis to functional genomics, and many other applications.

This talk describes an algorithm developed by Karmasphere Labs for performing the entire family of cross correlation algorithms on arbitrarily large data sets. The algorithm supports wide or even unbounded windowing functions.

Read more

Real-time Analytics With Cassandra

Location:

Einsteinsaal

Date and time:

Mon, 2011-06-06 16:10 - 16:50

Speaker:

Sylvain Lebresne

Real-time analytics requires counting a lot of things very quickly.
This talk will present the Apache Cassandra approach to distributed
counters, starting with the difficulties inherent to partitioning
counters across multiple machines, and continuing to the solution
adopted by Cassandra. We will wrap up with a use-case: how Twitter
leverages this counter design in their Rainbird system for real-time
engagement analytics.

Kafka - Bringing reliable stream processing to a confusing, dark world

Location:

Kleistsaal

Date and time:

Mon, 2011-06-06 17:00 - 17:20

Speaker:

Jakob Homan

Kafka is a new distributed publish-subscribe system recently open sourced by LinkedIn designed to provide extremely high-throughput, reliable message passing. Kafka is focused on processing high-volume, site-critical stream activity and provides integration for loading data into Hadoop for further processing. Kafka is under heavy development and in active production use at LinkedIn.

The Multiple Uses of HBase

Location:

Kleistsaal

Date and time:

Mon, 2011-06-06 17:30 - 17:50

Speaker:

Jean-Daniel Cryans

HBase has been deployed in production to solve often very different problems at companies such as StumbleUpon, Mozilla, Facebook, and Twitter. It helps scaling web applications, it stores insanely high amounts of historical data, it’s used as source or sink of analytical jobs, it keeps track of millions of counters, and so on. This presentation aims at helping people interested in HBase and current users understanding how it can be adapted to their own use case by analyzing the current deployments and their characteristics.

Heavy Committing: Flexible Indexing in Lucene 4.0

Location:

Einsteinsaal

Date and time:

Tue, 2011-06-07 11:00 - 11:40

Speaker:

Uwe Schindler

Apache Lucene's next major release, 4.0, will introduce lots of flexibility into indexing, but also fundamental changes to the well-known APIs: It features a new and consistent, 4-dimensional iteration API on top of a low-level, pluggable codec API giving applications full control over the postings data. Terms are now arbitrary opaque bytes enabling users to store terms in any encoding, not necessarily UTF-8, natively in the index (e.g. numeric fields).

Read more

Heavy Committing: Lucene DocValues aka. Column Stride Fields

Location:

Einsteinsaal

Date and time:

Tue, 2011-06-07 11:50 - 12:30

This talk will introduce one of the new features in the next intentionally backwards incompatible Apache Lucene release. Beside Flexible Indexing & Realtime Search Lucene 4.0 will ship with an alternative column based per-document storage called DocValues or Column Stride Fields. DocValues provide efficient storage for per document values used for sorting, result rendering or scoring. Unlike FieldCache DocValues don't need to be indexed nor be un-inverted and allow users to specify how values are stored per field.

Read more

Berlin Buzzwords 2011 is a conference for developers and users of open source software projects, focussing on the issues of scalable search, data-analysis in the cloud and NoSQL-databases. Berlin Buzzwords presents more than 30 talks and presentations of international speakers specific to the three tags "search", "store" and "scale".