Hakuna MapData!
rss

Slides from “Hadoop Operations Powered By … Hadoop” given at Hadoop Summit 2014 in Amsterdam

| Posted in Community, Presentations, Troubleshooting |

“Hadoop Operations Powered By … Hadoop” accepted for Hadoop Summit 2014 in Amsterdam! ;)

| Posted in Community, Presentations, Troubleshooting |

I am extremely happy to say that my proposal was accepted for Hadoop Summit 2014 in Amsterdam ;) The title of my presentation is Hadoop operations powered by … Hadoop and I will talk about various metrics, logs and files that Hadoop generates and how to analyze them … using Hadoop (and open-source tools and simple scripts) to learn more about Hadoop and avoid guesstimates!

Beetest – a simple utility for testing Apache Hive scripts locally for non-Java developers

| Posted in Programming, Testing, Troubleshooting |

Recently, I have been refactoring one of our Hive scripts. Because, I introduced significant changes, a query is resource-intensive (process almost two terabytes of data) and … I wanted to iterate fast, I decided to test it locally.

To make it easier, I implemented Beetest – a super simple utility that helps you to test your Apache Hive scripts locally without any Java knowledge.

Celebrate failure(s) – a real-world Hadoop example (HDFS issues)

| Posted in Failures, Troubleshooting |

At Spotify, we have a company-wide culture of celebrating successes and … failures. Because we want to iterate fast, we do realize that failures can happen. On the other hand, we can not afford to make the same mistake more than once. One way of preventing from that is sharing our failures, mistakes and learning across the company.

Today however, I would like to share my failures … outside of the company ;) While my failures relate to my recent work with Apache Hadoop cluster, I think that the lessons that I have learned are generic enough, so that many people can benefit from them.

Slides from “Apache Hadoop In Theory And Practice”

| Posted in Uncategorized |

A presentation that I gave at at Distributed Systems Seminar at the University of Warsaw (the university that I graduated from). I wanted to make this presentation academically interesting, but also shows a bit how everything looks in practice on a large Hadoop cluster at Spotify. I hope you will like this combination! ;)