Beetest – a simple utility for testing Apache Hive scripts locally for non-Java developers

| Posted in Programming, Testing, Troubleshooting |


Recently, I have been refactoring one of our Hive scripts. Because, I introduced significant changes, a query is resource-intensive (process almost two terabytes of data) and … I wanted to iterate fast, I decided to test it locally.

To make it easier, I implemented Beetest – a super simple utility that helps you to test your Apache Hive scripts locally without any Java knowledge.

Pigitos – MapKeysToBag, MapSize and more UDFs to manipulate maps in Apache Pig

| Posted in Programming |


I have already created a project called Pigitos which is a set of tiny, but highly useful Java UDFs for Apache Pig.

Currently, Pigitos contains a couple of UDFs that support working with maps. It provides UDFs to calculate the size of the map and get map’s keys (or values, or key/value pairs) as a bag. Such UDFs are very useful when working with dynamically created column qualifiers (that hold some meaningful information that you want to process) in Apache HBase tables.