Tag Archives: tez

Underlying Hive Engines

One thing to note is Hadoop/MapReduce programming is a bit low level in nature and it takes lots of boiler template and logic code to complete a simple task like WordCount. Here is the program to aggregate the occurrence of different words in a give input, which will give an idea on how complex MapReduce programs can become.

Because of the above mentioned reasons Facebook started Apache Hive (note that Facebook is moving towards Presto from Hive) and Yahoo started Apache Pig which are higher level abstracts to convert HiveQL for Hive and PigLatin for Pig into low level MapReduce execution. Hive and Pig provide better developer productivity compared to MapReduce, so most of the companies are opting for higher level abstractions like Hive and Pig for data processing.Hive-Pig-MRAs mentioned in the previous blog, MapReduce is batch oriented in nature. So, the frameworks like Hive and Pig are also batch oriented in nature. So, there is a  work in progress to support multiple Hive progressing engine like Apache Incubator Tez and Apache Spark beside the old MapReduce. By simple configuration changes, the Hive queries can be converted to Tez or Spark or MR programs to be executed on the appropriate platform.Hive-Engines Continue reading Underlying Hive Engines