Tuesday, August 23, 2016

Hive vs PIG

Hive

PIG

Developed at Facebook

Developed at Yahoo!
          Hive is best for structured Data 
          
PIG is best for semi structured data
Hive used for reporting

PIG for programming
Hive used as a declarative SQL

PIG used as procedural language
Hive supports partitions

PIG does not
          Hive can start an optional thrift based server 
          
         PIG can't

          Hive defines tables before hand (schema) + stores schema information in database

PIG don't have dedicated metadata of database
          Hive does not support Avro

PIG does
Hive does not

Pig also supports additional COGROUP feature for performing outer joins.
Both Hive & PIG can join, order & sort dynamically