Hive Performance Tuning

Hive Performance Tuning

The Apache Hive ™ data warehouse software facilitates reading, writing, and managing large datasets residing in distributed storage using SQL syntax. To know how to use Hive please read https://cwiki.apache.org/confluence/display/Hive/Tutorial…

Read more »

Apache Spark Unit Testing

Apache Spark is a lightning-fast unified analytics engine for big data and machine learning. Apache Spark is included in almost all of the Hadoop distributions. Apache Spark is the hottest…

Read more »
global hadoop

Hadoop to explore data

Big data by definition denotes datasets that are so large or complex that traditional data processing application frameworks and software are inadequate to deal with them. Hadoop is the answer…

Read more »