Tag Archives: Oozie

Build dataflow to get monthly top price of Land Trading in UK

The dataset is downloaded from UK government data web(The total data size is more than 3GB). And, I am using Apache Oozie to run Hive and Sqoop job periodically. The Hive script “land_price.hql”:

We want Hive job to run on queue “root.default” in YARN (and other jobs in “root.mr”),… Read more »

Use Oozie to run terasort

      No Comments on Use Oozie to run terasort

The better choice of “Action” for running terasort test case in Oozie is “Java Action” instead of “Mapreduce Action” because terasort need to run

first and then load ‘partitonFile’ by “TotalOrderPartitioner”. It’s not a simple Mapreduce job which need merely a few propertyies. The directory of this”TerasortApp” which using… Read more »