You are here: Links of Interest » HEIG-VD » [CLD] Cloud Computing » Lab 05: MapReduce in the Cloud
Lab 05: MapReduce in the Cloud
This is an old revision of the document!
Table of Contents
Lab 05: MapReduce in the Cloud
Pedagogical Objectives
- Perform data analysis in the cloud using a dynamically allocated cluster of machines
- Write a MapReduce program
- Become familiar with Hadoop
Tasks
In this lab you will perform a number of tasks and document your progress in a lab report. Each task specifies one or more deliverables to be produced. Collect all the deliverables in your lab report. Give the lab report a structure that mimics the structure of this document.
Task 1 - Using Elastic MapReduce
Copy a screenshot of the EMR console into the report.
Copy the bar chart of maximum temperature by year into the report.
What is the overall highest temperature in the data set?
The overal highest temperature is 38.0 degrees. This temperature has been reached in 2003.
How many EC2 instances were created to run the job?
Three EC2 instances were created to run this job. We can see it on the next screnshot.
This pricing test has been made with 3 EMR instances of type m1.small. This job took 19 minutes to complete so we have been charged for a 1 hour. The price for it was about 0.18 $.