Sample syllabus includes expectations, topics covered and grading scheme details.
There will be 3 parts to the project.
In part one you will install Hadoop, design and develop a map-reduce job to solve the given problem and learn to store the data in HDFS and/or HBase.
In part two you will learn to install and work with Hive and/or Pig to access and process this data.
In part three you will learn to design and develop an algorithm in conjunction with a distributed columnar database.
There will be some extra credit components you can work on that involve other technologies such as Mahout, Solr, etc. which will be determined in individual consultation with the instructor based on your interest.
You will use Hadoop for programming assignments. We will try to provide you with a Linux VM with Hadoop pre-installed. You can try to also use Cloudera's open source utilities to manage your Hadoop installation.