Hadoop

Hadoop Course Details

Duration: 45 Days (1:30Mins)

Hadoop Introduction:-

  • What is Hadoop? Why Hadoop?
  • Hadoop History?
  • Different types of Components in Hadoop?
  • HDFS, MapReduce, PIG, Hive, SQOOP, HBASE, OOZIE, Flume, Zookeeper and so on…
  • What is the scope of Hadoop?

Hadoop Distributed File System (HDFS) (for Storing the Data):-

  • Introduction of HDFS
  • Features of HDFS
  • Daemons of Hadoop
    • Name Node
    • Secondary Name Node
    • Job Tracker
    • Data Node
    • Task Tracker
  • Basic Configuration for HDFS
  • Data Organization and Replication
  • Rack Awareness, Heartbeat Signal
  • How to Store the Data into HDFS
  • Accessing HDFS (Introduction of Basic UNIX commands)
  • CLI commands

MapReduce using Java (Processing the Data):-

  • Introduction of MapReduce.
  • MapReduce Architecture
  • Data flow in MapReduce
    • Splits
    • Mapper
    • Portioning
    • Sort and shuffle
    • Combiner
    • Reducer
  • Basic Configuration of MapReduce
  • MapReduce life cycle
  • Writing and Executing the Basic MapReduce Program using Java
  • File Input Formats
  • Joins
    • Map-side Joins
    • Reducer-side Joins

PIG:-

  • Introduction to Apache PIG
  • MapReduce vs PIG
  • Basic PIG programming
  • Modes of Execution in PIG
    • Local Mode and
    • MapReduce Mode
  • Execution Mechanisms
    • Grunt Shell
    • Script
    • Embedded
  • Operators in PIG
  • PIG UDF’s

SQOOP:-

  • Introduction to SQOOP
  • Connect to mySql database
  • SQOOP commands
    • Import
    • Export
    • Eval
    • Codegen and etc…
  • Joins in SQOOP

HIVE:-

  • Introduction to HIVE
  • HIVE Architecture
  • Tables in HIVE
    • Managed Tables
    • External Tables
  • Partition
  • Joins in HIVE
  • HIVE UDF’s and UADF’s

HBASE:-

  • Introduction to HBASE and Basic Configurations of HBASE
  • HBASE Architecture
  • SQL vs NOSQL
  • How HBASE is differ from RDBMS
  • Client side buffering or bulk uploads

Cluster Setup:–

  • Downloading and installing the Hadoop
  • Creating Cluster
  • Increasing Decreasing the Cluster size
  • Monitoring the Cluster Health
  • Starting and Stopping the Nodes

Introduction about OOZIE, FLUME and ZOOKEEPER and some sample programs.