SV Soft Solutions Online Training : Hadoop

Hadoop Online Training

Understanding BigData
What is Big Data?
Big-Data characteristics
Hadoop Distributions
Hortonworks
Cloudera
Pivotal HD
Greenplum
Introduction to Apache Hadoop
Flavors of Hadoop: Big-Insights, Google Query etc..
Hadoop Eco-system components: Introduction
MapReduce
HDFS
Apache Pig
Apache Hive
HBASE
Apache Oozie
FLUME
SQOOP
Apache Mahout
KIJI
LUCENE
SOLR
KiteSDK
Impala
Chukwa
Shark
Cascading
Understanding Hadoop Cluster
Hadoop Core-Components
NameNode
JobTracker
TaskTracker
DataNode
SecondaryNameNode
HDFS Architecture
Why 64MB?
Why Block?
Why replication factor 3?
Discuss NameNode and DataNode
Discuss JobTracker and TaskTracker
Typical workflow of Hadoop application
Rack Awareness
Network Topology
Assignment of Blocks to Racks and Nodes
Block Reports
Heart Beat
Block Management Service
Anatomy of File Write
Anatomy of File Read
Heart Beats and Block Reports

Discuss Secondary NameNode
Usage of FsImage and Edits log

Map Reduce Overview
Best Practices to setup Hadoop cluster
Cluster Configuration

Core-default.xml

Hdfs-default.xml

Mapred-default.xml

Hadoop-env.sh

Slaves

Masters

Need of *-site.xml
Map Reduce Framework
Why Map Reduce?
Use cases where Map Reduce is used
Hello world program with Weather Use Case

Setup environment for the programs

Possible ways of writing Map Reduce program with sample codes find the best code and discuss

Configured, Tool, GenericOptionParser and queues usage

Demo for calculating maximum temperature and Minimum temperature

Limitations of traditional way of solving word count with large dataset
Map Reduce way of solving the problem
Complete overview of MapReduce
Split Size
Combiners
Multi Reducers
Parts of Map Reduce
Algorithms
Apache Hadoop Single Node Installation Demo
Namenode format
Apache Hadoop Multi Node Installation Demo
Add nodes dynamically to a cluster with Demo
Remove nodes dynamically to a cluster with Demo
Safe Mode
Hadoop cluster modes

Standalone Mode
Psuedo distributed Mode

Fully distributed mode
Revision
HDFS Practicals(HDFS Commands)
Map Reduce Anatomy

Job Submission

Job Initialization

Task Assignments

Task Execution

Schedulers
Quiz
Map Reduce Failure Scenarios
Speculative Execution
Sequence File
Input File Formats
Output File Formats
Writable DataTypes
Custom Input Formats
Custom keys, Values usage of writables
Walkthrough the installation process through the cloudera manager
Example List, show sample example list for the installation
Demo on teragen, wordcount, inverted index, examples
Debugging Map Reduce Programs
Map Reduce Advance Concepts
Partitioning and Custom Partitioner
Joins
Multi outputs
Counters
MR unit testcases
MR Design patterns
Distributed Cache
- Command line implementation
MapReduce API implementation
Map Reduce Advance concepts examples
Introduction to course Project
Data loading techniques

Hadoop Copy commands
Put,get,copyFromLocal,copyToLocal,mv,chmod,rmr,rmr –skipTrash,distcp,ls,lsr,df,du,cp,moveFromLocal,moveToLocal,text,touhz,tail,mkdir,help
Flume
Sqoop

Demo for Hadoop Copy Commands
Sqoop Theory
Demo for Sqoop
Need of Pig?
Why Pig Created?
Introduction to skew Join
Why go for Pig when Map Reduce is there?
Pig use cases
Pig built in operators
Pig store schem
Operators

Load

Store

Dump

Filter

Distinct

Group

CoGroup

Join

Stream

Foreach Generate

Parallel

Distinct

Limit

ORDER

CROSS

UNION

SPLIT

Sampling

Dump Vs Store
DataTypes

Complex
Bag
Tuple
Atom
Map

- Primitives

Integers

Float

Chararray

byteArray

Double

Diagnostic Operators

Describe

Explain

Illustrate

UDFs

Filter Function
Eval Function
Macros
Demo

Storage Handlers
Pig Practicals and Usecases
Demo using schema
Demo using without schema
Hive Background
What is Hive?
Pig Vs Hive
Where to Use Hive?
Hive Architecture
Metastore
Hive execution modes
External, Manged, Native and Non-native tables
Hive Partitions

Dynamic Partitions

Static Partitions

Buckets
Hive DataModel
Hive DataTypes

Primitive

Complex

Queries

Create Managed Table
Load Data
Insert overwrite table
Insert into Local directory
CTAS
Insert Overwrite table select

Joins

Inner Joins
Outer Joins
Skew Joins

Multi-table Inserts
Multiple files, directories, table inserts
Serde
View
Index
UDF
UDAF
Hive Practicals
Oozie Architecture
Workflow designing in Oozie
Oozie practicals
YARN Architecture
Hadoop Classic vs YARN
YARN Demo
Flume Architecture
Flume Practicals
Zoo Keeper
Introduction to NOSQL Databases
NOSql Landscapes
Introduction to HBASE
HBASE vs RDBMS
Create Table on HBASE using HBASE shell
Where to use HBASE?
Where not to use HBASE?
Write Files to HBASE
Major Components of HBASE

HBase Master
HRegionServer
HBase Client
Zookeeper
Region

HBase Practicals
HBASE –ROOT- Catalog table
CAP Theorm
Compaction
Sharding
Sparse Datastore
Cassandra Architecture
Big Table and Dynamo
Distributed Hash Table, P2P Fault Tolerant

Data Modelling
Column Families

Installation Demo on Cassandra
Practicals
Real time Project Analysis
Design
Implementation
Execution
Debugging
Optimization Techniques
Which one to use where
Amazon Web Services(Hadoop on Cloud) – Installations for MultiNode
EMR and S3
Storm Architecture
Real time use case with Storm
Spark

What is Spark?

Understanding Spark

Spark Architecture

Hadoop RDD

RDDs Partitioning

Lazy Evaluation

Caching

Spark Context

Map, flatMap, filter

Actions

Serialization

Scala

Scala Features

Scala Functions

Collections and Combiners

Spark with Scala

Spark with Yarn

Spark on Cluster mode

Spark CLI

Spark programming with Java API

Spark Streaming

Spark SQL

Spark SQL Context

Spark SQL with Hive

Spark MLib Algorithms(K-Means, Clustering,..)

Spark GraphX Overview

Hands On and Usecases

Impala Architecture
Impala Practicals
Adhoc Querying in Impala
Compression Techniques

Snappy

Bgzip

Image processing in Hadoop
Certification Preparation Guidelines
Best Practices to setup Hadoop cluster
Commissioning and Decommissioning Nodes
Benchmarking the Hadoop cluster
Admin monitoring tools
Routine Admin tasks
Kafka Architecture
Kafka Usecase Execution

SV Soft Solutions Online Training

Pages

Hadoop

Hadoop Online Training

No comments:

Post a Comment