Hadoop

Hadoop Online Training 

  • Understanding BigData
  • What is Big Data?
  • Big-Data characteristics
  • Hadoop Distributions
  • Hortonworks
  • Cloudera
  • Pivotal HD
  • Greenplum
  • Introduction to Apache Hadoop
  • Flavors of Hadoop: Big-Insights, Google Query etc..
  • Hadoop Eco-system components: Introduction
  • MapReduce
  • HDFS
  • Apache Pig
  • Apache Hive
  • HBASE
  • Apache Oozie
  • FLUME
  • SQOOP
  • Apache Mahout
  • KIJI
  • LUCENE
  • SOLR
  • KiteSDK
  • Impala
  • Chukwa
  • Shark
  • Cascading
  • Understanding Hadoop Cluster
  • Hadoop Core-Components
  • NameNode
  • JobTracker
  • TaskTracker
  • DataNode
  • SecondaryNameNode
  • HDFS Architecture
  • Why 64MB?
  • Why Block?
  • Why replication factor 3?
  • Discuss NameNode and DataNode
  • Discuss JobTracker and TaskTracker
  • Typical workflow of Hadoop application
  • Rack Awareness
  • Network Topology
  • Assignment of Blocks to Racks and Nodes
  • Block Reports
  • Heart Beat
  • Block Management Service
  • Anatomy of File Write
  • Anatomy of File Read
  • Heart Beats and Block Reports
    • Discuss Secondary NameNode
    • Usage of FsImage and Edits log
      • Map Reduce Overview
      • Best Practices to setup Hadoop cluster
      • Cluster Configuration
    • Core-default.xml
      • Hdfs-default.xml
        • Mapred-default.xml
          • Hadoop-env.sh
            • Slaves
              • Masters
                • Need of *-site.xml
                • Map Reduce Framework
                • Why Map Reduce?
                • Use cases where Map Reduce is used
                • Hello world program with Weather Use Case
              • Setup environment for the programs
                • Possible ways of writing Map Reduce program with sample codes find the best code and discuss
                  • Configured, Tool, GenericOptionParser and queues usage
                    • Demo for calculating maximum temperature and Minimum temperature
                      • Limitations of traditional way of solving word count with large dataset
                      • Map Reduce way of solving the problem
                      • Complete overview of MapReduce
                      • Split Size
                      • Combiners
                      • Multi Reducers
                      • Parts of Map Reduce
                      • Algorithms
                      • Apache Hadoop Single Node Installation Demo
                      • Namenode format
                      • Apache Hadoop Multi Node Installation Demo
                      • Add nodes dynamically to a cluster with Demo
                      • Remove nodes dynamically to a cluster with Demo
                      • Safe Mode
                      • Hadoop cluster modes
                        • Standalone Mode
                        • Psuedo distributed Mode
                      • Fully distributed mode
                      • Revision
                      • HDFS Practicals(HDFS Commands)
                      • Map Reduce Anatomy
                  • Job Submission
                    • Job Initialization
                      • Task Assignments
                        • Task Execution
                            • Schedulers
                            • Quiz
                            • Map Reduce Failure Scenarios
                            • Speculative Execution
                            • Sequence File
                            • Input File Formats
                            • Output File Formats
                            • Writable DataTypes
                            • Custom Input Formats
                            • Custom keys, Values usage of writables
                            • Walkthrough the installation process through the cloudera manager
                            • Example List, show sample example list for the installation
                            • Demo on teragen, wordcount, inverted index, examples
                            • Debugging Map Reduce Programs
                            • Map Reduce Advance Concepts
                            • Partitioning and Custom Partitioner
                            • Joins
                            • Multi outputs
                            • Counters
                            • MR unit testcases
                            • MR Design patterns
                            • Distributed Cache
                              • Command line implementation
                            • MapReduce API implementation
                            • Map Reduce Advance concepts examples
                            • Introduction to course Project
                            • Data loading techniques
                        • Hadoop Copy commands
                        • Put,get,copyFromLocal,copyToLocal,mv,chmod,rmr,rmr –skipTrash,distcp,ls,lsr,df,du,cp,moveFromLocal,moveToLocal,text,touhz,tail,mkdir,help
                        • Flume
                        • Sqoop
                            • Demo for Hadoop Copy Commands
                            • Sqoop Theory
                            • Demo for Sqoop
                            • Need of Pig?
                            • Why Pig Created?
                            • Introduction to skew Join
                            • Why go for Pig when Map Reduce is there?
                            • Pig use cases
                            • Pig built in operators
                            • Pig store schem
                            • Operators
                        • Load
                          • Store
                            • Dump
                              • Filter
                                • Distinct
                                  • Group
                                    • CoGroup
                                      • Join
                                        • Stream
                                          • Foreach Generate
                                            • Parallel
                                              • Distinct
                                                • Limit
                                                  • ORDER
                                                    • CROSS
                                                      • UNION
                                                        • SPLIT
                                                          • Sampling
                                                              • Dump Vs Store
                                                              • DataTypes
                                                          • Complex
                                                          • Bag
                                                          • Tuple
                                                          • Atom
                                                          • Map
                                                                • Primitives
                                                          • Integers
                                                            • Float
                                                              • Chararray
                                                                • byteArray
                                                                  • Double
                                                                        • Diagnostic Operators
                                                                      • Describe
                                                                        • Explain
                                                                          • Illustrate
                                                                            • UDFs
                                                                        • Filter Function
                                                                        • Eval Function
                                                                        • Macros
                                                                        • Demo
                                                                            • Storage Handlers
                                                                            • Pig Practicals and Usecases
                                                                            • Demo using schema
                                                                            • Demo using without schema
                                                                            • Hive Background
                                                                            • What is Hive?
                                                                            • Pig Vs Hive
                                                                            • Where to Use Hive?
                                                                            • Hive Architecture
                                                                            • Metastore
                                                                            • Hive execution modes
                                                                            • External, Manged, Native and Non-native tables
                                                                            • Hive Partitions
                                                                        • Dynamic Partitions
                                                                          • Static Partitions
                                                                              • Buckets
                                                                              • Hive DataModel
                                                                              • Hive DataTypes
                                                                          • Primitive
                                                                            • Complex
                                                                                • Queries
                                                                            • Create Managed Table
                                                                            • Load Data
                                                                            • Insert overwrite table
                                                                            • Insert into Local directory
                                                                            • CTAS
                                                                            • Insert Overwrite table select
                                                                                • Joins
                                                                            • Inner Joins
                                                                            • Outer Joins
                                                                            • Skew Joins
                                                                                • Multi-table Inserts
                                                                                • Multiple files, directories, table inserts
                                                                                • Serde
                                                                                • View
                                                                                • Index
                                                                                • UDF
                                                                                • UDAF
                                                                                • Hive Practicals
                                                                                • Oozie Architecture
                                                                                • Workflow designing in Oozie
                                                                                • Oozie practicals
                                                                                • YARN Architecture
                                                                                • Hadoop Classic vs YARN
                                                                                • YARN Demo
                                                                                • Flume Architecture
                                                                                • Flume Practicals
                                                                                • Zoo Keeper
                                                                                • Introduction to NOSQL Databases
                                                                                • NOSql Landscapes
                                                                                • Introduction to HBASE
                                                                                • HBASE vs RDBMS
                                                                                • Create Table on HBASE using HBASE shell
                                                                                • Where to use HBASE?
                                                                                • Where not to use HBASE?
                                                                                • Write Files to HBASE
                                                                                • Major Components of HBASE
                                                                            • HBase Master
                                                                            • HRegionServer
                                                                            • HBase Client
                                                                            • Zookeeper
                                                                            • Region
                                                                                • HBase Practicals
                                                                                • HBASE –ROOT- Catalog table
                                                                                • CAP Theorm
                                                                                • Compaction
                                                                                • Sharding
                                                                                • Sparse Datastore
                                                                                • Cassandra Architecture
                                                                                • Big Table and Dynamo
                                                                                • Distributed Hash Table, P2P Fault Tolerant
                                                                            • Data Modelling
                                                                            • Column Families
                                                                                • Installation Demo on Cassandra
                                                                                • Practicals
                                                                                • Real time Project Analysis
                                                                                • Design
                                                                                • Implementation
                                                                                • Execution
                                                                                • Debugging
                                                                                • Optimization Techniques
                                                                                • Which one to use where
                                                                                • Amazon Web Services(Hadoop on Cloud) – Installations for MultiNode
                                                                                • EMR and S3
                                                                                • Storm Architecture
                                                                                • Real time use case with Storm
                                                                                • Spark
                                                                            • What is Spark?
                                                                              • Understanding Spark
                                                                                • Spark Architecture
                                                                                  • RDD
                                                                                    • Hadoop RDD
                                                                                      • RDDs Partitioning
                                                                                        • Lazy Evaluation
                                                                                          • Caching
                                                                                            • Spark Context
                                                                                              • Map, flatMap, filter
                                                                                                • Actions
                                                                                                  • Serialization
                                                                                                    • Scala
                                                                                                      • Scala Features
                                                                                                        • Scala Functions
                                                                                                          • Collections and Combiners
                                                                                                            • Spark with Scala
                                                                                                              • Spark with Yarn
                                                                                                                • Spark on Cluster mode
                                                                                                                  • Spark CLI
                                                                                                                    • Spark programming with Java API
                                                                                                                      • Spark Streaming
                                                                                                                        • Spark SQL
                                                                                                                          • Spark SQL Context
                                                                                                                            • Spark SQL with Hive
                                                                                                                              • Spark MLib Algorithms(K-Means, Clustering,..)
                                                                                                                                • Spark GraphX Overview
                                                                                                                                  • Hands On and Usecases
                                                                                                                                      • Impala Architecture
                                                                                                                                      • Impala Practicals
                                                                                                                                      • Adhoc Querying in Impala
                                                                                                                                      • Compression Techniques
                                                                                                                                  • Snappy
                                                                                                                                    • LZO
                                                                                                                                      • Bgzip
                                                                                                                                          • Image processing in Hadoop
                                                                                                                                          • Certification Preparation Guidelines
                                                                                                                                          • Best Practices to setup Hadoop cluster
                                                                                                                                          • Commissioning and Decommissioning Nodes
                                                                                                                                          • Benchmarking the Hadoop cluster
                                                                                                                                          • Admin monitoring tools
                                                                                                                                          • Routine Admin tasks
                                                                                                                                          • Kafka Architecture
                                                                                                                                          • Kafka Usecase Execution

                                                                                                                                                                                                                                        No comments:

                                                                                                                                                                                                                                        Post a Comment

                                                                                                                                                                                                                                        Note: only a member of this blog may post a comment.