Big Data with Hadoop Course
Duration: 80 Hrs.
Pre-requisite: Strong knowledge of Java & J2EE
- Introduction to Big Data and Hadoop
- What is Big data, Its impact
- How Hadoop is helpful to manage & process Big Data.
- Hadoop Ecosystem
- Getting Started with Hadoop update
- Hadoop Architecture
- Hadoop Components
- HDFS
- Hadoop Deployment
- Get comfort with single node cluster (No setup please) understand configurations of important files
- Data loading & processing.
- Introduction to MapReduce
- Mapper
- Reducer
- Driver
- Input split, Participation, Combiner, Shuffling
- Advanced HDFS and MapReduce
- Pig
- About Pig
- Pig latin scripting
- Complex Data Type
- Where to use PIG when there is MR
- Operation & transformation
- Compilation
- Load
- Filter, Join, foreach
- Hadoop scripting
- Hive
- About Hive
- Manage table
- External table
- Complex data Type
- Execution engine
- Partition & Bucketing
- Hive query (sorting, aggregating, Joins)
- Commercial Distribution of Hadoop
- Only theory introduction ZooKeeper, Flume
- Ecosystem and its Components
- Introduction to mongodb
- Overview, advantages
- mongoDB environment
- Create database
- Drop database
- Create Collection
- Drop Collection
- Data types