TUV Rheiland Cerificated
+91 995 887 3874

Bigdata Hadoop Hive Spark Data Science Training

Bigdata Hadoop Hive Spark Data Science Training

The Apache Hadoop software library is a framework that allows for the distributed processing of large Big data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.

Bigdata & Data Science Courses

Certified Hadoop Architect Engineer

Enroll Now

Certified Bigdata Engineer

Enroll Now

Certified Data Scientist

Enroll Now

Android Training cochin
Understanding Big Data and Hadoop

Understanding Big Data and Hadoop

  • Limitations and Solutions of existing Data Analytics Architecture
  • Hadoop Features
  • Hadoop Ecosystem
  • Hadoop 2.x core components
  • Hadoop Storage: HDFS
  • Hadoop Processing: MapReduce Framework
  • Hadoop Different Distributions
Android Training cochin
Hadoop Architecture and HDFS

Hadoop Architecture and HDFS

  • Hadoop 2.x Cluster Architecture
  • Federation and High Availability
  • A Typical Production Hadoop Cluster
  • Hadoop Cluster Modes
  • Common Hadoop Shell Commands
  • Hadoop 2.x Configuration Files
  • Single node cluster and Multi node cluster set up Hadoop Administration.
Android Training cochin
Hadoop MapReduce Framework

Hadoop MapReduce Framework

  • Topics-MapReduce Use Cases
  • Hadoop 2.x MapReduce Architecture
  • YARN MR Application Execution Flow,
  • Anatomy of MapReduce Program
  • Input Splits
  • Relation between Input Splits and HDFS Blocks
  • MapReduce: Combiner & Partitioner
  • Counters ,Distributed Cache
  • MRunit, Reduce Join
  • Custom Input Format
  • Sequence Input Format
  • Xml file Parsing using MapReduce.
Android Training cochin
Hive

Hive

  • Hive Background
  • Hive Vs Pig
  • Hive Architecture and Components
  • Metastore in Hive, Limitations of Hive
  • Comparison with Traditional Database
  • Hive Data Types and Data Models, Partitions and Buckets, Hive Tables(Managed Tables and External Tables), Importing Data, Querying Data, Managing Outputs, Hive Script, Hive UDF, Retail use case in Hive, Hive Demo on Healthcare Data set.
  • Hive QL: Joining Tables, Dynamic Partitioning
  • Custom Map/Reduce Scripts
  • Hive Indexes and views Hive query optimizers
  • Hive : Thrift Server, User Defined Functions, HBase: Introduction to NoSQL Databases and HBase, HBase v/s RDBMS, HBase Components, HBase Architecture, Run Modes & Configuration, HBase Cluster Deployment.
Android Training cochin
HBase

HBase

  • HBase Data Model
  • HBase Shell
  • HBase Client API
  • Data Loading Techniques
  • ZooKeeper Data Model
  • Zookeeper Service
  • Zookeeper, Demos on Bulk Loading
  • Getting and Inserting Data, Filters in HBase
Android Training cochin
Apache Spark & scala

Apache Spark & scala

  • What is Apache Spark
  • Spark Ecosystem
  • Spark Components
  • Spark a Polyglot
  • Why Scala
  • SparkContext
  • RDD
Android Training cochin
Apache Pig

Apache Pig

  • About Pig
  • MapReduce Vs Pig
  • Programming Structure in Pig
  • Pig Running Modes
  • Pig components, Pig Execution
  • Pig Latin Program, Data Models in Pig
  • Pig Data Types, Shell and Utility Commands, Pig Latin : Relational Operators, File Loaders, Group Operator, COGROUP Operator, Joins and COGROUP, Union, Diagnostic Operators, Specialized joins in Pig, Built In Functions ( Eval Function, Load and Store Functions, Math function, String Function, Date Function, Pig UDF, Piggybank, Parameter Substitution ( PIG macros and Pig Parameter substitution ), Pig Streaming, Testing Pig scripts with Punit, Aviation use case in PIG, Pig Demo on Healthcare Data set.
Android Training cochin
Oozie Sqoop and Flume

Oozie Sqoop and Flume

  • Flume and Sqoop
  • Oozie Components, Oozie Workflow
  • Scheduling with Oozie
  • Oozie Co-ordinator
  • Oozie Commands, Oozie Web Console
  • Oozie for MapReduce
  • PIG, Hive, and Sqoop, Combine flow of MR, PIG, Hive in Oozie, Hadoop Project Demo, Hadoop Integration with Talend.
Back to Top