5 Months (weekend) | Live Online Classes
Eligibility : Graduates & Working Professionals
5 Months (weekend) | Live Online Classes
Eligibility : Graduates & Working Professionals
By clicking Sign Up, I authorize GUVI Team to Call me, receive SMS/Messages about its products & offers. This consent will override any registration for DNC/ NDNC
The IITM Pravartak and AWS Certified Advanced Professional course in Big Data and Cloud Analytics is a leading-edge Technological Program paving your way to an assured lucrative career. It is an integrated course directed by GUVI - an IIT-Madras incubated company. Instructed by the industry’s best Technical Experts & Founders(Ex-Paypal Employees), this program offers mentorship through Data Engineering Experts and directs you to the Fortune 500 companies. The vision is to make the premium organizations discover the Right Talent.
Zen Class is one of the industry’s leading Project Based Career Programs offered by GUVI that promises Placement Assistance on completing the course. Conducted by an IIT Madras incubated company & designed by our Founders(Ex-Paypal Employees) also offers mentoring through experts from companies like Google, Microsoft, Flipkart, Zoho & Freshworks for placing you in top companies with high salaries.
Check the only eligibility criteria that we have for our Zen class: Students & Working Professionals, seeking opportunities to upskill their Data Engineering proficiency for faster career growth.
"Most Trusted Vernacular Edtech Brand"
Awarded by ZEE Digital during ZEE National Achievers Awards 2022.
AI-for-India 1.0 - Guinness World Record Holder
Broke the Record for most users taking an online computer programming lesson in 24 Hrs.
"Best Online Personalised Learning Programs"
Awarded by ENTREPRENEUR INDIA for having the best online personalized learning programs
Get personalized mentorship and guidance from several Industry experts who work in leading companies
Get introduced to the different aspects of Data Engineering and stay abreast with the advanced tools and technologies that go hand-in-hand to make a brand a fire. Master the Data Engineering skills that the present tycoons & top Data Engineers follow!
● Why python ?
● Python IDE
● Hello World Program
● Variables & Names
● String Basics
● List
● Tuple
● Dictionaries
● Conditional Statements
● For and While Loop , TRY AND EXCEPT
● Numbers and Math Functions
● Common Errors in Python
● FUNCTIONS , Lambda, Filters and Map
● Functions as Arguments
● List Comprehension
● Debugging in Python
● Class and Objects
● Inheritance , polymorphism , abstractions
● Liner and non-Linear Data structures
● Singly , doubly ,ciculer Linked list , Binary tree
● Bubble , insertion , merge ,quick , Heap sorting
● File Handling (Text , Json , csv )
● Iterators
● Pickling, Multi Threading
Sub module 1
Sub module 2
● MYSQL
● SQL KEYS
● PRIMARY KEY
● FOREIGN KEY
● UNIQUE KEY
● composite key
● triggers
● indexes
● transaction
● views
● Load balancer and High availbility
● Horizontal vs Vertical Scaling
● Monolithic vs microservice
● Distributed messing service and Aws SQS
● CDN (content deliver Network)
● Caching , scalability
● Aws API gateway
Sub module 1
● CAP Theorem
● Structured and
unstructured data
● OLTP vs OLAP
● Schema vs Schema less
● Dimensional modelling
● Cluster set and up
Monitoring"
● Insert First Data
● CRUD Operations
● Insert Many
● Update and Update Many
● Delete and Delete Many
Sub module 2
● Projection
● Intro to Embed Documents
● Embed Documents in
Action
● Adding Arrays
● Fetching Data From
Structured Data
● Schema Types
● Types of Data in MongoDB
● Relationship between
data's
● Aggregation
Sub module 3
● One to One using
Embed Method
● One to One using
Reference Many
● One to Many Embed
● One to ManyReference Method
● Assessment-
MongoDB
● Intrduction to Linux
● Basic Shell script commands
● Creating Frameworks
● Cron jobs, Email alerts
● Running Batch jobs
● Introduction to Git
● Git commonds
● cloning repository in vs code
● Working on cloning branches, commit, push, add, merge from vs code
Sub module 1
● Introduction to Cloud
● AWS Services overview
● Server vs serverless
● IAM ,roles , policies
● EC2 , VM’s
● S3
● RDS – MySQL Free tier database ,
● Integrating RDS to Local System and
Integrating RDS to Python Environment
Sub module 2
● Cloud data warehouse
● Cloud Datalake
● Cloud database (Dynamo DB)
● Lambda
● Cloud Watch,
● Integratrating All the Above
componets and RDS
● Monitoring ETL pipline with Step
funtion
● Glue, Data crawler, Athena
● monitoring ETL pipeline with step
funtion
● Introduction to snowflake
● Diffrence between Datalake,Datawarehouse,Deltalake,Database
● Dimension and Fact Tables
● Roles and users
● data modeling , snowpipe
● MLOAP and ROLAP
● Partitioning and indexing
● Data mart and data cubes & caching
● data masking
● handling json files
● data loading from S3 and tranformation
● Why and what is airflow
● airflow UI
● Run first dag
● grid view
● graph view
● landing times view
● calender view
● gantt view
● Code view
● Core concepts of airflow
● DAGs
● Scope
● Operators
● control flow
● Task and task instance
● Database and executors
● ETL/ELT process implementation
● monitoring ETL pipeline with aitflow
Sub module 1
● Installing Hive, Installing MSQL
Locally ,
● Running Hive Query to integrat Local
and HDFS file system
● Installing Pig,
● Working with Pig script and itegrating
with local and HDFS file system
● Installing HBase working with HBase
Qurey
● Installing Cassandra and working
with Cassandra
Sub module 2
● Installing Sqoop and fume and do the
data Migration,
● Local RDBMS to HDFS ,
● Local RDBMS to Hive,
● Local RDBMS to HBase,
● HDFS to local RDBMS
● Hive to RDBMS
● Introduction to kafka
● producer, consumer, Consumer Groups
● topics , offset , partitions, brokers
● Zookeper,replication
● Batch vs realtime streaming
● real streaming process
● Assignment and Task
Sub module 1
● Introduction to Apache Spark
● Spark architecture
● Hadoop vs Spark
● RDDs , Dag , tranformation ,
actions
● Data Partitioning and Shuffling
● DataFrame & Spark SQL
● Streaming data handling in
Spark
Sub module 2
● Spark batch data
processing(CSV,
JSON,parquet files)
● AWS Data Management Tools
[AWS EMR , GLUE jobs]
● Assignment & Assessments
● Structured vs Unstructured Data using Pandas
● Common Data issues and how to clean them
● Data cleaning with Pandas and pyspark\
● Handling Json Data
● Meaningful data transformation (Scaling and
● Normalization)
● Example: Movies Data Set Cleaning
Sub module 1
● server,architecture
● installation
● understading prom UI
● node exporters
● promql (agg , fun ,opertaors,datatypes)
● integrating python with prom
● counter , gauge , summary ,histogram
● recording rules
● alerting rules
● alert manager ,installation of alert
manager
● grouping, inhebiting , throttling ,
silencing alerts
Sub module 2
● Salck integration with prom with alert
manager
● pager duty integration with alert
manager
● Black box exporters,installation
● Mysql exporter
● Integrating aws and prom
● aws cloudwatch and prom
● Implementing graffana dashboard to
prom"
● Metrics
● Dashboards
● Alerts
● Monitors
● Tracing
● Logs monitoring
● Integrations
● What is docker
● Installation of docker
● Docker images , containers● Docker file
● Docker volume
● Docker registry
● Containerzing applaciton with docker hands-on
● Nodes
● Pods
● ReplicaSets
● Deployments
● Namespaces
● Ingress
Get introduced to the different aspects of Data Engineering and stay abreast with the advanced tools and technologies that go hand-in-hand to make a brand a fire. Master the Data Engineering skills that the present tycoons & top Data Engineers follow!
Students explore Python, a versatile and beginner-friendly programming language.
Python is known for its readability and wide range of applications, from web
development and data analysis to artificial intelligence and automation. It offers a rich
ecosystem of libraries and tools, making it a popular choice for both novice and
experienced programmers.● Why python ?
● Python IDE
● Hello World Program
● Variables & Names
● String Basics
● List
● Tuple
● Dictionaries
● Conditional Statements
● For and While Loop , TRY AND EXCEPT
● Numbers and Math Functions
● Common Errors in Python
Students will dive into some advanced concepts like comprehensions, file
handling, regular expressions, object oriented programming, pickling and
many more essential concepts.
● FUNCTIONS , Lambda, Filters and Map
● Functions as Arguments
● List Comprehension
● Debugging in Python
● Class and Objects
● Inheritance , polymorphism , abstractions
● Liner and non-Linear Data structures
● Singly , doubly ,ciculer Linked list , Binary tree
● Bubble , insertion , merge ,quick , Heap sorting
● File Handling (Text , Json , csv )
● Iterators
● Pickling, Multi Threading
Students dive into SQL (Structured Query Language) to acquire the skills needed for
managing and querying relational databases. SQL enables them to retrieve, update, and
manipulate data, making it a fundamental tool for working with structured data in
various applications.
Sub module 4 Hr
Sub module 2 Hr
Students explore RDBMS (Relational Database Management System) to understand the
database technology that organizes data into structured tables with defined
relationships.
● MYSQL
● SQL KEYS
● PRIMARY KEY
● FOREIGN KEY
● UNIQUE KEY
● composite key
● triggers
● indexes
● transaction
● views
● Load balancer and High availbility
● Horizontal vs Vertical Scaling
● Monolithic vs microservice
● Distributed messing service and Aws SQS
● CDN (content deliver Network)
● Caching , scalability
● Aws API gateway
Students delve into MongoDB to understand this popular NoSQL database, which stores
data in flexible, JSON-like documents. They learn how MongoDB's scalability and speed
make it suitable for handling large volumes of unstructured data
Sub module 4 Hr
● CAP Theorem
● Structured and
unstructured data
● OLTP vs OLAP
● Schema vs Schema less
● Dimensional modelling
● Cluster set and up
Monitoring"
● Insert First Data
● CRUD Operations
● Insert Many
● Update and Update Many
● Delete and Delete Many
Sub module 4 Hr
● Projection
● Intro to Embed Documents
● Embed Documents in
Action
● Adding Arrays
● Fetching Data From
Structured Data
● Schema Types
● Types of Data in MongoDB
● Relationship between
data's
● Aggregation
Sub module 2 Hr
● One to One using
Embed Method
● One to One using
Reference Many
● One to Many Embed
● One to Many
Reference Method
● Assessment-
MongoDB
Students explore shell scripting in the Linux environment, where they learn to write and
execute scripts using the command-line interface. Shell scripts are text files containing a series of commands, and students discover how to automate tasks
● Intrduction to Linux
● Basic Shell script commands
● Creating Frameworks
● Cron jobs, Email alerts
● Running Batch jobs
Students study Git, a distributed version control system, to learn how it tracks changes in software code. Git allows collaborative development, enabling multiple people to work on the same project simultaneously while managing different versions of code. It is essential for software development, as it tracks revisions, facilitates collaboration, and helps in code management.
● Introduction to Git
● Git commonds
● cloning repository in vs code
● Working on cloning branches, commit ,push ,add , merge from vs code
Students delve into cloud computing, which involves delivering various computing
services (such as servers, storage, databases, networking, software, and analytics) over
the internet.
Sub module 4 Hr
● Introduction to Cloud
● AWS Services overview
● Server vs serverless
● IAM ,roles , policies
● EC2 , VM’s
● S3
● RDS – MySQL Free tier database ,
● Integrating RDS to Local System and
Integrating RDS to Python Environment
Sub module 4 Hr
● Cloud data warehouse
● Cloud Datalake
● Cloud database (Dynamo DB)
● Lambda
● Cloud Watch,
● Integratrating All the Above
componets and RDS
● Monitoring ETL pipline with Step
funtion
● Glue, Data crawler, Athena
● monitoring ETL pipeline with step
funtion
Students study Snowflake to grasp modern cloud-based data warehousing, focusing on its architecture, data sharing, scalability, and data analytics applications.
applications.
● introduction to snowflake
● Diffrence between Datalake,Datawarehouse,Deltalake,Database
● Dimension and Fact Tables
● Roles and users
● data modeling , snowpipe
● MLOAP and ROLAP
● Partitioning and indexing
● Data mart and data cubes & caching
● data masking
● handling json files
● data loading from S3 and tranformation
Students explore Airflow to understand its role in orchestrating and automating
workflows, scheduling tasks, managing data pipelines, and monitoring job execution.
● Why and what is airflow
● airflow UI
● Run first dag
● grid view
● graph view
● landing times view
● calender view
● gantt view
● Code view
● Core concepts of airflow
● DAGs
● Scope
● Operators
● control flow
● Task and task instance
● Database and executors
● ETL/ELT process implementation
● monitoring ETL pipeline with aitflow
Students delve into big data to learn about handling and analyzing vast datasets, using
tools like Hadoop, Hive , and HDFS , PIG for insights and decision-making.
Sub module 4 Hr
● Installing Hive, Installing MSQL
Locally ,
● Running Hive Query to integrat Local
and HDFS file system
● Installing Pig,
● Working with Pig script and itegrating
with local and HDFS file system
● Installing HBase working with HBase
Qurey
● Installing Cassandra and working
with Cassandra
Sub module 4 Hr
● Installing Sqoop and fume and do the
data Migration,
● Local RDBMS to HDFS ,
● Local RDBMS to Hive,
● Local RDBMS to HBase,
● HDFS to local RDBMS
● Hive to RDBMS
Students learn about Kafka, an open-source stream processing platform. Kafka is used
for ingesting, storing, processing, and distributing real-time data streams and explore Kafka's architecture, topics, producers, consumers, and its role in handling large
volumes of data with low latency.
● Introduction to kafka
● producer, consumer, Consumer Groups
● topics , offset , partitions, brokers
● Zookeper,replication
● Batch vs realtime streaming
● real streaming process
● Assignment and Task
Students will explore Spark is an open-source, distributed computing framework
that provides high-speed, in-memory data processing for big data analytics.
Sub module 4 Hr
● Introduction to Apache Spark
● Spark architecture
● Hadoop vs Spark
● RDDs , Dag , tranformation ,
actions
● Data Partitioning and Shuffling
● DataFrame & Spark SQL
● Streaming data handling in
Spark
Sub module 2 Hr
● Spark batch data
processing(CSV,
JSON,parquet files)
● AWS Data Management Tools
[AWS EMR , GLUE jobs]
● Assignment & Assessments
Students engage in data cleaning to understand the process of identifying and
correcting errors or inconsistencies in datasets, ensuring data accuracy and reliability
for analysis and reporting.
● Structured vs Unstructured Data using Pandas
● Common Data issues and how to clean them
● Data cleaning with Pandas and pyspark\
● Handling Json Data
● Meaningful data transformation (Scaling and
● Normalization)
● Example: Movies Data Set Cleaning
Students study Prometheus to explore its role as an open-source monitoring and
alerting toolkit, used for collecting and visualizing metrics from various systems, aiding
in performance optimization and issue detection.
Sub module 4 Hr
● server,architecture
● installation
● understading prom UI
● node exporters
● promql (agg , fun ,opertaors,datatypes)
● integrating python with prom
● counter , gauge , summary ,histogram
● recording rules
● alerting rules
● alert manager ,installation of alert
manager
● grouping, inhebiting , throttling ,
silencing alerts
Sub module 3 Hr
● Salck integration with prom with alert
manager
● pager duty integration with alert
manager
● Black box exporters,installation
● Mysql exporter
● Integrating aws and prom
● aws cloudwatch and prom
● Implementing graffana dashboard to
prom"
● Metrics
● Dashboards
● Alerts
● Monitors
● Tracing
● Logs monitoring
● Integrations
Students learn about Docker to understand containerization technology, which allows
them to package applications and their dependencies into portable, efficient containers. Docker facilitates easy deployment, scaling, and management of applications across various environments.
● What is docker
● Installation of docker
● Docker images , containers● Docker file
● Docker volume
● Docker registry
● Containerzing applaciton with docker hands-on
● Nodes
● Pods
● ReplicaSets
● Deployments
● Namespaces
● Ingress
With IIT-M Pravartak certification for an Advanced Programming Professional Certificate &
Official AWS Certification
Pay just Rs 8000 (Refundable) and Pre-Book your seat
Program Fee
₹89,999
(without IIT-M Pravartak Certification)
Program Fee
₹1,30,095
₹1,23,900
(with IIT-M Pravartak Certification)
Start learning today & Pay in easy EMIs! Get maximum flexibility to learn at your own pace. EMI Options available
Hurry up. Limited seats only!
& more
Data engineering is the technique of designing and creating systems that can efficiently collect, store, and interpret data for analytical or operational uses. It is an aspect of data science that focuses on practical data applications.
There is just one eligibility criteria for ZEN Class: Students & Working Professionals seeking an opportunity to upskill their Data Engineering proficiency for faster career growth. To keep the chances fair, we provide a PreBootcamp session for Zen Class Interested students to understand how ready they are to be a Data Engineer. A Small Eligibility test is conducted right after the Pre-Bootcamp will provide you with a final ticket to be part of Zen Bootcamp.
With the objective of creating as many job opportunities for our students, we do intend to help every student who is willing to “make the extra catching up needed” in terms of understanding Business & Data Analysis. We assess this via a comprehensive PreBootcamp where you can understand how ready you are for the Zen Bootcamp. In case you don’t make the eligibility,our mentors will charter the course aheads for you with some Guvi Lessons.
Along with 100% job placement support, this Data Engineering Program extends an industry-recognized certification from the IIT-M incubated company GUVI. Procure the essential skills and make an extraordinary career in Data Engineering !
Our comprehensive program is a 3 Months (weekday) / 5 Months (weekend) Certification Bootcamp that will facilitate live online subject-matter expert-driven classes.
Promising hands-on training, our program comprises live sessions + 20+ Projects, guaranteeing 50+ interviews complemented with resume reviews and a real-time project portfolio.
The tools & technologies covered in this program include Python, SQL, Shell Script, Orchestrator, Cloud Services, Big Data, Data Cleaning, Data Visualization, etc.
Request a Callback. An expert from the admissions office will call you in the next 24 working hours. You can also reach out to us at [email protected] or +91-9736097320
Fill & Download Syllabus
By clicking Sign Up, I authorize GUVI Team to Call me, receive SMS/Messages about its products & offers. This consent will override any registration for DNC/ NDNC
Please provide the details below:
By clicking Sign Up, I authorize GUVI Team to Call me, receive SMS/Messages about its products & offers. This consent will override any registration for DNC/ NDNC
Book your seat by filling the form
By clicking Sign Up, I authorize GUVI Team to Call me, receive SMS/Messages about its products & offers. This consent will override any registration for DNC/ NDNC
Fill & Proceed to Dream Job Opportunity
By clicking Sign Up, I authorize GUVI Team to Call me, receive SMS/Messages about its products & offers. This consent will override any registration for DNC/ NDNC
Verify OTP to proceed
Change Number?
Resend