Description
WHAT YOU WILL LEARN
A Certified Big Data Engineer has demonstrated proficiency in designing and utilizing Big Data solutions (using Hadoop, MapReduce and other tools), with an emphasis on Big Data mechanisms used to enable data processing, data storage and the establishment of Big Data pipelines. Depending on the exam format chosen, attaining the Big Data Engineer Certification can require passing a single exam or multiple exams. Those who achieve this certification receive an official digital Certificate of Excellence, as well as a digital Certification Badge from Acclaim/Credly with an account that supports the online verification of certification status.
MODULE OVERVIEW
The Big Data Engineer certification track is associated with the following courses and the courses can be delivered via instructor-led training.
BDSCP Module 1: Fundamental Big Data
This foundational course provides a high-level overview of essential Big Data topic areas. A basic understanding of Big Data from business and technology perspectives is provided, along with an overview of common benefits, challenges, and adoption issues. The course content is divided into a series of modular sections, each of which is accompanied by one or more hands-on exercises.
The following primary topics are covered:
– Understanding Big Data
– Fundamental Terminology & Concepts
– Big Data Business & Technology Drivers
– Traditional Enterprise & Technologies Related to Big Data
– Characteristics of Data in Big Data Environments
– Dataset Types in Big Data Environments
– Fundamental Analysis and Analytics
– Machine Learning Types
– Business Intelligence & Big Data
– Data Visualization & Big Data
– Big Data Adoption & Planning Considerations
Duration: 1 Day
BDSCP Module 2: Big Data Analysis & Technology Concepts
This course explores a range of the most relevant topics that pertain to contemporary analysis practices, technologies and tools for Big Data environments. The course content does not get into implementation or programming details, but instead keeps coverage at a conceptual level, focusing on topics that enable participants to develop a comprehensive understanding of the common analysis functions and features offered by Big Data solutions, as well as a high-level understanding of the back-end components that enable these functions.
The following primary topics are covered:
– Big Data Analysis Lifecycle (from business case evaluation to data analysis and visualization)
– A/B Testing, Correlation
– Regression, Heat Maps
– Time Series Analysis
– Traditional Enterprise
– Network Analysis
– Spatial Data Analysis
– Classification, Clustering
– Filtering (including collaborative filtering & content-based filtering)
– Sentiment Analysis, Text Analytics
– Processing Workloads, Clusters
– Cloud Computing & Big Data
– Foundational Big Data Technology Mechanisms
Duration: 1 Day
BDSCP Module 7: Fundamental Big Data Engineering
This course covers engineering-related concepts, techniques and technologies for the processing and storage of Big Data datasets. It highlights the unique challenges faced when processing and storing large, volatile and disparate sets of data. NoSQL is covered and the MapReduce data processing engine is explained in detail as a base framework for high-volume batch data processing.
The following primary topics are covered:
– Big Data Engineering Techniques and Challenges
– Big Data Storage, including Sharding, Replication, CAP Theorem, ACID and BASE
– Master-Slave, Peer-to-Peer Replication, Combining Replication with Sharding
– Big Data Storage Requirements, Scalability, Redundancy and Availability
– Fast Access, Long-term Storage, Schema-less Storage and Inexpensive Storage
– On-Disk Storage, including Distributed File System and Databases
– Introduction to NoSQL and NewSQL
– NoSQL Rationale and Characteristics
– NoSQL Database Types, including Key-Value, Document, Column-Family and Graph Databases
– Big Data Processing Engines
– Distributed/Parallel Data Processing, Schema-less Data Processing
– Multi-Workload Support, Linear Scalability and Fault-Tolerance
– Big Data Processing Requirements, including Batch, Cluster and Realtime Modes
– MapReduce for Big Data Processing, including Map, Combine, Partition, Shuffle and Sort and Reduce
– MapReduce Algorithm Design
– Task Parallism, Data Parallism
Duration: 1 Day
BDSCP Module 8: Advanced Big Data Engineering
This course builds upon Module 7 by exploring advanced engineering topics pertaining primarily to the storage and processing of Big Data datasets. Specifically, it covers advanced Big Data engineering mechanisms, in-memory data storage and realtime data processing.
The course presents further considerations for building MapReduce algorithms and also introduces the Bulk Synchronous Parallel (BSP) processing engine, along with a discussion of graph data processing. The Big Data mechanisms required for developing Big Data pipelines, its stages and the design process involved in building Big Data processing solutions are also explored.
The following primary topics are covered:
– Advanced Big Data Engineering Mechanisms
– Serialization and Compression Engines
– In-Memory Storage Devices
– In-Memory Data Grids and In-Memory Databases
– Read-Through, Read-Ahead, Write-Through and Write-Behind Integration Approaches
– Polyglot Persistence
– Explanation, Issues and Recommendations
– Realtime Big Data Processing
– Speed Consistency Volume (SCV)
– Event Stream Processing (ESP)
– Complex Event Processing (CEP)
– The SCV Principle
– General Realtime Big Data Processing and MapReduce
– Advanced MapReduce Algorithm Designs
– Bulk Synchronous Parallel (BSP) Processing Engine
– BSP vs. MapReduce
– BSP Synchronous Parallel
– Graph Data and Graph Data Processing using BSP (Supersteps)
– Big Data Pipelines, including Definition and Stages
– Big Data with Extract-Load-Transform (ELT)
– Big Data Solution Characteristics, Design Considerations and Design Process
Duration: 1 Day
BDSCP Module 9: Big Data Engineering Lab
This course module covers a series of exercises and problems designed to test the participant’s ability to apply knowledge of topics covered previously in course modules 7 and 8. Completing this lab will help highlight areas that require further attention, and will further prove hands-on proficiency in Big Data engineering practices as they are applied and combined to solve real-world problems.
As a hands-on lab, this course incorporates a set of detailed exercises that require participants to solve various inter-related problems, with the goal of fostering a comprehensive understanding of how different data engineering technologies, mechanisms and techniques can be applied to solve problems in Big Data environments.
For instructor-led delivery of this lab course, the Certified Trainer works closely with participants to ensure that all exercises are carried out completely and accurately. Attendees can voluntarily have exercises reviewed and graded as part of the class completion. For individual completion of this course as part of the Module 9 Study Kit, a number of supplements are provided to help participants carry out exercises with guidance and numerous resource references.
Duration: 1 Day
PREREQUISITES
- There are no formal prerequisites for the certification exam
EXAM & CERTIFICATION
You can take exams anywhere in the world via Pearson VUE testing centers, Pearson VUE online proctoring and Arcitura on-site exam proctoring at your location.
You are provided with three flexible exam format options:
- Complete Exam B90.BDE, a single combined exam for the entire Big Data Engineer certification track. Recommended for those who want to only take a single exam that encompasses all course modules within this track.
- Complete the partial version of Exam B90.BDE. Recommended for those who have already obtained a BDSCP certification and would like to achieve the Big Data Engineer Certification without having to be retested on BDSCP Modules 1 and 2.
- Complete one module-specific exam for each course module in Big Data Engineer Certification track. This is recommended for those who want to progress gradually through the track and who would like to be assessed after each course module before proceeding to the next.