## Data Expert – Level2

## Applied Data Science with Machine Learning

- Bring your own laptop.
- Knowledge of
**Level 1**is required.

Note: Training is completely practical on real world industrial datasets to achieve mastery for an international market.

In this training, you’ll be hands-on on how to master mandatory data scientist technical skills, including

object-oriented and functional approaches with Python, and libraries like scikit-learn, Matplotlib,

NumPy, and pandas. You’ll also master web scraping and database queries, deep learning and machine

learning, and predictive analysis.

To help you stand out from others, we included concepts such as Git, and GitHub to develop efficient

collaboration. Best of all, you’ll learn by doing and apply your skills to several projects involving

realistic business scenarios to build your portfolio and prepare for the international market.

- Foreign and Local Experienced Trainers;
- Hands-on Training with Real Working Environment;
- Internship Opportunities; Affordable Cost;
- Continuous help to the Participants even after Training sessions via whatsapp groups;
- Guided projects, Profile building and Specific Resume designing are also done usually during gap between two levels.

As Data Science & Data Literacy are going to be mandatory and regulatory requirements for all domains and industries, any professional belonging to Finance, HR & Admin, Audit, Supply Chain, Engineering, Computer Science, Health Care, etc. are eligible for this training.

**Databases & SQL:**

Why SQL is Important

- Introduction to Databases
- Understanding Query
- Table Preview
- The LIMIT Clause
- Selecting Specific Columns
- Filtering Rows Using WHERE
- Expressing Multiple Filter Criteria Using ‘AND’
- Returning One of Several Conditions With OR
- Grouping Operators
- Ordering Results

**Statistics in SQL:**

- Aggregate Functions
- Missing Values
- Combining Multiple Aggregation Functions
- Customizing the Results
- Counting Unique Values
- Data Types
- String Functions and Operations
- Arithmetic in SQL

**Group Statistics in SQL:**

- If/Then in SQL
- Dissecting CASE
- Calculating Group-Level Summary Statistics GROUP BY Visual Breakdown
- Multiple Summary Statistics by Group
- Multiple Group Columns
- Querying Virtual Columns
- Order of Execution
- Nesting functions
- Casting

**Subqueries in SQL:**

- Subqueries
- Subquery in SELECT
- The IN Operator
- Multiple Results in Subqueries
- Building Complex Subqueries
- Integrating A Subquery with The Outer Query

**Joining Data in SQL:**

- Introducing Joins
- Inner Joins
- Left Joins
- Right Joins and Outer Joins
- Combining Joins with Subqueries

**Intermediate Joins in SQL:**

- Combining Multiple Joins with Subqueries
- Self-joins
- Pattern Matching Using Like

**Building and Organizing Complex Queries in SQL:**

- The With Clause
- Creating Views
- Combining Rows with Union
- Combining Rows Using Intersect and Except
- Multiple Named Subqueries

**Sampling:**

- Introduction
- Populations and Samples
- Sampling Error
- Simple Random Sampling
- Importance of Sample Size
- Stratified Sampling
- Proportional Stratified Sampling
- Choosing the Right Strata
- Cluster Sampling
- Sampling in Data Science Practice
- Descriptive and Inferential Statistics

**Variables in Statistics:**

- Quantitative and Qualitative Variables
- Scales of Measurement
- The Nominal Scale
- The Ordinal Scale
- The Interval and Ratio Scales
- Difference Between Ratio and Interval Scales
- Common Examples of Interval Scales
- Discrete and Continuous Variables
- Real Limits

**Frequency Distributions:**

- Frequency Distribution Tables
- Proportions and Percentages
- Percentiles and Percentile Ranks
- Grouped Frequency Distribution Tables
- Information Loss
- Frequency Tables and Continuous Variables
- Visualizing Distributions
- Statistical Bar & Pie Charts and its customization
- Histogram and the statistics behind it
- Histograms as Modified Bar Plots
- Binning for Histograms
- Skewed Distributions
- Symmetrical Distributions
- Comparing Frequency Distributions
- Grouped Bar Plots
- Comparing Histograms
- Kernel Density Estimate Plots
- Drawbacks of Kernel Density Plots
- Strip Plots
- Box plots
- Outliers

**Averages:**

- The Mean as a Balance Point
- Mean Algebraically
- Estimating the Population Mean
- Estimates from Low-Sized Samples
- Variability Around the Population Mean
- The Sample Mean as an Unbiased Estimator
- The Weighted Mean
- The Median for Open-ended Distributions
- Distributions with Even Number of Values
- The Median as a Resistant Statistic
- The Median for Ordinal Scales
- Sensitivity to Changes
- The Mode for Ordinal Variables
- The Mode for Nominal Variables
- The Mode for Discrete Variables

**Measures of Variability & Z-Scores:**

- The Range
- Mean Absolute Deviation
- Variance
- Standard Deviation
- Average Variability Around the Mean
- Measure of Spread
- The Sample Standard Deviation
- Bessel’s Correction
- Standard Notation
- Sample Variance — Unbiased Estimator
- Z-scores
- Locating Values in Different Distributions
- Transforming Distributions
- The Standard Distribution
- Standardizing Samples
- Standardization for Comparisons
- Converting Back from Z-scores

**Probabilities:**

- Probability Introduction
- The Empirical Probability
- Probability as Relative Frequency
- Repeating an Experiment
- The True Probability Value
- The Theoretical Probability
- Events vs. Outcomes

**Probability Rules:**

- Sample Space
- Probability of Events
- Certain and Impossible Events
- The Addition Rule
- Venn Diagrams
- Exceptions to the Addition Rule
- Mutually Exclusive Events
- Set Notation

** Complex Probability:**

- Complex Probability Problems
- Opposite Events
- Set Complements
- Multiplication Rule
- Independent Events
- Combining Formulas
- Sampling With(out) Replacement

**Permutations and Combinations:**

- The Rule of Product
- Extended Rule of Product
- Permutations
- Unique Arrangements
- Combinations

**Conditional Probability:**

- Updating Probabilities
- Conditional Probability Formula
- Complements
- Order of Conditioning
- The Multiplication Rule
- Statistical Independence
- Statistical Dependence

**Bayes Theorem:**

- Independence vs. Exclusivity
- The Law of Total Probability
- Bayes’ Theorem
- Prior and Posterior Probability
- The Naive Bayes Algorithm
- Conditional Independence
- Edge Cases
- Additive Smoothing
- Multinomial Naive Bayes

**Significance Testing:**

- Hypothesis Testing
- Research Design
- Statistical Significance
- Test Statistic
- Permutation Test
- P Value

** ****Chi-Squared Tests:**

- Observed and Expected Frequencies
- Statistical Significance
- Sampling Distribution Equality
- Degrees of Freedom
- Chi-squared
- Cross Tables

**APIs:**

- What’s an API?
- API Requests
- Types of Requests
- Status Codes
- Endpoints
- Query Parameters
- JSON Format
- Content Type
- API Authentication
- Endpoints and Objects
- Pagination
- User-Level Endpoints
- POST Requests
- PUT/PATCH Requests
- DELETE Requests

**Web Scraping:**

- Web Page Structure
- Retrieving Elements from a Page
- Find All
- Element IDs
- Element Classes
- CSS Selectors
- Nesting
- Selenium
- Beautiful Soup
- Requests libraries
- Playwright
- Microsoft browser automation tool
- Automate clicking

## Overview

## Course Modality

- On-site
- Online

## Course Duration

- 40 Hours

## Course Level

- Level 2 – Data Expert

## Course Prerequisites

- Knowledge of LEVEL-1 is required.

## Course Language

- English