8280  Reviews star_rate star_rate star_rate star_rate star_half

Data Science and Data Engineering for Architects

This intensive Data Science training course covers the theoretical and practical aspects of applying the principles and methods of Data Science and Data Engineering in practice. The students are...

Read More
$2,620 USD
Duration 4 days
Course Code WA3057
Available Formats Classroom

Overview

This intensive Data Science training course covers the theoretical and practical aspects of applying the principles and methods of Data Science and Data Engineering in practice. The students are introduced to the relevant concepts, terminology, theory, and tools used in the field.

  • This Data Science training course is complemented by a variety of hands-on exercises to help the attendees reinforce their theoretical knowledge of the material being studied.

Skills Gained

  • Applied data science, business analytics, and data engineering
  • Common data science/machine learning algorithms for supervised and unsupervised machine learning
  • NumPy, pandas, matplotlib, seaborn, scikit-learn
  • Python REPLs
  • Jupyter notebooks
  • Data analytics life-cycle phases
  • Data repairing and normalizing
  • Data aggregation and grouping
  • Data visualization and EDA
  • Operational data analytics
  • Distributed and scalable data processing
  • Cloud machine learning and data engineering capabilities

Who Can Benefit

  • IT architects and technical managers

Prerequisites

Participants should have a working knowledge of Python (or have the programming background and/or the ability to quickly pick up Python’s syntax), and be familiar with core statistical concepts (variance, correlation, etc.)

Course Details

Outline

Chapter 1. Python for Data Science

  • Python Data Science-Centric Libraries
  • SciPy
  • NumPy
  • pandas
  • Scikit-learn
  • Matplotlib
  • Seaborn
  • Python Dev Tools and REPLs
  • IPython
  • Jupyter Notebooks
  • Anaconda
  • Summary

Chapter 2. Data Visualization in Python

  • Why Do I Need Data Visualization?
  • Data Visualization in Python
  • Getting Started with matplotlib
  • A Basic Plot
  • Scatter Plots
  • Figures
  • Saving Figures to a File
  • Seaborn
  • Getting Started with seaborn
  • Histograms and KDE
  • Plotting Bivariate Distributions
  • Scatter Plots in seaborn
  • Pair plots in seaborn
  • Heatmaps
  • A Seaborn Scatterplot with Varying Point Sizes and Hues
  • Summary

Chapter 3. Introduction to NumPy

  • What is NumPy?
  • The First Take on NumPy Arrays
  • The ndarray Data Structure
  • Understanding Axes
  • Indexing Elements in a NumPy Array
  • Re-Shaping
  • Commonly Used Array Metrics
  • Commonly Used Aggregate Functions
  • Sorting Arrays
  • Vectorization
  • Vectorization Visually
  • Broadcasting
  • Broadcasting Visually
  • Filtering
  • Array Arithmetic Operations
  • Reductions: Finding the Sum of Elements by Axis
  • Array Slicing
  • 2-D Array Slicing
  • The Linear Algebra Functions
  • Summary

Chapter 4. Introduction to pandas

  • What is pandas?
  • The DataFrame Object
  • The DataFrame's Value Proposition
  • Creating a pandas DataFrame
  • Getting DataFrame Metrics
  • Accessing DataFrame Columns
  • Accessing DataFrame Rows
  • Accessing DataFrame Cells
  • Deleting Rows and Columns
  • Adding a New Column to a DataFrame
  • Getting Descriptive Statistics of DataFrame Columns
  • Getting Descriptive Statistics of DataFrames
  • Sorting DataFrames
  • Reading From CSV Files
  • Writing to a CSV File
  • Summary

Chapter 5. Repairing and Normalizing Data

  • Repairing and Normalizing Data
  • Dealing with the Missing Data
  • Sample Data Set
  • Getting Info on Null Data
  • Dropping a Column
  • Interpolating Missing Data in pandas
  • Replacing the Missing Values with the Mean Value
  • Scaling (Normalizing) the Data
  • Data Preprocessing with scikit-learn
  • Scaling with the scale() Function
  • The MinMaxScaler Object
  • Summary

Chapter 6. Defining Data Science

  • What is Data Science?
  • Data Science, Machine Learning, AI?
  • The Data Science Ecosystem
  • Tools of the Trade
  • The Data-Related Roles
  • Data Scientists at Work
  • Examples of Data Science Projects
  • The Concept of a Data Product
  • Applied Data Science at Google
  • Data Science and ML Terminology: Features and Observations
  • Terminology: Labels and Ground Truth
  • Label Examples
  • Terminology: Continuous and Categorical Features
  • Encoding Categorical Features using One-Hot Encoding Scheme
  • Example of 'One-Hot' Encoding Scheme
  • Gartner's Magic Quadrant for Data Science and Machine Learning Platforms (a Labeling Example)
  • Machine Learning in a Nutshell
  • Common Distance Metrics
  • The Euclidean Distance
  • Decision Boundary Examples (Object Classification)
  • What is a Model?
  • Training a Model to Make Predictions
  • Types of Machine Learning
  • Supervised vs Unsupervised Machine Learning
  • Supervised Machine Learning Algorithms
  • Unsupervised Machine Learning Algorithms
  • Which ML Algorithm to Choose?
  • Bias-Variance (Underfitting vs Overfitting) Trade-off
  • Underfitting vs Overfitting (a Regression Model Example) Visually
  • ML Model Evaluation
  • Mean Squared Error (MSE) and Mean Absolute Error (MAE)
  • Coefficient of Determination
  • Confusion Matrix
  • The Binary Classification Confusion Matrix
  • The Typical Machine Learning Process
  • A Better Algorithm or More Data?
  • The Typical Data Processing Pipeline in Data Science
  • Data Discovery Phase
  • Data Harvesting Phase
  • Data Cleaning/Priming/Enhancing Phase
  • Exploratory Data Analysis and Feature Selection
  • Exploratory Data Analysis and Feature Selection Cont'd
  • ML Model Planning Phase
  • Feature Engineering
  • ML Model Building Phase
  • Capacity Planning and Resource Provisioning
  • Communicating the Results
  • Production Roll-out
  • Data Science Gotchas
  • Summary

Chapter 7. Overview of the scikit-learn Library

  • The scikit-learn Library
  • The Navigational Map of ML Algorithms Supported by scikit-learn
  • Developer Support
  • scikit-learn Estimators, Models, and Predictors
  • Annotated Example of the LinearRegression Estimator
  • Annotated Example of the Support Vector Classification Estimator
  • Data Splitting into Training and Test Datasets
  • Data Splitting in scikit-learn
  • Cross-Validation Technique
  • Summary

Chapter 8. Classification Algorithms (Supervised Machine Learning)

  • Classification (Supervised ML) Use Cases
  • Classifying with k-Nearest Neighbors
  • k-Nearest Neighbors Algorithm Visually
  • Decision Trees
  • Decision Tree Terminology
  • Decision Tree Classification in the Context of Information Theory
  • Using Decision Trees
  • Properties of the Decision Tree Algorithm
  • The Simplified Decision Tree Algorithm
  • Random Forest
  • Properties of the Random Forest Algorithm
  • Support Vector Machines (SVMs)
  • SVM Classification Visually
  • Properties of SVMs
  • Dealing with Non-Linear Class Boundaries
  • Logistic Regression (Logit)
  • The Sigmoid Function
  • Logistic Regression Classification Example
  • Logistic Regression's Problem Domain
  • Naive Bayes Classifier (SL)
  • Naive Bayesian Probabilistic Model in a Nutshell
  • Bayes Formula
  • Document Classification with Naive Bayes
  • Summary

Chapter 9. Unsupervised Machine Learning Algorithms

  • PCA
  • PCA and Data Variance
  • PCA Properties
  • Importance of Feature Scaling Visually
  • Unsupervised Learning Type: Clustering
  • Clustering vs Classification
  • Clustering Examples
  • k-means Clustering
  • k-means Clustering in a Nutshell
  • k-means Characteristics
  • Global vs Local Minimum Explained
  • Summary

Lab Exercises

  • Lab 1. Learning the CoLab Jupyter Notebook Environment
  • Lab 2. Data Visualization in Python
  • Lab 3. Understanding NumPy
  • Lab 4. Data Repairing
  • Lab 5. Understanding Common Metrics
  • Lab 6. Coding kNN Algorithm in NumPy (Optional)
  • Lab 7. Understanding Machine Learning Datasets in scikit-learn
  • Lab 8. Building Linear Regression Models
  • Lab 9. Spam Detection with Random Forest
  • Lab 10. Spam Detection with Support Vector Machines
  • Lab 11. Spam Detection with Logistic Regression
  • Lab 12. Comparing Classification Algorithms
  • Lab 13. Feature Engineering and EDA
  • Lab 14. Understanding PCA

Schedule

FAQ

Does the course schedule include a Lunchbreak?

Classes typically include a 1-hour lunch break around midday. However, the exact break times and duration can vary depending on the specific class. Your instructor will provide detailed information at the start of the course.

What languages are used to deliver training?

Most courses are conducted in English, unless otherwise specified. Some courses will have the word "FRENCH" marked in red beside the scheduled date(s) indicating the language of instruction.

What does GTR stand for?

GTR stands for Guaranteed to Run; if you see a course with this status, it means this event is confirmed to run. View our GTR page to see our full list of Guaranteed to Run courses.

Does Ascendient Learning deliver group training?

Yes, we provide training for groups, individuals and private on sites. View our group training page for more information.

What does vendor-authorized training mean?

As a vendor-authorized training partner, we offer a curriculum that our partners have vetted. We use the same course materials and facilitate the same labs as our vendor-delivered training. These courses are considered the gold standard and, as such, are priced accordingly.

Is the training too basic, or will you go deep into technology?

It depends on your requirements, your role in your company, and your depth of knowledge. The good news about many of our learning paths, you can start from the fundamentals to highly specialized training.

How up-to-date are your courses and support materials?

We continuously work with our vendors to evaluate and refresh course material to reflect the latest training courses and best practices.

Are your instructors seasoned trainers who have deep knowledge of the training topic?

Ascendient Learning instructors have an average of 27 years of practical IT experience and have also served as consultants for an average of 15 years. To stay current, instructors spend at least 25 percent of their time learning new, emerging technologies and courses.

Do you provide hands-on training and exercises in an actual lab environment?

Lab access is dependent on the vendor and the type of training you sign up for. However, many of our top vendors will provide lab access to students to test and practice. The course description will specify lab access.

Will you customize the training for our company’s specific needs and goals?

We will work with you to identify training needs and areas of growth.  We offer a variety of training methods, such as private group training, on-site of your choice, and virtually. We provide courses and certifications that are aligned with your business goals.

How do I get started with certification?

Getting started on a certification pathway depends on your goals and the vendor you choose to get certified in. Many vendors offer entry-level IT certification to advanced IT certification that can boost your career. To get access to certification vouchers and discounts, please contact info@ascendientlearning.com.

Will I get access to content after I complete a course?

You will get access to the PDF of course books and guides, but access to the recording and slides will depend on the vendor and type of training you receive.

How do I request a W9 for Ascendient Learning?

View our filing status and how to request a W9.

Reviews

The class was very vast paced however the teacher was very good at checking in on us while giving us time to complete the labs.

They were very good. They made sure everyone was able to get into the training and got all of the material needed for class.

Great class I learned a great deal from the material. There would seem to a large amount that I need to learn about.

Overall experiance is very nice. the online training plateform is very advance.

I found this course informative. It was easy to follow and provided some good information.