Databricks ML Engineer Skill Path
End-to-end industry-focused skill path to help gain essential skills for running machine learning workloads on Databricks
Pre-Requisites
✅ Model Pre-procesisng
✅ Model building and Evaluation
✅ NLP tasks and RNN
✅ Vision tasks and CNN
Key Highlights
✅Mutiple raw and clean data sources for ingestion
✅ Aligned to Databricks ML Associate certificate
✅ Intricacies of Model serving and MOps
Skill Path
Introduction to Databricks
This Module details the need for a Unified Analytics platform like Databricks and how to utilize it to tackle Data + AI challenges. In this Module, we will look into Databricks architecture and how it can be created in Azure Databricks. We will also understand different types of clusters needed for various Analytical workloads
Exploratory Analysis and data preparation using Databricks
This module emphasizes techniques for data exploration and visualization using Spark and Python within Databricks.
AutoML using Databricks
This module explores Databricks' AutoML capabilities, which automate parts of the machine learning workflow.
Feature Engineering using Databricks
This module focuses on the process of preparing data for machine learning, including handling missing values and encoding categorical features, engineering domain-specific features, and other data pre-processing techniques
Feature Management with Feature stores
This module covers the Databricks Feature Store, which helps in storing and accessing features for machine learning pipelines.
Feature Management with Unity catalog
This module explores the use of Unity Catalog for managing and sharing features across different teams and projects within Databricks.
Model Logging with MLFlow on Databricks
This module delves into MLflow, focusing on experiment tracking, model management, and deployment within Databricks.
ML Model Building
This module covers the training of machine learning models, including hyperparameter tuning and parallelization techniques.
Distributed Model training
This module discusses the challenges and techniques for scaling machine learning models using Spark.
Model Serving
This module covers the deployment and serving of machine learning models in Databricks, ensuring models are accessible for predictions in production environments.
Model Drift Analysis
This module focuses on techniques and tools for monitoring and analyzing model drift in machine-learning applications. It covers the detection of changes in data and model performance over time to ensure ongoing model accuracy and reliability.