Vendors

In this course, you will gain theoretical and practical knowledge of Apache Spark’s architecture and its application to machine learning workloads within Databricks. You will learn when to use Spark for data preparation, model training, and deployment, while also gaining hands-on experience with Spark ML and pandas APIs on Spark. This course will introduce you to advanced concepts like hyperparameter tuning and scaling Optuna with Spark. This course will use features and concepts introduced in the associate course such as MLflow and Unity Catalog for comprehensive model packaging and governance. 

img-course-overview.jpg

What You'll Learn

  • Machine Learning Development with Spark
  • Distributed Model Tuning on Databricks
  • Deploying Machine Learning Models with Spark
  • Pandas on Spark

Who Should Attend

This course is designed for professionals who:

  • Are working as data scientists, machine learning engineers or advanced analytics practitioners looking to scale machine-learning workloads using the Databricks lakehouse platform and Apache Spark.
  • Are responsible for preparing large-scale datasets, training and deploying models in production, and want to use Spark ML, Pandas API on Spark, and tools such as MLflow and Optuna for hyperparameter tuning at scale.
  • Have a basic understanding of Python and machine-learning concepts (such as classification, regression, metrics like F1-score) and now wish to extend these skills to distributed ML pipelines and scale-out environments.
  • Are part of teams tasked with operationalizing ML workflows—i.e., scaling from experimentation to production, enabling repeatable, governed, high-volume model training and inference in the enterprise.
  • Want to deepen their platform-specific expertise in Databricks ML workflows and integrate Spark-based processing into their machine-learning lifecycle.
img-who-should-learn.png

Prerequisites

The content was developed for participants with these skills/knowledge/abilities:  

  • A beginner-level understanding of Python.
  • Basic understanding of DS/ML concepts (e.g. classification and regression models), common model metrics (e.g. F1-score), and Python libraries (e.g. scikit-learn and XGBoost).

Learning Journey

Coming Soon...

1.Machine Learning Development with Spark

  • A Brief Overview of Spark Architecture for Machine Learning
  • Introduction to Spark ML for Model Development
  • Model Tracking and Packaging with MLflow and Unity Catalog on Databricks
  • Model Development with Spark

2.Distributed Model Tuning on Databricks

  • Overview of Hyperparameter Tuning
  • Scalable HPO Frameworks on Databricks
  • Optuna and Hyperopt with Spark ML
  • HPO with Ray Tune

3.Deploying Machine Learning Models with Spark

  • Deployment with Spark
  • Inference with Spark
  • Model Deployment with Spark
  • Optimization Strategies with Spark and Delta Lake
  • Model Deployment with Spark

4.Pandas on Spark

  • Scaling with Pandas APIs
  • Pandas UDFs and Function APIs
  • Pandas APIs

img-exam-cert

Frequently Asked Questions (FAQs)

None

Keep Exploring

Course Curriculum

Course Curriculum

Training Schedule

Training Schedule

Exam & Certification

Exam & Certification

FAQs

Frequently Asked Questions

img-improve-career.jpg

Improve yourself and your career by taking this course.

img-get-info.jpg

Ready to Take Your Business from Great to Awesome?

Level-up by partnering with Trainocate. Get in touch today.

Name
Email
Phone
I'm inquiring for

Inquiry Details

By submitting this form, you consent to Trainocate processing your data to respond to your inquiry and provide you with relevant information about our training programs, including occasional emails with the latest news, exclusive events, and special offers.

You can unsubscribe from our marketing emails at any time. Our data handling practices are in accordance with our Privacy Policy.