Pages

Monday, May 30, 2022

Spark, Ray, and Python for Scalable Data Science

Colleagues, according to Salary.com the average Data Scientist salary in the United States is $136,309. The Spark, Ray, and Python for Scalable Data Science program equips you to scale machine learning and artificial intelligence projects using Python, Spark, and Ray. Learn to integrate Python and distributed computing, scale data processing with Spark, conduct exploratory data analysis with PySpark, utilize parallel computing with Ray and scale machine learning and artificial intelligence applications with Ray. Skill-based training modules include: 1) Introduction to Distributed Computing in Python - you get some experience with one of Spark's primary data structures, the resilient distributed dataset (RDD). Next is key-value pairs and how Spark does operations on them similar to MapReduce. The lesson finishes up with a bit of Spark internals and the overall Spark application lifecycle, 2) Exploratory Data Analysis with PySpark - large data science workflow centered around natural language processing (NLP). He starts off with a general introduction to exploratory data analysis (EDA), followed by a quick tour of Jupyter notebooks. Next he discusses how to do EDA with Spark at scale, and then he shows you how to create statistics and data visualizations to summarize data sets. Finally, he tackles the NLP example, showing you how to transform a large corpus of text into numerical representation suitable for machine learning, 3) Parallel Computing with Ray - Ray programming API, with Jonathan comparing the similarities and differences between the Ray and Spark APIs. You learn how you can distribute functions with Ray, and 4) Scaling AI Applications with Ray - scale up machine learning and artificial intelligence applications with Python. The lesson starts with the general model training and evaluation process in Python. Then it turns to how Ray enables you to scale both the evaluation and tuning of our models. You see how Ray makes possible very efficient hyperparameter tuning. You also see how, once you have a trained model, Ray can serve predictions from your machine learning model.

Enroll today (teams & execs welcome): https://tinyurl.com/4pydnt23 


Down your complimentary Data Science - Career Transformation Guide.


Much career success, Lawrence E. Wilson - Artificial Intelligence Academy (subscribe & share)

No comments:

Post a Comment

Machine Learning Specialization

Colleagues, the Machine Learning Specialization taught by Andrew Ng is a foundational online program created in collaboration between DeepL...