Scaling Machine Learning with Spark • Adi Polak & Holden Karau • GOTO 2023

admin

May 18

This interview was recorded for the GOTO Book Club. #GOTOcon #GOTObookclub
http://gotopia.tech/bookclub

Read the full transcription of the interview here:
https://gotopia.tech/bookclub/episodes/234/Scaling-Machine-Learning-with-Spark

Adi Polak - VP of Developer Experience at Treeverse & Contributing to lakeFS OSS @polakadi
Holden Karau - Co-Author of "Kubeflow for Machine Learning" & many more books & Open Source Engineer at Netflix @HoldenKarau

RESOURCES
Adi

Tweets by AdiPolak

https://adipolak.substack.com
https://mastodon.online/@adipolak
https://blog.adipolak.com
https://www.linkedin.com/in/-adi-polak-68548365

Holden

Tweets by holdenkarau

https://www.twitch.tv/holdenkarau
https://tech.lgbt/@holden
http://www.holdenkarau.com

DESCRIPTION
Learn how to build end-to-end scalable machine learning solutions with Apache Spark. With this practical guide, author Adi Polak introduces data and ML practitioners to creative solutions that supersede today's traditional methods. You'll learn a more holistic approach that takes you beyond specific requirements and organizational goals--allowing data and ML practitioners to collaborate and understand each other better.

Scaling Machine Learning with Spark examines several technologies for building end-to-end distributed ML workflows based on the Apache Spark ecosystem with Spark MLlib, MLflow, TensorFlow, and PyTorch. If you're a data scientist who works with machine learning, this book shows you when and why to use each technology.

You will:
• Explore machine learning, including distributed computing concepts and terminology
• Manage the ML lifecycle with MLflow
• Ingest data and perform basic preprocessing with Spark
• Explore feature engineering, and use Spark to extract features
• Train a model with MLlib and build a pipeline to reproduce it
• Build a data system to combine the power of Spark with deep learning
• Get a step-by-step example of working with distributed TensorFlow
• Use PyTorch to scale machine learning and its internal architecture

* Book description: © O'Reilly:
https://www.oreilly.com/library/view/scaling-machine-learning/9781098106812

The interview is based on the book "Scaling Machine Learning with Spark": https://amzn.to/3ppdUkB

TIMECODES
00:00 Intro
02:25 Lead with the tools & resources you have
04:06 The Apache Spark ecosystem
08:44 Book chapter overview
12:22 Exploring the glue spaces in ML & data engineering
19:18 Navigating the trade-offs of distributed ML
29:37 Challenges of keeping up with Open Source software
35:22 Can 2e expect another book?
38:11 Outro

RECOMMENDED BOOKS
Adi Polak • Machine Learning with Apache Spark • https://amzn.to/3ppdUkB
Holden Karau, Trevor Grant, Boris Lublinsky, Richard Liu & Ilan Filonenko • Kubeflow for Machine Learning • https://amzn.to/3JVngcx
Holden Karau • Distributed Computing 4 Kids • https://www.distributedcomputing4kids.com
Holden Karau • Scaling Python with Dask • https://www.oreilly.com/library/view/scaling-python-with/9781098119867
Holden Karau & Boris Lublinsky • Scaling Python with Ray • https://amzn.to/44GU6cC
Holden Karau & Rachel Warren • High Performance Spark • https://amzn.to/3v2eLbn
Holden Karau, Konwinski, Wendell & Zaharia • Learning Spark • https://amzn.to/397e2NE
Holden Karau & Krishna Sankar • Fast Data Processing with Spark 2nd Edition • https://amzn.to/3xKhXKu
Holden Karau • Fast Data Processing with Spark 1st Edition • https://amzn.to/3rHQgOu

Tweets by GOTOcon

https://www.linkedin.com/company/goto-
https://www.facebook.com/GOTOConferences
#Spark #ApacheSpark #ML #MachineLearning #MLlib #TensorFlow #PyTortch #DataScience #AI #ComputerScience #AdiPolak #HoldenKarau #Programming #SoftwareEngineering

Looking for a unique learning experience?
Attend the next GOTO conference near you! Get your ticket at https://gotopia.tech

SUBSCRIBE TO OUR CHANNEL - new videos posted almost daily.
https://www.youtube.com/user/GotoConferences/?sub_confirmation=1

Comment

Unsubscribe to no longer receive posts from Crypto Timeless.
Change your email settings at manage subscriptions.

Trouble clicking? Copy and paste this URL into your browser:
https://cryptotimeless.com/2023/05/18/scaling-machine-learning-with-spark-adi-polak-holden-karau-goto-2023/

Get the Jetpack app to use Reader anywhere, anytime

Follow your favorite sites, save posts to read later, and get real-time notifications for likes and comments.

Learn how to build your website with our video tutorials on YouTube.

Automattic, Inc. - 60 29th St. #343, San Francisco, CA 94110

Daily Mail PH

Thursday, May 18, 2023

[New post] Scaling Machine Learning with Spark • Adi Polak & Holden Karau • GOTO 2023

Scaling Machine Learning with Spark • Adi Polak & Holden Karau • GOTO 2023

No comments:

Post a Comment

CG BOSS Posts from Gargoyles Reboot thanks to creator kept it alive | CG BOSS Games for 04/26/2026

Report Abuse

Labels