Site icon VCDX #181 Marc Huppert

Standalone Spark on VMware vSphere with a…

Standalone Spark on VMware vSphere with a Machine Learning Test Run

Standalone Spark on VMware vSphere with a…

This demo shows a VMware vSphere environment with 16 virtual machines hosting the Apache Spark distributed platform in standalone mode. A test run (using the Spark Perf test suite) is started that executes a Machine Learning linear regression algorithm to train a model on a dataset with 10,000 features and 1 million examples. This test executes across all the virtual machines in the cluster and finishes in 3 minutes 49 seconds (wall clock time). A key measure of 40 seconds of model training time is achieved. This shows that Spark and MLlib execute well on a vSphere environment and that can Spark can be used there by data scientists and data engineers in standalone mode as well as in YARN mode.


VMware Social Media Advocacy

Exit mobile version