How To Run Spark On Windows

Apache Spark is a lightning-fast unified analytics engine used for cluster computing for large data sets like BigData and Hadoop with the aim to run programs parallel across multiple nodes. It is a combination of multiple stack libraries such as SQL and Dataframes, GraphX, MLlib, and Spark Streaming.

Spark operates in iv different modes:

Standalone Mode: Here all processes run within the same JVM process.
Standalone Cluster Mode: In this mode, it uses the Job-Scheduling framework in-built in Spark.
Apache Mesos: In this style, the piece of work nodes run on various machines, simply the driver runs only in the master node.
Hadoop YARN: In this mode, the drivers run inside the application's master node and is handled by YARN on the Cluster.

In This article, we will explore Apache Spark installation in a Standalone mode. Apache Spark is adult in Scala programming language and runs on the JVM. Java installation is one of the mandatory things in spark. So permit'southward get-go with Java installation.

Installing Java:

Step one: Download the Java JDK.

Step ii: Open the downloaded Coffee SE Development Kit and follow along with the instructions for installation.

Pace 3: Open up the environment variable on the laptop by typing it in the windows search bar.

Set up JAVA_HOME Variables:

To set the JAVA_HOME variable follow the below steps:

Click on the User variable Add JAVA_HOME to PATH with value Value: C:\Programme Files\Java\jdk1.eight.0_261.
Click on the Organisation variable Add C:\Program Files\Java\jdk1.8.0_261\bin to PATH variable.
Open control prompt and blazon "java –version", it will testify bellow appear & verify Java installation.

Installing Scala:

For installing Scala on your local motorcar follow the below steps:

Pace 1: Download Scala.

Stride ii: Click on the .exe file and follow along instructions to customize the setup according to your needs.

Step iii: Accept the agreement and click the next button.

Set ecology variables:

In User Variable Add SCALA_HOME to PATH with value C:\Program Files (x86)\scala.
In Arrangement Variable Add C:\Program Files (x86)\scala\bin to PATH variable.

Verify Scala installation:

In the Command prompt apply the below command to verify Scala installation:

scala

Installing Spark:

Download a pre-built version of the Spark and excerpt it into the C drive, such as C:\Spark. Then click on the installation file and follow forth the instructions to ready Spark.

Set ecology variables:

In User variable Add together SPARK_HOME to PATH with value C:\spark\spark-two.4.6-bin-hadoop2.7.
In System variable Add%SPARK_HOME%\bin to PATH variable.