Java Spark disable Hadoop discovery

So the final "trick" I've used is a mix of sandev and Vipul answers.

Create a 'fake' winutils in your project root :

mkdir <java_project_root>/bintouch <java_project_root>/bin/winutils.exe

Then, in your Spark configuration, provide the 'fake' HADOOP_HOME :

 public SparkConf sparkConfiguration() {    SparkConf cfg = new SparkConf();    File hadoopStubHomeDir = new File(".");    System.setProperty("hadoop.home.dir", hadoopStubHomeDir.getAbsolutePath());    cfg.setAppName("ScalaPython")            .setMaster("local")            .set("spark.executor.instances", "2");    return cfg;}

But still, it's a 'trick' to avoid Hadoop discovery, but it doesn't turn it off.

java apache-spark hadoop

Just spark need winutils just create a folder example C:\hadoop\bin\winutils.exeand define inveroiment variable HADOOP_HOME = C:\hadoop and append to path variable C:\hadoop\bin.then u can use spark functionality

java apache-spark hadoop

It's not because spark wants hadoop to be installed or it just wants that particular file.

First, You have to run the code with spark-submit, are you doing that? Please stick to that as a first approach since that would yield list library-related issues. After you've done that you can add this to your pom file to be able to run it directly from the IDE, I use IntelliJ but should work on eclipse as well

<dependency>        <groupId>org.apache.hadoop</groupId>        <artifactId>hadoop-common</artifactId>        <version>2.6.5</version></dependency>

Second, if it still doesn't work:

Download the winutils file from http://public-repo-1.hortonworks.com/hdp-win-alpha/winutils.exe.
create a new directory named bin inside some_other_directory
in your code add this line before creating the Context.
System.setProperty("hadoop.home.dir", "full path to some_other_directory");

Pro tip, switch to using Scala. Not that it's necessary but that's where spark feels most at home and it wouldn't take you more than a day or two to get the basic programs running just right.

CodeHunter

Java Spark disable Hadoop discovery

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last