How to remove the parentheses around records when saveAsTextFile on RDD[(String, Int)]? [duplicate]

Use map transformation before you save the records to outputfiles directory, e.g.

wordcountRDD.map { case (k, v) => s"$k, $v" }.saveAsTextFile("/user/cloudera/outputfiles")

See Spark's documentation about map.

I strongly recommend using Datasets instead.

scala> words.toSeq.toDS.groupBy("value").count().show+-----+-----+|value|count|+-----+-----+|  HOW|    1||  ARE|    1||   HI|    1|+-----+-----+scala> words.toSeq.toDS.groupBy("value").count.write.csv("outputfiles")$ cat outputfiles/part-00199-aa752576-2f65-481b-b4dd-813262abb6c2-c000.csvHI,1

See Spark SQL, DataFrames and Datasets Guide.

hadoop apache-spark apache-pig

This format is a format of Tuple. You can manually define your format:

val wordcountRDD = keyvalueRDD.reduceByKey((x,y) => x+y)                              // here we set custom format                              .map(x => x._1 + "," + x._2)wordcountRDD.saveAsTextFile("/user/cloudera/outputfiles")

CodeHunter

How to remove the parentheses around records when saveAsTextFile on RDD[(String, Int)]? [duplicate]

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last