Caused by: ERROR XSDB6: Another instance of Derby may have already booted the database

I was getting the same error while creating Data frames on Spark Shell :

Caused by: ERROR XSDB6: Another instance of Derby may have already booted the database /metastore_db.

Cause:

I found that this is happening as there were multiple other instances of Spark-Shell already running and holding derby DB already, so when i was starting yet another Spark Shell and creating Data Frame on it using RDD.toDF() it was throwing error:

Solution:

I ran the ps command to find other instances of Spark-Shell:

ps -ef | grep spark-shell

and i killed them all using kill command:

kill -9 Spark-Shell-processID ( example: kill -9 4848)

after all the SPark-Shell instances were gone, i started a new SPark SHell and reran my Data frame function and it ran just fine :)

hadoop apache-spark derby

If you're running in spark shell, you shouldn't instantiate a HiveContext, there's one created automatically called sqlContext (the name is misleading - if you compiled Spark with Hive, it will be a HiveContext). See similar discussion here.

If you're not running in shell - this exception means you've created more than one HiveContext in the same JVM, which seems to be impossible - you can only create one.

hadoop apache-spark derby

Another case where you can see the same error is a Spark REPL of an AWS Glue dev endpoint, when you are trying to convert a dynamic frame into a dataframe.

There are actually several different exceptions like:

pyspark.sql.utils.IllegalArgumentException: u"Error while instantiating 'org.apache.spark.sql.hive.HiveSessionState':"
ERROR XSDB6: Another instance of Derby may have already booted the database /home/glue/metastore_db.
java.sql.SQLException: Failed to start database 'metastore_db' with class loader org.apache.spark.sql.hive.client.IsolatedClientLoader

The solution is hard to find with google but eventually it is described here.

The loaded REPL contains an instantiated SparkSession in a variable spark and you just need to stop it before creating a new SparkContext:

>>> spark.stop()>>> from pyspark.context import SparkContext>>> from awsglue.context import GlueContext>>>>>> glue_context = GlueContext(SparkContext.getOrCreate())>>> glue_frame = glue_context.create_dynamic_frame.from_catalog(database=DB_NAME, table_name=T_NAME)>>> df = glue_frame.toDF()

CodeHunter

Caused by: ERROR XSDB6: Another instance of Derby may have already booted the database

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last