FutureWarning: elementwise comparison failed; returning scalar, but in the future will perform elementwise comparison FutureWarning: elementwise comparison failed; returning scalar, but in the future will perform elementwise comparison python-3.x python-3.x

FutureWarning: elementwise comparison failed; returning scalar, but in the future will perform elementwise comparison


This FutureWarning isn't from Pandas, it is from numpy and the bug also affects matplotlib and others, here's how to reproduce the warning nearer to the source of the trouble:

import numpy as npprint(np.__version__)   # Numpy version '1.12.0''x' in np.arange(5)       #Future warning thrown hereFutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparisonFalse

Another way to reproduce this bug using the double equals operator:

import numpy as npnp.arange(5) == np.arange(5).astype(str)    #FutureWarning thrown here

An example of Matplotlib affected by this FutureWarning under their quiver plot implementation: https://matplotlib.org/examples/pylab_examples/quiver_demo.html

What's going on here?

There is a disagreement between Numpy and native python on what should happen when you compare a strings to numpy's numeric types. Notice the right operand is python's turf, a primitive string, and the middle operation is python's turf, but the left operand is numpy's turf. Should you return a Python style Scalar or a Numpy style ndarray of Boolean? Numpy says ndarray of bool, Pythonic developers disagree. Classic standoff.

Should it be elementwise comparison or Scalar if item exists in the array?

If your code or library is using the in or == operators to compare python string to numpy ndarrays, they aren't compatible, so when if you try it, it returns a scalar, but only for now. The Warning indicates that in the future this behavior might change so your code pukes all over the carpet if python/numpy decide to do adopt Numpy style.

Submitted Bug reports:

Numpy and Python are in a standoff, for now the operation returns a scalar, but in the future it may change.

https://github.com/numpy/numpy/issues/6784

https://github.com/pandas-dev/pandas/issues/7830

Two workaround solutions:

Either lockdown your version of python and numpy, ignore the warnings and expect the behavior to not change, or convert both left and right operands of == and in to be from a numpy type or primitive python numeric type.

Suppress the warning globally:

import warningsimport numpy as npwarnings.simplefilter(action='ignore', category=FutureWarning)print('x' in np.arange(5))   #returns False, without Warning

Suppress the warning on a line by line basis.

import warningsimport numpy as npwith warnings.catch_warnings():    warnings.simplefilter(action='ignore', category=FutureWarning)    print('x' in np.arange(2))   #returns False, warning is suppressedprint('x' in np.arange(10))   #returns False, Throws FutureWarning

Just suppress the warning by name, then put a loud comment next to it mentioning the current version of python and numpy, saying this code is brittle and requires these versions and put a link to here. Kick the can down the road.

TLDR: pandas are Jedi; numpy are the hutts; and python is the galactic empire.


I get the same error when I try to set the index_col reading a file into a Panda's data-frame:

df = pd.read_csv('my_file.tsv', sep='\t', header=0, index_col=['0'])  ## or same with the followingdf = pd.read_csv('my_file.tsv', sep='\t', header=0, index_col=[0])

I have never encountered such an error previously. I still am trying to figure out the reason behind this (using @Eric Leschinski explanation and others).

Anyhow, the following approach solves the problem for now until I figure the reason out:

df = pd.read_csv('my_file.tsv', sep='\t', header=0)  ## not setting the index_coldf.set_index(['0'], inplace=True)

I will update this as soon as I figure out the reason for such behavior.


My experience to the same warning message was caused by TypeError.

TypeError: invalid type comparison

So, you may want to check the data type of the Unnamed: 5

for x in df['Unnamed: 5']:  print(type(x))  # are they 'str' ?

Here is how I can replicate the warning message:

import pandas as pdimport numpy as npdf = pd.DataFrame(np.random.randn(3, 2), columns=['num1', 'num2'])df['num3'] = 3df.loc[df['num3'] == '3', 'num3'] = 4  # TypeError and the Warningdf.loc[df['num3'] == 3, 'num3'] = 4  # No Error

Hope it helps.