How to turn Numpy array to set efficiently?

First flatten your ndarray to obtain a single dimensional array, then apply set() on it:

set(x.flatten())

Edit : since it seems you just want an array of set, not a set of the whole array, then you can do value = [set(v) for v in x] to obtain a list of sets.

python numpy set

The current state of your question (can change any time): how can I efficiently remove unique elements from a large array of large arrays?

import numpy as nprng = np.random.default_rng()arr = rng.random((3000, 30000))out1 = list(map(np.unique, arr))#orout2 = [np.unique(subarr) for subarr in arr]

Runtimes in an IPython shell:

>>> %timeit list(map(np.unique, arr))5.39 s ± 37.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)>>> %timeit [np.unique(subarr) for subarr in arr]5.42 s ± 58.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Update: as @hpaulj pointed out in his comment, my dummy example is biased since floating-point random numbers will almost certainly be unique. So here's a more life-like example with integer numbers:

>>> arr = rng.integers(low=1, high=15000, size=(3000, 30000))>>> %timeit list(map(np.unique, arr))4.98 s ± 83.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)>>> %timeit [np.unique(subarr) for subarr in arr]4.95 s ± 51.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In this case the elements of the output list have varying lengths, since there are actual duplicates to remove.

python numpy set

A couple of earlier 'row-wise' unique questions:

vectorize numpy unique for subarrays

Numpy: Row Wise Unique elements

Count unique elements row wise in an ndarray

In a couple of these the count is more interesting than the actual unique values.

If the number of unique values per row differs, then the result cannot be a (2d) array. That's a pretty good indication that the problem cannot be fully vectorized. You need some sort of iteration over the rows.

CodeHunter

How to turn Numpy array to set efficiently?

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last