What exactly is the point of memoryview in Python

python buffer memoryview

One reason memoryviews are useful is that they can be sliced without copying the underlying data, unlike bytes/str.

For example, take the following toy example.

import timefor n in (100000, 200000, 300000, 400000):    data = b'x'*n    start = time.time()    b = data    while b:        b = b[1:]    print(f'     bytes {n} {time.time() - start:0.3f}')for n in (100000, 200000, 300000, 400000):    data = b'x'*n    start = time.time()    b = memoryview(data)    while b:        b = b[1:]    print(f'memoryview {n} {time.time() - start:0.3f}')

On my computer, I get

     bytes 100000 0.211     bytes 200000 0.826     bytes 300000 1.953     bytes 400000 3.514memoryview 100000 0.021memoryview 200000 0.052memoryview 300000 0.043memoryview 400000 0.077

You can clearly see the quadratic complexity of the repeated string slicing. Even with only 400000 iterations, it's already unmanageable. Meanwhile, the memoryview version has linear complexity and is lightning fast.

Edit: Note that this was done in CPython. There was a bug in Pypy up to 4.0.1 that caused memoryviews to have quadratic performance.

python buffer memoryview

memoryview objects are great when you need subsets of binary data that only need to support indexing. Instead of having to take slices (and create new, potentially large) objects to pass to another API you can just take a memoryview object.

One such API example would be the struct module. Instead of passing in a slice of the large bytes object to parse out packed C values, you pass in a memoryview of just the region you need to extract values from.

memoryview objects, in fact, support struct unpacking natively; you can target a region of the underlying bytes object with a slice, then use .cast() to 'interpret' the underlying bytes as long integers, or floating point values, or n-dimensional lists of integers. This makes for very efficient binary file format interpretations, without having to create more copies of the bytes.

python buffer memoryview

Let me make plain where lies the glitch in understanding here.

The questioner, like myself, expected to be able to create a memoryview that selects a slice of an existing array (for example a bytes or bytearray). We therefore expected something like:

desired_slice_view = memoryview(existing_array, start_index, end_index)

Alas, there is no such constructor, and the docs don't make a point of what to do instead.

The key is that you have to first make a memoryview that covers the entire existing array. From that memoryview you can create a second memoryview that covers a slice of the existing array, like this:

whole_view = memoryview(existing_array)desired_slice_view = whole_view[10:20]

In short, the purpose of the first line is simply to provide an object whose slice implementation (dunder-getitem) returns a memoryview.

That might seem untidy, but one can rationalize it a couple of ways:

Our desired output is a memoryview that is a slice of something. Normally we get a sliced object from an object of that same type, by using the slice operator [10:20] on it. So there's some reason to expect that we need to get our desired_slice_view from a memoryview, and that therefore the first step is to get a memoryview of the whole underlying array.
The naive expection of a memoryview constructor with start and end arguments fails to consider that the slice specification really needs all the expressivity of the usual slice operator (including things like [3::2] or [:-4] etc). There is no way to just use the existing (and understood) operator in that one-liner constructor. You can't attach it to the existing_array argument, as that will make a slice of that array, instead of telling the memoryview constructor some slice parameters. And you can't use the operator itself as an argument, because it's an operator and not a value or object.

Conceivably, a memoryview constructor could take a slice object:

desired_slice_view = memoryview(existing_array, slice(1, 5, 2) )

... but that's not very satisfactory, since users would have to learn about the slice object and what its constructor's parameters mean, when they already think in terms of the slice operator's notation.

CodeHunter

What exactly is the point of memoryview in Python

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last