How to convert sRGB to NV12 format using NumPy? How to convert sRGB to NV12 format using NumPy? numpy numpy

How to convert sRGB to NV12 format using NumPy?


Converting sRGB to NV12 format using NumPy

The purpose of the post is demonstrating the conversion process.
The Python implementation below uses NumPy, and deliberately avoids using OpenCV.

RGB to NV12 conversion stages:

  • Color space conversion - convert from sRGB to YUV color space:
    Use sRGB to YCbCr conversion formula.
    Multiply each RGB triple by 3x3 conversion matrix, and add a vector of 3 offsets.
    The post shows both BT.709 and BT.601 conversions (the only difference is the coefficients matrix).
  • Chroma downsampling - shrink U,V channels by a factor of x2 in each axis (converting from YUV444 to YUV420).
    The implementation resizes U,V by factor of 0.5 in each axis using bi-linear interpolation.
    Note: bi-linear interpolation is not the optimal downsampling method, but it's usually good enough.
    Instead of using cv2.resize, code uses average of every 2x2 pixels (result is equivalent to bi-linear interpolation).
    Note: implementation fails in case input resolution is not even in both dimensions.
  • Chroma elements interleaving - arrange U,V elements as U,V,U,V...
    Implemented by array indexing manipulation.

Here is a Python code sample for converting RGB to NV12 standard:

import numpy as npimport matplotlib.pyplot as pltimport matplotlib.image as mpimgdo_use_bt709 = True; # True for BT.709, False for BT.601RGB = mpimg.imread('rgb_input.png')*255.0     # Read RGB input image, multiply by 255 (set RGB range to [0, 255]).R, G, B = RGB[:, :, 0], RGB[:, :, 1], RGB[:, :, 2]  # Split RGB to R, G and B numpy arrays.rows, cols = R.shape# I. Convert RGB to YUV (convert sRGB to YUV444)#################################################if do_use_bt709:    # Convert sRGB to YUV, BT.709 standard    # Conversion formula used: 8 bit sRGB to "limited range" 8 bit YUV (BT.709).    Y =  0.18258588*R + 0.61423059*G + 0.06200706*B + 16.0    U = -0.10064373*R - 0.33857195*G + 0.43921569*B + 128.0    V =  0.43921569*R - 0.39894216*G - 0.04027352*B + 128.0else:    # Convert sRGB to YUV, BT.601 standard.    # Conversion formula used: 8 bit sRGB to "limited range" 8 bit YUV (BT.601).    Y =  0.25678824*R + 0.50412941*G + 0.09790588*B + 16.0    U = -0.14822290*R - 0.29099279*G + 0.43921569*B + 128.0    V =  0.43921569*R - 0.36778831*G - 0.07142737*B + 128.0# II. U,V Downsampling (convert YUV444 to YUV420)################################################### Shrink U and V channels by a factor of x2 in each axis (use bi-linear interpolation).#shrunkU = cv2.resize(U, dsize=(cols//2, rows//2), interpolation=cv2.INTER_LINEAR)#shrunkV = cv2.resize(V, dsize=(cols//2, rows//2), interpolation=cv2.INTER_LINEAR)# Each element of shrunkU is the mean of 2x2 elements of U# Result is equvalent to resize by a factor of 0.5 with bi-linear interpolation.shrunkU = (U[0: :2, 0::2] + U[1: :2, 0: :2] + U[0: :2, 1: :2] + U[1: :2, 1: :2]) * 0.25shrunkV = (V[0: :2, 0::2] + V[1: :2, 0: :2] + V[0: :2, 1: :2] + V[1: :2, 1: :2]) * 0.25# III. U,V Interleaving######################### Size of UV plane is half the number of rows, and same number of columns as Y plane.UV = np.zeros((rows//2, cols))  # Use // for integer division.# Interleave shrunkU and shrunkV and build UV palne (each row of UV plane is u,v,u,u,v...)UV[:, 0 : :2] = shrunkUUV[:, 1 : :2] = shrunkV# Place Y plane at the top, and UV plane at the bottom (number of rows NV12 matrix is rows*1.5)NV12 = np.vstack((Y, UV))# Round NV12, and cast to uint8 (use floor(x+0.5) instead of round to avoid "bankers rounding").NV12 = np.floor(NV12 + 0.5).astype('uint8')# Write NV12 array to binary fileNV12.tofile('nv12_output.raw')# Display NV12 result (display as Grayscale image).plt.figure()plt.axis('off')plt.imshow(NV12, cmap='gray', interpolation='nearest')plt.show()

Sample RGB input image:
RGB input

NV12 Result (displayed as Grayscale image):
NV12 output as Grayscale