Simultaneously diagonalize matrices with numpy Simultaneously diagonalize matrices with numpy numpy numpy

Simultaneously diagonalize matrices with numpy


There is a fairly simple and very elegant simultaneous diagonalization algorithm based on Givens rotation that was published by Cardoso and Soulomiac in 1996:

Cardoso, J., & Souloumiac, A. (1996). Jacobi Angles for Simultaneous Diagonalization. SIAM Journal on Matrix Analysis and Applications, 17(1), 161–164. doi:10.1137/S0895479893259546

I've attached a numpy implementation of the algorithm at the end of this response. Caveat: It turns out simultaneous diagonalization is a bit of a tricky numerical problem, with no algorithm (to the best of my knowledge) that guarantees global convergence. However, the cases in which it does not work (see the paper) are degenerate and in practice I have never had the Jacobi angles algorithm fail on me.

#!/usr/bin/env python2.7# -*- coding: utf-8 -*-"""Routines for simultaneous diagonalizationArun Chaganty <arunchaganty@gmail.com>"""import numpy as npfrom numpy import zeros, eye, diagfrom numpy.linalg import normdef givens_rotate( A, i, j, c, s ):    """    Rotate A along axis (i,j) by c and s    """    Ai, Aj = A[i,:], A[j,:]    A[i,:], A[j,:] = c * Ai + s * Aj, c * Aj - s * Ai     return Adef givens_double_rotate( A, i, j, c, s ):    """    Rotate A along axis (i,j) by c and s    """    Ai, Aj = A[i,:], A[j,:]    A[i,:], A[j,:] = c * Ai + s * Aj, c * Aj - s * Ai     A_i, A_j = A[:,i], A[:,j]    A[:,i], A[:,j] = c * A_i + s * A_j, c * A_j - s * A_i     return Adef jacobi_angles( *Ms, **kwargs ):    r"""    Simultaneously diagonalize using Jacobi angles    @article{SC-siam,       HTML =   "ftp://sig.enst.fr/pub/jfc/Papers/siam_note.ps.gz",       author = "Jean-Fran\c{c}ois Cardoso and Antoine Souloumiac",       journal = "{SIAM} J. Mat. Anal. Appl.",       title = "Jacobi angles for simultaneous diagonalization",       pages = "161--164",       volume = "17",       number = "1",       month = jan,       year = {1995}}    (a) Compute Givens rotations for every pair of indices (i,j) i < j             - from eigenvectors of G = gg'; g = A_ij - A_ji, A_ij + A_ji            - Compute c, s as \sqrt{x+r/2r}, y/\sqrt{2r(x+r)}    (b) Update matrices by multiplying by the givens rotation R(i,j,c,s)    (c) Repeat (a) until stopping criterion: sin theta < threshold for all ij pairs    """    assert len(Ms) > 0    m, n = Ms[0].shape    assert m == n    sweeps = kwargs.get('sweeps', 500)    threshold = kwargs.get('eps', 1e-8)    rank = kwargs.get('rank', m)    R = eye(m)    for _ in xrange(sweeps):        done = True        for i in xrange(rank):            for j in xrange(i+1, m):                G = zeros((2,2))                for M in Ms:                    g = np.array([ M[i,i] - M[j,j], M[i,j] + M[j,i] ])                    G += np.outer(g,g) / len(Ms)                # Compute the eigenvector directly                t_on, t_off = G[0,0] - G[1,1], G[0,1] + G[1,0]                theta = 0.5 * np.arctan2( t_off, t_on + np.sqrt( t_on*t_on + t_off * t_off) )                c, s = np.cos(theta), np.sin(theta)                if abs(s) > threshold:                    done = False                    # Update the matrices and V                    for M in Ms:                        givens_double_rotate(M, i, j, c, s)                        #assert M[i,i] > M[j, j]                    R = givens_rotate(R, i, j, c, s)        if done:            break    R = R.T    L = np.zeros((m, len(Ms)))    err = 0    for i, M in enumerate(Ms):        # The off-diagonal elements of M should be 0        L[:,i] = diag(M)        err += norm(M - diag(diag(M)))    return R, L, err


I am not aware of any direct solution. But why not just getting the eigenvalues and the eigenvectors of the first matrix, and using the eigenvectors to transform all other matrices to the diagonal form? Something like:

eigvals, eigvecs = np.linalg.eig(matrix1)eigvals2 = np.diagonal(np.dot(np.dot(transpose(eigvecs), matrix2), eigvecs))

You can the add the columns to an array via hstack if you like.

UPDATE: As pointed out below, this is only valid if no degenerate eigenvalues occur. Otherwise one would have to check first for the degenerate eigenvalues, then transform the 2nd matrix to a blockdiagonal form, and diagonalize eventual blocks bigger than 1x1 separately.


If you know something about the size of the eigenvalues of the two matrices in advance, you can diagonalize a linear combination of the two matrices, with coefficients chosen to break the degeneracy. For example, if the eigenvalues of both lie between -10 and 10, you could diagonalize 100*M1 + M2. There's a slight loss of precision, but for many purposes it's good enough--and quick and easy!