10x Faster Matrix Determination Using Numba Instead of Numpy

INFO

This function can be 10x faster with the exact result as Numpy.linalg.det()

Background

The speed of numpy.linalg.det() is slow, let’s have a test:

b = 256**2
k = 3
A = np.random.randn(b,k,k)
npDet = np.linalg.det(A)
print("The Numpy Determinant of A is", round(sum(npDet),9))

Output: The Numpy Determinant of A is 176.93690024

Test the speed:

%timeit npDet = np.linalg.det(A)

Output: 10.8 ms ± 90 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Code

@numba.jit(nopython=True)
def determinant(A):
    """A shape should be (B,n,n) return shape (B,)  """
    B,n,_ = A.shape
    AM = A.copy()
    # Section 2: Row ops on A to get in upper triangle form
    product = np.ones(B,dtype=A.dtype)
    for b in range(B):
        for fd in range(n): # A) fd stands for focus diagonal
            for i in range(fd+1,n): # B) only use rows below fd row
                if AM[b][fd][fd] == 0: # C) if diagonal is zero ...
                    AM[b][fd][fd] = 1.0e-18 # change to ~zero
                # D) cr stands for "current row"
                crScaler = AM[b][i][fd] / AM[b][fd][fd] 
                # E) cr - crScaler * fdRow, one element at a time
                for j in range(n): 
                    AM[b][i][j] = AM[b][i][j] - crScaler * AM[b][fd][j]
        # Section 3: Once AM is in upper triangle form ...
        for i in range(n):
            # ... product of diagonals is determinant
            product[b] *= AM[b][i][i] 
 
    return product

Test

b = 256**2
k = 3
A = np.random.randn(b,k,k)
Det = determinant(A)
npDet = np.linalg.det(A)
torch_Det = torch.linalg.det(torch.Tensor(A))
print("Determinant of A is", round(sum(Det),9))
print("The Numpy Determinant of A is", round(sum(npDet),9))
print("The Torch Determinant of A is", round(sum(npDet),9))
%timeit Det = determinant(A)
%timeit npDet = np.linalg.det(A)
%timeit torch_Det = torch.linalg.det(torch.Tensor(A))

Output:

Determinant of A is -397.275054658
The Numpy Determinant of A is -397.275054658
The Torch Determinant of A is -397.275054658
1.73 ms ± 10.4 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
10.8 ms ± 90 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
3.73 ms ± 381 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

The new function is 10x faster than numpy and 2x faster than pytorch

INFO

Background

Code

Test

Enjoy Reading This Article?