An MPI-CUDA Implementation and Optimization for Parallel Sparse Equations and Least Squares (LSQR) 12.pdf