Abstract: Sparse Matrix Multiplication (SpMM) is a crucial algorithm in modern platforms such as Artificial Intelligence (AI), Graph Neural Network (GNN), Graph Convolutional Network (GCN), and neural ...
In industrial recommendation systems, the shift toward Generative Retrieval (GR) is replacing traditional embedding-based nearest neighbor search with Large Language Models (LLMs). These models ...
python main.py # modify benchmarklist.csv if necessary. COO and CSR data are stored in coo_data/ and csr_data/, respectively. make sim_start # Run simulation (default: ./sim_coo2csr) Output waveforms ...
* Program re-ordering for improved L2 cache hit rate. * Automatic performance tuning. # Motivations # Matrix multiplications are a key building block of most modern high-performance computing systems.
Abstract: This research proposes and evaluates a novel approach to optimizing matrix multiplication (MatMul) on Huawei Ascend NPUs, motivated by a key insight: during matrix-vector multiplication ...