NEC Vector Engine

Github

Introduction

  • NEC Vector Engine Processor:
    • 8 vector cores
    • 64 vector registers per core
    • First to use 6 HBM2 memory modules
    • 1.2 TB/s memory bandwidth
  • SpGEMM Implementation:
    • Novel hybrid method with sparse vectors
    • Optimized for vector architectures
    • 139% average improvement over CPU
    • Up to 6.43x performance gain

Performance Evaluation

  • Testing Details:
    • Compared with Intel MKL
    • Evaluated using A^2 calculation
    • Excludes I/O time
    • Shows better scalability with 2 sockets

Publications

B. Peng, J. Li, S. Akkas, T. Araki, O. Yoshiyuki, J. Qiu, “Rank Position Forecasting in Car Racing”, Proceedings of 35th IEEE International Parallel & Distributed Processing Symposium (IPDPS21)

J. Li, F. Wang, and Q. J. Araki, Takuya, “Generalized sparse matrix-matrix multiplication for vector engines and graph applications,” in MCHPC’19: Workshop on Memory Centric High Performance Computing, ACM, 2019.