[12-7]“高性能科学计算”前沿系列邀请报告3 -- Reliable Matrix Computations via Algorithm-Based Fault Tolerance
Title: Reliable Matrix Computations via Algorithm-Based Fault Tolerance
Speaker: Prof. Zizhong Chen
Department of Computer Science and Engineering
University of California, Riverside
Time: 13:30pm, Wednesday, Dec. 7, 2016
Venue: Mid Conference Room, Level 4, Building 5,
Institute of Software, Chinese Academy of Sciences.
Errors are common in today's computer systems. When an error occurs, if the affected application continues, we call it a fail-continue error. Otherwise, we call it a fail-stop error. In this talk, I will discuss our recent work on algorithm-based fault tolerance for reliable matrix computations. We have developed some highly efficient error correction techniques for selected widely used matrix computation algorithms to tolerate both fail-continue and fail-stop errors according to their specific algorithmic characteristics. By leveraging the algorithmic characteristics of these algorithms, the proposed techniques can achieve much higher efficiency than the traditional general techniques (i.e., Triple Modular Redundancy for fail-continue errors and checkpoint/restart for fail-stop errors).
Zizhong Chen is an Associate Professor in the Department of Computer Science and Engineering at the University of California, Riverside. He specializes in reliable and high performance scientific computing, numerical algorithms and software, and algorithm-based fault tolerance. He has published over 80 papers with many in highly competitive conferences and journals such as HPDC, PPoPP, SC, ICS, IPDPS, TPDS, TC, JPDC, PARCO, SIMAX, SISC, and IBMRD. He has received a CAREER Award from the U.S. National Science Foundation and a Best Paper Award from the International Supercomputing Conference. Dr. Chen is a Senior Member of the IEEE and a Life Member of the ACM. He currently serves as Subject Area Editor for Elsevier Parallel Computing journal and Associate Editor for IEEE Transactions on Parallel and Distributed Systems.
All are welcome!