Siva Kumar Sastry Hari

Principal Research Scientist
Architecture Research Group
NVIDIA
IEEE Senior Member
Email: shari [at] nvidia [dot] com
Google Scholar page
My NVIDIA page

Siva Hari is a Principal Research Scientist in the Architecture Research Group at NVIDIA. His research interests are in computer architecture, artificial intelligence, and systems, with the current focus on Autonomous and High-Performance Computing Systems. He obtained his Ph.D. and M.S. in Computer Science at University of Illinois at Urbana-Champaign and B.Tech. in Computer Science and Engineering at Indian Institute of Technology (IIT) Madras.

He received the 2023 Rising Star in Dependability Award at the DSN conference. He received the 2014 David J. Kuck Outstanding Ph.D. Thesis Award from the Computer Science Department at the University of Illinois. He received the W.J. Poppelbaum Memorial Award from the Computer Science Department at the University of Illinois at Urbana-Champaign in 2012 for academic merit and creativity in computer hardware or architecture. His work received the following recognitions: two papers selected for IEEE Top Picks in Test and Reliability in 2023, one paper selected as an IEEE Micro's Top Pick in 2022, and Best Research Paper Award at ISSRE 2020, Best Paper Award Runner-up at DSN 2018, paper selected as an IEEE Micro's Top Pick in 2013, and Margarida Jacome Best Poster Award at GSRC Annual Symposium, 2012.




Conference and Journal Publications

arXiv and Workshop Publications

Patents

Theses

Software


Conference and Journal Publications

Top
  1. ALBERTA: ALgorithm-Based Error Resilience in Transformer Architectures
    H. Liu, V. Singh, M. Filipiuk, S. K. S. Hari
    IEEE Open Journal of the Computer Society, 2024

  2. Safety-Critical Scenario Generation Via Reinforcement Learning Based Editing
    H. Liu, L. Zhang, S. K. S. Hari, J. Zhao
    ICRA'24: IEEE International Conference on Robotics and Automation, 2024

  3. VaPr: Variable-Precision Tensors to Accelerate Robot Motion Planning
    Y. S. Hsiao, S. K. S. Hari, B. Sundaralingam, J. Yik, T. Tambe, C. Sakr, S. W. Keckler, V. J. Reddi
    IROS'23: IEEE/RSJ International Conference on Intelligent Robots and Systems, 2023

  4. CuRobo: Parallelized Collision-Free Robot Motion Generation
    B. Sundaralingam, S. K. S. Hari, A. Fishman, C. Garret, K. Van Wyk, A. Millane, H. Oleynikova, A. Handa, F. Ramos, N. Ratliff, and D. Fox
    ICRA'23: IEEE International Conference on Robotics and Automation, 2023

  5. Zhuyi: Perception Processing Rate Estimation for Safety of Autonomous Vehicles
    Y. S. Hsiao, S. K. S. Hari, M. Filipiuk, T. Tsai, M. B. Sullivan, V. J. Reddi, V. Singh, and S. W. Keckler
    DAC'22: Design Automation Conference, 2022

  6. Exploiting Temporal Data Diversity for Detecting Safety-critical Faults in AV Compute Systems
    S. Jha, S. Cui, T. Tsai, S. K. S. Hari, M. B. Sullivan, Z. T. Kalbarczyk, S. W. Keckler, R. K. Iyer
    DSN'22: IEEE/IFIP International Conference on Dependable Systems and Networks, 2022

  7. Characterizing and Mitigating Soft Errors in GPU DRAM
    M. B. Sullivan, M. O’Connor, D. Lee, P. Racunas, S. Hukerikar, N. Saxena, T. Tsai, S. K. S. Hari, and S. W. Keckler
    TopPicks'22: IEEE Micro, Special Issue on Top Picks from the 2021 Computer Architecture Conferences, 2022

  8. Suraksha: A Framework to Analyze the Safety Implications of Perception Design Choices in AVs
    H. Zhao, S. K. S. Hari, T. Tsai, M. B. Sullivan, S. W. Keckler, and J. Zhao
    ISSRE'21: IEEE International Conference on Software Reliability Engineering, 2021

  9. Optimizing Selective Protection for CNN Resilience
    A. Mahmoud, S. K. S. Hari, C. Fletcher, S. Adve, C. Sakr, N. Shanbag, P. Molchanov, M. B. Sullivan, T. Tsai, and S. W. Keckler
    ISSRE'21: IEEE International Conference on Software Reliability Engineering, 2021

  10. Characterizing and Mitigating Soft Errors in GPU DRAM
    M. B. Sullivan, M. O’Connor, D. Lee, P. Racunas, S. Hukerikar, N. Saxena, T. Tsai, S. K. S. Hari, and S. W. Keckler
    MICRO'21: IEEE/ACM International Symposium on Microarchitecture, 2021
    Selected as an IEEE Top Pick in Test and Reliability, 2023

  11. Generating and Characterizing Scenarios for Safety Testing of Autonomous Vehicles
    Z. Ghodsi, S. K. S. Hari, I. Frosio, T. Tsai, A. Troccoli, S. W. Keckler, S. Garg, and A. Anandkumar
    IEEE IV'21: IEEE Intelligent Vehicles Symposium, 2021

  12. NVBitFI: Dynamic Fault Injection for GPUs
    T. Tsai, S. K. S. Hari, M. B. Sullivan, O. Villa, and S. W. Keckler
    DSN'21: IEEE/IFIP International Conference on Dependable Systems and Networks, 2021

  13. Making Convolutions Resilient via Algorithm-Based Error Detection Techniques
    S. K. S. Hari, M. B. Sullivan, T. Tsai, and S. W. Keckler
    TDSC'21: IEEE Transactions on Dependable and Secure Computing, 2021

  14. Demystifying GPU Reliability: Comparing and Combining Beam Experiments, Fault Simulation, and Profiling
    F. F. do Santos, S. K. S. Hari, P. M. Basso, L. Carro, and P. Rech
    IPDPS'21: IEEE International Parallel & Distributed Processing Symposium, 2021

  15. AV-FUZZER: Finding Safety Violations in Autonomous Driving Systems
    G. Li, Y. Li, S. Jha , T. Tsai, M. B. Sullivan, S. K. S. Hari, Z. T. Kalbarczyk, and R. K. Iyer
    ISSRE'20: IEEE International Conference on Software Reliability Engineering, 2020
    Best Research Paper Award

  16. GPU-TRIDENT: Efficient Modeling of Error Propagation in GPU Programs
    A. R. Anwer, G. Li, K. Pattabiraman, M. B. Sullivan, T. Tsai, and S. K. S. Hari
    SC'20: International Conference for High-Performance Computing, Networking, Storage and Analysis, 2020

  17. GPU Snapshot: Checkpoint Offloading for GPU-Dense Systems
    K. Lee, M. B. Sullivan, S. K. S. Hari, T. Tsai, S. W. Keckler, and M. Erez
    ICS'19: International Conference on Supercomputing, 2019

  18. ML-based Fault Injection for Autonomous Vehicles: A Case for Bayesian Fault Injection
    S. Jha, S. S. Banerjee, T. Tsai, S. K. S. Hari, M. B. Sullivan, Z. T. Kalbarczyk, S. W. Keckler, R. K. Iyer
    DSN'19: IEEE/IFIP International Conference on Dependable Systems and Networks, 2019

  19. Optimizing Software-Directed Instruction Replication for GPU Error Detection
    A. Mahmoud, S. K. S. Hari, M. Sullivan, T. Tsai, and S. Keckler
    SC'18: International Conference for High-Performance Computing, Networking, Storage and Analysis, 2018 (acceptance rate: ~19%)

  20. SwapCodes: Error Codes for Hardware-Software Cooperative GPU Pipeline Error Detection
    M. Sullivan, S. K. S. Hari, B. Zimmer, T. Tsai, and S. Keckler
    MICRO'18: IEEE/ACM International Symposium on Microarchitecture, 2018 (acceptance rate: ~21%)

  21. Modeling Soft Error Propagation in Programs
    G. Li, K. Pattabiraman, S. K. S. Hari, M. Sullivan, and T. Tsai,
    DSN'18: IEEE/IFIP International Conference on Dependable Systems and Networks, 2018 (acceptance rate: ~25%)
    Best Paper Award Runner-Up

  22. Understanding Error Propagation in Deep-Learning Neural Networks (DNN) Accelerators and Applications
    G. Li, S. K. S. Hari, M. Sullivan, T. Tsai, K. Pattabiraman, J. Emer, S. Keckler
    SC'17: International Conference for High-Performance Computing, Networking, Storage and Analysis, 2017 (acceptance rate: ~19%)
    Selected as an IEEE Top Pick in Test and Reliability, 2023

  23. SASSIFI: An Architecture-level Fault Injection Tool for GPU Application Resilience Evaluation
    S. K. S. Hari, T. Tsai, M. Stephenson, S. Keckler, J. Emer
    ISPASS'17: IEEE International Symposium on Performance Analysis of Systems and Software, 2017 (acceptance rate: ~30%)

  24. Approxilyzer: Towards A Systematic Framework for Instruction-Level Approximate Computing and its Application to Hardware Resiliency
    R. Venkatagiri, A. Mahmoud, S. K. S. Hari, S. Adve
    MICRO'16: IEEE/ACM International Symposium on Microarchitecture, 2016 (acceptance rate: ~21%)

  25. Flexible Software Profiling of GPU Architectures
    M. Stephenson, S. K. S. Hari, Y. Lee, E. Ebrahimi, D. Johnson, D.Nellans, M. O’Connor, S. W. Keckler
    ISCA'15: International Symposium on Computer Architecture, 2015 (acceptance rate: ~19%)

  26. Locality-Driven Dynamic GPU Cache Bypassing
    C. Li, S. L. Song, H. Dai, A. Sidelnik, S. K. S. Hari, and H. Zhou
    ICS'15: International Conference on Supercomputing, 2015 (acceptance rate: ~25%)

  27. Hardware Fault Recovery for I/O Intensive Applications
    P. Ramachandran, S. K. S. Hari, M. Li, and S. V. Adve
    TACO'14:Transactions on Architecture and Code Optimization, 2014

  28. GangES: Gang Error Simulation for Hardware Resiliency Evaluation
    S. K. S. Hari, R. Venkatagiri, S. V. Adve, and H. Naeimi
    ISCA'14: International Symposium on Computer Architecture, 2014

  29. Relyzer: Application Resiliency Analyzer for Transient Faults
    S. K. S. Hari, S. V. Adve, H. Naeimi, and P. Ramachandran
    TopPicks'13: IEEE Micro, Special Issue on Top Picks from the 2012 Computer Architecture Conferences, 2013

  30. Low-cost Program-level Detectors for Reducing Silent Data Corruptions
    S. K. S. Hari, S. V. Adve, and H. Naeimi
    DSN'12: IEEE/IFIP International Conference on Dependable Systems and Networks, 2012 (acceptance rate: ~17%)

  31. Relyzer: Exploiting Application-level Fault Equivalence to Analyze Application Resiliency to Transient Faults
    S. K. S. Hari, S. V. Adve, H. Naeimi, and P. Ramachandran
    ASPLOS '12: International Conference on Architectural Support for Programming Languages and Operating Systems, 2012 (acceptance rate: ~21%)

  32. CrashTest'ing SWAT: Accurate, Gate-Level Evaluation of Symptom-Based Resiliency Solutions
    A. Pellegrini, R. Smolinski, L. Chen, X. Fu, S. K. S. Hari, J. Jiang, S. V. Adve, T. Austin, V. Bertacco
    DATE'12: Design, Automation and Test in Europe, 2012

  33. Architectures for Online Error Detection and Recovery in Multicore Processors D. Gizopoulos, M. Psarakis, S. V. Adve, P. Ramachandran, S. K. S. Hari, D. Sorin, A. Meixner, A. Biswas, X. Vera
    DATE'11: Design, Automation and Test in Europe, 2011 (acceptance rate: ~25%)

  34. mSWAT: Low-Cost Hardware Fault Detection and Diagnosis for Multicore Systems
    S. K. S. Hari, M. Li, P. Ramachandran, B. Choi, S. V. Adve
    MICRO'09: IEEE/ACM International Symposium on Microarchitecture, 2009 (acceptance rate: ~24%)

  35. Accurate Microarchitecture-level Fault modeling for Studing Wear-out Faults
    M. Li, P. Ramachandran, U. Karpuzcu, S. K. S. Hari, S. V. Adve
    HPCA'09: Proceeding of the International Conference on High-Performance Computer Architecture, 2009 (acceptance rate: ~19%)

  36. Automatic Constraint Based Test Generation for Behavioral HDL Models
    S. K. S. Hari, V. V. Konda, V. Kamakoti, V. Vedula, K. S. Maneperambil
    TVLSI'08: IEEE Transactions on VLSI Systems in the special section on Design Verification and Validation: Theory and Techniques, 2008

  37. Power Virus Generation Using Behavioural Models of Circuits
    K. Najeeb, V. V. Konda, S. K. S. Hari, V. Kamakoti, V. Vedula
    VTS'07: IEEE VLSI Test Symposium , 2007 (acceptance rate: ~35%)

  38. Constructing Online Testable Circuits using Reversible Logic
    N. Mahammad, S. K. S. Hari, S. Shroff, V. Kamakoti
    VDAT'06: IEEE VLSI Design and Test Symposium, 2006

  39. Efficient Building Blocks for Reversible Sequential Circuit Design
    S. K. S. Hari, S. Shroff, N. Mahammad, V. Kamakoti
    MWSCAS'06: IEEE International Midwest Symposium on Circuits and Systems , 2006


  40. arXiv and Workshop Publications

    Top
  41. Towards Precision-Aware Fault Tolerance Approaches for Mixed-Precision Applications
    B. Fang, S. K. S. Hari, T. Tsai, X. Li, G. Gopalakrishnan, I. Laguna, K. Barker, and A. Li
    FTXS'22: Workshop on Fault-Tolerance for HPC at Extreme Scale, 2022

  42. Suraksha: A Quantitative AV Safety Evaluation Framework to Analyze Safety Implications of Perception Design Choices
    H. Zhao, S. K. S. Hari, T. Tsai, M. B. Sullivan, S. W. Keckler, and J. Zhao
    SSIV'21: Workshop on Safety and Security of Intelligent Vehicles, 2021

  43. Simulation Driven Design and Test for Safety of AI Based Autonomous Vehicles
    V. Singh, S. K. S. Hari, T. Tsai, M. Pitale
    SAIAD'21: Workshop on Safe Artificial Intelligence for Automated Driving, 2021

  44. Generating and Characterizing Scenarios for Safety Testing of Autonomous Vehicles
    Z. Ghodsi, S. K. S. Hari, I. Frosio, T. Tsai, A. Troccoli, S. W. Keckler, S. Garg, and A. Anandkumar
    arXiv'21
    ART'20: Shorter version accepted at the IEEE International Workshop on Automotive Reliability and Test, 2020

  45. Making Convolutions Resilient via Algorithm-Based Error Detection Techniques
    S. K. S. Hari, M. B. Sullivan, T. Tsai, S. W. Keckler
    arXiv'20

  46. PyTorchFI: A Runtime Perturbation Tool for DNNs
    A. Mahmoud, N. Aggarwal, A. Nobbe, J. R. S. Vicarte, S. V. Adve, C.W. Fletcher, I. Frosio, and S. K. S. Hari
    DSN-S'20: IEEE/IFIP International Conference on Dependable Systems and Networks – Supplemental Volume, 2020, presented at the Workshop on Dependable and Secure Machine Learning (DSML), 2020

  47. Estimating Silent Data Corruption Rates Using a Two-Level Model
    S. K. S. Hari, P. Rech, T. Tsai, M. Stephenson, A. Zulfiqar, M. B. Sullivan, P. Shirvani, P. Racunas, J. Emer, and S. W. Keckler
    arXiv'20

  48. Feature Map Vulnerability Evaluation in CNNs
    A. Mahmoud, S. K. S. Hari, C. Fletcher, S. Adve, C. Sakr, N. Shanbag, P. Molchanov, M. B. Sullivan, T. Tsai, and S. W. Keckler
    SARA'20: Workshop on Secure and Resilient Autonomy (SARA), 2020

  49. HarDNN: Feature Map Vulnerability Evaluation in CNNs
    A. Mahmoud, S. K. S. Hari, C. Fletcher, S. Adve, C. Sakr, N. Shanbag, P. Molchanov, M. B. Sullivan, T. Tsai, and S. W. Keckler
    arXiv'20
    An updated version appeared in SRC TECHCON'20 with the title "HarDNN: Fine-Grained Vulnerability Evaluation and Protection for Convolutional Neural Networks"

  50. Towards analytically evaluating the error resilience of GPU Programs
    A. R. Anwer, G. Li, K. Pattabiraman, S. K. S. Hari, M. B. Sullivan, T. Tsai
    SELSE'19: IEEE Workshop on Silicon Errors in Logic - System Effects, 2019

  51. On the Trend of Resilience for GPU-Dense Systems
    K. Lee, M. B. Sullivan, S. K. S. Hari, T. Tsai, S. W. Keckler, M. Erez
    DSN-S'19: IEEE/IFIP International Conference on Dependable Systems and Networks – Supplemental Volume, 2019, also presented at the IEEE Workshop on Silicon Errors in Logic - System Effects, 2019 and received Best of SELSE Award

  52. Kayotee: A Fault Injection-based System to Assess the Safety and Reliability of Autonomous Vehicles to Faults and Errors
    S. Jha, T. Tsai, S. K. S. Hari, M. Sullivan, Z. Kalbarczyk, S. W. Keckler, and R. Iyer
    ART'18: IEEE International Workshop on Automotive Reliability & Test, 2018

  53. An Analytical Model for Hardened Latch Selection and Exploration
    M. Sullivan, B. Zimmer, S. K. S. Hari, T. Tsai, S. Keckler
    SELSE'16: IEEE Workshop on Silicon Errors in Logic - System Effects, 2016

  54. SASSIFI:Evaluating Resilience of GPU Applications
    S. K. S. Hari, T. Tsai, M. Stephenson, S. W. Keckler, and J. Emer.
    SELSE'15: IEEE Workshop of Silicon Errors in Logic - System Effects (SELSE), 2015

  55. Measuring the Radiation Reliability of SRAM Structures in GPUs Designed for HPC
    P. Rech, L. Carro, N. Wang, T. Tsai, S. K. S. Hari, and S. W. Keckler
    SELSE'14: IEEE Workshop on Silicon Errors in Logic - System Effects , 2014

  56. Relyzer: Application Resiliency Analyzer for Transient Faults
    S. K. S. Hari, H. Naeimi, P. Ramachandran, S. V. Adve
    SELSE'11: IEEE Workshop of Silicon Errors in Logic - System Effects , 2011

  57. Understanding When Symptom Detectors Work by Studying Data-Only Application Values
    P. Ramachandran, S. K. S. Hari, S. V. Adve, H. Naeimi
    SELSE'11: IEEE Workshop of Silicon Errors in Logic - System Effects, 2011.

  58. CrashTest'ing SWAT: Accurate, Gate-Level Evaluation of Symptom-Based Resiliency Solutions
    A. Pellegrini, R. Smolinski, X. Fu, L. Chen, S. K. S. Hari, J. Jiang, S. V. Adve, T. Austin, V. Bertacco
    SELSE'11: IEEE Workshop of Silicon Errors in Logic - System Effects, 2011


Patents:

Top

Theses:

Top

Software:

Top