Graduate Seminar

Graduate Seminar
Speaker:

Gil Shomron

Affiliation:

The Andrew and Erna Viterbi Faculty of Electrical & Computer Engineering

Embracing Sparsity and Resiliency for Deep Learning Acceleration

Deep neural networks (DNNs) have gained tremendous momentum in recent years, both in academia and industry. DNNs introduce state-of-the-art results in numerous applications, such as image classification, object segmentation, and speech recognition. Yet, DNNs are compute intensive and may require billions of multiply-and-accumulate operations for a single input query. Limited resources, such as those in IoT devices, latency constraints, and high input throughput, all drive research and development of efficient computing methods for DNN execution. In our research, we rethink two well-known CPU methods – simultaneous multithreading (SMT) and value prediction – and map them to the new environment introduced by DNNs, by leveraging their unique characteristics. First, DNNs are resilient, that is, they can tolerate noise in parameters (e.g., quantization) or during MAC operations with only a “graceful degradation” in accuracy; and second, DNNs usually comprise many zero-valued activations and weights. Leveraging activation sparsity is particularly challenging, since activation values are input dependent, i.e., zero-valued activations are both dynamic and unstructured. With SMT, we propose a new concept of non-blocking SMT (NB-SMT), in which execution units are shared among several computational flows to avoid idle MAC operations due to zero-valued operands. In the scenario of a structural hazard on a shared execution unit, we propose to temporarily and locally “squeeze in” the operations by reduced precision. We present and discuss the path from a data-driven “blocking” SMT design to the concept of NB-SMT, to a fine-tuned sparsity-aware quantization method. As for value prediction, we present prediction schemes which leverage the inherent spatial correlation in CNN feature maps to predict zero-valued activations. By speculating which activations will be zero-valued, we potentially reduce the required MAC operations. * PhD Seminar under the supervision of Prof. Uri Weiser. Zoom link: https://technion.zoom.us/j/96766030020

Date: Sun 01 Aug 2021

Start Time: 14:30

End Time: 15:30

Zoom meeting | The Andrew And Erna Viterbi Faculty Of Electrical & Computer Engineering