Match!
Mark Horowitz
Stanford University
Parallel computingElectronic engineeringComputer scienceCMOSReal-time computing
583Publications
91H-index
36.7kCitations
What is this?
Publications 404
Newest
While hardware generators have drastically improved design productivity, they have introduced new challenges for the task of verification. To effectively cover the functionality of a sophisticated generator, verification engineers require tools that provide the flexibility of metaprogramming. However, flexibility alone is not enough; components must also be portable in order to encourage the proliferation of verification libraries as well as enable new methodologies. This paper introduces fault,...
We show that DNN accelerator micro-architectures and their program mappings represent specific choices of loop order and hardware parallelism for computing the seven nested loops of DNNs, which enables us to create a formal taxonomy of all existing dense DNN accelerators. Surprisingly, the loop transformations needed to create these hardware variants can be precisely and concisely represented by Halide's scheduling language. By modifying the Halide compiler to generate hardware, we create a syst...
Mar 9, 2020 in ASPLOS (Architectural Support for Programming Languages and Operating Systems)
#1Xuan Yang (Stanford University)H-Index: 5
#2Mingyu Gao (THU: Tsinghua University)H-Index: 7
Last. Priyanka Raina (Stanford University)H-Index: 3
view all 12 authors...
We show that DNN accelerator micro-architectures and their program mappings represent specific choices of loop order and hardware parallelism for computing the seven nested loops of DNNs, which enables us to create a formal taxonomy of all existing dense DNN accelerators. Surprisingly, the loop transformations needed to create these hardware variants can be precisely and concisely represented by Halide's scheduling language. By modifying the Halide compiler to generate hardware, we create a syst...
6 CitationsSource
#1Ankita Nayak (Stanford University)H-Index: 1
Last. Priyanka Raina (Stanford University)H-Index: 3
view all 10 authors...
Source
#1Kevin Kiningham (Stanford University)H-Index: 2
#2Philip Levis (Stanford University)H-Index: 45
Last. Maurice Shih (Stanford University)
view all 6 authors...
Internet of Things (IoT) devices, once deployed, must remain secure for their entire lifetime, which can be as long as 20 years. Over this lifetime, devices must be able to update which ciphers they use to meet evolving security requirements. However, devices cannot rely on software updates for their cryptography because software implementations consume too much energy. At the same time, fixed function hardware accelerators such as an AES engine cannot support new ciphers. This paper presents Fa...
Source
Sep 23, 2019 in ICIP (International Conference on Image Processing)
#2Edward H. Lee (Stanford University)H-Index: 5
Last. Mark Horowitz (Stanford University)H-Index: 91
view all 4 authors...
Real-time CNN-based object detection models for applications like surveillance can achieve high accuracy but are computationally expensive. Recent works have shown 10 to 100× reduction in computation cost for inference by using domain-specific networks. However, prior works have focused on inference only. If the domain model requires frequent retraining, training costs can pose a significant bottleneck. To address this, we propose Dataset Culling: a pipeline to reduce the size of the dataset for...
Source
Developing and middle-income countries increasingly empha-size higher education and entrepreneurship in their long-term develop-ment strategy. Our work focuses on the influence of higher education institutions (HEIs) on startup ecosystems in Brazil, an emerging economy. First, we describe regional variability in entrepreneurial network characteristics. Then we examine the influence of elite HEIs in economic hubs on entrepreneur networks. Second, we investigate the influence ofthe academic trajec...
Apr 4, 2019 in ASPLOS (Architectural Support for Programming Languages and Operating Systems)
#1Mingyu Gao (Stanford University)H-Index: 7
#2Xuan Yang (Stanford University)H-Index: 5
Last. Christos Kozyrakis (Stanford University)H-Index: 60
view all 5 authors...
The use of increasingly larger and more complex neural networks (NNs) makes it critical to scale the capabilities and efficiency of NN accelerators. Tiled architectures provide an intuitive scaling solution that supports both coarse-grained parallelism in NNs: intra-layer parallelism, where all tiles process a single layer, and inter-layer pipelining, where multiple layers execute across tiles in a pipelined manner. This work proposes dataflow optimizations to address the shortcomings of existin...
11 CitationsSource
#2Edward H. Lee (Stanford University)H-Index: 5
Last. Mark Horowitz (Stanford University)H-Index: 91
view all 4 authors...
Real-time CNN-based object detection models for applications like surveillance can achieve high accuracy but are computationally expensive. Recent works have shown 10 to 100x reduction in computation cost for inference by using domain-specific networks. However, prior works have focused on inference only. If the domain model requires frequent retraining, training costs can pose a significant bottleneck. To address this, we propose Dataset Culling: a pipeline to reduce the size of the dataset for...
#1Byong Chan Lim (Stanford University)H-Index: 6
#2Mark Horowitz (Stanford University)H-Index: 91
Real number models, which are computationally efficient analog functional models, are now indispensable in verifying complex mixed-signal systems on chip (SoCs); yet, creating and validating these models remain difficult. To remove this problem, we created a framework for building analog functional model templates. Each template covers a class of circuits (e.g., oscillators or amplifiers) and can generate functional models for any implementation of this circuit class, regardless of pin configura...
Source
12345678910