ACM Transactions on Reconfigurable Technology and Systems
Papers 284
1 page of 29 pages (284 results)
In this article, we present a novel type of medium-grained reconfigurable architecture that we term the Field Programmable Operation Array (FPOA). This device has been designed specifically for the implementation of HLS-generated circuitry. At the core of the FPOA is the OP-block. Unlike a standard LUT, an OP-block performs multi-bit operations through gate-based logic structures, translating into greater speed and efficiency in digital circuit implementation. Our device is not optimized for a s...
#1Yaman Umuroglu (Xilinx)H-Index: 7
#2Davide Conficconi (Polytechnic University of Milan)
Last.Magnus Själander (NTNU: Norwegian University of Science and Technology)H-Index: 2
view all 5 authors...
Matrix-matrix multiplication is a key computational kernel for numerous applications in science and engineering, with ample parallelism and data locality that lends itself well to high-performance implementations. Many matrix multiplication-dependent applications can use reduced-precision integer or fixed-point representations to increase their performance and energy efficiency while still offering adequate quality of results. However, precision requirements may vary between different applicatio...
#1Ibrahim Ahmed (U of T: University of Toronto)H-Index: 10
#2Shuze Zhao (U of T: University of Toronto)H-Index: 5
Last.Vaughn Betz (U of T: University of Toronto)H-Index: 29
view all 5 authors...
In earlier technology nodes, FPGAs had low power consumption compared to other compute chips such as CPUs and GPUs. However, in the 14nm technology node, FPGAs are consuming unprecedented power in the 100+W range, making power consumption a pressing concern. To reduce FPGA power consumption, several researchers have proposed deploying dynamic voltage scaling. While the previously proposed solutions show promising results, they have difficulty guaranteeing safe operation at reduced voltages for a...
#1Muhsen Owaida (ETH Zurich)H-Index: 8
#2Amit Kulkarni (ETH Zurich)H-Index: 6
Last.Gustavo Alonso (ETH Zurich)H-Index: 59
view all 3 authors...
Given the growth in data inputs and application complexity, it is often the case that a single hardware accelerator is not enough to solve a given problem. In particular, the computational demands and I/O of many tasks in machine learning often require a cluster of accelerators to make a relevant difference in performance. In this article, we explore the efficient construction of FPGA clusters using inference over Decision Tree Ensembles as the target application. The article explores several le...
#1Ilias Giechaskiel (University of Oxford)H-Index: 4
#2Ken Eguro (Microsoft)H-Index: 11
Last.Kasper Bonne Rasmussen (University of Oxford)H-Index: 17
view all 3 authors...
1 CitationsSource
#1Jiliang Zhang (Hunan University)H-Index: 1
#2Gang Qu (UMD: University of Maryland, College Park)H-Index: 1
2 CitationsSource
Top fields of study
Embedded system
Parallel computing
Computer science
Field-programmable gate array
Real-time computing