WeChat Mini Program
Old Version Features

A Case for Low Bitwidth Floating Point Arithmetic on FPGA for Transformer Based DNN Inference

Jiajun Wu, Mo Song, Jingmin Zhao,Hayden Kwok-Hay So

2024 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS, IPDPSW 2024(2024)

Cited 1|Views4
Key words
Deep Neural Network,Deep Neural Network Inference,Low Bit-width,Transformer Model,Linear Layer,Floating-point Operations,32-bit Floating-point,Exponent,Nonlinear Function,Processing Unit,Per Cycle,Lookup Table,Partial Products,Deep Neural Network Model,Product Term,Arithmetic Operations,Partial Sums,Least Significant Bit,Hardware Architecture,Minimal Overhead,Sign Bit,Hardware Overhead
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined