A Case for Low Bitwidth Floating Point Arithmetic on FPGA for Transformer Based DNN Inference
2024 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS, IPDPSW 2024(2024)
Key words
Deep Neural Network,Deep Neural Network Inference,Low Bit-width,Transformer Model,Linear Layer,Floating-point Operations,32-bit Floating-point,Exponent,Nonlinear Function,Processing Unit,Per Cycle,Lookup Table,Partial Products,Deep Neural Network Model,Product Term,Arithmetic Operations,Partial Sums,Least Significant Bit,Hardware Architecture,Minimal Overhead,Sign Bit,Hardware Overhead
AI Read Science
Must-Reading Tree
Example

Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined