WeChat Mini Program
Old Version Features

Task Driven Sensor Layouts - Joint Optimization of Pixel Layout and Network Parameters

2024 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL PHOTOGRAPHY, ICCP 2024(2024)

Univ Siegen | Univ Mannheim

Cited 0|Views6
Abstract
Computational imaging concepts based on integrated edge AI and neural sensor concepts solve vision problems in an end-to-end, task-specific manner, by jointly optimizing the algorithmic and hardware parameters to sense data with high information value. They yield energy, data, and privacy efficient solutions, but rely on novel hardware concepts, yet to be scaled up. In this work, we present the first truly end-to-end trained imaging pipeline that optimizes imaging sensor parameters, available in standard CMOS design methods, jointly with the parameters of a given neural network on a specific task. Specifically, we derive an analytic, differentiable approach for the sensor layout parameterization that allows for task-specific, locally varying pixel resolutions. We present two pixel layout parameterization functions: rectangular and curvilinear grid shapes that retain a regular topology. We provide a drop-in module that approximates sensor simulation given existing high-resolution images to directly connect our method with existing deep learning models. We show for two different downstream tasks, classification and semantic segmentation, that network predictions benefit from learnable pixel layouts.
More
Translated text
Key words
Sensors,Sensor Optimization,Computer Vision,Semantic Segmentation
求助PDF
上传PDF
Bibtex
AI Read Science
AI Summary
AI Summary is the key point extracted automatically understanding the full text of the paper, including the background, methods, results, conclusions, icons and other key content, so that you can get the outline of the paper at a glance.
Example
Background
Key content
Introduction
Methods
Results
Related work
Fund
Key content
  • Pretraining has recently greatly promoted the development of natural language processing (NLP)
  • We show that M6 outperforms the baselines in multimodal downstream tasks, and the large M6 with 10 parameters can reach a better performance
  • We propose a method called M6 that is able to process information of multiple modalities and perform both single-modal and cross-modal understanding and generation
  • The model is scaled to large model with 10 billion parameters with sophisticated deployment, and the 10 -parameter M6-large is the largest pretrained model in Chinese
  • Experimental results show that our proposed M6 outperforms the baseline in a number of downstream tasks concerning both single modality and multiple modalities We will continue the pretraining of extremely large models by increasing data to explore the limit of its performance
Upload PDF to Generate Summary
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Related Papers
Russell A. Kirsch
2010

被引用4 | 浏览

Data Disclaimer
The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn
Chat Paper

要点】:本研究首次实现了端到端的成像管道训练,该管道同时优化标准CMOS设计方法中的成像传感器参数和特定任务的神经网络参数,提出了一种解析的、可微分的传感器布局参数化方法,实现了面向任务的局部可变像素分辨率。

方法】:研究采用了一种可微分的方法来参数化传感器布局,允许任务特定的、局部变化的像素分辨率,并提出了两种像素布局参数化函数:矩形和曲线网格形状,这些形状保持了规则的拓扑结构。

实验】:研究实现了一个即插即用的模块,该模块通过给定的高分辨率图像来近似传感器模拟,可以直接与现有的深度学习模型连接,并在两个不同的下游任务——分类和语义分割上展示了网络预测从可学习的像素布局中受益,但未具体提及使用的数据集名称。