谷歌浏览器插件
订阅小程序
在清言上使用

RegionGPT: Towards Region Understanding Vision Language Model

CVPR 2024(2024)

引用 37|浏览103
关键词
Language Model,Vision Language,Vision-language Models,Object Classification,Understanding Of The Complexity,Spatial Awareness,Visual Encoding,Regional Level,Feature Maps,Visual Features,Bounding Box,Vision Tasks,Word Embedding,Two-stage Approach,Training Efficiency,Output Format,COCO Dataset,Details In Regions,Image Captioning,Ground-truth Box,Pre-training Stage,Visual Question Answering,Fine-tuning Stage,High-resolution Feature Maps,Object Classification Tasks,Entire Training Process,Low-resolution Map,Artificial Intelligence,Output Patterns,Contextual Information
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要