订阅小程序
旧版功能

CogAgent: A Visual Language Model for GUI Agents

CVPR 2024(2024)

引用 341|浏览1912
关键词
Graphical User Interface,Language Model,Visual Model,High-resolution Images,Low-resolution Images,Image Encoder,Visual Question Answering,Image Features,Natural Language,Multi-agent,Sequence Of Actions,Visual Features,Original Structure,Bounding Box,Web Page,Model Architecture,Natural Images,Description Task,Residual Connection,Optical Character Recognition,Pre-training Data,Text Sequence,Hidden Size,Floating-point Operations,Decoder Layer,Domain Generalization,Side Of The Image,Data Augmentation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要