本文记录一些读过的多模态论文。
Learning to Prompt for Vision-Language Models
( Citation: Zhou, Yang & al., 2022 Zhou, K., Yang, J., Loy, C. & Liu, Z. (2022). Learning to Prompt for Vision-Language Models. International Journal of Computer Vision, 130(9). 2337–2348. https://doi.org/10.1007/s11263-022-01653-1 ) 提出了自动化生成CLIP类别提示词的方法。