Submitted by Yilun Zhao 13 Can Multimodal Foundation Models Understand Schematic Diagrams? An Empirical Study on Information-Seeking QA over Scientific Papers Yale NLP Lab 7 1