CFBenchmark-MM: Chinese Financial Assistant Benchmark for Multimodal Large Language Model

Apr 7, 2026·

Jiangtong Li

Yiyun Zhu

Dawei Cheng

Zhijun Ding

Changjun Jiang

· 0 min read

PDF

Abstract

Multimodal Large Language Models (MLLMs) have rapidly evolved with the growth of Large Language Models (LLMs) and are now applied in various fields. In finance, the integration of diverse modalities such as text, charts, and tables is crucial for accurate and efficient decision-making. In this paper, we introduce CFBenchmark-MM, a Chinese multimodal financial benchmark with over 9,000 image-question pairs featuring tables, histogram charts, line charts, pie charts, and structural diagrams. Additionally, we develop a staged evaluation system to assess MLLMs in handling multimodal information by providing different visual content step by step.

Type

Journal article

Publication

Big Data Mining and Analytics (BDMA 2026)

Last updated on May 28, 2026

Large Language Model Financial Benchmark Multimodal

← Bridging Visual Dynamics and Reasoning Evaluation: Multimodal Large Language Models for Short Drama Quality Assessment Apr 13, 2026

DynaMind: Reconstructing Dynamic Visual Scenes from EEG by Aligning Temporal Dynamics and Multimodal Semantics to Guided Diffusion Mar 1, 2026 →