Scale AI logo
SEAL Logo

VISTA

Visual Language Understanding

Last updated: April 4, 2025

Performance Comparison

1

54.65±1.46

2

51.79±0.63

2

51.66±1.08

2

50.78±0.57

2

50.07±1.14

4

49.59±0.66

5

49.15±0.36

5

47.32±1.78

6

48.23±0.70

8

46.97±1.29

8

46.96±0.95

9

45.50±1.20

9

45.34±0.91

10

45.49±0.21

11

45.25±0.40

14

43.53±1.24

14

43.25±1.26

16

43.21±0.52

16

43.02±1.14

16

42.11±1.39

20

41.14±0.58

20

39.95±0.80

21

39.85±0.71

22

Claude 3.5 Sonnet (October 2024)

38.72±0.51

24

Claude 3.5 Sonnet (June 2024)

38.37±0.70

24

38.33±0.55

24

ChatGPT-4o-latest (November 2024)

37.99±0.48

24

Gemini 1.5 Pro

37.07±1.34

29

GPT-4o (August 2024)

34.94±0.23

29

34.59±1.12

29

Gemini 1.5 Flash 002

34.03±1.41

30

Pixtral Large (November 2024)

33.89±0.69

30

32.69±1.40

34

Qwen2-VL-72B-Instruct

28.56±1.37

34

Claude 3 Opus

27.82±0.55

36

26.55±0.35

36

Nova Pro

26.27±0.61

36

Pixtral 12B (September 2024)

25.97±0.74

36

Nova Lite

25.50±0.77

38

Llama 3.2 90B Vision Instruct

24.61±0.80

41

Llama 3.2 11B Vision-Instruct

20.47±0.15

42

Phi 3.5 Vision-Instruct

15.18±0.81