Beyond Pattern Recognition: Assessing AI Through Abstract Visual Reasoning
Abstract: Abstract Visual Reasoning (AVR) encompasses a class of tasks that require discovering shared underlying concepts across sets of images through analogy-making, similar to the processes humans employ when solving IQ tests. In this talk, I will provide an overview of the main types of AVR problems and discuss potential solution approaches. In the second part of the talk, I will focus on Bongard Problems (BPs), which represent a fundamental challenge in AVR, primarily due to the need to integrate visual reasoning with verbal description. Specifically, I will investigate whether multimodal large language models (MLLMs)—which are explicitly designed to combine vision and language—are capable of solving BPs. To address this question, I will present and analyze results from applying several MLLMs to BPs composed of both synthetic and real-world images. The findings reveal clear limitations in the abstract visual reasoning capabilities of contemporary models.
Speaker Bio: Jacek Mańdziuk is a Full Professor at the Faculty of Mathematics and Information Science, Warsaw University of Technology, and Head of the Division of Artificial Intelligence and Computational Methods. He is a Member of the Polish Academy of Sciences, a Fulbright Senior Research Scholar (visiting UC Berkeley and ICSI Berkeley, USA, 1996–1997), an IEEE Senior Member, and a recipient of the Robert Schuman Foundation Fellowship (visiting CNRS, Besançon, France, 1994). In 2015–2016, he was a Visiting Professor at Nanyang Technological University (Singapore). He served as General Chair of the 5th Polish Conference on Artificial Intelligence (2024), General Co-Chair of the IEEE Congress on Evolutionary Computation (2021), and Chair of the annual IEEE SSCI Symposium on Computational Intelligence for Human-like Intelligence (2013–2023). He publishes in leading journals and top-tier AI/ML conferences, including ICML, ICLR, AAAI, IJCAI, and AAMAS. His research interests include the application of AI to security and social good, bilevel optimization, abstract visual reasoning, games, human–machine cooperation, and human-like learning and problem solving.