The Art of (Artificial) Reasoning
Scaling laws suggest that “more is more” — brute-force scaling of data and compute leads to stronger AI capabilities. However, despite rapid progress on benchmarks, state-of-the-art models still exhibit "jagged intelligence," indicating that current scaling approaches may have limitations in terms of sustainability and robustness. Additionally, while the volume of papers on arXiv continues to grow rapidly, our scientific understanding of artificial intelligence hasn't kept pace with engineering advances, and the current literature presents seemingly contradictory findings that can be difficult to reconcile. In this talk, I will discuss key insights into the strengths and limitations of LLMs, examine when reinforcement learning succeeds or struggles in reasoning tasks, and explore methods for enhancing reasoning capabilities in smaller language models to help them close the gap against their larger counterparts in specific domains.
Queer in AI
Creative AI Session 3
Emergent Stories: AI Agents and New Narratives
Abstract: This panel explores how AI agents are reshaping contemporary storytelling across games, installations, film, and performance. From multi-agent LLM architectures that generate narratives through simulated social dynamics to environments where humans and AI construct meaning together under uncertainty, each work examines how stories emerge between actors. An interactive performance exposes the limits of computational empathy through a malfunctioning therapeutic agent, while Little Martians turns ceramic sculptures into autonomous creators releasing daily films. We will discuss how these and related projects consider storytelling as a distributed, agent-driven process that displaces the current norms of coherence, authorship and narrative control.
On the Science of “Alien Intelligences”: Evaluating Cognitive Capabilities in Babies, Animals, and AI
Today’s generative AI systems—termed by some researchers as “alien intelligences”—have exceeded human performance on many benchmarks meant to test humanlike cognitive capabilities. However, these systems still struggle in unhumanlike ways on real-world tasks requiring these capabilities. This disconnect may be due in part to neglect in the AI community of well-founded experimental protocols for evaluating cognition. In this talk I will summarize several recommendations on experimental methods from developmental and comparative psychology—fields that study the “alien intelligences” of babies and non-human animals—and demonstrate the application of such methods in two case studies of cognitive abilities in LLMs: analogical reasoning and visual abstraction.
Creative AI Session 4
This panel explores how generative and embodied AI are transforming design into a materially intelligent practice. Through projects spanning olfactory storytelling, edible architecture, autonomous instruments, and robotic co-fabrication, the discussion investigates how intelligence manifests through physical form, sensory interaction, and situated behavior. Bringing together researchers from architecture, robotics, music, and computational design, the panel examines how physical AI systems reconfigure the relationship between cognition and materiality, suggesting new paradigms for co-creation and human experience in an embodied, multisensory design ecology.
Panelists: Selina Khan Alexander Htet Kyaw Cyrus Clarke Masatoshi Hamanaka Ethan Chang
Value Chain from Research to ROI
We aim to understand: How can research be ready to accelerate into products/applied solutions? What kind of environments, processes & catalysts are needed to establish pathways from research to larger ecosystems consisting of products & businesses? We will have a panel discussion on these topics followed by a hands-on group activity where attendees will ideate sector-specific product applications from their own research. The event is structured to help researchers articulate their research’s broader potential, emphasizing ecosystem thinking and practical steps needed to scale research impact.
NeuroEd: Artificial Intelligence and the Future of Learning
NeuroEd: Artificial Intelligence and the Future of Learning explores how educational technology and AI are transforming teaching and learning. The social invites participants to discuss AI-driven tools that personalize education, accelerate learning, and enhance educational outcomes. Attendees will connect across disciplines, AI, EdTech, ethics, and beyond, to share ideas, talk about implications, and imagine a future where technology empowers educators and learners alike.
When Errors Dream: Exploring Collective Creativity through AI Hallucination
When Errors Dream reframes AI hallucination as creative material. In a two-hour open-space jam in festival style, attendees rotate through small groups to generate surprising AI outputs—text, image, or sound—and transform them into collaborative artworks and interactive experiences using digital and analog media. No prior skills are required; emphasis is on the joy of making with machines, not polish. Designed to be drop-in friendly, the format scales to 150–200 participants through science-fair-style stations and quick exquisite-corpse creation cycles, fostering inclusive networking through shared play rather than one-directional talks.
Evaluating Agentic Systems: Bridging Research Benchmarks and Real-World Impact
Agentic AI systems - LLM-driven agents capable of autonomous planning, tool use, and multi-step task execution - are rapidly advancing, yet methods for evaluating them remain underdeveloped. Traditional metrics for static or single-turn tasks fail to capture the complexity of open-ended, long-horizon interactions where goals evolve and behaviors emerge dynamically. This social aims to bridge research and industry perspectives on designing frameworks, simulation environments, and metrics that assess reliability, alignment, and safety in autonomous agents. Through lightning talks, panel discussions, and networking, the event fosters an interactive exchange on how to meaningfully evaluate and benchmark the next generation of agentic AI systems.
| NeurIPS uses cookies for essential functions only. We do not sell your personal information. Our Privacy Policy » |