https://openreview.net/forum?id=Q5pVZCrrKr&referrer=%5Bthe%20profile%20of%20Kai%20Yan%5D(%2Fprofile%3Fid%3D~Kai_Yan1)
Inductive program synthesis, or programming by example, requires synthesizing functions from input-output examples that generalize to unseen inputs. While...
llm agentsbenchmarkingreasoningcapabilitiesinductive
https://openreview.net/forum?id=GQNojroNCH&referrer=%5Bthe%20profile%20of%20Kaivalya%20Hariharan%5D(%2Fprofile%3Fid%3D~Kaivalya_Hariharan1)
Benchmarks for large language models (LLMs) have predominantly assessed short-horizon, localized reasoning. Existing long-horizon suites (e.g. SWE-lancer) rely...
stress testingllm agentsbreakpointsystemslevel
https://labelbox.com/blog/announcing-r-constraintbench-a-novel-way-to-stress-test-llm-reasoning-abilities-under-interacting-constraints/
stress testannouncingnovelwayllm
https://openreview.net/group?id=ICLR.cc/2026/Workshop/LLM_Reasoning&referrer=%5BHomepage%5D(%2F)
Welcome to the OpenReview homepage for ICLR 2026 Workshop LLM Reasoning
llm reasoningiclrworkshopopenreview
https://arxiv.org/abs/2601.10332
Abstract page for arXiv paper 2601.10332: Think-Then-Generate: Reasoning-Aware Text-to-Image Diffusion with LLM Encoders
thinkgeneratereasoningawaretext
https://nousresearch.com/introducing-the-forge-reasoning-api-beta-and-nous-chat-an-evolution-in-llm-inference/
May 29, 2025 - The Forge Reasoning API contains some of our latest advancements in inference-time AI research, building on our journey from the original Hermes model.
nous chatintroducingforgereasoningapi
https://aclanthology.org/2025.acl-demo.2/
Xin Quan, Marco Valentino, Danilo Carvalho, Dhairya Dalal, Andre Freitas. Proceedings of the 63rd Annual Meeting of the Association for Computational...
formal reasoningpeircematerialviallm
https://openreview.net/forum?id=t2227VT4RJ&referrer=%5Bthe%20profile%20of%20Nan%20Yin%5D(%2Fprofile%3Fid%3D~Nan_Yin4)
Knowledge Graph Question Answering (KGQA) aims to interpret natural language queries and perform structured reasoning over knowledge graphs by leveraging their...
adaptive reasoningviallmguidedmcts
https://openreview.net/forum?id=ITuuEaXcSB&referrer=%5Bthe%20profile%20of%20Zidi%20Xiong%5D(%2Fprofile%3Fid%3D~Zidi_Xiong2)
The rapid advancement of large language model (LLM) agents has raised new concerns regarding their safety and security, which cannot be addressed by...
llm agentssafeguardviaknowledgeenabled
https://arxiv.org/abs/2506.12509v3
Abstract page for arXiv paper 2506.12509v3: Graph of Verification: Structured Verification of LLM Reasoning with Directed Acyclic Graphs
llm reasoninggraphverificationstructured
https://openreview.net/forum?id=ITuuEaXcSB&referrer=%5Bthe%20profile%20of%20Jiawei%20Zhang%5D(%2Fprofile%3Fid%3D~Jiawei_Zhang9)
The rapid advancement of large language model (LLM) agents has raised new concerns regarding their safety and security, which cannot be addressed by...
llm agentssafeguardviaknowledgeenabled
https://www.graphcore.ai/posts/june-papers-gradient-norms-llm-reasoning-and-video-generation
This month: why gradients spike late in training, how prolonged RL boosts LLM reasoning, and using diffusion models for real-time video generation.
llm reasoningvideo generationjunepapersgradient
https://arxiv.org/abs/2402.10963
Abstract page for arXiv paper 2402.10963: GLoRe: When, Where, and How to Improve LLM Reasoning via Global and Local Refinements
gloreimprovellm
https://openreview.net/forum?id=IZHWvvmYwx&referrer=%5Bthe%20profile%20of%20Kai%20Yan%5D(%2Fprofile%3Fid%3D~Kai_Yan1)
The ability to recognize patterns from examples and apply them to new ones is a primal ability for general intelligence, and is widely studied by psychology...
mirbenchllmrecognizecomplicated
https://arxiv.org/abs/2601.14209
Abstract page for arXiv paper 2601.14209: InT: Self-Proposed Interventions Enable Credit Assignment in LLM Reasoning
intselfproposedenablecredit
https://www.preprints.org/manuscript/202507.1289
Large Language Models (LLMs) have demonstrated significant capabilities in answering questions using techniques such as Chain of Thought (CoT) and...
llm reasoningiratcontrolledretrievalrobust