Fall 2025 Submitted January 2026

Importance Lenses for Reasoning Models Chain of Thoughts

Tommaso Derossi

Mentored by Rauno Arike

Working report from the SPAR program. May not reflect the authors' current views.

Abstract

Chain-of-thought (CoT) monitoring has been shown to be an opportunity for the detection of Large Language Models (LLMs) misbehaviors. Currently, how the CoT process is structured and unfolds still lacks understanding. To improve that comprehension, one crucial step is constituted by the decomposition of those often long CoTs into somehow connected subparts and attribution of importance to the different parts obtained. Although different ways to perform such decompositions have been proposed, it is not clear how the chosen decomposition affects the scores attributed to the different parts of the CoT. In this context we focus on exploring the sensitivity of the attribution scores to the granularity at which the scores are computed, in particular we compare resulting attribution scores at the sentence level with those obtained at the token level.