Exploiting Memorization: Understanding the CLM Objective for Knowledge Encoding in LLMs

Written by reckoning | Published 2025/10/24
Tech Story Tags: llms | transformer-models | deep-learning | memorization | causal-language-modeling | parametric-knowledge | contextual-knowledge | loss-function

TLDRThis article provides the necessary background and notation for reasoning research, defining problems as tuplesvia the TL;DR App

Abstract and 1. Introduction

  1. Background

  2. Method

  3. Experiments

    4.1 Multi-hop Reasoning Performance

    4.2 Reasoning with Distractors

    4.3 Generalization to Real-World knowledge

    4.4 Run-time Analysis

    4.5 Memorizing Knowledge

  4. Related Work

  5. Conclusion, Acknowledgements, and References

A. Dataset

B. In-context Reasoning with Distractors

C. Implementation Details

D. Adaptive Learning Rate

E. Experiments with Large Language Models

2 Background

Notation We use f : X × θ → Y to refer to parameterised functions in which X is the set of possible inputs and θ are their possible weights (parameters). We use fθ : x 7→ f(x, θ) to easily refer to any f with a given set of parameters θ. We describe reasoning problems using tuples (K, x, y∗ , Y ) such that y ∈ Y is the correct answer for the question x given facts K, and use D to refer to sets of such problems. When it is clear from context, we drop Y and use only (K, x, y∗ ).

Language Modeling and Memorization In the causal language modeling (CLM) objective, a parameterized model fθ is trained to estimate the conditional probabilities of each token in a sequence given its predecessors: p(xt|x) Specifically, we train fθ to approximate p using the CLM loss:

This training objective allows language models to memorize individual training examples [10, 11], and we will exploit this ability to memorize and draw on contextual knowledge in our work.

Authors:

(1) Zeming Chen, EPFL ([email protected]);

(2) Gail Weiss, EPFL ([email protected]);

(3) Eric Mitchell, Stanford University ([email protected])';

(4) Asli Celikyilmaz, Meta AI Research ([email protected]);

(5) Antoine Bosselut, EPFL ([email protected]).


This paper is available on arxiv under CC BY 4.0 DEED license.


Written by reckoning | No technological innovation comes without sacrifice. The pendulum will swing back to the people! Wanna' be 501c3.
Published by HackerNoon on 2025/10/24