STAR + FAR
Research · 2025STAR + FAR
Continual learning for LLMs that stays fresh without forgetting
Role · Creator (design & evaluation); first author on the paper
Budget-conscious continual-learning pipelines for LLMs built on two mechanisms, Sparse Temporal Adapter Routing (STAR) and Freshness-Aware Replay (FAR), that acquire fresh knowledge from day-wise data streams while preserving legacy knowledge, at minutes-per-day update cost. Now written up as a paper under review at ACM TIST.
The stale-vs-forgetting tradeoff
Keeping a model both fresh and stable is a tension: updates that add new knowledge tend to overwrite old knowledge, and the methods that protect old knowledge tend to dull or delay the new. STAR + FAR attacks both sides at once: sparse temporal adapters keep new information from clobbering old capabilities, and freshness-aware replay decides what to rehearse and how often, while treating cost (wall-clock train time, RAM/VRAM, buffer size) as a first-class constraint rather than an afterthought.
The system runs a rolling daily loop over reproducible day-wise streams (encyclopedic, news, and StackExchange-style), measuring a Factual Freshness Index on same-day evals and legacy retention on a fixed pre-stream holdout. Full method and results are in the paper, currently under review at ACM Transactions on Intelligent Systems and Technology.