Research

Library

This is where I keep the collections. Nine ADS libraries I curate on NASA ADS, three thematic literature explorers, a podcast library, essays, and a live feed of what's been added recently. It's the reading side of the research.

Research libraries

NASA ADS ↗
  • Coding Agents

    85 documents

    Software-engineering agents: architectures, multi-agent coding, and how developers work with them.

    • Controllable Memory Usage: Balancing Anchoring and Innovation in Long-Term Human-Agent Interaction
    • Developer Interaction Patterns with Proactive AI: A Five-Day Field Study

    Browse →

  • Benchmarks

    62 documents

    Evaluating coding agents and code models on real software work.

    • Towards Comprehensive Benchmarking Infrastructure for LLMs In Software Engineering
    • IDE-Bench: Evaluating Large Language Models as IDE Agents on Real-World Software Engineering Tasks

    Browse →

  • Code Generation & Retrieval

    41 documents

    Code generation, context retrieval, and localization for coding agents.

    • ShortCoder: Knowledge-Augmented Syntax Optimization for Token-Efficient Code Generation
    • Compressed code: the hidden effects of quantization and distillation on programming tokens

    Browse →

  • Agent Memory

    35 documents

    Long-horizon memory for LLM agents: storage, consolidation, and forgetting.

    • From Storage to Experience: A Survey on the Evolution of LLM Agent Memory Mechanisms
    • Same Ranking, Different Winner: How Scoring Targets Shape LLM Memory Benchmarks

    Browse →

  • Scientific Search & SciX

    84 documents

    Navigating scientific literature: NASA ADS / SciX information systems, scientific language models, and fine-grained classification of research text.

    • Decades of Transformation: Evolution of the NASA Astrophysics Data System's Infrastructure
    • Improving astroBERT Using Semantic Textual Similarity

    Browse →

Thematic explorers

Navigable, themed maps of recent literature. Each one starts from an ADS library and structures the papers into themes using SciX MCP.

  • Agentic Memory Systems

    108 papers · 9 themes

    Procedural & SkillsReflection & ExperienceBenchmarksEval MethodologySynthetic DataArchitecturesSecurity & GovernanceApplications & PersonalizationForgetting & Consolidation
  • Memory Design Considerations

    51 papers · 11 themes

    Retrieval & RankingConsolidation & DistillationKnowledge RepresentationTemporality & UpdatingForgetting & LifecycleStorage SubstrateMulti-Agent & Shared MemoryWorking Memory & ContextEvaluation & CostInterop, Schema & GovernanceFoundations & Landscape
  • Enterprise Multi-Agent Reliability

    50 papers · 8 themes

    Reliability & failure modesRecovery & durable stateObservability & tracingEvaluation & assuranceCost, routing & schedulingTopology & coordinationSecurity & governanceHuman oversight & collaboration

Resource libraries

What's new

Recent papers across all libraries, ordered by publication date. This is the recency prior from the hybrid scorer in my code-intelligence-digest app.

Paper Library Date Cites
From Storage to Experience: A Survey on the Evolution of LLM Agent Memory Mechanisms ↗ Agent Memory 2026-05-00 0
Same Ranking, Different Winner: How Scoring Targets Shape LLM Memory Benchmarks ↗ Agent Memory 2026-05-00 0
MemConflict: Evaluating Long-Term Memory Systems Under Memory Conflicts ↗ Agent Memory 2026-05-00 0
Hijacking Agent Memory: Stealthy Trojan Attacks Through Conversational Interaction ↗ Agent Memory 2026-05-00 0
MemTrace: Tracing and Attributing Errors in Large Language Model Memory Systems ↗ Agent Memory 2026-05-00 0
Harness Updating Is Not Harness Benefit: Disentangling Evolution Capabilities in Self-Evolving LLM Agents ↗ Agent Memory 2026-05-00 0
SkillEvolBench: Benchmarking the Evolution from Episodic Experience to Procedural Skills ↗ Agent Memory 2026-05-00 0
GRAVITY: Architecture-Agnostic Structured Anchoring for Long-Horizon Conversational Memory ↗ Agent Memory 2026-05-00 0
STALE: Can LLM Agents Know When Their Memories Are No Longer Valid? ↗ Agent Memory 2026-05-00 0
Adaptive Memory Crystallization for Autonomous AI Agent Learning in Dynamic Environments ↗ Agent Memory 2026-04-00 0
HyperMem: Hypergraph Memory for Long-Term Conversations ↗ Agent Memory 2026-04-00 2
CLEAR: Context Augmentation from Contrastive Learning of Experience via Agentic Reflection ↗ Agent Memory 2026-04-00 0
Learning to Forget -- Hierarchical Episodic Memory for Lifelong Robot Deployment ↗ Agent Memory 2026-04-00 0
Towards Real-world Human Behavior Simulation: Benchmarking Large Language Models on Long-horizon, Cross-scenario, Heterogeneous Behavior Traces ↗ Agent Memory 2026-04-00 0
SEA-Eval: A Benchmark for Evaluating Self-Evolving Agents Beyond Episodic Assessment ↗ Agent Memory 2026-04-00 0
EngramaBench: Evaluating Long-Term Conversational Memory with Structured Graph Retrieval ↗ Agent Memory 2026-04-00 0
Controllable Memory Usage: Balancing Anchoring and Innovation in Long-Term Human-Agent Interaction ↗ Coding Agents 2026-01-00 0
Developer Interaction Patterns with Proactive AI: A Five-Day Field Study ↗ Coding Agents 2026-01-00 2
CodeEval: A pedagogical approach for targeted evaluation of code-trained Large Language Models ↗ Coding Agents 2026-01-00 0
Towards Comprehensive Benchmarking Infrastructure for LLMs In Software Engineering ↗ Benchmarks 2026-01-00 0
IDE-Bench: Evaluating Large Language Models as IDE Agents on Real-World Software Engineering Tasks ↗ Benchmarks 2026-01-00 1
ShortCoder: Knowledge-Augmented Syntax Optimization for Token-Efficient Code Generation ↗ Code Generation & Retrieval 2026-01-00 1
Compressed code: the hidden effects of quantization and distillation on programming tokens ↗ Code Generation & Retrieval 2026-01-00 1
From Large AI Models to Agentic AI: A Tutorial on Future Intelligent Communications ↗ Coding Agents 2026-00-00 43
Beyond Elicitation: Provision-Based Prompt Optimization for Knowledge-Intensive Tasks ↗ Code Generation & Retrieval 2026-00-00 0
Trusted AI Agents in the Cloud ↗ Coding Agents 2025-12-00 4
Building from Scratch: A Multi-Agent Framework with Human-in-the-Loop for Multilingual Legal Terminology Mapping ↗ Coding Agents 2025-12-00 0
A Practical Guide for Designing, Developing, and Deploying Production-Grade Agentic AI Workflows ↗ Coding Agents 2025-12-00 11
From monoliths to modules: Decomposing transducers for efficient world modelling ↗ Coding Agents 2025-12-00 1
Please Don't Kill My Vibe: Empowering Agents with Data Flow Control ↗ Coding Agents 2025-12-00 2