<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"><channel><title>Stephanie Jarmak — Digest</title><description>Daily and weekly digests on agentic coding, evals, multi-agent orchestration, agent memory, and information retrieval.</description><link>https://sjarmak.ai/</link><item><title>The week the benchmarks broke</title><link>https://sjarmak.ai/digest/daily-2026-06-09/</link><guid isPermaLink="true">https://sjarmak.ai/digest/daily-2026-06-09/</guid><description>Opus 4.8 scores 13.8% on FrontierCode Diamond, and METR says over half of passing SWE-bench results are unmergeable slop. The field spent the week rebuilding its measuring sticks: cheating-resistant evals, exploration and memory benchmarks, and the finding that orchestration is a skill distinct from coding.</description><pubDate>Tue, 09 Jun 2026 00:00:00 GMT</pubDate><category>evals</category><category>agentic-coding</category><category>information-retrieval</category><category>agent-memory</category><category>multi-agent-orchestration</category><enclosure url="https://sjarmak.ai/media/digests/daily-2026-06-09.mp3" length="0" type="audio/mpeg"/></item><item><title>Enhancing Developer Productivity with Google Colab CLI and Agentic Observability</title><link>https://sjarmak.ai/digest/manual-enhancing-developer-productivity-with-google-colab-cli-and-agentic-observability/</link><guid isPermaLink="true">https://sjarmak.ai/digest/manual-enhancing-developer-productivity-with-google-colab-cli-and-agentic-observability/</guid><description>The insights from this digest suggest that teams should actively explore integrating advanced tools like Google’s Colab CLI for resource management, adopt agentic observability for improved operational oversight, and consider the implications of SWE-Marathon for evaluating coding agents’ capabilities. Additionally, focusing on MEnvAgent may significantly bolster productivity and success rates in coding tasks. By leaning into these developments, teams can optimize workflows and tool utilization, ultimately enhancing developer productivity and code quality.</description><pubDate>Tue, 09 Jun 2026 00:00:00 GMT</pubDate><category>developer-productivity</category><category>evals</category><category>agent-memory</category><category>infrastructure</category><category>knowledge-bases</category><category>benchmarks</category><category>reliability</category></item><item><title>Agents Get Graded on Process, Not Just Pass/Fail</title><link>https://sjarmak.ai/digest/weekly-2026-06-09/</link><guid isPermaLink="true">https://sjarmak.ai/digest/weekly-2026-06-09/</guid><description>A week of instrumentation: benchmarks broke the binary resolved/unresolved score into exploration, maintainability, and handoff cost, while a Sonnet 4.6 judge that flags agents contradicting their own reasoning predicted failure 94% of the time. Memory research converged on agent-controlled storage over fixed pipelines, self-evolving agents started learning from their own traces, and multi-agent orchestration finally got a cost accounting. Adoption more than doubled in the same window.</description><pubDate>Tue, 09 Jun 2026 00:00:00 GMT</pubDate><category>evals</category><category>agent-memory</category><category>multi-agent</category><category>agentic-coding</category><category>information-retrieval</category><enclosure url="https://sjarmak.ai/media/digests/weekly-2026-06-09.mp3" length="0" type="audio/mpeg"/></item><item><title>Weekly: the orchestration stack consolidates</title><link>https://sjarmak.ai/digest/weekly-2026-06-08/</link><guid isPermaLink="true">https://sjarmak.ai/digest/weekly-2026-06-08/</guid><description>This week the multi-agent orchestration tooling started to converge on a few shared patterns — typed message contracts, deterministic fan-out, and adversarial review as a default stage. Plus a strong week for coding-agent benchmarks and a quietly important retrieval-eval release.</description><pubDate>Mon, 08 Jun 2026 00:00:00 GMT</pubDate><category>multi-agent</category><category>agentic-coding</category><category>evals</category><category>information-retrieval</category><enclosure url="https://sjarmak.ai/media/podcasts/podcast-agentic-memory-ep2-the-memory-stack.mp3" length="0" type="audio/mpeg"/></item><item><title>Curated: what I&apos;m actually reading on agentic coding</title><link>https://sjarmak.ai/digest/manual-agentic-coding-roundup/</link><guid isPermaLink="true">https://sjarmak.ai/digest/manual-agentic-coding-roundup/</guid><description>A hand-picked set from the items I starred in code-intelligence-digest this week — the agentic-coding pieces I keep coming back to, with a note on why each one stuck.</description><pubDate>Fri, 05 Jun 2026 00:00:00 GMT</pubDate><category>agentic-coding</category><category>evals</category></item></channel></rss>