AI Weekly Trends Highly Opinionated Signals from the Week [CY26W10]

Mar 09, 2026

🔗 Learn more about me, my work, and how to connect: maeste.it – personal bio, projects, and social links.

Welcome to this week’s newsletter. Lots of important news: OpenAI releases GPT-5.4 with remarkable performance, while Alibaba launches the Qwen 3.5 family with a 9-billion parameter model that beats the 120-billion open source GPT — the news of the week for models. Google responds with Gemini 3.1 Flash-Lite, focusing on speed and low costs. On the agents front, OpenClaw’s influence is felt everywhere: from Cursor automations to the thesis that MCP is dead in favor of CLIs. For AI-assisted coding, we present LINCE, our open source project, and discuss how traditional code reviews are destined to change. Finally, in business, the Anthropic-Pentagon clash evolves, OpenAI raises 110 billion dollars, and Amodei reminds us that critical thinking remains our last true advantage.

Before I leave you to read the news and my analysis of what happened this week, let me tell you what has happened, is about to happen, or will happen in my public agenda, for those who want to follow my talks or meet me in person (I love exchanging opinions with anyone who wants to):

Podcast with Alessio and Paolo:
- On March 12th we’ll be at JUG Milan to record our first live episode. Don’t miss it
- On Saturday we released an episode where we talk extensively about AI coding, agents, and lots of news
- We’re working on other interviews and episodes with very interesting guests
- We’ve created a GitHub repository with tools and configurations for AI coding from the Linux terminal. Obviously open source, so take a look and contribute: LINCE - Linux Intelligent Native Coding Environment
On my own:
- I was interviewed again on the Open Source podcast. This time I talk about agents, AI, AGI. Released here on the 26th. Listen to it and send me your comments
- On March 24th I’ll be at Voxxed Day in Zurich. Alessio and I present a talk on AI assisted coding
- On March 25th I’ll be a speaker at this Milan meetup on Vibe Coding and Agentic Engineering
- On May 30th I’ll have the honor of being one of the PyCon Italia speakers

But let’s start with AI research, because this week there’s also relevant news on the models front.

🧠 AI Models News and Research

Takeaways for AI Engineers

Takeaway 1: The competition on models is played on three axes: pure performance (GPT-5.4), efficiency/cost (Gemini 3.1 Flash-Lite), and small high-performance models (Qwen 3.5). There’s no longer a single winner.

Takeaway 2: Qwen 3.5-9B beating gpt-oss-120B demonstrates that model size matters less than architecture: Chinese open source models are redefining the performance/parameters ratio.

Takeaway 3: NotebookLM with cinematic videos marks the shift from “AI that summarizes” to “AI that produces complete multimedia content”.

Action Items:

Try GPT-5.4 and Qwen 3.5-9B on a concrete task in your workflow to compare real performance and costs.
Experiment with NotebookLM Cinematic Video Overviews to evaluate if it can replace traditional presentation tools.

What’s happening this week?

As always, lots of news in the world of models. Let’s start with the new model from OpenAI, GPT-5.4, which brings many new features and truly remarkable performance, both from what I read in benchmarks and from the first opinions I’ve seen on social media. This seems to be OpenAI’s response to the good work Anthropic is doing with its Claude models, both Opus and Sonnet. Google responds instead with Gemini 3.1 Flash-Lite, focusing mainly on speed and low costs: it’s the smallest and fastest of the Gemini 3 series. But if we’re talking about small and fast models, the absolutely news of the week — but perhaps the news of the week in general for models — is the release of Qwen 3.5 models. As always for Qwen, it’s a family of natively multimodal models with different sizes, starting from 0.8 billion parameters up to very large models, mixture of experts, with great performance. Particularly interesting is the article comparing the 9-billion version that beats the 120-billion open source GPT version in reasoning and benchmarks. A truly remarkable result for Chinese models, which comes from Qwen, which I remind you is made by Alibaba, therefore one of the world’s big tech companies and certainly the largest in China. Finally, I point out the new feature, currently in testing for only some users, of NotebookLM which, using Gemini 3 and Veo, is able to generate videos that are no longer just slide presentations but real documentaries.

This week’s links

Introducing GPT-5.4 — GPT-5.4 with 1M token context, tool search, improved vision and 33% fewer factual errors compared to GPT-5.2.
Deep Dive: Qwen 3.5 — Qwen 3.5 with native multimodality, 262K token context and hybrid architecture for edge deployment from 0.8B parameters.
Gemini 3.1 Flash-Lite — The fastest and most affordable model in the Gemini 3 series, starting at $0.25/M input tokens.
Cinematic Video Overviews in NotebookLM — NotebookLM generates cinematic videos from user sources using Gemini 3 and Veo 3.
Qwen3.5-9B Beats gpt-oss-120B — Qwen3.5-9B surpasses gpt-oss-120B on benchmarks, available open source with Apache 2.0 license.

🤖 Agentic AI

Takeaways for AI Engineers

Takeaway 1: OpenClaw’s influence is visible everywhere: persistent memory, goal-driven automation, and CLIs as the interface for agents are becoming dominant patterns in the industry.

Takeaway 2: The “MCP is dead” thesis finds practical confirmation in the explosion of dedicated CLIs (Google Workspace CLI at the forefront): existing, well-documented tools beat new protocols.

Takeaway 3: GAM demonstrates that effective agentic memory isn’t just “save everything”, but on-demand synthesis with an approach inspired by just-in-time compilation.

Action Items:

Explore Cursor Automations to build recurring agents in your development workflow.
Read the GAM paper to evaluate how to apply the Memorizer/Researcher approach to memory management in your agents.

What’s happening this week?

In this week’s news about agents, it seems quite evident to see an influence from OpenClaw. Let me explain. One of OpenClaw’s characteristics was certainly having strong automation based on what it learned from memory. We find this characteristic both in Cursor automations — and here the parallel is quite strong. Cursor Automations introduces agents that run on schedules, create their own sandboxes, and especially have access to a memory tool that allows them to learn from past executions. It’s exactly the logic of the “factory that produces software” — a concept that those who follow OpenClaw will immediately recognize. We also find it in the tests that Google is doing for what they call Learning Hub, where agents learn autonomously based on defined objectives. These two things closely resemble what OpenClaw does in this area. Or at least we can say that, if it’s not a direct influence, the same forces that led to OpenClaw’s development are also leading others to do research in the same direction. Another of OpenClaw’s characteristics is using CLIs instead of MCP. In this sense, both the article that takes a strong position saying that MCP is dead and especially the release of many CLIs in recent weeks for doing the most varied things are interesting. The key point of the article is simple but powerful: LLMs are already good at figuring things out on their own, all they need is a CLI and documentation. New protocols aren’t needed when the tools already exist and work well for both humans and agents. Last but not least, the one from Google engineers to interact with all of Workspace, from Gmail to Calendar to Google Drive and everything related to Workspace. Finally, I point you to a research called General Agentic Memory via Deep Research that’s worth reading. It’s a memory framework for agents. There are several interesting ideas in this research and I invite you to read it, particularly paying attention to the approach that somewhat resembles the just-in-time compilation found in some languages, but applied to context and natural language. In practice, GAM uses two components — a Memorizer that maintains lightweight summaries and a Researcher that retrieves and synthesizes relevant information only when needed. It’s an elegant alternative to the classic “save everything in a vector store” approach that dominates today’s agentic landscape.

This week’s links

Cursor Automations — Always-on agents on schedule or events with cloud sandbox and memory to learn from past executions.
Google Workspace CLI — Unified CLI for all Google Workspace services, designed for humans and AI agents with 100+ skills.
GAM: General Agentic Memory Via Deep Research — Agentic memory framework with JIT approach: Memorizer for lightweight summaries and Researcher for on-demand synthesis.
Google Tests Learning Hub with Goal-Based Actions — Gemini “Goal Scheduled Actions”: AI autonomously adjusts tasks toward defined objectives.
MCP Is Dead. Long Live the CLI — CLIs are more practical than MCP for humans and agents: existing, well-documented, and universal tools.

💻 AI Assisted Coding

Takeaways for AI Engineers

Takeaway 1: LINCE demonstrates that Linux and the terminal remain the most natural environment for AI-assisted development: sandbox, backlog, and voice assistance integrated in a single session.

Takeaway 2: Traditional code reviews are becoming the bottleneck of AI-assisted development: we need new models based on intentions and acceptance criteria, not line-by-line inspection.

Takeaway 3: The introduction of evals in Anthropic’s skill-creator marks a paradigm shift: tested context beats reduced context for improving coding agents’ performance.

Action Items:

Try LINCE and contribute to the open source project with feedback, bug reports, or pull requests.
Register for the Packt Publishing workshop on AI refactoring with ast-grep and Claude Code (March 14, 2026).

What’s happening this week?

Let’s start with something a bit self-referential but that I care a lot about. Together with the other guys from the Risorse Artificiali podcast, we started an open source project that we called LINCE, which stands for Linux Intelligent Native Coding Environment. A complicated name to make public what we use every day. In practice, we put together a series of scripts, configurations, and also some software that we developed with the help of intelligent agents to help us in our daily work as AI-assisted developers. It’s basically having an integration of Claude Code with a backlog, with a voice assistant that we called VoxCode, and with a sandbox made entirely in Linux user space. I know you’re probably thinking that many of these things already exist natively in Claude Code, but the point is that most of these don’t work or don’t work well within a Linux system. And we believe that Linux can instead be the system in which to do development within the terminal, since it’s the operating system that best integrates the terminal of all. If you want, go take a look, try it, leave us comments, feedback, open some bugs if you find any, or maybe send us some pull requests. Thanks.

One of the things we might look into is also how to automate, or better make simpler and more direct, code reviews. Because code reviews are becoming precisely the bottleneck and perhaps also one of the things that will need to be revised as the code generated by intelligent agents becomes increasingly greater. The article I propose to you, titled “How to Kill the Code Review,” is very interesting, where it discusses how the traditional code review method can become unsustainable in the AI era and how instead we could focus more on other criteria, including the acceptance of so-called development intentions. Read it, it’s worth it.

Just as it’s worth taking a look at the article that talks about the importance of having inserted evals in the creation of skills by Anthropic. As the article explains well, it’s an important paradigm shift that’s worth exploring.

Finally, I point you to a 90-minute workshop organized by Packt Publishing that covers the aspects of refactoring using AI coding. The trainer is also the author of a library called ast-grep and uses precisely the AST approach to guide Claude Code in the refactoring phase, greatly reducing regressions and making the refactoring process much more linear. I believe that mixed approaches like this can be absolutely significant for further improving the AI-assisted coding experience. I’m already registered for that workshop, if you want, take a look.

This week’s links

How to Kill the Code Review — Traditional code reviews are unsustainable with AI: need to focus on specifications and acceptance criteria.
LINCE - Linux Intelligent Native Coding Environment — Agentic workstation on Linux terminal with sandboxed Claude Code, task board, and voice assistance.
Anthropic Brings Evals to Skill-Creator — Evals integrated into Anthropic’s skill-creator to automatically test and validate AI skills.
Safely Refactor Production Codebases with AI — Workshop — Workshop (March 14, 2026) on safe refactoring with ast-grep and Claude Code on production codebases.

🏢 Business and Society

Takeaways for AI Engineers

Takeaway 1: Amodei reiterates that critical thinking is the last true human competitive advantage, while coding becomes a commodity: a clear message about where to invest your skills.

Takeaway 2: The Anthropic-Pentagon clash and the subsequent OpenAI-Department of War agreement raise unresolved questions: are ethical red lines real constraints or negotiating tools?

Takeaway 3: OpenAI at $730B valuation and 900M weekly users demonstrates that the consumer AI market is now mainstream, regardless of the technical debate on models.

Action Items:

Read the Anthropic paper on labor market impacts to understand where AI is already changing employment dynamics in your sector.
Watch Amodei’s full interview in Bangalore to deepen your understanding of his vision on power concentration in AI.

What’s happening this week?

By now you’ll have figured it out, when Dario Amodei speaks I certainly don’t miss his interview. A new one came out, a full interview he gave in Bangalore, where he focuses on how coding skills are in decline and how much critical thinking can be the competitive advantage to preserve for human beings. He also talks about how power concentration with AI can become an even bigger problem than it has been in the past.

And indeed, as you well know, during this period Amodei and Anthropic have taken a position regarding the U.S. Department of War and the Pentagon, withdrawing from a million-dollar agreement. This thing has had various developments: I’m reporting to you the latest response from Amodei and Anthropic regarding this clash, in which they were accused of being a risk to the national security supply chain. They give a very strong response, which you can find in the article I reported. Obviously the American Department isn’t standing still and in the meantime has made other agreements, specifically with OpenAI, who declare they have put red lines against the arguments that caused discussion and that pushed Anthropic to withdraw. Personally, it seems strange to me that Anthropic withdrew because those red lines were crossed and then the American Department of War signed an agreement with another company respecting them instead. Something doesn’t add up.

Meanwhile OpenAI continues to do its business and has raised another 110 billion dollars, reaching a valuation of 730 billion dollars: 900 million weekly active users, 50 million consumer subscribers, and 9 million business subscribers. I’d say the company is doing well anyway, despite some doubts it raised in recent months compared to competitors, who seem to have moved forward faster.

I conclude by pointing you to another paper from Anthropic that analyzes the labor market and the real impacts that AI is already having: different from what we perhaps expected a few months ago, but extremely interesting to read nonetheless.

This week’s links

Where Things Stand with the Department of War — Anthropic responds to the designation as a national security supply chain risk, announces legal action.
Labor Market Impacts of AI — New Anthropic framework to measure the real impacts of AI on labor markets.
Dario Amodei — Full Interview — Full interview in Bangalore: coding in decline, critical thinking as human advantage, power concentration in AI.
OpenAI’s Agreement with the Department of War — Classified agreement with red lines against mass surveillance, autonomous weapons, and automated decision-making.
OpenAI Raises $110B at $730B Valuation — $110B round, 900M weekly users, 50M consumer subscribers, 9M paying business users.

Mar 11

The OpenClaw connection is interesting but I'd flip the causality slightly. The forces that shaped gws: CLI-first, Discovery Document-driven command generation, SKILL.md files for composable agent skills. They aren't downstream of OpenClaw. They're converging independently on the same architecture. Justin Poehnelt wrote the design thesis up in a post called 'You Need to Rewrite Your CLI for AI Agents' and it reads like a parallel derivation, not an influence.

LINCE is doing the same thing from the bottom up. Terminal as native habitat, CLIs as the composition layer. Same destination, different route.

Full gws architecture breakdown here https://reading.sh/google-workspace-finally-has-a-cli-and-its-built-for-agents-5f5fe87d0425 if you want the internals. The two-phase argument parsing from Discovery Documents is the bit that most directly connects to the MCP is dead thesis: when the command surface comes from the same docs the agent reads anyway, the discovery problem collapses.

Artificial Code

Discussion about this post

Ready for more?