The attack surface is you: why I built a sandbox for my agents
🔗 Learn more about me, my work, and how to connect: maeste.it – personal bio, projects, and social links.
This is issue number 52. Exactly one year ago I started this adventure of writing every single week about what’s happening in the world of AI. I’ve been thinking for a while about whether it’s time to change the format of this newsletter, and I believe today is the right day. I could have waited a week and started with year number 2, but instead I want to close the first year with this change.
What’s different? The philosophy shifts a bit. Until now I’ve tried to inform you by collecting and curating a series of links that could highlight the trends of the moment in the AI world. I believe that in the phase we’ve just been through this was useful, because seeing where big tech was heading, what investments they were making, what new technologies and models they were launching was the most important thing so each of you could form your own opinion. But things have changed over time: news roundups like the ones I’ve been doing are increasingly common, even in Italian, and I believe they’re starting to have less added value. On top of that, the information overload we’re all subject to forces us to separate what’s important from what’s redundant, and I certainly don’t want to be redundant. I’d rather give you something more, trying to tell you about things with a strong point of view, as always, but in greater detail.
So starting this week, the newsletter consists of just two sections. A deep-dive where I tackle a specific topic, talk about a personal project or something worth exploring to understand a particular aspect or trend in the AI world right now. The second section is a collection of just a few links, only those that truly caught my attention during the week, with a personal comment that I hope will make you want to read the full articles I link to. This week’s deep-dive starts with a very personal project I built with the folks at Risorse Artificiali: it’s called LINCE and I invite you to read the deep-dive that follows, try it out and let me know.
My agenda
Podcast with Alessio and Paolo:
A great interview with Gabriele Venturi, founder of PandasAI and a true nerd, is out :)
We’re working on more interviews and episodes with very interesting guests.
You already know about our GitHub repository with tools and configurations for AI coding from the terminal on Linux. It now has its own website with a single-script installation Lince.sh
We released AntiVocale (Google Play, GitHub), a software to translate voice messages to text
On my own:
The video of the talk I gave with Alessio at VoxxedDay Zurich has been published
On May 30th I’ll have the honor of being one of the PyCon Italia speakers
On June 12th I’ll be in Catania as a speaker at Coderful
Sandbox, multi-agent and vendor independence: why I built LINCE
This week I want to tell you about my sensitivity to the security of my development environment, especially since I started developing with agent-based systems. Something I said during a conference, the Voxxed Day Ticino, in a panel on AI security, was precisely that developers need to change their mindset, because the attack surface is no longer just the software they produce and put into production. The attack surface is themselves and their development environment.
Starting from this consideration, I believe it’s fundamental to start asking ourselves how we isolate our development environments, especially when working with agents. The most concrete answer comes from sandboxes. One of the traditional ways to use sandboxes is to rely on cloud solutions. There are quite a few on the market (E2B, Daytona, Fly.io Sprites, to name a few) and they are excellent solutions based on microVMs with hypervisor-level isolation. However, they have a fundamental flaw: you’re sending your code to the cloud and managing your agents inside a virtual machine that, however secure and guaranteed it may be, is not local. For those working on proprietary or sensitive code, this is not a negligible nuance. I asked myself whether it was time to think about local solutions.
It was precisely from this need that, together with the folks at Risorse Artificiali, I started the LINCE project. The project, which you can find on GitHub, was initially focused solely on sandboxing. Our implementation aimed to be extremely lightweight and focused on the Linux environment, so we turned to bubblewrap, a technology already present in Linux that uses kernel namespaces superbly. We’re certainly not the first to use it: bubblewrap is the technology behind Flatpak. What we did was put together an efficient CLI tool to create your sandboxes on the fly, with practically zero overhead, that lets you decide exactly which directories to expose and how. All entirely from the terminal, with no need to install IDEs, graphical interfaces or additional programs, and with minimal dependencies. The first goal was to be able to launch Claude Code and other agents completely skipping permission requests, since you’re inside a sandbox anyway. Additionally, the sandbox can take a snapshot of the current directory and configuration directories at launch time, allowing you to sync the last working version in case of disasters. In this sense, I feel comfortable launching agents without permission checks and letting them run in an almost YOLO mode.
But then there’s another problem, much discussed these days: multi-agent orchestration. Developers, increasingly to boost productivity, have started (and I was among the first) using multiple agents on the same project or developing projects in parallel. As Addy Osmani writes, the sweet spot seems to be between 3 and 5 agents in parallel, and the real bottleneck is no longer code generation but its verification. It’s an activity that leaves you exhausted by mid-morning, as someone says, but it also gives great satisfaction and unprecedented throughput. Part of the overload that comes from it is context switching, so I asked myself whether, alongside the sandbox, it was worth optimizing precisely that.
The idea was to develop a dashboard that would integrate with any agent from any vendor, allowing you to launch them in parallel and monitor when they needed input. The dashboard is a plugin written in Rust compiled to WASM (about 900KB), which adds an additional layer of isolation: the plugin runs inside Zellij‘s sandbox, the terminal window manager we built on, without direct access to the host system. Zellij is modern, well supported on Linux and macOS, and allowed us to keep the promise of a solution that lives entirely in the terminal, with no heavy dependencies.
The vendor independence aspect is not a technical detail: it’s a strategic choice. We saw this concretely on April 4th, when Anthropic blocked subscription access for all third-party tools with less than 24 hours notice, leaving those who depended exclusively on their ecosystem without immediate alternatives. LINCE supports Claude Code, Codex, Gemini, OpenCode, Aider and any custom agent via TOML configuration, precisely because tying yourself to a single vendor, however excellent, is a risk not worth taking.
For those using Linux who have always struggled with audio (and I know there are many of you), we also developed VoxCode, a module for voice interaction with agents using Whisper locally. Audio stays entirely on your machine, transcription is routed directly to the active agent in the dashboard, and everything works from the terminal without needing to configure PulseAudio, PipeWire or any other audio daemon to make it talk to external applications. It’s perfectly integrated with the dashboard and the sandbox system.
At the time of release, although development had started for Linux, we decided to support macOS as well. To do this we integrated nono as an alternative sandbox backend to bubblewrap. Nono is a very interesting project that leverages kernel-level security mechanisms (Landlock on Linux, Seatbelt on macOS) to create sandboxes with a deny-by-default approach across five layers of defense: kernel isolation, atomic rollback, cryptographic audit trail, supply chain provenance and runtime supervisor.
The result is a complete workstation for agent-based development that lives entirely in the terminal: sandbox, multi-agent orchestration, voice input, all installable with a single script from lince.sh. Please let us have your feedback.
Links that caught my attention this week
Hermes Agent
Hermes Agent is an open source alternative to OpenClaw. It caught my attention because there’s much more focus on security, but especially for its pluggable and extremely advanced memory system. It’s worth checking out even just to understand how they use the memory system. Overall Hermes performs very well from my testing and I believe it will be my next personal assistant. The research group behind it comes from the crypto world, which brings some concerns about the business model, but certainly not about technical expertise, especially on the security and cryptography front.
Cursor 3
Cursor isn’t standing still and releases version 3, even though with the arrival of all competitors and especially the Claude Code craze that exploded after December, it seemed a bit forgotten. The most interesting thing about Cursor 3 is its ability to orchestrate multiple agents that can run partly locally and partly in the cloud: the orchestration doesn’t care where agents are deployed, it just tries to get to the result.
Anthropic Ended Subscription-Based OpenClaw Usage
I mentioned this in the deep-dive. Anthropic changing the rules of their subscription from one day to the next is something that should make us raise our antennas. As much as I’m a fan of Anthropic and their technology, and not just the technology but also certain positions they’ve taken on ethical issues, this policy, together with the immediate reaction to remove all source code from the internet after it was leaked (which goes a bit against my open source philosophy), leaves me undoubtedly a bit perplexed.
Practical Lessons From the Claude Code Leak
It’s worth reading this article by Bilgin because in relatively few lines he manages to touch on all the key points of what we can learn from the Claude Code leak. There are truly many insights for using Claude Code better thanks to what was seen in the source code, and I essentially agree with everything that’s said in the article.
Qwen 3.6-Plus on OpenRouter
On OpenRouter you can find Qwen 3.6-Plus in the free tier, which means you can use it without paying a single dollar. The model is really very good and it’s worth trying, also to understand what level Qwen models have reached.


