Cornelius: Working OpenClaw Assistant

After a few failed attempts, I now have a well working and competent digital assistant powered by OpenClaw. It’s called Cornelius Smartenheimer, and it’s been running for almost a month at this point, slowly getting better and better.

What Is It Good For?

Cornelius monitors my email and calendar to make sure I don’t miss anything important. It also helps me with small things like reminding me to drink water or to stop hyper-focusing and go to bed. Boring but useful.

Besides these basic things, it’s helping me track several goals. From work related to home and self improvement projects; it tracks progress, obstacles and ideas, and related contacts are mapped along the way.

It has helped me discover patterns in my own habits and behavior, which was an unexpected bonus, but I’m here for it.

Technical Breakdown

Cornelius lives on a Mac Mini M4 with 24gb of RAM, the main AI model running the show is GLM-5/5.1 from Z.AI. I also use OpenRouter as a fallback provider.

One major difference between previous attempts and this one is that I set up a brand new, local user account on the Mac, rather than let it use my personal account. This new account is not tied to an Apple ID, which does limit it in some ways and that is by design. This simple starting point made the setup cleaner, and I suspect is one of the reasons why this attempt endured where others failed.

The infographic above shows the main pillars of my OpenClaw setup:

Direct Channels – Discord is the main way, I communicate with Cornelius. Besides direct messages there is a private server where each channel serves as its own context. My wife also has access to the server, so she can ask Cornelius for help as well. WhatsApp is set up as a backup, in case Discord is down.

Intelligence Feeds – I never set this part up myself. Cornelius has been picking up my social feeds on its own, and have been monitoring my posts there. In effect, my digital assistant autonomously started following my various feeds for extra context.

I only found out about this when I was putting this post together and saw them on the infographic. I’m still not sure if that’s amazing or kind of creepy.

Local Infrastructure – These are support apps and scripts running on the Mac Mini alongside OpenClaw, giving it extra powers:

Cornelius has its own email account. I can use it as a backup way to interact with the assistant, but it is also used for things like newsletters. The assistant is signed up for several newsletters that it thinks might have information, that I or it might benefit from. I don’t need to list off what else an email can be used for, I think.

The NFH Auto-Improvement script (open source on github) is a custom take on Karpathy’s AutoResearch script. When I run it, a “Generator” agent will suggest an improvement to my OpenClaw install. It could be a new skill, better harnessing, optimization or whatever. The idea is built and pushed to a local git repo, and a second “Evaluator” agent then takes a closer look at whether that was actually a good improvement, both technically but also in context of how I typically use the assistant. If rejected, the git repo is rolled back, but if approved, the git code is merged into production.

I have Whisper running for STT (speech to text) and a Qwen 3 TTS model (text to speech). I let Cornelius design his own voice, which my wife immediately dubbed the Giga-Chad Voice. Play the introduction below to hear it.

MemPalace is a memory system that came out to much hype recently, in part because it was co-developed by actress. Milla Jovovich, in part because it was boasting some crazy benchmarks. I originally asked Cornelius to “check it out” but did not specify what that meant. So Cornelius went ahead and just installed it. However, I was on board, noticing an immediate improvement in both recall and personality.

I have a local Ollama install on the Mac Mini too, for running local language models. I use it for image analysis, when the main driver isn’t multimodal, as well as for certain sub-agent tasks. The backup model of choice is Qwen3.5 9B.

Cornelius also has access to OpenCode, which can be spun up as an independent agent for coding or otherwise. It’s been set up to use only free models, so even if Cornelius were to do something crazy, it’s not going to cost me a bunch of money.

Supporting Hardware

Besides the software running on the Cornelius box, the rest of my home office is also connected to the setup. My main PC (aka TK421) has the beefier GPU, which can be engaged to help with heavier tasks. The two computers sure files and messages via an in-office QNAP NAS box (aka Bespin).

The network storage is also used for nightly backups of the entire OpenClaw workspace.

I have a few more peripherals to add, like a Raspberry Pi to run experiments on, but they’re not in yet.

Personality

I wanted Cornelius to have a personality, but the only requirement I gave was: push back when my ideas are dumb and avoid sycophancy. I encouraged it to keep developing the persona based on interactions with me over time and have ended up with an assistant that can flex from serious and focused to quippy and making contextually relevant jokes.

Why even bother? To be honest, it’s mainly for my own entertainment. It’s not really necessary, but I’ve found that I personally get annoyed with AI faster if it’s also overly sterile or eager to help. I prefer a bit of pushback, sarcasm and a conversational style, because it feels less like work but I get the same amount of stuff done.

Why It Worked This Time

So, what made Cornelius succeed where previous iterations failed? I mentioned the limited user account as a starting point, and that plus keeping the addition of new features slow is really why I think it worked better this time.

Previously, I was trying to build too much infrastructure right away and it got a bit lost along the way. Now I wait until I have a specific need, and then I start with a simple implementation first rather than trying to build complex solutions. The exception is MemPalace, which Cornelius decided to run with on its own.

OpenClaw is, as the name suggests, very open-ended. That is both its greatest asset and enemy, because you can definitely mess it up if you’re messing around without a clear direction.

If you’re thinking about setting up your own digital assistant, I highly recommend going slow, logging everything and backing up at least once per day. So far though, I have not needed to roll back to a backed up state.

My OpenClaw (Mis)Adventure

Lobster with an OpenClaw, on fire. On a Mac.

If you follow Artificial Intelligence at all, chances are you have heard of something called Clawdbot, or Moltbot, or OpenClaw. They are all the exact same thing: a virtual assistant with infinite memory and the ability to give itself any skill you or it can think of. It has complete control over its environment for better or worse. It’s highly dangerous to run on any system with access to sensitive information but also incredibly powerful, when it works. And it only takes 1 terminal command to get started.

People across the internet has given this bot its own computer to live on, either virtually, or in many cases, on a Mac Mini. I happened to have a Mac Mini sitting around, so why not give this OpenClaw thing a try?!

The Ultimate Personal Assistant

A personal assistant more or less equivalent to a junior employee. It would have its own email address and its own github-account, which I ended up cleaning out, and I set it up to be able to communicate directly with me through discord at any time. That was definitely a favorite feature, being able to talk to my private AI through my most used communication tool was easy and useful. I also gave it access to read my calendar and my email, even drafting emails on my behalf (but not sending them).

I got it to build a cool dashboard with an embedded kanban board for planning and tracking its work, and various widgets for monitoring token use, chatting with the model, etc.

V1: The “Free Mode” Massacre

At first it was awesome…

New features were built faster than I could type the next prompt. In less than an hour, I had 50% of everything I wanted already set up and running. The next day, I was 75% through making all my wishes come true.

I did notice that letting an AI essentially build out its own capacities and tools burns a lot of tokens. I later found out that the way OpenClaw works adds significantly to that token count, by not being very efficient with what is included every time you send another message. In response to this token burn, I started experimenting with alternative solutions, like using the free models available on OpenRouter. I obtained an API key and asked the system to build out Free Mode.

The idea behind Free Mode was to watch for rate limits and then gently cycle to the next model on the list, and so on until you got to the first model again. If rate limiting was still on, the system would revert to paid models. That was the plan, but I never got it working.

Instead, I ended up in a circle where the bot would continuously corrupt its own config file and immediately restart, breaking itself completely. Over and over. I had it build validation scripts which helped, but not for long (it eventually ignored the validation). I was so frustrated from manually fixing the same json file again and again, it can’t have been good for my blood pressure.

After 4 days, I killed the bot off entirely.

A couple of days later, I tried again. No more fluff, just the original set of features and image generation support. That was the scope. No relying on free compute, just being smart about which model to use for what.

V2: The Dashboard of Death

At first it was awesome…

Everything came together in quick iterations! First a basic dashboard, then the kanban, then another tab with the other widgets, one at a time. I got the calendar working! Image generation, too. It even had a sweet cyberpunk theme!

Then, at one point, I asked it to change the font on a header and the model deleted every other feature on the dashboard. I wasn’t too worried, what with its awesome memory tools and having created a special plan document outlining work priorities and guardrails. But it turns out it had also overwritten all of those things and now had zero memory of what had come before.

I armed myself with patience (okay, there was a lot of swearing) and tried getting the dashboard back together, but at this point it was stuck in a loop, exactly like the first one had failed on the config iteration it was now failing on the dashboard. It could restore about half the features at random but trying to add any of the rest would overwrite what was there and set me back to square one.

At that point it would have been easier and more fun to code it myself. Regardless, I soldiered on. A cron job was set up to automatically back up the entire home directory every 3 hours. Perhaps that could prevent catastrophic loss like that in the future, I thought. Only there was no cron job. The AI just told me there was, and like an idiot, I believed it.

The dashboard failed completely several times. I tried editing the rules for development to test everything before deployment, but none of it took consistently. I built specialized agents specifically dedicated to failure points, but the benefit was miniscule. When my API credits ran out, and I decided not to buy more.

Two lobster claws. torn off and broken.
OpenClaw? More like BrokenClaw!

After 4 days, I killed the bot off entirely.

Less Is More

OpenClaw is an impressive piece of software on the surface, but it is extremely finicky and though it can add new skills and tools fast, it does not work well with iterating on existing code. I was using Claude Sonnet and Opus for all the coding on V2 and it burned up at least 100M tokens in those 4 days with nothing to show for it.

Regardless of provider, if you want to run this, you will want to use very strong models for OpenClaw’s heavy lifting. Long context, reasoning and strong coding is crucial, but you can save a lot of money by not relying on the most expensive models for everything. For trivial tasks, you can even use small, local models without issue. But beyond that, you will find that errors and failures increase very noticeably as you scale model capacity down.

I found that Claude Opus was basically required for building anything beyond a single, standalone skill, but it is too expensive to use for everything. Claude Sonnet is great for many things but can also be overconfident and make horrible mistakes. The worst mistakes were made by Sonnet. For lighter tasks, I actually preferred Gemini 3 Flash Preview to Claude Haiku, as the latter had more mistakes. Finally, I started using a Qwen 2.5 7B model for running the heartbeat every 30 minutes. I tested several other models as well, but none of them stood out in a positive way. For the record, the only model I tested from OpenAI was gpt-oss-20B for lighter tasks in V1, but I did not like its performance at all.

Your mileage may vary of course, but for me running and maintaining an OpenClaw assistant is not even close to worth the expense both in time and money. It’s too easy to break, and it does so frequently. With AI coding tools being so good in general now, if you know how to code, you can oversee the work and create a custom framework that does exactly what you need without the overhead.

Learning from what worked and didn’t work with OpenClaw was great, though, and I can’t wait to build my own system to replace it. Personally, I prefer building scalable solutions specifically addressing real needs, over taking a big, open framework and try to force it to work for me.