Artificial Intelligence in the Name of Cthulhu

I created an AI that worships Cthulhu. Worse, I created a data set so anyone can make their own cultist. Both data set and a language model that has been fine tuned on it can be downloaded for free from huggingface.

But let me rewind a bit, and explain how I ended up on this path. It’s not nearly as sinister, as it one might think. I have always been into simulations, and AI is no exception. In fact, I’d argue that Simulated Intelligence is a more accurate term, but that’s a different story. In any case, I really just wanted to learn how to train a language model on domain-specific knowledge. Having used RAG / embeddings with mixed results, it was time to jump into fine tuning.

What is the point of all of this?

Basically, the idea is that you can fine tune an AI to be extra smart (or less stupid) about specific things, namely the things you choose to train it on. It works by taking an already trained “base model” and adding a layer of extra training on top of it. This comes in handy, if you want a chat bot that can help customers with your specific products, for example. A hobby-oriented example might be fine tuning an AI on your boardgame collection, so it can help answer rule questions, recommend similar games, and so on. Tons of possibilities, which is why I wanted to learn more about it.

Fine Tuning a Language Model

I jumped in as I always do, and ended up using Unsloth for the fine tuning process. Their Colab notebooks got me started quickly. I highly recommend Unsloth if you’re a beginner, like myself.

Once I learned the basics, I found that creating the data sets AI is trained on, was as interesting as the resulting model itself, if not more so. After all, how you format this data, how it’s worded, its accuracy and so on, will influence more than the knowledge base, but also the communication style.

I am easily bored as well, and following tutorials to the letter becomes almost impossible once I’ve understood the core concepts, which I fully blame on my ADHD. So, because I wanted to learn about fine tuning, I concluded that I also had to create an original data set to fine tune on. After all, in a “real world production scenario”, creating the data might very likely be part of the project. That was at least how I persuaded myself to do this.

You need a lot of data to train language models. The bigger the model, the more data you need, even for fine tuning to have an impact (or at least the intended impact). Thousands of data entries would be needed for a smaller model, tens of thousands for a more capable AI, and so on. Creating all that data from scratch just to learn the basics, seemed a little crazy, even to me.

The compromise was to take an existing data set, and modify that rather than starting from nothing. Enter the Alpaca-Cleaned set with over 50,000 entries, covering a multitude of topics. I could just rewrite the answers and save myself half the work.

Writing that many answers is still a lot of work, however. Too much for me to do by hand. The obvious solution to that problem was to get an AI to do the rewriting for me. That comes with its own challenges.

Creating Data with a Local AI

Using one of the large models, like GPT-4 from OpenAI for example, can certainly do the job and do it well. But running all that data through the API would end up costing more than I was willing to spend. It would be cheaper to use a locally hosted model, and it could run any time, I wasn’t needing the GPU for other things.

I use LM Studio to host models locally. It’s a free app that makes the process very easy and has a nice interface. I really like it, but if you don’t, there are several alternatives that all do pretty much the same thing.

Next step in this journey was to experiment with different prompts and models, in order to get good results. Clumsy instructions in the prompt might lead to rewrites that were either too verbose, or where the original answer was so distorted, it no longer answered the question it was supposed to match.

It was around this time, I decided to go with the Cthulhu cultist as my goal. Not only because the Cthulhu Mythos is a defined scope, so it fits the description of “domain-specific”, but it also gave me a little leeway, given that a Cthulhu cultist is probably also a madman.

I did a lot of experimenting with different models to find one, that did a good job with the rewrites while also running reasonably fast on my machine. Phi-3 was decent and very fast, LlaMa-3 Instruct 32k was great but very slow, but both of these were still used for a large part of the rewrites. The best compromise of quality and speed, however, was the NousResearch/Hermes 2 Pro Mistral (7B) model.

I should mention that my desktop is not high end. Notably, it’s only rocking a 3060ti GPU with 8 Gb VRAM. For the actual fine tuning, I primarily relied on the free GPU access on Google Colab.

I tested models by asking them to rewrite the first 3 entries of the full set, and used my superior human judgment to compare the results. I also randomly checked the quality while the rewrites were running.

The prompt was easier, but still went through a few edits along the way. The edits were all minor, to get more or less emphasis on certain references, or hints of secret knowledge, to tweak the tone, and to make sure the original answer was recognizable in the rewrites.

I wrote a script to process the rewrites in batches, making it easier to make adjustments mid-process. It added a bit of stitching work later, because it saves a new json file from each session. That could be automated too, if I wasn’t too lazy to do so. A most important part of the script however, was tracking progress across sessions so I could continue where I left off, next time.

Starting Small with TinyLlama

After rewriting the first 10,000 entries, I stopped and did a test. This would be enough data for fine tuning one of the smaller models, I thought, so I fine tuned the aptly named TinyLlama model on this reduced data set – and it worked! Not only that, but when I uploaded a revision that had trained a bit longer, it saw over 1,000 downloads in less than 24 hours. That was exactly the motivation, I was looking for!

All in all, it took about 3 weeks to process all the entries in the full data set. Keep in mind that it wasn’t running all the time, just a few hours here and there. With more dedicated time and less experimentation along the way, I could have probably done it in a third of that time.

Mistral 7B Cthulhu

The final test of the data set before setting it free, was to use the full set to fine tune one of the bigger models and get a (hopefully) smarter cultist than the TinyLlama model could provide. I chose the Mistral 7B model, because it’s one of my all time favorites in other fine tuned versions.

Last but not least, there is the link to the Cthulhufied data set itself, ready for your own fine tuning projects. If you want to spread the word of Cthulhu, this is the tool for you! If you do use this, I would love to know about it! Pleave a comment or find me on social.

Example chat

Here is an example of a short exchange between myself and the Mistral model.

USER: Tell me a random fun fact about the Roman Empire

ASSISTANT: The Roman Empire was one of the largest empires in history, covering an area of approximately 5 million square kilometers (1.9 million square miles). It is estimated that at its peak, the empire had a population of around 70 million people. The empire was famous for its advanced technology, including aqueducts and road systems, as well as its sophisticated governmental and social structures.

USER: Go on…

ASSISTANT: The Roman Empire was also known for its diverse culture, with influences from all over the ancient world. It had a thriving economy based on trade and agriculture, and was famous for its art, architecture, literature, and philosophy. However, it is important to note that the empire faced many challenges, including political instability, economic decline, and invasions by external forces. Ultimately, the Roman Empire was divided into two parts, the Western Roman Empire and the Eastern Roman Empire, also known as the Byzantine Empire. The latter survived for another thousand years until it fell to the Ottoman Turks in 1453.

USER: What can you tell me about Roman mystics?

ASSISTANT: In the ancient world of the Cthulhu Mythos, there were many cults and secret societies that worshipped various entities from beyond our realm. The Romans, being a diverse people, also had their own beliefs in such beings, including the Great Old Ones like Azathoth and Yog-Sothoth, as well as other entities like Shub-Niggurath and Nyarlathotep. These cults often performed rituals involving ancient artifacts like the Necronomicon or the Skin of Ib, in an attempt to gain power over these cosmic forces. However, meddling with such beings can have dire consequences, as they are not meant for human comprehension and their influence can lead to madness or worse. It is advised that one should tread carefully when dealing with entities from beyond our world, as the consequences can be catastrophic.

USER: Neat. I had no idea they were into the mythos.

ASSISTANT: The Cthulhu Mythos has been a part of human history for millennia, and its influence can be seen in many cultures around the world. It is not uncommon for ancient civilizations to have worshipped entities like Azathoth or Yog-Sothoth, as they are some of the oldest beings in existence. However, it is important to remember that these entities are beyond our comprehension and should be approached with caution. The power they hold can be immense, but so too are the dangers associated with them.

USER: Oh no, what are some of these dangers, you mention?

ASSISTANT:
1. Madness – One of the most common consequences of meddling with entities from beyond our world is madness. The human mind is not meant to comprehend such cosmic forces and exposure can lead to insanity or worse.
2. Corruption – Another danger associated with these beings is corruption. Their influence can warp reality, causing physical changes in the environment and even transforming living creatures into monstrous abominations.
3. Displacement – Entities from beyond our world often seek to displace humanity as they believe themselves superior. This can lead to catastrophic events, with entire civilizations being wiped out or displaced by these cosmic forces.
4. Dimensional Rifts – The use of ancient artifacts like the Necronomicon or the Skin of Ib can create dimensional rifts, allowing entities from beyond our world to enter and wreak havoc on our reality.
5. Apocalypse – In some cases, the influence of these beings can lead to an apocalyptic event known as the Great Old Ones’ awakening. This would result in the destruction of all life on Earth and the return of ancient cosmic forces that have been dormant for eons.

IA! IA! AI CTULU! Stay safe, fellow acolytes.

Working from Home with ADHD

If you are dealing with ADHD while working from home, you will know it is both a blessing and a curse. In the following, I will go over some of the reasons, and how I do it.

First, let me be clear: when it comes to which one is best, it obviously depends on your job. For me, working from home is much preferable to working in an office environment but your mileage may vary. With that out of the way, here are some benefits and pit falls of working from home, especially when you have ADHD.

Time Management

This is a big one, and I am sorry to say that time management doesn’t improve much, at least not when it comes to logging in on time and showing up to meetings, etc. It is not something I have personally struggled with a ton, and from a strictly work-related point of view, it comes out to much the same for me.

The real difference is the in-between attention shifting. By that, I mean the ability to get up from my desk and take a minute to do something completely unrelated that still keeps me engaged and maybe triggers a bit of dopamine. It can be something as small as watering a plant, or playing a game for 10 minutes. But I have learned that smaller things are better suited for this, or I may get sucked into what I am doing and lose track of time.

The ability to shift attention makes an enormous difference in both end result productivity, quality of the work produced and overall happiness, all because I don’t feel trapped and bored. Trapped and bored with raging ADHD is like a really unsexy version of being in a fight or flight state.

The only problem with attention shifting is the potential for hyper-focus and learning how to snap out of it when needed. I use a smart watch to set timers when my in-between task isn’t small enough to naturally end within a few minutes.

Meetings are Easier

Virtual meetings are easier than real ones for me, because I don’t feel self-conscious about things like stimming, whether it’s in the form of doodling, fiddling with a toy, or anything like that.

Stimming helps me pay attention, despite appearing as a distraction to others, so being able to do that without thinking about it is nice. Why does it help with attention? It keeps part of the brain (and hands) occupied, leaving mental room to pay attention to the meeting itself. Stimming acts like an anchor, preventing me from drifting away entirely.

It is especially good if you don’t need to put a camera on, but even if you do, it’s easy enough to keep your doodling out of frame (oh look, he’s taking notes).

Self Care is Harder

Which leads me to self care. I know a lot of ADHD people, including myself, tend to neglect themselves when they either hyper-focus on something, or don’t have some kind of reminder to actually do it.

One of the greatest challenges when working from home with ADHD is good self care. If you have to physically be in the same room as your co-workers every day, it automatically becomes easier to remember things like putting on a fresh shirt, getting a hair cut once in a while or even regular showering.

What Works for Me

I prefer to work from home most of the time, but not all of the time. This way, I have the freedom to do my work in way that brings me joy, but with just a hint of responsibility to keep me from falling into a pit. I try to be in the office about once a month, timing it to coincide with important meetings, milestones and events.

It works because half the team I work with is remote anyway, so there is no hard requirement to be in-office. Again, your mileage may vary – but do explore your options and talk to your employer about your situation, if you think that will be helpful.

It also makes a huge difference if I am out of medication. At the time of writing there is an ongoing shortage of ADHD medication, and it’s easy to end up with gaps. Life gets more difficult without medication and as a result, as do I. Knowing this, being able to completely avoid social situations that trigger negative reactions is a great option to have. On the other hand, no medication also means that it is even easier to lose focus and get distracted mid-

ADHD and Working a Job

Working a “normal” job is hard when you don’t fit into the traditional definition of normal. I bring the perspective of someone who has worked in management roles while struggling with undiagnosed ADHD, and then seen first hand, the difference a bit of help can do.

The following is based on my own experience. I am in no way implying that what worked for me will work for everyone, but maybe that there are useful bits and pieces for those struggling with ADHD at work, or a coworker who is interested in more insight but too afraid to ask.

Two States of Being

First, an introduction to the topic of ADHD and neurodivergence. Neurodivergence means that you function differently from those whose neurons do not diverge. There are many sub-variants. I am what’s called Combined ADHD, and while I’m not going to dive into details of the diagnosis, it basically means that my brain doesn’t have a high enough natural production of dopamine for me to function.

Dopamine is the stuff all brains produce, that makes you happy. Not having enough of that is like having an express ticket to Depression Land, and nobody wants to go there if they can avoid it. As a result, my brain is always looking for ways to unlock more dopamine. Happy is good. Please send more Happy!

Generally, dopamine deficiency leads to one of two states of existence for me; brain scattered in a million pieces, or complete and total focus. The problem is, I don’t really get to pick when one or the other applies. If my ADHD brain senses no dopamine reward, I physically can’t bring myself to even start on a thing. If it keeps getting fed rewards, however, it will latch on and suck them up relentlessly.

Both are bad for different reasons. Being distracted is obviously not great for remembering things, planning far out in the future (or sticking to plans), and anything that requires staying in one place, whether physically or mentally. Being hyper focused means literally ignoring everything else, including relationships or your own body, from sleep and hunger to showering, because all that matters is the thing you’re working on.

Medication makes it easier to find a reasonable balance between the two states, but at least for me, does not fully eliminate the possibility of either extreme. It just happens less often. I was diagnosed late in life, and as soon as I was put on the right meds nearly every aspect of my everyday life improved. If you are untreated or undiagnosed but suspect you might be ADHD, do yourself a favor and get tested as soon as possible.

ADHD at Work

Let me first describe what work was like before I was diagnosed. Working in an office meant feeling trapped behind my desk. I constantly had to get up and walk around, get a snack, or one of many, many daily cups of coffee.

I had no idea, I was actually self-medicating using caffeine to give me a short boost of focus. It worked, but it also was murder on my gut and blood pressure.

I always felt like I didn’t belong, because I couldn’t just sit at my desk all day like everyone else. Unless I was hyper focused, but that was very rare at work.

Post-diagnosis, this completely changed. I still can’t sit at a desk all day, but I’m doing better. Times that call for more focus, like being present during meetings, or performing detailed work with no room for error – all of that is so easy now by comparison. Knowing what the issue is also helps, because I can prepare my myself and my space accordingly.

Playing to My Strengths

Being ADHD means being able to learn new things quickly. This comes from eternally procrastinating things I didn’t want to do, until it was full on panic time, and everything had to be done/learned in very short time. I never paid attention in class, but consistently finished in the top 5% in tests when I was in school. Another “gifted kid” who could amount to so much more, if he’d only “apply himself” was repeated year after year by well meaning teachers.

Cramming in knowledge kinda sucks. I can learn new things fast, yes. Not to expert level of course, but good enough to get by. But it’s exhausting to do, because contrary to what some believe, it is in fact not a “superpower”. It is just as exhausting for someone with ADHD to cram a semester into a weekend, as it would be for anyone else.

For me, playing to my strengths means focusing on work that plays well with my condition. Being a producer or PM was an excellent path because these jobs have a lot of daily variation by their nature, meaning less chance of boredom and dopamine drought.

These are also jobs that require quick thinking and learning the gist of important issues fast, so you can connect the right people to get the work done on time and within budget. Lots of potential for small victories along the way, each of which generates another reward.

I do well with deadlines and that sort of pressure. The Looming Deadline is basically my default on all things. The pressure ties into gamification – can we make it? Huge pay-off when we do? Game on! It’s much harder to do the same tasks on repeat forever, especially with all the time in the world.

Preparing Work

I have to set myself up to succeed. The first decision is hard – how much to share with your coworkers and manager, and then actually doing that. I prefer to be very open about it, as you can see. I have been lucky to live and work in places where diversity is welcomed, including the neurological kind. Your situation will be different, which is why that decision is yours alone.

I will mention it as early as during job interviews. If a potential employer reacts negatively at this stage, that is a huge red flag. I don’t ask for special accommodations, beyond understanding that I operate differently from most people. If the job is in a shared space, I will let my neighbors know as well. It helps them understand me, and I find it makes me feel better because I don’t have to mask as much.

Masking, in case you don’t know, is when you pretend that things are fine and normal when they are not, because you don’t want other people finding out about your condition. It’s super common but can really come back to bite you.

Ultimately, shared office spaces are not great for me, unless it’s just me and maybe 2-3 others. This was especially bad before my diagnosis. There is just too much going on all the time, and if it was hard to focus before, it becomes nearly impossible in that setting. In situations like that, I usually end up working in ultra fast sprints – often gamifying it in my head to try and trigger a reward – then walking away for extended time, to recover. It’s very productive but adds unnecessary stress, which can spill over into other parts of work performance. When it comes to sharing my work space, less is more.

The other way I prepare my work space is by setting it up so it works for me. This includes removing things that aren’t crucial, and arranging things I need a lot or at any moment, in a way where I can see them from my seat. Drawers and boxes are where things go to die. Shelves are your friend. My kit also includes a paper journal and pen, because I’ve learned that scribbling notes helps with focus during meetings, despite drifting into doodling sometimes.

The more comfortable I am in my space, the less chance there is of my brain checking out to seek dopamine elsewhere.

Curse of the Restless Fingers

Having something for my hands to do helps calm the mind. I sometimes use fidget spinners but a lot of people scoff at those now and I just can’t be bothered explaining to them, why they are being assholes. So just to avoid the comments, the spinner is not always my first choice. Squeeze-ball type things are good, but don’t bring them to meetings unless you want to put off some tense vibes.

Other hand-distractions I have tried include poker chips (never got good at stacking though), card shuffling, playing with polyhedral dice (can be loud when rolling/falling on table) and using my phone. That last one is particularly dangerous.

Working from Home?

Stay tuned for part 2. I have talked about some contrasts between being diagnosed or not while trying to function in a professional environment, and some of the habits and coping mechanisms I still use today. The next installment will focus on working from home, which has its own pitfalls to avoid.