Posted on

The rise of homelabs: Running your own AI server at home

In the battle between Amazon Web Services and Google Cloud, a quiet contender is silently encroaching on the battlefield. The homelab, a computing space previously reserved for the closet or garage, is now beginning to be a larger part of people’s homes and small offices. 

In my early years as a budding software engineer in the 1990s, I would often take home expired or junked office computers, quietly assembling Frankenstein’s file server. Stacks of cords, cables, and components confused my friends who would often ask why I was hoarding so much computer equipment.

The simple answer was that I loved to build private home versions of the servers I helped maintain at work. But now, an even bigger draw is pulling even non-technical people into running their own private server. 

One of the biggest current drivers for homelabs is the development of open source easily accessible machine learning algorithms. Specifically, large language models (LLMs). What was once reserved for Universities and research labs can now be run on simple hardware quickly and easily using open software. Even I have let go of my Frankenstein File Server in favour of smaller, lower-wattage single board computers that take up less space and power in my own homelab. 

How can I start my own AI homelab?

Through my journey setting up my own personal LLM, let me share the top five things you need to know in order to get started with running your own private homelab LLM.

Docker

Once a mysterious tool used by backend engineers for development and testing, Docker containers are now the backbone for beginners looking to quickly launch a machine learning application quickly and easily. A Docker container is simply a shrink wrapped package of all the software you need to run an application.

If a chef, menu, vegetables, and noodles are everything you need to make a stir fry, the Docker container version would be all these things in a box, with a simple command to start the fire, cut the vegetables, and cook the meal. 

For example, you can run your own private LLM using Docker by typing this Docker command:

docker run -d -v ollama:/root/.ollama -p 11434:11434 –name ollama ollama/ollama

Ollama

As we saw with the previous recommendation, Docker allows us to install Ollama with a single command prompt. But what does Ollama do? And why have over 5 million people downloaded it? Large Language Models come in many sizes, and using different models can be confusing to set up and configure.

Also Read: Securing tomorrow’s metaverse today: Why safety in the new frontier must leverage on hardware

Ollama provides a common interface for communicating to these LLMs using a simple application programming interface (API). This means software can be developed that “plugs into” Ollama to provide functionality, decoupled from the LLM itself. For example you can use the Ollama API in your own Jupyter Notebook to send natural language prompts to your own LLM. 

Jupyter Notebook

Almost half of all Data Scientists use Jupyter Notebooks, for good reason. Notebooks provide an easy way to both see and comment on code, and plenty of examples exist on how to use machine learning algorithms in python code, as shareable Notebooks. With a Notebook, you can easily plug into OpenAI’s ChatGPT API, for a fee.

However, if you run your own API, as shown in the above example with Ollama, you can send LLM prompts to your own homelab for privately and for free. A Notebook can be a very hands-on “learning” approach to running your own private homelab LLM. However, a more hands-off approach is also available. 

Open WebUI

If you have no interest in learning data science but just want to run your own large language model on your own private network, with minimal tinkering, Open WebUI provides an entirely self-hosted AI interface that works seamlessly with Ollama, and plenty of other LLM API services (including OpenAI’s ChatGPT).

Similarly to Ollama, the easiest way to run Open WebUI is through Docker. Once it is running, you can see the local address on your home network, and it looks and functions very similarly to OpenAI’s ChatGPT service. You even have the choice of uploading your own documents and running prompts against the text inside them.

A healthy community of developers is constantly updating functionality and features in this software in the open source community. This means you are free to download, use, and contribute as much as you like, for free. 

Single board computer

Any new modern computer can be used to run a Large Language Model, though these models run in different sizes and the computer you have may only be able to run a smaller sized one.

Also Read: Why building user communities is far better than paid advertising

The top three things that will influence how well a system fits into your homelab are the following:

  • How much power does the computer consume? If you run a powerful computer running a 800+ watt power supply, be prepared for equally large sized power bills. There’s a reason many AI companies are looking into using Nuclear Power – these computers are typically very hungry for electricity and this can translate to high operating costs. Keep this in mind when you are weighing pros and cons for a big system.
  • How much RAM does the computer have? Even the lowest end LLMs require at least 8GB of RAM. Some can operate with 4GB but performance will be very poor. Ideally, a system should have a lot of RAM, with 8GB minimal and 16GB substantially better. Even more will allow access to larger models. 
  • Some kind of acceleration helps. This could be a GPU, NPU, or TPU. Though, to keep things simple, the best option is to find the fastest CPU within your Power (see 1.) and financial budget. In my experience, configuring machine learning algorithms to fully take advantage of acceleration is a very technical topic outside the scope of what is defined here. But if you like to spend time “tuning” your hardware to run as fast as possible, this could be a great project you can sink many hours into.

Conclusion 

Though, no matter which direction you eventually take, many options are available to customise your homelab with an increasing number of consumer centric devices. The Raspberry Pi is one of the most popular computers for homelab enthusiasts, with a low cost, low wattage, and 8GB options. The Jetson Orin is a GPU enabled single board computer, also with 8GB options though more expensive. The RapidAnalysis Darius is a low cost, low wattage Intel-based single board computer which also has an 8GB option. 

The cheapest and most accessible option is the computer you have with you at home right now. Though, most people will not want to run memory-hungry software continuously on a machine they are doing serious work on. Much like getting on a crowded runway, applications fighting for takeoff on a PC that sits right next to you, whirring its fans like a jetliner, can become annoying quickly. But there is another option. 

With so many computers heading for the junkyard daily, a little time in the “lab” can resurrect old machines into new workstations. Often, computers that struggle with Microsoft Windows are perfectly capable at running a single application in a cluster of homelab Docker containers.

For example, you can run Ollama on one e-waste machine, OpenWebUI on another separate e-waste machine, and Jupyter Notebook on a third e-waste machine, for a fully integrated homelab server cluster, and access them via a web interface locally. If you have the space, time, and patience (much like I did as a young engineer) you could slowly assemble a capable homelab using e-waste and commercially expired parts. 

Editor’s note: e27 aims to foster thought leadership by publishing views from the community. Share your opinion by submitting an article, video, podcast, or infographic.

Join us on InstagramFacebookX, and LinkedIn to stay connected.

Image credit: Canva Pro

The post The rise of homelabs: Running your own AI server at home appeared first on e27.