Self Hosting is a rabbit hole
My Infrastructure#
Today’s post is going to be about my infrastructure and what I self host. And by doing that i’m following the prophecy:
Why do I self host?#
Self hosting is fun by itself, but for me it’s more than just that. As a productivity tool, self hosting allow me to freely use technology to automate my daily boring tasks (ever heard of accounting?), but self hosting also allows me to have complete control over my entire stack, which is very rewarding in the age of enshittification.
Self hosting started as a hobby, but as time went own I’ve realised how useful it is. Maybe this post can inspire you to do the same, or maybe you found out things that I havent done yet and you can let me know in the comments, or reach out to me at @yerik@social.yrk06.dev.
The Box(es) under the desk#
First, these are devices I have on my setup:
- Main Desktop
- Homelab
- MacBook Pro
- Cloud VPS
- iPad Air
You may be looking and thinking “Why 3 different computers and an iPad?” and the answer is that they are all serving different purposes. My main desktop is my work station, it has windows and it’s the main device I use on a daily basis. It’s also my gaming computer. The MacBook Pro is my portable device, I can take it to classes or trips and since I’m work home office, I can work on the MacBook if I’m ouside my house. The iPad is in a a weird state, it’s the middle point between a phone and a laptop so it ended up becoming an entertaiment and drawing device while I’m home. The biggest strength of the iPad is it’s form factor, I can take it anywhere and I can use it to remotely access all my other computers.
My homelab is the star of the show (it runs Arch Linux btw). I bought my homelab as a prebuilt entry level gaming PC, then I upgraded it to 64GB of RAM and it runs most of my self hosted software. It also has an xrdp server which allow me to connect to the graphic sessions remotely, for example I can use my iPad as a dumb terminal for a graphical session.
But now, how do I connect all these devices, and what is the Cloud VPS for?
Networking#
When self hosting, generally we’d like to be able to access our services outside our LAN (or house). One way to achieve this is by port forwarding, but that opens up an attack vector on your home LAN
Port Forwarding is a setting to route all outbound traffic on a specific port (for example 443 for HTTPS) to a specific internal IP
The best way to have all these devices connected is to create a Virtual LAN Network between them, or more accurately: a Virtual Private Network. While I suspect most people reading this blog will know what a VPN is for, since VPN advertisements have been all over the place I’d like to make a quick clarification:
VPN are virtual networks that connect devices which aren’t in the same physical LAN. The comercially advertised VPNs usually offer tunelling, which is sending all your traffic privately to an exit node, which will then forward the traffic to the open internet.
Usually the best choice for a VPN is Wireguard but I decided to go with Tailscale instead (which uses wireguard in the backbone). The only downside of tailscale for my use is that they host the central server that is used to connect your devices. This isn’t a problem, but since this is a critical part of the infrastructure I wanted to have full control over it. Besides that, tailscale is comercial software and I don’t think it’s worth to pay it currently for my case.
I have no problem with paid software, even with the subscription model. But the cost of maintaining it myself is currently less than what tailscale costs
I found out Headscale which is an open source implementation of the tailscale central server, with it I could have my own VPN using tailscale. Being a central server also requires being exposed to the open internet. Since I’m not doing a port forward, the next option was hosting it on a VPS which costs me about $6/month
I already had the VPS for other things which I’ll go in detail in the following sections
With the headscale deployed on docker I can connect all my devices and rely only on one central server which I also control. So how are my devices supposed to find this central server?
Central Server Networking#
In order to reach the central server from anywhere, I have a subdomain pointed to the server where it runs. This server also acts as the public gateway of services in the entire infra. Since each service runs on it’s own subdomain, I need a reverse proxy to handle the domain based routing and I went with Caddy.
Caddy is an excellent piece of software. I can point a wildcard subdomain to my central server and then add the service subdomains as required in Caddy, it takes care of certificates and it means I only need one interface between the public internet and my infra (no need to mess with port forwarding and decreasing attack vectors).
With caddy in place I can:
- Connect all my devices in tailscale using Headscale as the login server
- Access all my services through subdomains anywhere on the internet
- Deploy services in any device on my VPN and make it accessible anywhere
In other devices I just log in tailscale and select my login server
Before hosting headscale I used tailscale and this VPS to bypass my university firewall
Services#
Now that I have the network all wired up and ready to go, what are the services I actually self host? While most services are hosted at my homelab, I still have some hosted on my VPS because they are simple enough to run on the basic digital ocean droplet and they it would add some unnecessary latency to move them somewhere else. All these run in docker except when they don’t.
Here are the VPS services
Caddy#
I know I’ve talked about Caddy already, but this blog and some other static sites are all hosted through Caddy. Just scp
or rsync
the files to the correct directory and it’s deployed and ready to be served. Caddy runs directly on the OS without docker, it’s just easier that way since it has to communicate with all sorts of different services.
This blog uses Hugo to transform the markdown content into posts, but my other static websites are all made with Bootstrap Studio. Bootstrap Studio is one of the best tools I’ve ever payed for. It’s a WYSIWYG HTML editor, it’s very tightly coupled with bootstrap so I wouldn’t recomend it if you prefer other CSS frameworks. That being said it supports exporting the static website or publishing through SCP. It’s pretty neat and it has a Lifetime License for $60
Gitea#
For those looking to have control on their git repositories, gitea is a good option. I’ve been using it to host source code for some of my private projects. Since I’m working alone I can’t say much about features other than pull requests because I don’t use anything else. I think it’s a bit fiddly to set up SSH (I don’t do git over HTTPS) because gitea is running inside a docker container and I don’t have that much knowledge on the ssh authentication process and how it talks to the docker container. This ssh passthrough is the only reason gitea is still running on the VPS and not on the homelab, if anyone knows how to set up ssh passthrough to another server entirely, let me know.
Vikunja#
Vikunja is a card management software. I can create Kanban boards for my projects, track deadlines and all other jazz related to project management. It’s enough to suit my needs but I wish they had better support for integrating with ouside services (track gitea PR status, track branches and commits). Once again I’m not like a power user but it fits my needs very well. Now why is it running on the VPS and not on the homelab? When I started using it I didn’t have my homelab and I didn’t care to migrate.
If you are using your services and not thinking too much about infra it means you’re doing it right. This is why I prefer to use simpler methods, they allow me to focus on the actual services and using them instead of having to fix infrastructure problems. Although having a more complex infrastructure can be good if you are stuying how things scale while on a safe environment
Here are the services running on the Homelab server:
GoToSocial#
The latest addition to my stack was GoToSocial. I have an entire rant about why I moved to the fediverse and to blogging, but in short GoToSocial is my own fediverse/mastodon instance and by automatically publishing my blog posts there it serves as a comment platform too. Why use this instead of a dedicated comment platform? Because this allows anyone on the entire fediverse to comment without having to create a specific The Keypress blog account, that’s the strenght of the fediverse. I also host my personal fediverse account which as of now has 1 toot.
n8n#
The other day I found out n8n is trending among some non technical folk (and tech bros and AI Bros) so I decided to take a look and it’s quite neat. I’ll be honest I found only one use case for it so far but I’m certain I’ll be able to optimize some other stuff in the future with it. Also n8n is comercial software but you can self host for free the community version, whatever that means. A big positive for n8n is that it has a ton of integrations out of the box which would allow me to write short amounts of code to automate tasks, and for software which doesn’t have automation it provides HTTP nodes to call their APIs if available. I’ll go into more detail about what I automate with n8n after finishing the other services
Actual Budget#
Before using Actual I had a google docs spreadsheet for keeping track of all my expenses and while it’s a good enough method (or at least it was before Gemini anyway) it became too cumbersome and I stopped taking care of it, which had me going overbudget for a few months in a row. I had to write all formulas by hand and I’m no accountant. I’d have conflicting data and numbers that I wasn’t sure exactly what they meant. After some time I realized that someone must have made something better by that point.
Actual budget delivers on that promise (and it’s the only service so far I had to write 2 paragraphs about) and it made my expense tracking and budgeting way way better with automatic graphs, expense tracking based on accounts (trust me I have a lot) and my experience working in financial systems really came in handy to set it up just the way I wanted to. Of course after some months it started becoming cumbersome to add new transaction and I looked up ways to automate it, they have an API but…
Actual HTTP API#
Actual API is through a javascript module, it doesnt provide HTTP endpoints.
One of those pedantic moments I find fun in the world of software is when I see an API in the wild, 90% of the time it means HTTP API and I’ve noticed people around me use API and HTTP interchangeably. While there’s nothing wrong with it, Actual Budget was the one time they had to write a note to clarify the different between an API and an HTTP API, and I think that is worth a note here too.
This docker container is a translastor of sorts that expose Actual Budget’s API through an HTTP interface. This service specifically is not exposed to the internet, only to my VPN. Currently there’s only one use I have of this (and it’s related to n8n, can you guess it before I say?)
Ollama#
There some really powerfull people claiming that GenAI (LLMs more specifically) are the future of tech, and while I think it’s mostly bullshit, I found some use cases LLMs can help me daily. If you really hate LLMs, maybe consider looking into Ollama, it’s a good way to use LLMs and possibly make tech bros mad that you’re not using their service? Eventually I’ll write a blog post about my thoughts of GenAI but for now the summarized version is: I think it can be useful if used properly. With Ollama I can use open source models which run 100% on my CPU (take that nvidia overpriced GPUs) and RAM (that’s why I bought 64GB) and most of the energy in my area comes from hydroelectric sources which are clean (take that tech bros). What I use the GenAIs for? Rubber duck debugging in projects I don’t have any experience or the codebase is a mess, converting natural language into structured data (foreshadowing n8n and budget integration) and basically that’s it. Oh I almost forgot another positive point for Ollama is that all your data is local and it’s not used to improve the model, you wont get anyone sniffing on your chats
GenAI models use a ton of resources and they tend to be slow on CPU+RAM, but there are some nice models optimized to run on a single H100 and they have an ok performance in the CPU only scenario. And also a reminder that I’m running this on an entry level gaming PC, your mileage will vary.
Open Web UI#
I don’t find this name too descriptive but Open Web UI is a web client for Ollama. It has support for saving chats, authentication with multiple users, knowledge bases and other things you might want from a “self hosted ChatGPT”. Sometime after I started using it I almost forgot I was using my own models and not ChatGPT. I think it’s really useful if you like LLMs but don’t want to support OpenAI and their shady practices.
Jenkins#
Ever since I started in the world of DevOps I’ve always been a fan of Jenkins. I think it’s really easy to set it up and automate DevOps tasks, mainly Building and deploying. It’s a bit tricky to set up some SSH related parts of Jenkins (I’m looking at you known_hosts) and also it’s a bit hacky building docker containers inside jenkins inside a docker container but overall it works really well.
I’ve heard people telling me that Jenkins is overkill for simple deployments and I’d like to hear your thoughts about it, and what other devops tools you use, leave them in the comments
Octoprint#
Octoprint is a must have if you own a 3D printer. You don’t need to use a raspberry pi, just spin up a docker container and passthrough the printer USB. Octoprint is great for monitoring and managing your 3D printer. It has integration with the Prusa Slicer allowing me to slice and start printing a model in minutes. Just remember that Octoprint cannot put out fires for you so don’t leave your printer running unattended .
Jupyter Notebook#
I’m studying for a Master’s deegre in AI which makes python notebooks essential. Currently the standard is Google Collab which is fine, but by using a self hosted Jupyter Notebook I have my entire homelab processing power at my disposal and no arbitrary limits by google. Also if you are doing Machine Learning tasks with sensitive information (like we are usually doing) this removes the need to upload the dataset to google drive (and you aren’t limited by google drive’s size).
Bonus#
Minecraft#
As a bonus, whenever my group of friends gets hit by that 2 week Minecraft phase, I have 2 containers ready to be deployed
-
itgz/minecraft-server
:Docker deployable Minecraft server with support for mods and very easy to use
-
Infrared
:A reverse proxy for Minecraft that I can deploy on my VPS to open my server to my friends, just remember to use some form of authentication or black/white lists to limit who can access it
My single n8n workflow#
As you might have guessed by now, the single workflow I’m using for n8n is to convert natural language text into transactions on my budget automatically. But there’s more to it: I have a siri shortcut which will parse what I say into text, then send that text using webhooks to n8n, which will use my Ollama models to extract the important information from the text into a structured json and then send them though the HTTP API wrapper to my budget service. These are the sorts of things I love in tech. I have a bunch of parts which do simple things on their own, but I can orchestrate them to produce complex behaviors and automate boring activities on my day to day life.
Final Note#
Self Hosting is a way to free yourself from the grasp of the Big Tech companies. It’s also a fun way to tinker with computer, experiment with software and improve your daily activities. You don’t need a lot to start, if you have an old computer get a copy of some lightweight linux distro and give it a go.
We all can make great things with technology, and maybe by starting with improving our lives we can improve the lives of many others.
Song Suggestion#
This one is great and I’m currently listening to it on vinyl while finishing this blog post
Comments
With an account on the Fediverse or Mastodon, you can respond to this post. Known public replies are displayed below.
Learn how this is implemented here.