Lately, there is a need of private chatbot service as a complete alternative to OpenAI's ChatGPT. So, I decide to implement one at home and make it accessible to everyone in my household alongside with my network printer and NAS (OpenMediaVault). In the past, I used to recommend people using Llama series for English tasks and Qwen series for Chinese tasks. There was no open-source model that's strong enough in multilingual tasks comparing to proprietary ones (GPT/Claude). However, as we all know—things have changed recently. I have been using DeepSeek-V2 occasionally every time I got tired with Qwen2.5 and have been falling behind with DeepSeek V2.5 and V3 due to lack of hardware. But DeepSeek didn't let me down, R1 performs so impressive and provides as small as 1.5B! This means we can run it even on CPU with some considerable user experience. As many people has GPUs for gaming, speed is not an issue. To make local LLMs process uploaded documents and images is a big advantage since OpenAI limits this usage for free accounts. Although Installing Open WebUI with Bundled Ollama Support is very easy with official one-line command: docker run -d -p 3000:8080 --gpus=all -v ollama:/root/.ollama -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:ollama But to get RAG (Web search) working is not easy for most people, so I would like to find some out-of-box solution. As I mentioned in my last post, harbor is a great testbed for experimenting with different LLM stack. But it is not only great for that, it's also an all-in-one solution for self-hosting local LLMs with RAG working out-of-box. So, let's begin implementing it from scratch and feel free to skip steps since most people don't start from OS installation. System Preparation (Optional) As same as previously, go through install process using debian-11.6.0-amd64-netinst.iso Add to sudoer usermod -aG sudo username then reboot (Optional) Add extra swap fallocate -l 64G /home/swapfile chmod 600 /home/swapfile mkswap /home/swapfile swapon /home/swapfile and make the swapfile persistent nano /etc/fstab UUID=xxxxx-xxx swap swap defaults,pri=100 0 0 /home/swapfile swap swap defaults,pri=10 0 0 Check with swapon --show or free -h Disable Nouveau driver bash -c "echo blacklist nouveau > /etc/modprobe.d/blacklist-nvidia-nouveau.conf" bash -c "echo options nouveau modeset=0 >> /etc/modprobe.d/blacklist-nvidia-nouveau.conf" update-initramfs -u update-grub reboot Install dependencies apt install linux-headers-`uname -r` build-essential libglu1-mesa-dev libx11-dev libxi-dev libxmu-dev gcc software-properties-common sudo git python3 python3-venv pip libgl1 git-lfs -y (Optional) Perform uninstall if needed apt-get purge nvidia* apt remove nvidia* apt-get purge cuda* apt remove cuda* rm /etc/apt/sources.list.d/cuda* apt-get autoremove && apt-get autoclean rm -rf /usr/local/cuda* Install cuda-tookit and cuda wget https://developer.download.nvidia.com/compute/cuda/12.4.1/local_installers/cuda-repo-debian11-12-4-local_12.4.1-550.54.15-1_amd64.deb sudo dpkg -i cuda-repo-debian11-12-4-local_12.4.1-550.54.15-1_amd64.debsudo cp /var/cuda-repo-debian11-12-4-local/cuda-*-keyring.gpg /usr/share/keyrings/ sudo add-apt-repository contrib sudo apt-get update sudo apt-get -y install cuda-toolkit-12-4 sudo apt install libxnvctrl0=550.54.15-1 sudo apt-get install -y cuda-drivers Install the NVIDIA Container Toolkit since harbor is docker-based curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \ && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \ sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \ sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list Then sudo apt-get update and sudo apt-get install -y nvidia-container-toolkit Perform a cuda post-install action nano ~/.bashrc export PATH=/usr/local/cuda-12.4/bin${PATH:+:${PATH}} export LD_LIBRARY_PATH=/usr/local/cuda-12.4/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}} Then sudo update-initramfs -u, ldconfig or source ~/.bashrc to apply changes after reboot, confirm with nvidia-smi and nvcc --version Install Miniconda (Optional, not for harbor) wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh && sudo chmod +x Miniconda3-latest-Linux-x86_64.sh && bash Miniconda3-latest-Linux-x86_64.sh Docker & Harbor Install docker # Add Docker's official GPG key: sudo apt-get update sudo apt-get install ca-certificates curl sudo install -m 0755 -d /etc/apt/keyrings sudo curl -fsSL https://download.docker.com/linux/debian/gpg -o /etc/apt/keyrings/docker.asc sudo chmod a+r /etc/apt/keyrings/docker.asc # Add the repository to Apt sources: echo \ "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/debian \ $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \ sudo tee /etc/apt/sources.list.d/docker.list > /dev/null sudo apt-get update sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin Perform post-install for docker without sudo sudo groupadd docker sudo usermod -aG docker $USER newgrp docker docker run hello-world Manual install Harbor git clone https://github.com/av/harbor.git && cd harbor ./harbor.sh ln Verify with harbor --version Add with RAG support to defaults with harbor defaults add searxng Use harbor defaults list to check, now there are three services active: ollama, webui, searxng Run with harbor up to bring up these services in docker Use harbor ps as docker ps , and harbor logs to see tailing logs Now the open-webui frontend is serving at 0.0.0.0:33801 and can be accessed from http://localhost:33801 or clients from LAN with server's IP address. Monitor VRAM usage with watch -n 0.3 nvidia-smi Monitor log with harbor up ollama --tail or harbor logs All ollama commands are usable such as harbor ollama list It's time to access from other devices (desktop/mobile) to register an admin account and download models now. Using Local LLM After login with admin account, click top right avatar icon, open Admin Panel then Settings, or simply access via `http://ip:33801/admin/settings. Click Models, and at the top right click the Manage Models which looks like a download button. Put deepseek-r1 or any other model in the textbox below Pull a model from Ollama.com and click the download button on the right side. After model downloaded, it may require a refresh and the newly downloaded model will be usable under the drop down menu on the New Chat (home) page. Now, it's not only running a chatbot alternative to ChatGPT, but also a fully functional API alternative to OpenAI API, plus a private search engine alternative to Google! webui is accessible within LAN via: http://ip:33801 ollama is accessible within LAN via: http://ip:33821 searxng is accessible within LAN via: http://ip:33811 Call Ollama API with any application with LLM API integration: http://ip:33821/api/ps http://ip:33821/v1/models http://ip:33821/api/generate http://ip:33821/v1/chat/completionsb Lately, there is a need of private chatbot service as a complete alternative to OpenAI's ChatGPT. So, I decide to implement one at home and make it accessible to everyone in my household alongside with my network printer and NAS (OpenMediaVault) . NAS (OpenMediaVault) In the past, I used to recommend people using Llama series for English tasks and Qwen series for Chinese tasks. There was no open-source model that's strong enough in multilingual tasks comparing to proprietary ones (GPT/Claude). However, as we all know—things have changed recently. I have been using DeepSeek-V2 occasionally every time I got tired with Qwen2.5 and have been falling behind with DeepSeek V2.5 and V3 due to lack of hardware. But DeepSeek didn't let me down, R1 performs so impressive and provides as small as 1.5B! This means we can run it even on CPU with some considerable user experience. As many people has GPUs for gaming, speed is not an issue. To make local LLMs process uploaded documents and images is a big advantage since OpenAI limits this usage for free accounts. Although Installing Open WebUI with Bundled Ollama Support is very easy with official one-line command: Installing Open WebUI with Bundled Ollama Support docker run -d -p 3000:8080 --gpus=all -v ollama:/root/.ollama -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:ollama docker run -d -p 3000:8080 --gpus=all -v ollama:/root/.ollama -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:ollama But to get RAG (Web search) working is not easy for most people, so I would like to find some out-of-box solution. As I mentioned in my last post , harbor is a great testbed for experimenting with different LLM stack. But it is not only great for that, it's also an all-in-one solution for self-hosting local LLMs with RAG working out-of-box. So, let's begin implementing it from scratch and feel free to skip steps since most people don't start from OS installation. my last post System Preparation (Optional) As same as previously , go through install process using debian-11.6.0-amd64-netinst.iso previously debian-11.6.0-amd64-netinst.iso Add to sudoer usermod -aG sudo username then reboot usermod -aG sudo username (Optional) Add extra swap fallocate -l 64G /home/swapfile chmod 600 /home/swapfile mkswap /home/swapfile swapon /home/swapfile fallocate -l 64G /home/swapfile chmod 600 /home/swapfile mkswap /home/swapfile swapon /home/swapfile and make the swapfile persistent nano /etc/fstab nano /etc/fstab UUID=xxxxx-xxx swap swap defaults,pri=100 0 0 /home/swapfile swap swap defaults,pri=10 0 0 UUID=xxxxx-xxx swap swap defaults,pri=100 0 0 /home/swapfile swap swap defaults,pri=10 0 0 Check with swapon --show or free -h swapon --show free -h Disable Nouveau driver bash -c "echo blacklist nouveau > /etc/modprobe.d/blacklist-nvidia-nouveau.conf" bash -c "echo options nouveau modeset=0 >> /etc/modprobe.d/blacklist-nvidia-nouveau.conf" update-initramfs -u update-grub reboot bash -c "echo blacklist nouveau > /etc/modprobe.d/blacklist-nvidia-nouveau.conf" bash -c "echo options nouveau modeset=0 >> /etc/modprobe.d/blacklist-nvidia-nouveau.conf" update-initramfs -u update-grub reboot Install dependencies apt install linux-headers-`uname -r` build-essential libglu1-mesa-dev libx11-dev libxi-dev libxmu-dev gcc software-properties-common sudo git python3 python3-venv pip libgl1 git-lfs -y apt install linux-headers-`uname -r` build-essential libglu1-mesa-dev libx11-dev libxi-dev libxmu-dev gcc software-properties-common sudo git python3 python3-venv pip libgl1 git-lfs -y (Optional) Perform uninstall if needed apt-get purge nvidia* apt remove nvidia* apt-get purge cuda* apt remove cuda* rm /etc/apt/sources.list.d/cuda* apt-get autoremove && apt-get autoclean rm -rf /usr/local/cuda* apt-get purge nvidia* apt remove nvidia* apt-get purge cuda* apt remove cuda* rm /etc/apt/sources.list.d/cuda* apt-get autoremove && apt-get autoclean rm -rf /usr/local/cuda* Install cuda-tookit and cuda Install cuda-tookit and cuda wget https://developer.download.nvidia.com/compute/cuda/12.4.1/local_installers/cuda-repo-debian11-12-4-local_12.4.1-550.54.15-1_amd64.deb sudo dpkg -i cuda-repo-debian11-12-4-local_12.4.1-550.54.15-1_amd64.debsudo cp /var/cuda-repo-debian11-12-4-local/cuda-*-keyring.gpg /usr/share/keyrings/ sudo add-apt-repository contrib sudo apt-get update sudo apt-get -y install cuda-toolkit-12-4 sudo apt install libxnvctrl0=550.54.15-1 sudo apt-get install -y cuda-drivers wget https://developer.download.nvidia.com/compute/cuda/12.4.1/local_installers/cuda-repo-debian11-12-4-local_12.4.1-550.54.15-1_amd64.deb sudo dpkg -i cuda-repo-debian11-12-4-local_12.4.1-550.54.15-1_amd64.debsudo cp /var/cuda-repo-debian11-12-4-local/cuda-*-keyring.gpg /usr/share/keyrings/ sudo add-apt-repository contrib sudo apt-get update sudo apt-get -y install cuda-toolkit-12-4 sudo apt install libxnvctrl0=550.54.15-1 sudo apt-get install -y cuda-drivers Install the NVIDIA Container Toolkit since harbor is docker-based Install the NVIDIA Container Toolkit curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \ && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \ sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \ sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \ && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \ sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \ sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list Then sudo apt-get update and sudo apt-get install -y nvidia-container-toolkit sudo apt-get update sudo apt-get install -y nvidia-container-toolkit Perform a cuda post-install action nano ~/.bashrc post-install action nano ~/.bashrc export PATH=/usr/local/cuda-12.4/bin${PATH:+:${PATH}} export LD_LIBRARY_PATH=/usr/local/cuda-12.4/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}} export PATH=/usr/local/cuda-12.4/bin${PATH:+:${PATH}} export LD_LIBRARY_PATH=/usr/local/cuda-12.4/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}} Then sudo update-initramfs -u , ldconfig or source ~/.bashrc to apply changes sudo update-initramfs -u ldconfig source ~/.bashrc after reboot, confirm with nvidia-smi and nvcc --version nvidia-smi nvcc --version Install Miniconda (Optional, not for harbor) wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh && sudo chmod +x Miniconda3-latest-Linux-x86_64.sh && bash Miniconda3-latest-Linux-x86_64.sh wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh && sudo chmod +x Miniconda3-latest-Linux-x86_64.sh && bash Miniconda3-latest-Linux-x86_64.sh Docker & Harbor Install docker Install docker # Add Docker's official GPG key: sudo apt-get update sudo apt-get install ca-certificates curl sudo install -m 0755 -d /etc/apt/keyrings sudo curl -fsSL https://download.docker.com/linux/debian/gpg -o /etc/apt/keyrings/docker.asc sudo chmod a+r /etc/apt/keyrings/docker.asc # Add the repository to Apt sources: echo \ "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/debian \ $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \ sudo tee /etc/apt/sources.list.d/docker.list > /dev/null sudo apt-get update sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin # Add Docker's official GPG key: sudo apt-get update sudo apt-get install ca-certificates curl sudo install -m 0755 -d /etc/apt/keyrings sudo curl -fsSL https://download.docker.com/linux/debian/gpg -o /etc/apt/keyrings/docker.asc sudo chmod a+r /etc/apt/keyrings/docker.asc # Add the repository to Apt sources: echo \ "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/debian \ $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \ sudo tee /etc/apt/sources.list.d/docker.list > /dev/null sudo apt-get update sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin Perform post-install for docker without sudo post-install sudo groupadd docker sudo usermod -aG docker $USER newgrp docker docker run hello-world sudo groupadd docker sudo usermod -aG docker $USER newgrp docker docker run hello-world Manual install Harbor Manual install Harbor git clone https://github.com/av/harbor.git && cd harbor ./harbor.sh ln git clone https://github.com/av/harbor.git && cd harbor ./harbor.sh ln Verify with harbor --version harbor --version Add with RAG support to defaults with harbor defaults add searxng harbor defaults add searxng Use harbor defaults list to check, now there are three services active: ollama , webui , searxng harbor defaults list ollama webui searxng Run with harbor up to bring up these services in docker harbor up Use harbor ps as docker ps , and harbor logs to see tailing logs harbor ps docker ps harbor logs Now the open-webui frontend is serving at 0.0.0.0:33801 and can be accessed from http://localhost:33801 or clients from LAN with server's IP address. 0.0.0.0:33801 http://localhost:33801 Monitor VRAM usage with watch -n 0.3 nvidia-smi watch -n 0.3 nvidia-smi Monitor log with harbor up ollama --tail or harbor logs harbor up ollama --tail harbor logs All ollama commands are usable such as harbor ollama list harbor ollama list It's time to access from other devices (desktop/mobile) to register an admin account and download models now. Using Local LLM After login with admin account, click top right avatar icon, open Admin Panel then Settings , or simply access via `http://ip:33801/admin/settings. Admin Panel Settings Click Models , and at the top right click the Manage Models which looks like a download button. Models Manage Models Put deepseek-r1 or any other model in the textbox below Pull a model from Ollama.com and click the download button on the right side. deepseek-r1 any other model Pull a model from Ollama.com After model downloaded, it may require a refresh and the newly downloaded model will be usable under the drop down menu on the New Chat (home) page. New Chat Now, it's not only running a chatbot alternative to ChatGPT, but also a fully functional API alternative to OpenAI API, plus a private search engine alternative to Google! webui is accessible within LAN via: http://ip:33801 http://ip:33801 ollama is accessible within LAN via: http://ip:33821 http://ip:33821 searxng is accessible within LAN via: http://ip:33811 http://ip:33811 Call Ollama API with any application with LLM API integration: Ollama API http://ip:33821/api/ps http://ip:33821/v1/models http://ip:33821/api/generate http://ip:33821/v1/chat/completionsb http://ip:33821/api/ps http://ip:33821/v1/models http://ip:33821/api/generate http://ip:33821/v1/chat/completionsb