This is the breakout year for Generative AI!
Yes, this article’s been written by a violinist and there’s new content! Spectacular new and awesome content!
Well; to say the very least, this year, I’ve been spoiled for choice as to how to run an LLM Model locally.
Let’s start!
Hugging Face Transformers is a state-of-the-art machine learning library that provides easy access to a wide range of pre-trained models for Natural Language Processing (NLP), Computer Vision, Audio tasks, and more. It’s an open-source library developed by Hugging Face, a company that has built a strong community around machine learning and NLP.
Here are some key features of Hugging Face Transformers:
In summary, Hugging Face Transformers is a powerful tool that makes the complexities of language technology and machine learning accessible to everyone, from beginners to experts.
GPT4All is an open-source ecosystem developed by Nomic AI that allows you to run powerful and customized large language models (LLMs) locally on consumer-grade CPUs and any GPU. It aims to be the best instruction-tuned assistant-style language model that any person or enterprise can freely use, distribute, and build on.
Here are some key features of GPT4All:
In summary, GPT4All is a tool that democratizes access to AI resources by allowing anyone to run large language models locally on their own hardware.
The website is (unsurprisingly)
https://gpt4all.io
Like all the LLMs on this list (when configured correctly), gpt4all does not require Internet or a GPU.
Ollama is an open-source command line tool that allows you to run, create, and share large language models on your computer. It’s designed to run open-source large language models locally on your machine2. Ollama supports various models such as Llama 3, Mistral, Gemma, and others.
It simplifies the process by bundling model weights, configuration, and data into a single package defined by a Modelfile. This means you can run large language models, such as Llama 2 and Code Llama, without any registration or waiting list.
Ollama is available for macOS, Linux, and Windows. It started by supporting Llama, then expanded its model library to include models like Mistral and Phi-25. So, if you’re interested in running large language models on your local machine, Ollama could be a great tool to consider!
Ollama’s multimodal capabilities allow it to process both text and image inputs together. This means you can ask the model questions or give it prompts that involve both text and images, and it will generate responses based on both types of input.
To use this feature, you simply type your prompt and then drag and drop an image. There is a new images
parameter for both Ollama’s Generate API & Chat API. The images parameter takes a list of base64 encoded PNG or JPEG format images. Ollama supports image sizes up to 100MB.
For example, you can run a multimodal model like LLaVA by typing ollama run llava
in the terminal. In the background, Ollama will download the LLaVA 7B model and run it. If you want to use a different parameter size, you can try the 13B model using ollama run llava:13b
.
More multimodal models are becoming available, such as BakLLaVA 7B. You can run it by typing ollama run bakllava
in the terminal.
This multimodal capability makes Ollama a powerful tool for generating creative and contextually relevant responses based on a combination of text and image inputs. It’s a great way to leverage the power of large language models in a more interactive and engaging way.
I find that this is the most convenient way of all. The full explanation is given on the link below:
localllm
is a tool developed by Google Cloud Platform that allows you to run Large Language Models (LLMs) locally on Cloud Workstations. It’s particularly useful for developers who want to leverage the power of LLMs without the constraints of GPU availability.
Here are some key points about localllm
:
In essence, localllm
is a game-changer for developers seeking to leverage LLMs in their applications, offering a flexible and efficient approach to application development.
Meta’s Llama 3, often referred to as Llama 3, is the latest iteration of Meta’s Large Language Model (LLM). It’s a powerful AI tool that has made significant strides in the field of Natural Language Processing (NLP) and Machine Learning (ML).
Llama 3 is an open-source large language model designed for developers, researchers, and businesses to build, experiment, and responsibly scale their generative AI ideas. It’s part of a foundational system and serves as a bedrock for innovation in the global community.
This model is available in both 8B and 70B pretrained and instruction-tuned versions to support a wide range of applications. It excels at understanding language nuances and performing complex tasks like translation and dialogue generation.
Llama 3 has been trained on over 15 trillion tokens of data, a training dataset 7 times larger than that used for Llama 2, including 4 times more code. This results in the most capable Llama model yet, which supports an 8K context length that doubles the capacity of Llama 2.
With the release of Llama 3, Meta has updated the Responsible Use Guide (RUG) to provide the most comprehensive information on responsible development with LLMs. Their system-centric approach includes updates to their trust and safety tools with Llama Guard 2, optimized to support the newly announced taxonomy published by MLCommons expanding its coverage to a more comprehensive set of safety categories, Code Shield, and Cybersec Eval 2.
In summary, Llama 3 represents a significant advancement in AI technology, offering enhanced capabilities and improved performance for a broad range of applications.
Link: https://huggingface.co/meta-llama
The llm
Python script from PyPI is a command-line utility and Python library for interacting with Large Language Models (LLMs), including OpenAI, PaLM, and local models installed on your own machine.
Here’s a brief overview of its functionalities:
pip install llm
or using Homebrew with the command brew install llm
.llm-gpt4all
plugin.llm "Five cute names for a pet penguin"
.llm chat
command.-s/--system
option to set a system prompt, providing instructions for processing other input to the tool.
A llamafile is an executable Large Language Model (LLM) that you can run on your own computer. It contains the weights for a given open LLM, as well as everything needed to actually run that model on your computer.
The goal of a llamafile is to make open LLMs much more accessible to both developers and end users. It combines the model with a framework into one single-file executable (called a “llamafile”) that runs locally on most computers, with no installation1. This means all the operations happen locally and no data ever leaves your computer.
For example, you can download an example llamafile for the LLaVA model, which is a new LLM that can do more than just chat; you can also upload images and ask it questions about them.
In addition to hosting a web UI chat server, when a llamafile is started, it also provides an OpenAI API compatible chat completions endpoint. This is designed to support the most common OpenAI API use cases, in a way that runs entirely locally.
This initiative is part of an effort to ensure that AI remains free and open, and to increase both trust and safety. It aims to lower barriers to entry, increase user choice, and put the technology directly into the hands of the people.
ChatGPT-Next-Web, also known as NextChat, is an open-source application that provides a user interface for interacting with advanced language models like GPT-3.5, GPT-4, and Gemini-Pro. It’s built on Next.js, a popular JavaScript framework.
Here are some key features of ChatGPT-Next-Web:
Deployment: It can be deployed for free with one-click on Vercel in under 1 minute.
It’s a versatile tool that allows users to interact with AI models in a more personalized and controlled manner. It’s being used by developers and AI enthusiasts around the world to create and deploy their own ChatGPT applications.
Website: https://github.com/ChatGPTNextWeb/ChatGPT-Next-Web
PrivateGPT is an open-source, production-ready AI project that allows you to interact with your documents using the power of Large Language Models (LLMs), 100% privately.
Here are some key features of PrivateGPT:
PrivateGPT is a versatile tool that allows users to interact with AI models in a more personalized and controlled manner. It’s being used by developers and AI enthusiasts around the world to create and deploy their own AI applications.
Text-Generation-WebUI, also known as TGW or “oobabooga”, is an open-source project that provides a Gradio-based web interface for interacting with Large Language Models.
Here are some key features of Text-Generation-WebUI:
H2O.ai is a company that provides AI solutions and is known for democratizing AI. They offer a range of products and services, including:
H2O Open Source: This is a fully open source, distributed in-memory machine learning platform with linear scalability. It supports the most widely used statistical & machine learning algorithms including gradient boosted machines, generalized linear models, deep learning and more. It also has an industry leading AutoML functionality that automatically runs through all the algorithms and their hyperparameters to produce a leaderboard of the best models.
H2O.ai Cloud Platform: This is a cloud-based platform for creating and deploying generative AI solutions with customizable large language models (LLMs). It offers features like information retrieval on internal data, data extraction, summarization, and other batch processing tasks, as well as code/SQL generation for data analysis. It also provides a customizable evaluation and validation framework that is model agnostic.
Enterprise Support: When AI becomes mission critical for enterprise success, H2O.ai provides the services you need to optimize your investments in people and technology to deliver on your AI vision.
PrivateGPT: This is a product that allows you to interact with your documents using the power of Large Language Models (LLMs), 100% privately. It provides an API offering all the primitives required to build private, context-aware AI applications.
H2O.ai is used by over 18,000 organizations globally and is extremely popular in both the R & Python communities. It’s a powerful tool for those who want to run AI models locally without the need for internet connectivity or expensive hardware.
LightLLM is a Python-based Large Language Model (LLM) inference and serving framework. It’s known for its lightweight design, easy scalability, and high-speed performance.
Here are some key features of LightLLM:
LightLLM supports a wide range of models including BLOOM, LLaMA, LLaMA V2, StarCoder, Qwen-7b, ChatGLM2-6b, Baichuan-7b, Baichuan2-7b, Baichuan2-13b, Baichuan-13b, InternLM-7b, Yi-34b, Qwen-VL, Qwen-VL-Chat, Llava-7b, Llava-13b, Mixtral, Stablelm, MiniCPM.
GPT Academic, also known as gpt_academic, is an open-source project that provides a practical interaction interface for Large Language Models (LLMs) like GPT and GLM. It’s particularly optimized for academic reading, writing, and code explanation.
Here are some key features of GPT Academic:
Now, let’s take a moment to appreciate the sheer genius of these LLMs. They’re like the Swiss Army knives of the AI world - versatile, handy, and always ready to impress with their multitude of uses. And the best part? They’re local! That’s right, no need to venture into the vast and sometimes scary world of the internet. These models are right at home on your local machine, ready to serve at a moment’s notice.
But let’s not forget the real heroes of our story - the programmers. Those tireless warriors of the digital realm, armed with nothing but their keyboards and an unquenchable thirst for knowledge. They’re the ones who’ve tamed these wild beasts of AI, turning lines of incomprehensible code into something we can all use and appreciate. So here’s to you, dear programmers. May your coffee be strong, your bugs few, and your StackOverflow answers always upvoted.
So, here’s to Local LLMs - the truly unsung heroes of the AI world. May they continue to inspire, amaze, and occasionally confuse us for many years to come. And remember, in the world of AI, the only limit is your imagination. So dream big, code hard, and don’t forget to laugh along the way. After all, as they say in the world of programming, “Laughter is the best exception handler.”
Cheers!
All Images Generated With Bing Image Creator. It’s Awesome! (DALL-E-3)