A List of Projects Software Engineers Should Undertake to Learn More About LLMs

Natural Language Processing (NLP) is a rapidly evolving field, and Large Language Models (LLMs) are playing a significant role in shaping how we interact with human language. However, the creation of LLMs is often considered the domain of machine learning experts, leaving software engineers to take a back seat.

This article challenges this idea and highlights the critical role that software engineers with strong programming skills can play in driving LLMs' growth and innovation, even without in-depth knowledge of machine learning.

The Power of LLMs in NLP

Before we dive into how software engineers can make their mark in LLM development, let's take a moment to appreciate these models' profound impact on NLP. Beyond just deciphering complex language structures, LLMs have simplified intricate processes in various domains. They have empowered chatbots to provide exceptional customer support, revolutionized language translation, automated text summarization, and refined sentiment analysis. In doing so, they've replaced cumbersome pipelines with elegant, data-driven solutions.

Now, let's explore how software engineers can actively contribute:

Creating Pet Projects:

You don't need a machine learning PhD to embark on exciting LLM projects. Consider the possibilities of generating personalized workout plans, crafting creative writing prompts, or developing unique conversational agents. The scope is boundless. By initiating and sharing these projects, you not only showcase the versatility of LLMs but also expand the realm of practical applications. These projects serve as a bridge between software engineering and NLP, demonstrating the tangible impact of LLMs.

For some inspiration, check out these project ideas:

Elevating Open Source Visualization Tools

Visualization is pivotal component of the LLM ecosystem. Here, software engineers can contribute to the development and enhancement of open-source visualization tools tailored for LLMs. These tools serve as essential aids in comprehending model behavior and results. By actively participating in the improvement of such visualization tools, frontend software engineers can make LLMs more accessible and user-friendly for a wider audience.

Some ideas for initial contribution:

tensorboard - TensorBoard, A web-based tool for visualizing TensorFlow runs and graphs.
netron - a viewer for neural network, deep learning, and machine learning models . Netron supports a variety of popular model formats such as TensorFlow, PyTorch, ONNX, Keras, and Caffe2.
manifold - a model-agnostic visual debugging tool for machine learning . Manifold allows you to explore your model’s behavior by visualizing its decision boundaries in high-dimensional space

Reimplementation Challenges

Recreating established LLM models from scratch offers a valuable learning journey. While it may not be intended for production use, this exercise provides invaluable insights into the inner workings of models like GPT and BERT. Through this hands-on approach, software engineers can better understand LLMs and explore their capabilities firsthand.

Explore these reimplementation challenges:

LLaMA.cpp: A C/C++ port of Facebook's LLaMA model on your own laptop with decent performance
Whisper-Burn: A Rust implementation of OpenAI's Whisper model using the Burn framework.

Developing Model Converters and Wrappers

Another practical avenue for contribution is crafting model converters for popular programming languages. Software engineers can make a significant contribution by developing model converters and wrappers for different programming languages. These tools simplify the integration of LLMs into various projects, making them more accessible to a wider audience. By creating effective bindings for languages like Go or .NET, software engineers bridge the gap for developers who prefer those languages.

Consider these converter and wrapper projects:

TorchSharp: A .NET library granting access to the core of PyTorch.
go-torch: A library designed primarily for running inference against serialized models from Python versions of PyTorch.
ONNX: An open-source format for AI models spanning deep learning and traditional machine learning.
TensorFlow.js: A library for crafting and training machine learning models in JavaScript, deployable in browsers or on Node.js.

Conclusion

Software engineers bring a unique perspective to the development of large language models. While expertise in machine learning is undoubtedly valuable, programming skills and creativity can also drive progress.

Through the act of initiating projects, enhancing open-source visualization tools, reimagining model structures, and developing essential converters and wrappers, software engineers actively contribute to the evolution of LLMs. This collaborative synergy, where diverse skill sets converge, fosters inclusivity, accessibility, and innovation within the field.

In essence, whether you're a seasoned software engineer or just embarking on this journey, remember that your role in shaping the future of LLMs is significant. By marrying the power of technology with the ingenuity of software engineering, we collectively redefine the boundaries of what's possible in NLP.