Bias is a problem in large language models [LLMs]. But how can bias be solved, or tapered? This question means what can be done about training base models or fine-tuning assistant models to solve bias—technically—in LLMs?
There are various disadvantages of AI bias in several use cases. This may indicate that there are organizations with interests in mitigating bias. However, is there a place they can go to find technical research paths to explore solving bias in AI?
There is a lot of vacillation around what is open source in LLMs—maybe weights, datasets or codes—but even if everything that is used to train a language model is open source, how much can be technically gleaned there to solve bias, since closed source LLMs have teams with access to everything yet are unable to solve it.
What can be done about bias—technically—that can be made available on a list, for those interested to go at it, could be a more important open source path, for problems in AI, than debating if weights or others are open source.
The same applies to AI safety, alignment, ethics, governance and regulation. What is technically doable to make AI less misused and harmful, in the present and in future, available on a forum for those who want to find options to approach problems?
The US has an AI safety institute, which also has an AI safety consortium. The UK has an AI safety institute. The EU has an AI office. Japan and Singapore also have their own AI safety teams. They can host forums for technical approaches as options that other labs may explore, which they may or may not also be exploring. These paths must not be extensively defined, but must have a technical entry port as well as anchorsto some existing approaches.
The objective is to open source possible solutions around AI challenges with technical paths that can be seen in a public forum or safety marketplace—so to speak, so that those interested in building would see what is available, and use that to pursue advances.
There is so much that has to be done about AI, and the few teams at the front have benchmarks and commercial ends too as priorities [OpenAI o1], making safety sometimes confined around a few of their core approaches. Open source for what can be done may instead redefine everything, more what LLM is open source, or which one is closed source.
There is a recent update from Open Source Initiative, The Open Source AI Definition – draft v. 0.0.9, stating that, "When we refer to a “system,” we are speaking both broadly about a fully functional structure and its discrete structural elements. To be considered Open Source, the requirements are the same, whether applied to a system, a model, weights and parameters, or other structural elements. An Open Source AI is an AI system made available under terms and in a way that grant the freedoms to: Use the system for any purpose and without having to ask for permission. Study how the system works and inspect its components. Modify the system for any purpose, including to change its output. Share the system for others to use with or without modifications, for any purpose. These freedoms apply both to a fully functional system and to discrete elements of a system. A precondition to exercising these freedoms is to have access to the preferred form to make modifications to the system."
There is a recent story from The NYTimes, OpenAI’s Fund-Raising Talks Could Value Company at $150 Billion, stating that, "OpenAI is in talks to raise about $6.5 billion as part of a deal that would value the company in the vicinity of $150 billion, a nearly $70 billion increase from its valuation nine months ago, according to five people with knowledge of the conversations. OpenAI is also working to transform its A.I. technologies into products that pull in revenue. Building this kind of technology requires billions of dollars in raw computing power, and by some estimates the company is burning through $7 billion every year. OpenAI is pulling in more than $2 billion a year through subscriptions to ChatGPT and other technologies, according to a person familiar with its finances."
There is a new story on Reuters, US to convene global AI safety summit in November, stating that, “The Biden administration plans to convene a global safety summit on artificial intelligence, it said on Wednesday, as Congress continues to struggle with regulating the technology. The network members include Australia, Canada, the European Union, France, Japan, Kenya, South Korea, Singapore, Britain, and the United States. Generative AI - which can create text, photos and videos in response to open-ended prompts - has spurred excitement as well as fears it could make some jobs obsolete, upend elections and potentially overpower humans and have catastrophic effects. The goal of the San Francisco meeting is to jumpstart technical collaboration before the AI Action Summit in Paris in February.“
Feature image source