paint-brush
Strategies for Responsible AI Governanceby@hackerclup7sajo00003b6s2naft6zw
673 reads
673 reads

Strategies for Responsible AI Governance

by Priyanka NeelakrishnanMay 7th, 2024
Read on Terminal Reader
Read this story w/o Javascript

Too Long; Didn't Read

As AI's influence grows, robust governance becomes essential. From discriminating between generative and discriminative AI models to implementing comprehensive risk assessment and compliance frameworks, AI governance ensures transparency, data security, and ethical AI practices, laying the foundation for long-term benefits and regulatory compliance.
featured image - Strategies for Responsible AI Governance
Priyanka Neelakrishnan HackerNoon profile picture


The widespread adoption of AI necessitates methodical guardrails to govern, manage, and secure its use.


In recent times, the world has witnessed a significant increase in the use of artificial intelligence, permeating every aspect of the digital landscape. From automated processing to advanced algorithms, artificial intelligence is slowly becoming an integral part of our daily lives and business operations. The use of artificial intelligence technologies in various industries and sectors is increasing at an unprecedented scale and exponentially. This has also resulted in profound impacts on society, as well as dangers and risks to the core rights of individuals.

What is Artificial Intelligence?

Artificial Intelligence (AI) is a broad field encompassing various machine learning, logic, and knowledge-based techniques and approaches to create systems that can perform tasks typically performed by humans or require human cognitive abilities. This includes tasks such as natural language processing, image recognition, problem-solving, and decision-making. As per the European Union’s AI Act and OECD Report on AI Risk Management, an AI system is a machine-based system that, for explicit or implicit objectives, infers from the input it receives how to generate outputs such as predictions, content, recommendations, or decisions that can influence physical or virtual environments.


There are two broad classifications:

  • Discriminative AI - could only do data classification. Example: logistic regression, k-nearest neighbors, support vector machines, and gradient-boosted decision trees. Neural architectures such as convolutional neural networks (CNN) and long short-term memory (LSTM) units are often used to build reasonably-sized discriminative models for very long and varying-length inputs.
  • Generative AI - generating new content resembling the content on which it is trained. Example: Generative Adversarial Networks (GANs), diffusion models, and autoregressive models.


GANs are machine-learning frameworks that consist of two neural networks, a generator and a discriminator. The generator generates data by shaping random noise fed to it into a target format. Generators alone cannot assess the quality of their output. This is where the discriminator model comes in. The discriminator aims to differentiate between real data and fake data generated by the generator. The two are trained simultaneously, with the discriminator trained to differentiate real and generator data, and the generator trained to confuse the discriminator by making increasingly realistic data. As training progresses, each model becomes increasingly better at its task resulting in the generator being able to create realistic-looking content. The challenge with GANs is training them. For example, GANs can undergo model collapse in training, in which the generator only learns to generate a small variety of samples sufficient to confuse the discriminator but not sufficient to be useful. This is where the diffusion model comes in. In essence, diffusion models are trained to recover training data from noisy-field versions of it. After training, diffusion may ideate entirely new images from a pure noise input. It iteratively constructs an image through a gradual denoising process.


Next, Autoregressive models are rooted in statistics. It generates sequences of data by modeling the probability of the next element in a sequence conditioned on the prior elements. The next element is then randomly selected from this distribution, using a “temperature” parameter can nudge the results to be more deterministic or more random, and the process is repeated. Popular neural network components for autoregressive models include LSTMs and transformers (which allow neural networks to learn patterns in very large volumes of text training data). Instead of just completing a sequence fed to it, we add an alignment stage to autoregressive models. Here, model is additionally trained to prefer certain input-output pairs to others based on human feedback. Eg, In LLMs alignment, has successfully taught models how to respond to questions and commands (reinforcement learning).


Key Benefits of AI

  • Automation for Efficiency - automate repetitive tasks, leading to increased productivity and operational efficiency;
  • Data-Driven Insights - extract valuable insights from large datasets, providing businesses with a competitive edge through data-driven decision-making;
  • Creative Problem Solving - generate innovative solutions and ideas, even when provided with ambiguous or incomplete instructions, enhancing problem-solving and creativity;
  • Content Creation - produce high-quality content swiftly and on a large scale, benefiting content marketing, advertising, and customer engagement;
  • Autonomous Decision-Making - enables levels of autonomous decision-making that were not possible with prior generations of AI.

Importance of Data in Generative AI

Data plays a central role in the development of generative AI models, particularly Large Language Models (LLMs). These models rely on vast quantities of data for training and refinement. For example, OpenAI’s ChatGPT was trained on an extensive dataset comprising over 45 terabytes of text data collected from the internet, including digitized books and Wikipedia entries. However, the extensive need for data collection in generative AI can raise significant concerns, including the inadvertent collection and use of personal data without the consent of individuals. Google AI researchers have also acknowledged that these datasets, often large and sourced from various places, may contain sensitive personal information, even if derived from publicly available data.


There are broadly two common sources for data collection:

  • Publicly Accessible Data - Web scraping is the most common method used to collect data. It involves extracting large volumes of information from publicly accessible web pages. This data is then utilized for training purposes or may be repurposed for sale or made freely available to other AI developers. Data obtained through web scraping often includes personal information shared by users on social media platforms like Facebook, Twitter, LinkedIn, Venmo, and other websites. While individuals may post personal information on such platforms for various reasons, such as connecting with potential employers or making new friends, they typically do not intend for their data to be used for training generative AI models.


  • User Data - Data shared by users with generative AI applications, such as chatbots, may be stored and used for training without the knowledge or consent of the data subjects. For example, users interacting with chatbots providing healthcare advice, therapy, financial services, and other services might divulge sensitive personal information. While such chatbots may provide terms of service mentioning that user data may be used to “develop and improve the service,” critics could argue that generative AI models should seek affirmative consent from users or provide clear disclosures about the collection, usage, and retention of user data.


Many organizations have also embedded generative AI models into their products or services to enhance their offerings. Such integration, in some cases, can also serve as a source of data, including the personal data of consumers, for the training and fine-tuning of these models.


Potential threats include:

  • Unauthorized mass surveillance of individuals and societies
  • Unexpected and unintentional breaches of individuals’ personal information
  • Manipulation of personal data on a massive scale for various purposes
  • Generation of believable and manipulative deep fakes of individuals
  • Amplifying while masking the influences of cultural biases, racism, and prejudices in legal and socially significant outcomes
  • Violation of data protection principles of purpose limitation, storage limitation, and data minimization
  • Discrimination against specific groups of individuals and societal bias
  • Disinformation and presenting factually inaccurate information
  • Intellectual property and copyright infringements


AI Governance

As we enter an era heavily influenced by generative AI technologies, the governance of artificial intelligence becomes an increasingly vital priority for businesses that want to enable the safe use of data and AI while meeting legal and ethical requirements. In October 2023, the “safe, secure, and trustworthy” use of artificial intelligence warranted an executive order from the Biden-Harris administration in the US, an issuance that followed closely on the heels of the EU’s AI Act, the world’s first comprehensive AI law on the books. Other countries, like China, the UK, and Canada, and even a number of US states have drawn their own lines in the sand either proposing or enacting legislation that highlights the importance of safety, security, and transparency in AI.


Product managers and in general enterprise leaders need to adopt this secure AI use mindset while incorporating AI into their business practices. Effective AI governance provides control and oversight, ensuring that businesses develop and manage their AI services responsibly, ethically, and in compliance with both internal policies and external regulations in a documented, efficient, and demonstrable manner. It will enable enterprises to maintain trust and also add accountability.


AI governance refers to the imposition of frameworks, rules, standards, legal requirements, policies, and best practices that govern, manage, and monitor the use of artificial intelligence. It involves directing, managing, and monitoring AI activities to meet legal and ethical requirements. On the ethical front, businesses are to be focused on ensuring a high level of transparency, safety, and security in their AI models to build and maintain customer trust. On the legal front, enterprises must conform to legal requirements and satisfy regulators or risk substantial financial penalties and damaged brand reputation.


McKinsey research estimates that generative AI could contribute between $2.6 trillion and $4.4 trillion in annual value going forward. However, to realize this potential, organizations must implement AI in a way that is transparent, secure, and trustworthy. In fact, Gartner suggests that organizations that successfully operationalize secure and trustworthy AI could see a 50% increase in their AI adoption and attainment of business goals.


Key Drivers of AI Governance in Enterprises

These include the following:

  • Innovation - AI governance provides a structured, yet flexible, framework that encourages responsible innovation.
  • Efficiency - by standardizing and optimizing AI development and deployment, AI governance enables enterprises to bring AI products to market faster while reducing costs.
  • Compliance - AI governance aligns AI solutions and decision-making with industry regulations and global legal standards. This ensures that AI practices meet legal requirements, reducing legal risks for the business and furthering its regulatory compliance.
  • Trust - AI governance focuses on building trustworthy and transparent AI systems. This practice is crucial for maintaining customer rights and satisfaction while also protecting the organization’s brand value. Trustworthy AI enhances customer confidence and loyalty while reducing the risk of regulatory action.


An example of an AI governance framework developed by Gartner is AI TRiSM - AI Trust, Risk, and Security Management framework that focuses on risk mitigation and alignment with data privacy laws in the use of AI. It has four pillars, 1) Explainability and model monitoring - to ensure transparency and reliability. 2) Model operations - involves developing processes and systems for managing AI models throughout their lifecycle. 3) AI application security - to keep models secure and protected against cyber threats. 4) Model privacy - to protect the data used to train or test AI models by managing data flows as per privacy laws (data purpose/storage limitations, data minimization/protection principles). Overall, TRiSM is an approach to enhance AI models’ reliability, trustworthiness, security, and privacy.


Actions to Take for Better AI Governance

  • Enhanced visibility into AI systems - discover and catalog AI models. The aim here is to give businesses a complete and comprehensive overview of their AI usage by identifying and recording details of all AI models used in public clouds, private environments, and third-party apps. It covers the models’ purposes, training data, architecture, inputs, outputs and interactions including undocumented or unsanctioned AI models. Creating a centralized catalog of this information enhances transparency, governance, and the effective use of AI, supporting better decisions and risk management. It’s essential for revealing the full range of AI applications and breaking down operational silos within the organization.


  • Comprehensive Risk Assessment - assess risks and classify AI models. The aim here is to assess the risks of their AI systems at pre-development and development stages and implement risk mitigation steps. It involves leveraging model cards that offer predefined risk evaluations for AI models, including model descriptions, intended use, limitations, and ethical considerations. These risk ratings provide comprehensive details covering aspects such as toxicity, maliciousness, bias, copyright considerations, hallucination risks and even model efficiency in terms of energy consumption and inference runtime. Based on these ratings organizations can decide which models to sanction for deployment and use, which models to block and which models need additional guardrails before consumption.


  • Transparent Data Practices - map and monitor data to AI flows. Data flows into the AI systems for training, tuning and inference and data flows out of AI systems as the output. It allows businesses to uncover the full context around their AI models and AI systems. That is map AI models and systems to associated data sources and systems, data processing, SaaS applications, potential risks, and compliance obligations. This comprehensive mapping enables privacy, compliance, security and data teams to identify dependencies, pinpoint potential points of failure and ensure that AI governance is proactive rather than reactive.


  • Robust Security Controls - implement data to AI controls. It allows the establishment of strict controls for the security and confidentiality of data that is both put into and generated from AI models. Such controls include data security and privacy controls mandated by security frameworks and privacy laws respectively. For example, redaction or anonymization techniques may be applied in order to remove identifiable values from datasets. It ensures the safe ingestion of data into AI models, aligning with enterprise data policies and user entitlements. If sensitive data finds its way into LLM models securing it becomes extremely difficult. Similarly, if enterprise data is converted into vector forms, securing it becomes more challenging. On the data generation and output side, safeguarding AI interactions requires caution against external attacks, malicious internal use and misconfigurations. To ensure secure conversations with AI assistants, bots and agents, LLM firewalls should be deployed to filter harmful prompts, retrievals and responses. These firewalls should be able to defend against various vulnerabilities highlighted in the OWASP Top 10 for LLMs and in the NIST AI RMF frameworks including prompt injection attacks and data exfiltration attacks.


  • Diligent Compliance with Regulatory Frameworks - comply with regulations. Businesses using AI systems must comply with AI-specific regulations and standards as well as data privacy obligations that relate to the use of AI. To streamline this demanding compliance process, businesses can leverage comprehensive compliance automation tailored to AI. Such a system offers a wide-ranging catalog of global AI regulations and frameworks, including the NIST AI RMF and the EU AI Act among others. It facilitates the creation of distinct AI projects within its framework enabling users to identify and apply the necessary controls for each project. This process includes both automated checks and assessments that require input from stakeholders providing a holistic approach to ensuring compliance.


Advantages of Implementing AI Governance

Enterprises that successfully implement AI governance will achieve

a) Full transparency into their sanctioned and unsanctioned AI systems

b) Clear visibility of the AI risks

c) Mapping of AI and data

d) Strong automated AI + Data controls

e) Compliance with global AI regulations.


Overall, we need to ensure the safe use of AI. While prioritizing safety may result in slightly lower business profits in the immediate short term, the medium and long-term benefits are substantial.