ThinkSpace

Using Generative AI Safely – December 13

A conference presenter recently told an audience, “Whatever you put on ChatGPT is out there. Gone for good. Out of your control.”

We hear that dire warning a lot and it raises serious concerns about business use of public tools like ChatGPT or Bard. The warning could also be more cautious than it needs to be, and cost you more than it buys in protection. Let’s see.

What Is Generative AI, And How Does It Work?

Most software we use is deterministic. It produces the same output given the same inputs and conditions. We rely on that predictability when it comes to writing emails and reports, and analyzing sales or budget scenarios.

By contrast, GenAI is generative. It’s designed to produce diverse and even creative outcomes using the same or similar inputs. We want it to brainstorm with us. To summarize a report in its words. Or to change the tone of an email for us.

GenAI does this by using language patterns. It recognizes the relationship of words, phrases, and sentences and then uses statistical probability to select the best sequence of words to return to you, based on your prompts.

When you hear talk of GenAI training, this is what’s meant – training it to recognize and use language patterns. As an example, ChatGPT was trained on 300B words, including scoring and weighting them based on how they were used in sentences. This “deep learning” is what makes generative AI useful.

What Has GenAI Training to Do with Safe Use?

The way GenAI works tends to limit what others can know about your use. While it’s true that GenAI tools read your prompts and might store them for future training, GenAI’s focus on language patterns rather than whole entries helps control risk but not eliminate it. Consider an example.

Say you cook and want to make a tomato sauce you’ve never made before. You search online for something you haven’t heard of, and search engines return entire recipes to you. All the ingredients, quantities, steps, and times for you to read – as you would expect.

But what if you used GenAI?

Let’s say I had previously put my grandmother’s secret tomato sauce recipe – which includes a dash of soy sauce at the end – in a prompt asking a generative AI tool (a GPT) to make a shopping list for me. Let’s also say the GPT stored my prompt for future training. Would it return my grandmother’s recipe to you like search engines would?

Because GPTs analyze language patterns to return language patterns to you, it’s not likely to return her entire recipe the way a search engine would. But, had you told it you wanted to try something unusual, it could very well inform you that “Some tomato sauce recipes use a dash of soy sauce at the end” because that’s novel. It could offer that tip along with others, all based on novel ingredients from thousands (tens of thousands?) of tomato sauce recipes.

It matters little if a GPT returns my grandmother’s entire recipe to you if her secret ingredient is identified for you. Her secret is out. But had you asked a GPT for Indian tomato sauce recipes, or different recipes with paprika, it might not have considered a dash of soy sauce at the end relevant. Remember, it’s all about what you ask and the relevance a GPT determines using language patterns and statistical probability.

So, is your proprietary or privileged business information at risk of being made public, through your use of GPTs trained on your prompts?

The answer is not no, but is it ever? The answer is yes, depending, and now you understand why. What, then, are safe uses of public GPTs?

A Word About Types of GPTs

AI terminology can be confusing. Glossaries contain dozens of terms, many of which sound like they say the same thing. Even the boundaries between simple terms like open, public, and proprietary aren’t so clean that certain terms always and only apply to ChatGPT or Bard, for example, while other terms always and only apply to, say, ACME Inc’s AI-assisted proposal tool. For the sake of easy reference, let’s divide products this way:

  • Public refers to ChatGPT, Bard, and others you can try for free by registering at the tool’s website
  • Private refers to dedicated, domain-specific tools you pay to use by user, per month, or by some other unit

We realize this might confuse architectures, fail to account for products with free and paid versions, ignore distinctions between publicly and privately held companies, and more. That’s okay because making those distinctions won’t change what we’re saying about safe use.

One safe-use advantage of a private tool is you can build and separate your document repository, and use only your repository to train the tool. Your vendor’s tool might also have a data relationship to foundational models, however, which might expose your data to others through training. Vendors know how to firewall your data and let you opt out of model training. Read the vendor’s and data use and privacy policies, understand the tool’s settings, and talk to the vendor if you have questions.

Can you also use a public tool safely? You can.

First, public tools might also permit you to prevent sessions from being used to train the GPT. Read their data use and privacy policies to understand how your data will be used, and to see if you can opt out of training.

Second, many valuable uses will have nothing to do with proprietary or privileged data. A proposal manager might use a GPT to improve their understanding of technical issues, to improve their conversations with technical SMEs. A team lead might role play with a GPT to understand the perspective of others on the team without ever using proprietary information. If you want to keep the risk-reward scales tipped in your favor, clarify what you want to accomplish with a particular use, know what success looks like, and ask yourself what might go wrong. You’ll find many ways to prompt a GPT which don’t require business data or information.

So, What’s the Bottom Line?

Recall my colleague’s dire warning at the conference: “Whatever you put on ChatGPT is out there. Gone for good. Out of your control.”

It’s true that the content of your prompts can be out there, depending on policies and settings. But it’s also true you can prevent the leaking of proprietary and privileged information.

But it’s also true that the way GenAI uses what’s out there reduces some risk for you. How safe that feels is a subjective judgment we’ll talk about in the next article. But understanding how GenAI trains helps you understand how information you provide in prompts can show up for future users.

In the GenAI Discovery Project, DWPA is experimenting with public and private tools. Using public tools, we know there’s zero chance we’ll give competition any advantage – because there’s no advantage at stake. There’s no soy sauce in the prompts. For uses where there’s a chance we could give something away, we know it’s a small chance and we weigh the gain we want from the harm we don’t want, and act accordingly.

DWPA has not used private tools, yet, beyond Discovery Project trials, so we can’t speak to practices with them. We know private tools have additional safeguards built in. If you use or are considering a private tool, talk to your vendor about how it’s trained and how your data might be included.

Whether using a public or private tool, read your tool’s privacy policy or statement. They’re not generally written for human reading, but gut it out so you know what’s happening to your data. You’ll probably see a choice for opting your content out of tool training. DWPA has exercised that option.

Beyond understanding how GenAI tools train and work, safe use comes down to use cases and risk tolerance. We’ll look at that in the next article but, for now, we’ll leave you with the thought that you probably already engage in a practice which is like determining GenAI safe use: Asking questions at an industry day, or in written Q&A during a solicitation process.

You can ask in ways which show your hand, or in ways which don’t. You weigh the odds of gaining information to your advantage versus benefiting your competition and neutralizing your gain. You might have done this for years, and it’s a risk-reward decision similar to deciding how to use GenAI, especially public tools.

To learn more, contact Lou.Kerestesy@DWPAssociates.com.