Generative AI with Vertex AI: Prompt Design

7 min readSep 14, 2024

Generative AI has revolutionized the way we approach natural language processing and content creation. Google Cloud’s Vertex AI platform offers powerful tools for leveraging generative AI models, with a particular focus on effective prompt design. In this article, we’ll explore how to use Vertex AI for generative AI tasks and dive into best practices for prompt design.

Let's explore how to use Google Cloud’s Vertex AI platform to design and implement prompts for generative AI models.

Prerequisites

A Google Cloud account with Vertex AI-enabled
Python 3.7 or later installed
Google Cloud SDK installed and configured

Step 1: Set up your environment

In your Google Cloud project, navigate to Vertex AI Workbench. In the top search bar, enter Vertex AI Workbench of the Google Cloud console.

In my case I am selecting User-managed-notebooks as I already have a notebook instance, in your case you might have to create a new Instance, based on your choice of machine you can move to the next steps once your instance is ready

First, let’s set up our Python environment and install the necessary libraries.

%pip install --upgrade google-cloud-aiplatform

For Vertex AI Workbench you can restart the terminal using the button on top.

2. Authenticating your notebook environment

import sys

if "google.colab" in sys.modules:
    from google.colab import auth

    auth.authenticate_user()

If you are running this notebook in a local development environment:
Install the Google Cloud SDK.

Obtain authentication credentials. Create local credentials by running the following command and following the oauth2 flow (read more about the command here)

Replace "your-project-id" with your actual Google Cloud project ID and "your-region" with the region, you have created the instance

import vertexai

PROJECT_ID = "[your-project-id]"  # @param {type:"string"}
REGION = "your-region"  # @param {type:"string"}

vertexai.init(project=PROJECT_ID, location=REGION)

from vertexai.language_models import TextGenerationModel
from vertexai.language_models import ChatModel

let’s Load Model

generation_model = TextGenerationModel.from_pretrained("text-bison")

you might see warning message

Step 3: Design your prompt

Prompt engineering is all about how to design your prompts so that the response is what you were indeed hoping to see.

The idea of using “unfancy” prompts is to minimize the noise in your prompt to reduce the possibility of the LLM misinterpreting the intent of the prompt. Below are a few guidelines on how to engineer “unfancy” prompts.

Be concise
Be specific, and well-defined
Ask one task at a time
Improve response quality by including examples
Turn generative tasks to classification tasks to improve safety

3.1 Be concise

prompt = "What do you think could be a good name for a flower shop that specializes in selling bouquets of dried flowers more than fresh flowers? Thank you!"

print(generation_model.predict(prompt=prompt, max_output_tokens=256).text)

3.2 Be specific, and well-defined

prompt = "Generate a list of ways that makes Earth unique compared to other planets"

print(generation_model.predict(prompt=prompt, max_output_tokens=256).text)

3.3 Ask one task at a time

✅ Recommended. The prompts below asks one task a time.

prompt = "What's the best method of boiling water?"

print(generation_model.predict(prompt=prompt, max_output_tokens=256).text)

3.4 Watch out for hallucinations

Although LLMs have been trained on a large amount of data, they can generate text containing statements not grounded in truth or reality; these responses from the LLM are often referred to as “hallucinations” due to their limited memorization capabilities. Note that simply prompting the LLM to provide a citation isn’t a fix to this problem, as there are instances of LLMs providing false or inaccurate citations. Dealing with hallucinations is a fundamental challenge of LLMs and an ongoing research area, so it is important to be cognizant that LLMs may seem to give you confident, correct-sounding statements that are in fact incorrect.

Note that if you intend to use LLMs for the creative use cases, hallucinating could actually be quite useful.

prompt = "Who was the first elephant to visit the moon?"

print(generation_model.predict(prompt=prompt, max_output_tokens=256).text)

No elephant has ever visited the moon.

Clearly the chatbot is hallucinating since no elephant has ever flown to the moon. But how do we prevent these kinds of inappropriate questions and more specifically, reduce hallucinations?

There is one possible method called the Determine Appropriate Response (DARE) prompt, which cleverly uses the LLM itself to decide whether it should answer a question based on what its mission is.

Let’s see how it works by creating a chatbot for a travel website with a slight twist.

chat_model = ChatModel.from_pretrained("chat-bison@002")

chat = chat_model.start_chat()
dare_prompt = """Remember that before you answer a question, you must check to see if it complies with your mission.
If not, you can say, Sorry I can't answer that question."""

print(
    chat.send_message(
        f"""
Hello! You are an AI chatbot for a travel web site.
Your mission is to provide helpful queries for travelers.

{dare_prompt}
"""
    )
)

3.5 Improve response quality by including examples

Another way to improve response quality is to add examples in your prompt. The LLM learns in-context from the examples on how to respond. Typically, one to five examples (shots) are enough to improve the quality of responses. Including too many examples can cause the model to over-fit the data and reduce the quality of responses.

Similar to classical model training, the quality and distribution of the examples is very important. Pick examples that are representative of the scenarios that you need the model to learn, and keep the distribution of the examples (e.g. number of examples per class in the case of classification) aligned with your actual distribution.

3.5.1 Zero-shot prompt

Below is an example of zero-shot prompting, where you don’t provide any examples to the LLM within the prompt itself.

prompt = """Decide whether a Tweet's sentiment is positive, neutral, or negative.

Tweet: I loved the new YouTube video you made!
Sentiment:
"""

print(generation_model.predict(prompt=prompt, max_output_tokens=256).text)

3.5.2 One-shot prompt

Below is an example of one-shot prompting, where you provide one example to the LLM within the prompt to give some guidance on what type of response you want.

prompt = """Decide whether a Tweet's sentiment is positive, neutral, or negative.

Tweet: I loved the new YouTube video you made!
Sentiment: positive

Tweet: That was awful. Super boring 😠
Sentiment:
"""

print(generation_model.predict(prompt=prompt, max_output_tokens=256).text)

3.5.3 Few-shot prompt

Below is an example of few-shot prompting, where you provide a few examples to the LLM within the prompt to give some guidance on what type of response you want.

prompt = """Decide whether a Tweet's sentiment is positive, neutral, or negative.

Tweet: I loved the new YouTube video you made!
Sentiment: positive

Tweet: That was awful. Super boring 😠
Sentiment: negative

Tweet: Something surprised me about this video - it was actually original. It was not the same old recycled stuff that I always see. Watch it - you will not regret it.
Sentiment:
"""

print(generation_model.predict(prompt=prompt, max_output_tokens=256).text)

3.6 Turn generative tasks into classification tasks to reduce output variability

3.6.1 Generative tasks lead to higher output variability

The prompt below results in an open-ended response, useful for brainstorming, but response is highly variable.

prompt = "I'm a high school student. Recommend me a programming activity to improve my skills."

print(generation_model.predict(prompt=prompt, max_output_tokens=256).text)

3.6.2 Classification tasks reduces output variability

The prompt below results in a choice and may be useful if you want the output to be easier to control.

prompt = """I'm a high school student. Which of these activities do you suggest and why:
a) learn Python
b) learn JavaScript
c) learn Fortran
"""

print(generation_model.predict(prompt=prompt, max_output_tokens=256).text)

Conclusion

In this hands-on article, we’ve explored how to use Vertex AI for generative AI tasks, specifically focusing on prompt design for product description generation. This example demonstrates the power of well-crafted prompts in guiding the AI model to produce desired outputs.

Remember to experiment with different prompt designs and model parameters to achieve the best results for your specific use case. Happy generating!

And that’s a wrap!

I appreciate you and the time you took out of your day to read this! Please watch out (follow & subscribe) for more, Cheers!