Can anyone write on-brand copy with an AI agent?
Let's find out.

What I did: I created a working AI agent prototype from 0-1 that could turn any messaging or idea into on-brand copy that sounded like Wellhub.

Why I did it: My main goal was to gain hands-on experience applying the principles of content design to create an intuitive agentic-AI experience. Given the company-wide push to become more familiar with AI tools, I was also curious to explore integrative AI solutions that could boost productivity and help break through creative blocks.

The challenge

If an AI Agent can essentially "do anything", defining a singular purpose proved to be an early obstacle. Should it help my coworkers build a web page? Should it build a full message house after a brief intake? Should it just be a glorified on-demand editor? And then, after I pinned down the use case for development, the intricate, grey-area issues began to emerge:

What is the ideal output format?
What determines an AI experience vs a tool?

How should the AI sound, behave, and interact with a user?
How can I ensure consistency and accuracy of content?
What shouldn't this AI do?

And the list of tough questions goes on. Ultimately, I chose to use rapid iterative testing as the best approach to address these questions and apply my own sensibilities to drive the outcome.

The solution

Think big. Start small.

Testing revealed a huge insight - Gemini Gems are not yet ready for a complex set of tasks. A single task with a clear format is when you start to see the magic. With this in mind, I set the Gem on a path to be an on-brand copy helper, providing on-demand assistance to help anyone write better for Wellhub.

Now that I had a direction and a clear purpose, I aligned with my senior leaders to set a timeline for an internal launch and get more participants from our brand team for testing.

My contributions and key milestones

Defining the Agent’s purpose
Clarified what the Gem is responsible for and what it is not, so it stayed focused and useful.
Creating all system messages to instruct the Gem
Wrote and iterated on system messages that shaped behavior, tone, and decision-making.
Documenting all Agent tests to make improvements
Tracked test scenarios and results to surface issues and guide ongoing improvements.
Sourcing documents and benchmarks for references
Selected and organized reference materials and benchmarks to ground outputs in trusted sources.
Defining agentic performance criteria
Defined clear criteria to evaluate quality, consistency, and alignment with intended use cases.
Ongoing fine-tuning for experience
Continuously refined prompts and logic based on testing, feedback, and real usage.
Syncing with Senior Brand and Creative Director for internal launch
Worked closely with brand leadership to align the Gem with voice, values, and creative standards.
Presenting the Gem prototype to the full brand team
Presented the prototype, walking through intent, capabilities, and next steps.
Creating and managing a live doc to track progress
Maintained a shared working document to keep teams aligned and progress visible.

Results and takeaways

Using the current model of Gemini and based on the evaluation by 8 participants (including myself) of the gem, this was an exciting and informative step forward; however, this agent was not considered ready to be shared broadly across the company.

Key questions that remain:
-Is the content customer-facing ready?
-Is the Gem foolproof from hallucinations?
-Are the rules built into the system instructions too limiting or restrictive?

Participant feedback: For the overall rating options ("Needs work", "Ok", and "Good"), 6 out of 8 participants rated it "Good". Additionally, most participants from our brand team said this would be great for inspiration, while other participants who aren't writers said they felt comfortable using the content outputs as is.

Takeaways: Gems can be a powerful tool when given the right data, instructions, and development. However, creating an AI agent in this form requires significant effort to govern and ensure consistently high-quality, accurate, and appropriate outputs.

Next steps

The most obvious next steps are to continue fine-tuning and testing based on the latest feedback. From a technical perspective, it would be vital to duplicate this gem on other models or GPTs and evaluate which models perform better.

Going back to the first challenge I faced, it would also be important to survey marketing and product orgs to determine what the most requested type of agents are so that we could offer a collection of Agents that served a variety of tasks and use cases.

Still curious?

Learn about me

JESSE SELLS

Can anyone write on-brand copy with an AI agent? Let's find out.