How to optimize product content for large language models
Brian HennessyTalkoot Co-Founder & CEO
Some call it Large Language Model Optimization (LLMO). Others are calling it Generative AI Optimization (GAIO). Still others are labeling it Answer Engine Optimization (AEO). The acronym will likely change as the practice matures, but one thing is for sure: optimizing your product content for ChatGPT, Bard and Bing it’s here to stay.
Like other new, market-moving sales channels that have come before—think SEO, content marketing and Facebook ads—online retailers that get an early start optimizing their product content for inclusion in conversational AI services like Bard, Bing and ChatGPT will gain a massive advantage over their competition. In truth, these large language chatbots are already driving a steady flow of purchase-ready shoppers straight to retailers’ product pages.
The good news is that it doesn’t require a massive investment in resources to pull it off. But it does require a shift in processes, systems and resources, We dig into that at the end of the article.
A good first step into the bold new world of LLMO is to understand how your product content finds its way into the responses of these chatbots in the first place.
How LLMs use product information to help shoppers find the right products
There are two ways product information can be included in conversational AI responses: 1) Information about your products can be included in the LLM’s pre-training data, or 2) it can be included in the chatbot’s Retrieval-Augmented Generation (RAG) data.
Pre-training data is the massive collection of text and code that is used to train large language models to understand and use language on a foundational level. This collection encompasses everything from books, articles and discussion boards to corporate websites, wikis and educational materials.
Google and OpenAI don’t share how they gather this data so it’s difficult to know what specific content is included in the LLM’s pre-training data. And even if your product data is included in LLM pre-training data given the sheer breadth of data used to inform these models, it is improbable that the product information from any single retailer could significantly sway or greatly influence the LLM’s knowledge base or outputs.
Retrieval-augmented generation (RAG) Data
Retrieval-augmented generation (RAG) is a technique that allows Bard and Bing to access live content on the web to improve the quality of their responses. This means that RAG-based LLMs are able to generate responses that are informed by the latest information, even if that information was not included in their pre-training data.
Product content on your website is more likely to appear in RAG data because RAG results come from a much wider range of sources like ecommerce websites and social media. If you have high quality, relevant product content on your site, it will likely be used. If, on the other hand, the LLM judges your content to be inaccurate, inconsistent or overly biased, it can exclude it from results.
Finally, RAG data updates faster than pre-training data, so changes to your content show up quicker, ensuring chatbots are providing shoppers with the most up-to-date info about your products.
Given all this, it’s recommended retailers aim to get their product information included in the RAG data rather than pre-training data.
The less well known your products, the harder the product copy on your website needs to work
The more well-established your products, the more likely there is already plenty of content out there for these LLMs to feast upon, good and bad. If your product isn’t well-established and lacks 3rd party reviews, the product content on your own website will likely be the only sources of information these LLMs will have on your product. That makes it all the more important for less established brands to ensure the product stories across your site are complete, accurate and elicit loads of positive reviews.
How retailers should structure their content to ensure their products appear prominently in LLM chat results
When structuring product information in a way that can be easily understood and processed by the LLMs that power Bard or Bing, here are the 6:
1. High Quality Product Description: A complete, clear, and accurate product description is crucial for inclusion in chatbot conversation results. Thankfully, optimizing your product copy for chatbots follows many of the same rules as optimizing product copy for shoppers. The big change for most retailers is that their product information should be much deeper and more specific.
- Use conversational language.When people chat with an LLM, they expect to have a conversation. First and foremost, that means writing at a comfortable reading level. Though most of us read at a much higher level in our work life, the average adult reads most comfortably at an 8th grade reading level. That means:
- Using simple 1- and 2-sylable words and avoid jargon.
- Keeping sentences short and to the point.
- Using active voice instead of passive voice.
- Avoid subjective language that could be seen as bias
- Answer common questions. What questions might people ask about your products? Answer these questions in your product descriptions and other content. Consider adding a FAQ to the bottom of every product page. Though most people don’t read a FAQ, that content will be available for LLMs to access in case a shopper asks the question to a chatbot.
A quick and easy way to figure out what kind of questions shoppers have about your products is to ask the Chatbots themselves.
- Be specific. Don’t just say that your product is “great.” Explain why it’s great. Be as specific as possible when describing the use case your product is designed for. This will help LLMs better understand which shoppers to recommend your product to. It will also help generate higher quality and more consistent customer reviews, which will again, help LLMs better understand and more frequently recommend your product.
People don’t search for the “best product.” Best is different for everyone. Is yours the lightest-weight bike with the longest battery life? Say so on your PDP. Be specific about what makes your product best and for whom.
- Contextual Information: Include details on the product’s context of use or any other relevant information. For instance, if it’s a piece of sports equipment, provide information about which sports it’s used for, the level of play or position it’s suitable for (beginner, intermediate, expert, point guard, defender, etc.). Here is a list of all the types of contextual information shopper are interested in:
- Functionality and Purpose: Clearly describe what the product is designed for and how it’s used. This can help consumers understand if the product will meet their specific needs or solve their problem.
- Unique Selling Proposition (USP): Highlight what makes your product different or better than competing products. Be specific here, especially if you aren’t the market leader. People are looking for products that solve specific problems. Your products won’t show up in chatbot conversations if you speak in broad generalities.
- Materials and Construction: What materials is the product made from? How is it constructed? This can be particularly important for clothing, shoes, and sports equipment where the type and quality of materials can have a big impact on performance and comfort.
- Care Instructions: How should the product be cleaned or maintained? What should consumers avoid doing to ensure the product lasts as long as possible?
- Safety Information: Are there any potential safety concerns or precautions that users should be aware of?
- Brand Story: Consumers are often interested in the brand behind the product. What is the brand’s purpose? How are their products made? Do they prioritize sustainability or other ethical practices? Remember that most customers don’t come in through the homepage. Don’t be shy to talk about your brand story on every product page.
- Value: Explain why the product is priced as it is and the value consumers will get from it. This could involve discussing the quality of materials, craftsmanship, the longevity of the product, etc.
- Compatibility or Requirements: Does the product require, or work well with, any other products? This is often relevant for tech products but could also apply to things like sports equipment.
- Warranty Information: What kind of product warranty or guarantee does the company offer? This information can give consumers peace of mind about their purchase.
- Use keywords your consumers use. Different people use different terms to search for the same products. Make sure to include keywords in your product descriptions and other content that match the way your customers search. Fitness runners might search for “training shoes” whereas elite competitors might search for “racing flats.” Know who the customer is for each product and what the language they use to search for that product.
When you describe your product, use the words your consumers use to research those products.
- Product specifications: Detailed product specifications should be provided. This includes any technical details such as dimensions, weight, materials used, color, size, etc. The specifications should be categorized and labeled properly. It’s easy for chatbots to quickly build comparison charts for multiple products. If your specs aren’t on your product page for the LLM to grab, your product won’t be part of the comparison the chatbot creates for the users.
Chatbots can’t create a chart with information they don’t have.
- Freshness: When determining the quality of content, LLMs not only consider relevance to the question and authority of the source, they also consider freshness of the content. Consumers also buy the same product for different reasons throughout the year. Ensure you’re updating your product content at least 2x a year to ensure its seasonally relevant.
- Make your content easily consumable. Identify which details you want LLMs to focus on. Organize product descriptions into bite-sized text blocks, each tackling an aspect of what makes your product unique. Begin each with a header introducing the topic. This makes it easier for shoppers and LLMs to absorb the information.
Though the consumer-facing text is the most valuable content when it comes to LLM optimization, there are other content types and requirements you should also consider.
2. Multimedia content: Include a variety of different types of media on the product page. This includes images, videos, 3D renderings and other multimedia content. The more diverse the media, the better the large language model will be able to understand the product and answer questions about it.
3. Metadata and Tags: Use metadata and tags to provide additional context about the product. This could include the product’s intended audience (age group, gender, etc.), secondary use cases not included in the consumer-facing description (outdoor, indoor, winter, summer, etc.), and product features (water-resistant, machine-washable, etc.).
4. Structured Format: Data should be formatted in a structured way that makes it easy for the LLMs to process. For instance, use JSON or XML formats to structure your data. Standardization across the data set is also important to reduce any potential ambiguity. As an example, just as in formatting content for rich snippets, breaking your content up into sections organized by common questions and using those questions as headers allows LLMs to better understand the content and match it to user inquiries.
Organize you’re the product content on your product page into bite-sized snippets like Earth Breeze does on their PDP.
5. FAQs and Common Use-Cases: It’s also beneficial to include data on frequently asked questions or common use cases for the product. This could include care instructions, warranty information, and so on. Though most consumers might not read your FAQ, the LLMs can use this information to answer user questions about these topics.
6. Customer Reviews and Feedback: If available, customer reviews and feedback can provide valuable insights about the product’s use and performance in real-world scenarios.
The idea is to create a comprehensive experience where a language model can infer valuable information about the product that it can then communicate to its users. By providing detailed, structured, and consistent product information, you’ll be able to leverage these models effectively as a valuable new sales channel.
In the end, it’s about customers, not LLMs
This is a very long list of content and requirements. It likely represents a tripling or quadrupling the level of product content most retailers are currently producing. We would understand if retailers decided to stick their current levels of content production. But they would do it at their own peril. The reason LLMs require this level of content is because this is the level of content shoppers are demanding from the LLMs.
Today there is a very large gap between the amount of product content consumers need to buy and the amount of content most retailers are able to produce given current technology, teams, and timelines. Retailers that step up early will be rewarded with increased sales and loyalty.
How to get it all done?
Though daunting, this quantity of content is well within reach of most retailers without a significant increase costs or manpower. But it will take new systems and ways of working, with teams leaning heavily into generative AI to lighten the burden.
Today, many of the world’s leading online retailers are already increasing traffic, conversion rates and sales with Talkoot. Talkoot is an AI-powered product content production platform that helps the world’s most loved brands create deeper, more valuable connections with customers everywhere they buy through more scalable and powerful product storytelling.
To see how Talkoot can do the same for your brand, get in touch with one of our consultants.