How to Build High-Performance E-Commerce Site Search at Enterprise Scale
Read Full Story

Large language models can generate fluent answers, but without grounded context they often produce vague or outdated information. In real world applications, especially in ecommerce and domain specific environments, AI systems must be grounded in accurate, structured data.
Marqo is an AI native search and product discovery platform that trains a dedicated large language model on each customer’s data. By combining catalog specific intelligence with real time retrieval, Marqo enables more accurate and context aware AI responses.
In this example, we demonstrate how Marqo can be used to ground a language model in recent news data to produce accurate summaries. While this demo uses news content, the same approach applies to ecommerce product data, knowledge bases, and domain specific content.

Thus, we can see the problem when we solely ask GPT3, “What is happening in business today?” It does not know and thus generates a generic response:
In fact, anyone following the financial markets knows ‘the “economy is slowly recovering” and “businesses are starting to invest again” is completely wrong!!
This example highlights why grounding matters. Without access to current or domain specific information, language models default to generic patterns. Marqo solves this by retrieving relevant context from indexed data and grounding the generation process in structured, searchable content. In ecommerce use cases, this means grounding responses in real product attributes, availability, and catalog data.
To solve this, we need to start our Marqo docker container, which creates a Python API we’ll interact with during this demo:
Next, let’s look at our example news documents corpus, which contains BBC and Reuters news content from 8th and 9th of November. We use “_id” as Marqo document identifier, the “date” the article was written, “website” indicating the web domain, “Title” for the headline, and “Description” for the article body:
We then index our news documents that manage both the lexical and neural embeddings. By default, Marqo uses SBERT from neural text embeding and has complete OpenSearch lexical and metadata functionality natively.
Now we have indexed our news documents, we can simply use Marqo Python search API to return relevant context for our GPT3 generation. For query “q”, we use the question and want to match news context based on the “Title” and “Description” text. We also want to filter our documents for “today”, which was ‘2022–11–09’.
Next, we insert Marqo’s search results into GPT3 prompt as context, and we try generating an answer again::
Grounding language models in structured data is essential for building reliable AI applications. Whether summarizing news, answering product questions, or powering intelligent product discovery, context determines accuracy.
Marqo applies this grounding approach within an AI native search and product discovery platform. By training a dedicated model on each retailer’s catalog and retrieving relevant context in real time, Marqo enables accurate, commerce aware AI experiences that improve relevance, trust, and conversion.
This example demonstrates how retrieval and generation can work together, but the true power comes when that intelligence is applied to ecommerce product discovery and revenue driven use cases.