Back to all Blog Posts

February 18, 2026

Improving Ecommerce Catalog Quality and Brand Safety with AI

Retailers rely on clean, accurate, and brand safe product content to drive conversion. But at scale, product catalogs often contain low quality images, inconsistent assets, or content that does not align with brand standards. When content issues appear in search results or product discovery journeys, they create friction and reduce trust.

Marqo is an AI native ecommerce search and product discovery platform that trains a dedicated large language model on each retailer’s catalog. This catalog trained intelligence improves relevance and personalization, and it can also be used to audit and curate catalog data at scale.

In this example, we applied Marqo to a large AI generated ecommerce dataset containing approximately 250,000 product images paired with titles and descriptions. Since the images were generated programmatically, manual review was impractical. We used Marqo to identify low quality, strange, and NSFW content and remove it before publishing the dataset for an ecommerce search demonstration.

Note: All examples included in this article are appropriately censored and pixelated for reader comfort.

Leveraging Data Discovery for Quality Assurance

Here at Marqo, we've recently compiled a dataset for an E-commerce search demonstration. The dataset is entirely AI-generated, housing approximately 250,000 images paired with product titles, text descriptions, and aesthetic scores.

As the dataset was AI-generated, we lacked explicit knowledge of the images' content. Despite guiding the generation process to stay within the E-commerce domain, the specific image contents remained a mystery. Given the dataset's size, manual inspection was impractical.

To ensure the dataset's appropriateness before going public with our demo, we decided to use Marqo to score the data for any NSFW (Not Safe for Work) images. Unsurprisingly, we stumbled upon several unsuitable images, particularly within the 'stockings' category, and various NSFW lingerie and underwear photos.

Marqo proved instrumental as a discovery tool in identifying and removing these images. Thanks to its powerful multimodal search and query composition, we could straightforwardly seek and weed out unwanted content. Notably, our exploration unearthed a number of bizarre and uncanny images we decided to discard.

Our Findings: A Closer Look at the Data

Strange Images

Marqo understands product imagery using AI that connects visual attributes with natural language. This makes it possible to search a catalog by describing a style, pattern, or visual issue, even when the product metadata does not explicitly mention it. For example, queries describing unusual or distorted images can quickly surface content that should be reviewed or removed.

Similarly, the query "AI Generated, fake, bizarre" presents us with this.

We believe that many of these stem from the AI's attempt to include a women or man in the product rather than generate a women's or men's version.

NSFW Images

Our exploration revealed that the majority of NSFW content was concentrated under the stockings category, initially brought to light with the query "lingerie".

The query "lingerie, nude", yielded the following top nine results:

Crafting Effective Queries with Marqo

A Useful Query for Strange Images

Detecting weird, AI-generated images is relatively straightforward. To refine our search, we can append 'deformed' to our existing query.

This query yields a more targeted set of results:

The people who are shirts that are also wearing shirts are a personal favourite.

An Effective Query for NSFW Images

Marqo facilitates the design of nuanced queries by allowing weighted components within the query.

Implementing this with Marqo is incredibly straightforward:

In this case, we employed a blend of intuition and experimentation to formulate our query. Being able to utilise natural language and weighted components makes for an intuitive design process. The query attempts to match NSFW images by combining the embeddings of each query item, according to their corresponding weights. We applied negative weights to some work-appropriate clothing items that might be misidentified as NSFW content.

Optimising Our Search to Eliminate Unwanted Data

To further ensure the relevance of our search results, we employed an additional technique. By manually inspecting the top 10 results from the previous step's query, we observed all responses were NSFW. To sharpen our search precision, we took the embeddings from these top 10 results and fed them back into the search - this introduced embeddings specifically representative of the data we aimed to eliminate.

Upon experimenting with various limits, we noticed our NSFW image results dwindled around a similarity score of roughly 0.79.

Subsequently, we conducted the search and deleted all images surpassing this threshold.

The same process was applied to our bizarre AI generated images.

We can clearly see that the density of strange images has been reduced. There are still some oddities however these should be some of the strangest left in the entire dataset.

Results of the Process

Through this process, we removed approximately 1,500 problematic images and significantly improved the quality and brand safety of the dataset. This ensured that the ecommerce search experience was not only relevant, but also trustworthy and aligned with real retailer standards.

While this example focused on images, the same approach can be applied to broader catalog quality workflows, including detecting mislabeled products, inconsistent attributes, duplicate listings, and content that damages shopper trust.

Marqo is an AI native ecommerce search and product discovery platform built to improve conversion and revenue. By training a dedicated model on each retailer’s catalog, Marqo helps ecommerce teams improve both discovery performance and the quality of the underlying product data.

Ready to explore better search?

Marqo drives more relevant results, smoother discovery, and higher conversions from day one.

Talk to a Search Expert