Getting Started

What is Normalized Discounted Cumulative Gain (NDCG)?

What is Normalized Discounted Cumulative Gain (NDCG)?

NDCG stands for Normalized Discounted Cumulative Gain. It’s a metric that evaluates the relevance of results returned by a search engine or recommendation algorithm, giving higher importance to the order of relevance. This metric is particularly useful when the goal is to ensure that the most relevant items appear as close to the top of the list as possible, maximizing user satisfaction by making the most useful information quickly accessible.

NDCG is given by the following formula:

$$ NDCG@K = \frac{DCG@K}{IDCG@K} $$

This formula contains Discounted Cumulative Gain (DCG) and Ideal Discounted Cumulative Gain (IDCG). Let’s discuss these two components.

What is Discounted Cumulative Gain (DCG)?

DCG is given by:

$$ DCG@K = \sum^{K}_{i=1}\frac{\text{relevance score of the item at }i}{\log_2(i+1)} $$

This formula contains two key components:

  • Relevance Score: Each result in the ranked list is assigned a relevance score, typically based on how closely it matches the user’s query.
  • Discount Factor: The relevance of each result is divided by the logarithm of its \((i+1)\) position. This logarithmic discounting means that results at higher positions contribute more to the DCG score than those at lower positions.

Example Calculation

If you have a list of relevance scores \([3, 2, 3, 0, 1]\) and want to calculate DCG@5:

$$ DCG@5 = \frac{3}{\log_2(1+1)} + \frac{2}{\log_2(2+1)} + \frac{3}{\log_2(3+1)} + \frac{0}{\log_2(4+1)} + \frac{1}{\log_2(5+1)} $$

Calculating each term, you get the cumulative score that reflects both relevance and position sensitivity for the top 5 results.

What is Ideal Discounted Cumulative Gain (IDCG)?

IDCG is the theoretical maximum DCG that you can achieve for a specific list of results. Essentially, IDCG is calculated by ordering all items in an ideal sequence—where the most relevant results are at the top of the list—and then applying the DCG calculation.

The formula for this can be written as:

$$ DCG@K = \sum^{K}_{i=1}\frac{\text{relevance score of the ideal item at }i}{\log_2(i+1)} $$

Again, this formula contains:

  • Ideal Relevance Ordering: First, arrange all items in the ideal order—i.e., with the highest relevance scores at the top, which represents the "perfect" ranking for that query.
  • Apply Discounting by Position: For each position i up to K, divide the relevance score by the logarithmic function \(\log_{2}(i+1)\), which discounts scores as you move down the list.

Example Calculation

If you’re calculating IDCG@5 with relevance scores sorted in ideal order as \([3, 3, 2, 1, 0]\) then:

$$ DCG@5 = \frac{3}{\log_2(1+1)} + \frac{3}{\log_2(2+1)} + \frac{2}{\log_2(3+1)} + \frac{1}{\log_2(4+1)} + \frac{0}{\log_2(5+1)} $$

This ideal DCG score provides a benchmark for normalizing DCG to calculate NDCG, allowing you to measure how close a ranked list is to the ideal.

Example Calculation of NDCG

Now we've established the two key components to NDCG, let's take a look at an example. Imagine a search query where you return a ranked list of items with the following relevance scores: \([3, 2, 3, 0, 1]\) . Let’s say we’re evaluating NDCG@5. We follow the steps:

  1. Calculate DCG@5: Sum the discounted relevance scores of each position up to the 5th item as we did previously.
  2. Calculate IDCG@5: Arrange the relevance scores in ideal order (i.e., \([3, 2, 3, 0, 1]\) and sum the discounted relevance scores as we did previously.
  3. Compute NDCG@5: Divide DCG@5 by IDCG@5 to normalize the score.

The resulting NDCG@5 score would indicate how well the system performed compared to an ideal ordering within the first five items. This process can be repeated for different positions (e.g., NDCG@10 or NDCG@100) to gain a broader view of ranking quality at varying list depths.

What is NDCG@10?

When working with NDCG, you'll often see it specified with an “@” symbol followed by a number (e.g., NDCG@10, NDCG@100). These indicate the depth of the result list being evaluated. For instance:

  • NDCG@10 evaluates only the top 10 results. This is particularly beneficial for search engines, ecommerce and recommendation engines, like Marqo.
  • NDCG@100 evaluates the top 100 results. This is particularly useful for catalog searches, information retrieval and specialized searches.

These metrics give insight into how well the system ranks relevant results within specific ranges of the results list.

Conclusion

NDCG is a powerful metric for understanding the relevance and quality of ordered results, making it a staple for evaluating search engines and recommendation systems. By focusing on both the relevance of items and their positions, NDCG provides a nuanced picture of how well a system meets user needs.

To understand how Marqo can help improve the relevance and quality of your results, book a demo with our team today.

Ellie Sleightholm
Head of Developer Relations at Marqo