NDCG Full Coverage¶
NDCG Full Coverage returns NDCG only when both baseline and comparison periods are covered. Otherwise it returns 0.0.
Formula¶
\[
NDCGFullCoverage = NDCG@K if coverage == 1.0 else 0.0
\]
Inputs¶
query,retrieved_ids,gold_ids, andretrieved_docsbaseline_anchorandcomparison_anchortemporal_focusk(cutoff)
Output¶
- Range: [0, 1], higher is better when full coverage is achieved.
Prompt (LLM mode)¶
You are grading retrieved documents for a temporal trend/change query.
Query: "{query}"
Temporal Focus: {temporal_focus}
Baseline anchor period: {baseline_anchor}
Comparison/current anchor period: {comparison_anchor}
Document:
{document}
Decide:
1) covers_baseline: true if the document contains evidence about the BASELINE anchor period.
2) covers_comparison: true if the document contains evidence about the COMPARISON/current anchor period.
Return ONLY valid JSON:
{"covers_baseline": true/false, "covers_comparison": true/false}
Examples¶
from tempoeval.metrics import NDCGFullCoverage
metric = NDCGFullCoverage()
metric.llm = llm
score = metric.compute(
query="How has X changed since 2015?",
retrieved_ids=["doc1", "doc2"],
gold_ids=["doc1"],
retrieved_docs=["..."],
baseline_anchor="2015",
comparison_anchor="2024",
k=5
)
Synchronous usage
Use compute(...) for sync calls and acompute(...) for async.