Focus Time¶
Focus Time is the core innovation behind TempoEval. It represents the specific time period(s) that a piece of text is semantically about—not when it was written, but what time it discusses.
The Problem with Traditional Metrics¶
Traditional text evaluation relies on surface-level text matching:
Text Overlap (ROUGE, BLEU)
Problem: "2020" and "2021" are only one character apart, but temporally distinct.LLM-as-a-Judge
- 💸 Expensive: \(0.01-\)0.10 per evaluation
- 🐌 Slow: 1-5 seconds per query
- 🎲 Non-deterministic: Different results on reruns
- 🔍 Opaque: Hard to debug why something scored low
The Solution: Focus Time¶
Focus Time converts text into sets of years (e.g., {2020, 2021}), enabling:
✅ Fast evaluation with set operations (intersection, union) ✅ Cheap computation (no API calls) ✅ Deterministic results (same input = same output) ✅ Interpretable scores (you can see which years match/mismatch)
Three Types of Focus Time¶
Extraction Methods¶
TempoEval supports three extraction methods with different trade-offs:
| Method | Speed | Cost | Accuracy | Best For |
|---|---|---|---|---|
| REGEX | ⚡ Instant | 💰 Free | ✓ Good | Explicit years ("in 2020") |
| TEMPORAL TAGGER | 🔵 Fast | 💰 Free | ✓✓ Better | Complex expressions ("last decade") |
| LLM | 🔴 Slow | 💸 Paid | ✓✓✓ Best | Implicit references ("during COVID") |
Method 1: REGEX (Default)¶
Fast pattern matching for explicit years + rule-based mappings.
Method 2: TEMPORAL TAGGER¶
External temporal tagging library. Multiple tagger options available:
| Tagger | Speed | Install | Description |
|---|---|---|---|
| DATEPARSER ⭐ | Fast | pip install dateparser |
RECOMMENDED - Pure Python, multi-language |
| PARSEDATETIME | Fast | pip install parsedatetime |
English natural language |
| FAST_PARSE_TIME | Ultra-fast | pip install fast-parse-time |
Sub-millisecond |
| HEIDELTIME | Very slow | pip install py_heideltime |
Java-based (requires Java JDK) |
from tempoeval.core.focus_time import QueryFocusTime, TemporalTagger
extractor = QueryFocusTime()
# Use dateparser (default, recommended - fastest pure Python)
qft = extractor.extract(
"What happened last decade?",
use_regex=False,
use_temporal_tagger=True,
tagger=TemporalTagger.DATEPARSER
)
print(qft.years) # Extracted years from dateparser
# Use HeidelTime (slower, requires Java)
qft = extractor.extract(
"What happened last decade?",
use_temporal_tagger=True,
tagger=TemporalTagger.HEIDELTIME
)
Handles:
- Relative expressions: "last year", "next month"
- Fuzzy periods: "early 2000s", "mid-1990s"
- Complex syntax: "between January 2020 and March 2021"
Method 3: LLM¶
Uses language models to infer implicit temporal references.
from tempoeval.core.focus_time import QueryFocusTime
from tempoeval.llm import OpenAIProvider
llm = OpenAIProvider(model="gpt-4o")
extractor = QueryFocusTime(llm=llm)
qft = extractor.extract(
"During the Great Depression",
use_regex=True,
use_llm=True
)
print(qft.years) # {1929, 1930, ..., 1939}
Handles:
- Historical events: "during WWII", "Victorian era"
- Cultural references: "the dot-com bubble", "the pandemic"
- Contextual inference: "when Obama was president"
Combining Methods¶
You can enable multiple methods - years will be merged into a unique set:
# Combine REGEX + temporal tagger + LLM for maximum coverage
qft = extractor.extract(
query,
use_regex=True, # Fast explicit extraction
use_temporal_tagger=True, # Parse complex expressions
use_llm=True # Handle implicit references
)
Set Operations on Focus Time¶
Once extracted, Focus Times support standard set operations:
Intersection (Overlap)¶
qft = FocusTime.from_years({2020, 2021})
dft = FocusTime.from_years({2021, 2022})
overlap = qft & dft # {2021}
print(len(overlap)) # 1 year overlap
Jaccard Similarity¶
Coverage Check¶
Practical Example¶
Let's evaluate a retrieval system:
from tempoeval.core import extract_qft, extract_dft
from tempoeval.metrics import TemporalPrecision
# Query
query = "What caused the 2008 financial crisis?"
qft = extract_qft(query) # {2008}
# Retrieved Documents
docs = [
"The collapse of Lehman Brothers in 2008...", # Relevant ✅
"The 1929 Wall Street Crash was...", # Irrelevant ❌
"The 2020 pandemic caused...", # Irrelevant ❌
]
dfts = [extract_dft(doc) for doc in docs]
# dfts = [{2008}, {1929}, {2020}]
# Evaluate
metric = TemporalPrecision(use_focus_time=True)
score = metric.compute(qft=qft, dfts=dfts, k=3)
print(f"Temporal Precision@3: {score}") # 0.333 (1/3 relevant)
Analysis:
- Only Doc 1 has
DFT ∩ QFT = {2008} ∩ {2008} = {2008}✅ - Doc 2:
{1929} ∩ {2008} = ∅❌ - Doc 3:
{2020} ∩ {2008} = ∅❌
Advantages of Focus Time¶
-
10,000x Faster
Set operations vs. LLM API calls: milliseconds vs. seconds
-
100% Free
No API costs for REGEX and Temporal Tagger methods
-
Deterministic
Same input always produces same output
-
Interpretable
See exactly which years match or mismatch
-
Complements LLMs
Use Focus Time for speed, LLM for complex edge cases
When to Use Which Method?¶
| Scenario | Recommended Method | Rationale |
|---|---|---|
| Production at scale (1000s of queries) | REGEX | Instant, free, good enough |
| Benchmarking on datasets | REGEX or TEMPORAL TAGGER | Reproducible, no API keys |
| Complex relative expressions | TEMPORAL TAGGER (dateparser) | Pure Python, fast, handles "last decade" |
| Java environment available | TEMPORAL TAGGER (heideltime) | Most accurate for complex expressions |
| Implicit historical references | LLM | Only method that handles "during COVID" |
| Maximum coverage | REGEX + TAGGER + LLM | All methods combined, years merged |
What's Next?¶
- Computation Modes - How to use Focus Time, LLM-judge, or Gold labels
- Metrics Overview - See how metrics use Focus Time
- Temporal Precision - Example metric using Focus Time