The cdn module provides a sophisticated, enterprise-grade multi-source Content Delivery Network (CDN) management system with automatic fallback, health tracking, circuit breaker patterns, and intelligent source selection. It ensures resilient data fetching even when individual CDN sources experience failures, outages, or degraded performance.
Overview
The CDN system is a mission-critical component of tif1’s data fetching infrastructure, designed to maximize availability, reliability, and performance when retrieving Formula 1 data from remote sources. It implements battle-tested reliability patterns including circuit breaker logic, automatic failover, health monitoring, and priority-based routing to ensure your application remains operational even under adverse network conditions.
Why CDN Management Matters
Formula 1 data fetching presents unique challenges:
- Data availability: Historical and live session data must be reliably accessible across multiple seasons (2018-2026+)
- Geographic distribution: Users worldwide need fast access regardless of location
- Network resilience: CDN providers can experience outages, rate limiting, or degraded performance
- Bandwidth optimization: Session data can be large (telemetry, lap times, weather data)—minification reduces transfer sizes by 20-40%
- Cost efficiency: Proper CDN selection and fallback minimize redundant requests and bandwidth waste
The tif1 CDN system addresses these challenges through intelligent multi-source management, ensuring your application remains fast and reliable.
Core Capabilities
1. Multi-Source Fallback with Automatic Failover
The CDN manager maintains a prioritized list of CDN sources and automatically tries alternative sources when the primary source fails. This ensures high availability even when individual CDN providers experience issues.
How it works:
- Sources are tried in priority order (lowest priority number first)
- If a source fails, the next available source is tried immediately
- Successful requests reset the failure counter for that source
- Failed sources are temporarily disabled after reaching the failure threshold
Benefits:
- Zero-downtime failover: Automatic switching to backup sources without user intervention
- Transparent recovery: Sources automatically re-enable after successful requests
- Configurable priorities: Define preferred CDN providers based on performance, cost, or geographic location
2. Health Tracking and Circuit Breaker Pattern
Each CDN source is continuously monitored for reliability. The circuit breaker pattern prevents wasting time on consistently failing sources by temporarily disabling them after repeated failures.
Circuit breaker states:
- Closed (healthy): 0-2 failures—source is available and used normally
- Open (disabled): 3+ failures—source is excluded from
get_sources() until reset
- Half-open (recovering): After a successful request, failure count resets to 0
Failure threshold:
- Default: 3 consecutive failures before disabling
- Configurable via
_max_failures attribute
- Failures are tracked per-source in
_failure_counts dictionary
Benefits:
- Reduced latency: Skip known-bad sources instead of waiting for timeouts
- Prevent cascading failures: Isolate failing sources to protect overall system health
- Automatic recovery: Sources re-enable after successful requests (self-healing)
3. Priority-Based Routing
Sources are assigned priority levels (integer values, lower = higher priority) that determine the order in which they’re tried. This allows you to:
- Optimize for performance: Set fastest CDN as priority 1
- Optimize for cost: Set free/unlimited CDN as priority 1, paid CDN as backup
- Optimize for geography: Set regional CDN as priority 1 for users in that region
- Implement tiered fallback: Primary (priority 1) → Regional backup (priority 2) → Global backup (priority 3)
Priority behavior:
- Sources are sorted by priority (ascending order: 1, 2, 3, …)
- Sources with the same priority maintain their insertion order
- Priority can be set when creating
CDNSource objects
- Sources are automatically re-sorted when added via
add_source()
4. JSON Minification Support
Optional JSON minification reduces file sizes by 20-40% by removing whitespace and formatting. This significantly improves download speeds and reduces bandwidth costs, especially for large telemetry datasets.
How it works:
- When
use_minification=True, URLs are transformed: file.json → file.min.json
- jsDelivr CDN automatically serves minified versions when available
- Minification is transparent—parsed data is identical to non-minified versions
- Can be enabled globally via config or per-source
Performance impact:
- Telemetry data: ~35-40% size reduction (highly structured, repetitive data)
- Lap data: ~25-30% size reduction
- Session metadata: ~20-25% size reduction
- Network transfer time: Proportional to size reduction (40% smaller = 40% faster download)
When to enable:
- Bandwidth-constrained environments: Mobile networks, metered connections
- High-volume applications: Fetching data for multiple sessions/drivers
- Performance-critical applications: Minimizing cold-start latency
- Cost optimization: Reducing CDN bandwidth costs
5. Configurable Custom Sources
Add custom CDN endpoints to support:
- Private mirrors: Host your own copy of F1 data for guaranteed availability
- Regional CDNs: Optimize performance for specific geographic regions
- Corporate proxies: Route requests through internal infrastructure
- Development/testing: Point to local servers or staging environments
- Backup sources: Add redundant sources for mission-critical applications
Configuration methods:
- Programmatic: Use
add_source() method to add sources at runtime
- Config file: Define sources in
~/.tif1rc JSON configuration
- Environment variable: Set
TIF1_CDNS comma-separated list
6. Singleton Pattern for Global State
The CDN manager uses a singleton pattern to ensure consistent state across your entire application:
- Single source of truth: All data fetching operations use the same CDN manager instance
- Shared health tracking: Failure counts and circuit breaker state are global
- Consistent configuration: Configuration changes affect all subsequent requests
- Thread-safe reads: Multiple threads can safely read CDN sources
- Memory efficient: Only one CDN manager instance exists per process
7. Transparent Integration
The CDN system works seamlessly behind the scenes—you typically don’t need to interact with it directly:
- Automatic initialization: CDN manager is created on first use
- Zero-configuration default: Works out-of-the-box with jsDelivr CDN
- Session API integration:
Session.load() automatically uses CDN system
- Error handling: Network errors are caught and trigger automatic fallback
- Logging: Debug-level logs show which CDN source is being used
Architecture
The CDN system consists of three main components working together:
1. CDNSource (Data Class)
Represents a single CDN endpoint with configuration:
@dataclass
class CDNSource:
name: str # Human-readable identifier
base_url: str # HTTPS base URL (e.g., "https://cdn.jsdelivr.net/gh/TracingInsights")
priority: int = 0 # Priority level (lower = higher priority)
enabled: bool = True # Whether source is currently enabled
use_minification: bool = False # Enable JSON minification (.json → .min.json)
Responsibilities:
- Store CDN configuration (URL, priority, settings)
- Format complete URLs for specific resources via
format_url()
- Support minification by transforming file paths
2. CDNManager (Orchestrator)
Manages multiple CDN sources and implements fallback logic:
class CDNManager:
sources: list[CDNSource] # All configured sources (sorted by priority)
_failure_counts: dict[str, int] # Failure tracking per source
_max_failures: int = 3 # Circuit breaker threshold
Responsibilities:
- Initialize sources from configuration (file, env vars, defaults)
- Validate source URLs (HTTPS only, block raw.githubusercontent.com)
- Track health via failure counters (circuit breaker pattern)
- Provide available sources via
get_sources() (filters disabled/failed sources)
- Implement fallback logic via
try_sources() (tries sources in priority order)
- Manage source lifecycle (add, enable, disable, reset)
3. Global Singleton
A single CDNManager instance shared across the application:
_cdn_manager = CDNManager() # Module-level singleton
def get_cdn_manager() -> CDNManager:
"""Get global CDN manager instance."""
return _cdn_manager
Responsibilities:
- Ensure consistent CDN state across all data fetching operations
- Provide global access point via
get_cdn_manager() function
- Initialize once on first import, reuse for all subsequent calls
Data Flow
Here’s how a typical data fetch flows through the CDN system:
1. Application calls Session.load()
↓
2. Session requests data (e.g., laps.json) from HTTP layer
↓
3. HTTP layer calls cdn.try_sources(year, gp, session, path, fetch_func)
↓
4. CDN manager gets available sources via get_sources()
- Filters out disabled sources (enabled=False)
- Filters out failed sources (failure_count >= 3)
- Returns sources sorted by priority
↓
5. For each source (in priority order):
a. Format URL via source.format_url() (applies minification if enabled)
b. Call fetch_func(url) to attempt HTTP request
c. If successful:
- Call mark_success(source_name) to reset failure count
- Return fetched data (done!)
d. If DataNotFoundError (404):
- Re-raise immediately (data doesn't exist, no point trying other sources)
e. If other error (network, timeout, 5xx):
- Call mark_failure(source_name) to increment failure count
- Log warning and try next source
↓
6. If all sources fail:
- Raise NetworkError with details
↓
7. If any source succeeds:
- Return data to Session
- Session parses and caches data
When to Use This API
Most users never need to interact with the CDN API directly—it works transparently behind the scenes. However, you should use this API when:
Direct Interaction Scenarios
-
Adding custom CDN sources: You have a private mirror, regional CDN, or backup source
from tif1.cdn import get_cdn_manager, CDNSource
cdn = get_cdn_manager()
cdn.add_source(CDNSource(name="Private Mirror", base_url="https://mirror.example.com/f1", priority=1))
-
Monitoring CDN health: You want to track which sources are failing or build health dashboards
cdn = get_cdn_manager()
for source in cdn.sources:
failures = cdn._failure_counts[source.name]
print(f"{source.name}: {failures}/3 failures")
-
Debugging network issues: You need to understand which CDN is being used or why requests are failing
import logging
logging.getLogger("tif1.cdn").setLevel(logging.DEBUG) # Enable debug logs
# Logs will show: "Trying CDN: jsDelivr - https://cdn.jsdelivr.net/..."
-
Optimizing bandwidth: You want to enable minification for faster downloads
import tif1
config = tif1.get_config()
config.set("cdn_use_minification", True)
-
Testing fallback behavior: You want to verify your application handles CDN failures gracefully
cdn = get_cdn_manager()
# Simulate failures to test fallback
for _ in range(3):
cdn.mark_failure("jsDelivr")
# Verify backup source is used
assert len(cdn.get_sources()) > 0
-
Implementing custom retry logic: You’re building a custom data fetching layer
cdn = get_cdn_manager()
data = cdn.try_sources(2021, "Belgian%20Grand%20Prix", "Race", "laps.json", my_fetch_func)
-
Recovering from widespread failures: Network was down, now you want to reset all sources
cdn = get_cdn_manager()
cdn.reset() # Reset all failure counts to 0
Configuration Scenarios
- Setting up regional CDNs: Optimize performance for specific geographic regions
- Implementing cost optimization: Use free CDN as primary, paid CDN as backup
- Corporate environments: Route through internal proxies or mirrors
- Development/testing: Point to local servers or staging environments
- High-availability requirements: Add multiple backup sources for mission-critical applications
Never use raw.githubusercontent.com as a CDN source. GitHub’s raw content URLs have strict rate limits (60 requests/hour for unauthenticated requests) and are explicitly blocked by tif1 during initialization. Use jsDelivr or another proper CDN service instead.Why it’s blocked:
- Rate limits: 60 requests/hour (unauthenticated) or 5,000/hour (authenticated)
- No caching: Every request hits GitHub’s servers directly
- No global CDN: Slower performance for international users
- Terms of service: Not intended for CDN usage
Recommended alternatives:
- jsDelivr (default): Unlimited bandwidth, global CDN, automatic caching
- Cloudflare CDN: Fast global network, generous free tier
- Custom mirror: Host your own copy for guaranteed availability
The default CDN source is jsDelivr (https://cdn.jsdelivr.net/gh/TracingInsights), which provides:
- Excellent global performance: 800+ CDN locations worldwide
- Unlimited bandwidth: No rate limits or bandwidth caps
- Automatic caching: Intelligent cache invalidation and purging
- HTTP/2 and HTTP/3 support: Modern protocols for faster transfers
- Minification support: Automatic
.min.json serving
- 99.9% uptime SLA: Enterprise-grade reliability
For most users, the default configuration is optimal and requires no changes.
Quick Start
For most use cases, you don’t need to interact with the CDN API directly—it works transparently behind the scenes. However, here are common operations for advanced use cases:
Basic Usage (Transparent)
import tif1
# CDN system works automatically - no configuration needed
session = tif1.get_session(2021, "Belgian Grand Prix", "Race")
session.load() # CDN automatically handles data fetching with fallback
# That's it! The CDN manager:
# 1. Tries jsDelivr (default, priority 1)
# 2. Falls back to other sources if jsDelivr fails
# 3. Tracks health and disables failing sources
# 4. Automatically recovers when sources become healthy
Inspecting CDN Status
from tif1.cdn import get_cdn_manager
# Get the global CDN manager singleton
cdn = get_cdn_manager()
# Check which sources are available
print(f"Available sources: {len(cdn.get_sources())}/{len(cdn.sources)}")
for source in cdn.get_sources():
failures = cdn._failure_counts[source.name]
print(f" {source.name}:")
print(f" Priority: {source.priority}")
print(f" Health: {failures}/3 failures")
print(f" Minification: {'enabled' if source.use_minification else 'disabled'}")
print(f" URL: {source.base_url}")
Adding Custom CDN Sources
from tif1.cdn import get_cdn_manager, CDNSource
cdn = get_cdn_manager()
# Add a regional CDN for better performance
regional_cdn = CDNSource(
name="Europe CDN",
base_url="https://eu.cdn.example.com/f1data",
priority=1, # Higher priority than default (tries first)
enabled=True,
use_minification=True
)
cdn.add_source(regional_cdn)
# Add a backup CDN with lower priority
backup_cdn = CDNSource(
name="Backup CDN",
base_url="https://backup.example.com/f1data",
priority=3, # Lower priority (tries last)
enabled=True,
use_minification=False
)
cdn.add_source(backup_cdn)
# Verify sources were added
print(f"Total sources: {len(cdn.sources)}")
for source in cdn.get_sources():
print(f" Priority {source.priority}: {source.name}")
Enabling Minification
import tif1
# Enable minification globally for all sources
config = tif1.get_config()
config.set("cdn_use_minification", True)
config.save() # Persist to ~/.tif1rc
# Or enable per-source when adding custom CDNs
from tif1.cdn import get_cdn_manager, CDNSource
cdn = get_cdn_manager()
cdn.add_source(CDNSource(
name="Fast CDN",
base_url="https://fast.cdn.example.com/f1",
priority=1,
use_minification=True # Only this source uses minification
))
Monitoring and Recovery
from tif1.cdn import get_cdn_manager
cdn = get_cdn_manager()
# Check health status
def check_cdn_health():
available = len(cdn.get_sources())
total = len(cdn.sources)
health_pct = (available / total * 100) if total > 0 else 0
print(f"CDN Health: {health_pct:.1f}% ({available}/{total} sources available)")
for source in cdn.sources:
failures = cdn._failure_counts[source.name]
status = "🟢 Healthy" if failures == 0 else f"🟡 {failures} failures"
if failures >= 3:
status = "🔴 Disabled"
print(f" {source.name}: {status}")
check_cdn_health()
# Reset all sources after network recovery
cdn.reset()
print("All CDN sources reset to healthy state")
Configuration via File
Instead of programmatic configuration, you can define CDN sources in ~/.tif1rc:
{
"cdns": [
"https://cdn.jsdelivr.net/gh/TracingInsights",
"https://eu.cdn.example.com/f1data",
"https://backup.cdn.example.com/f1data"
],
"cdn_use_minification": true
}
Configuration via Environment Variables
# Set CDN sources (comma-separated)
export TIF1_CDNS="https://cdn.jsdelivr.net/gh/TracingInsights,https://backup.cdn.example.com/f1"
# Enable minification
export TIF1_CDN_USE_MINIFICATION=true
# Run your application
python my_f1_app.py
API Reference
Module-Level Functions
get_cdn_manager()
def get_cdn_manager() -> CDNManager
Returns the global CDN manager singleton instance. This function always returns the same CDNManager object, ensuring consistent CDN state across your entire application.
Why Singleton Pattern?
The singleton pattern is used because:
- Shared health tracking: CDN health monitoring must be consistent across all data fetching operations
- Global configuration: Configuration changes should affect all subsequent requests
- Consistent failure counts: Circuit breaker state must be global to prevent redundant retries
- Memory efficiency: Only one CDN manager instance exists per process
- Thread safety: Single instance simplifies concurrent access patterns
Returns:
CDNManager: The global singleton instance
Thread Safety:
The CDN manager is thread-safe for read operations (getting sources, checking health). However, modifying sources or marking failures should be done from a single thread or with appropriate synchronization:
- Thread-safe operations:
get_sources(), reading sources list, reading _failure_counts
- Not thread-safe:
add_source(), mark_failure(), mark_success(), reset()
For multi-threaded applications, consider:
- Configure all sources during initialization (single-threaded)
- Only read sources during concurrent execution
- Use locks if you need to modify sources from multiple threads
Example: Basic Usage
from tif1.cdn import get_cdn_manager
# Get the singleton instance
cdn = get_cdn_manager()
# All calls return the same instance
cdn2 = get_cdn_manager()
assert cdn is cdn2 # True - same object reference
# Check active sources
active_sources = cdn.get_sources()
print(f"Active CDN sources: {len(active_sources)}")
# Inspect each source
for source in active_sources:
print(f"\n{source.name}:")
print(f" URL: {source.base_url}")
print(f" Priority: {source.priority}")
print(f" Enabled: {source.enabled}")
print(f" Minification: {source.use_minification}")
Example: Health Dashboard
from tif1.cdn import get_cdn_manager
def create_cdn_dashboard():
"""Create a comprehensive CDN health dashboard."""
cdn = get_cdn_manager()
print("=" * 70)
print("CDN HEALTH DASHBOARD")
print("=" * 70)
# Overall statistics
total_sources = len(cdn.sources)
available_sources = len(cdn.get_sources())
disabled_sources = total_sources - available_sources
print(f"\nOverall Status:")
print(f" Total Sources: {total_sources}")
print(f" Available: {available_sources} ({available_sources/total_sources*100:.1f}%)")
print(f" Disabled: {disabled_sources}")
# Per-source details
print(f"\nSource Details:")
print(f"{'Name':<20} {'Priority':<10} {'Failures':<10} {'Status':<15} {'Minification':<15}")
print("-" * 70)
for source in cdn.sources:
failures = cdn._failure_counts[source.name]
# Determine status
if not source.enabled:
status = "🔴 Disabled"
elif failures >= cdn._max_failures:
status = "🔴 Circuit Open"
elif failures == 0:
status = "🟢 Healthy"
elif failures == 1:
status = "🟡 Degraded"
else:
status = "🟠 Critical"
minification = "✓ Enabled" if source.use_minification else "✗ Disabled"
print(f"{source.name:<20} {source.priority:<10} {failures}/{cdn._max_failures:<7} {status:<15} {minification:<15}")
print("=" * 70)
# Run dashboard
create_cdn_dashboard()
Example: Singleton Verification
from tif1.cdn import get_cdn_manager
# Verify singleton behavior
cdn1 = get_cdn_manager()
cdn2 = get_cdn_manager()
# Same object reference
assert cdn1 is cdn2
print(f"Same instance: {cdn1 is cdn2}") # True
# Modifications affect all references
cdn1.add_source(CDNSource(name="Test", base_url="https://test.com", priority=10))
# Both references see the change
print(f"cdn1 sources: {len(cdn1.sources)}")
print(f"cdn2 sources: {len(cdn2.sources)}") # Same count
# Failure tracking is shared
cdn1.mark_failure("jsDelivr")
print(f"cdn1 failures: {cdn1._failure_counts['jsDelivr']}")
print(f"cdn2 failures: {cdn2._failure_counts['jsDelivr']}") # Same count
CDNManager Class
The CDNManager class is the core orchestrator of the multi-source CDN system. It maintains a list of CDN sources, tracks their health, implements fallback logic with circuit breaker patterns, and provides methods for managing sources.
Initialization
The CDN manager is automatically initialized when you call get_cdn_manager() for the first time. During initialization, the following steps occur:
Initialization Sequence:
- Load configuration: Reads CDN sources from config file (
~/.tif1rc) or environment variables (TIF1_CDNS)
- Parse source URLs: Splits comma-separated CDN URLs into individual sources
- Validate sources: Ensures all URLs are HTTPS and not blacklisted (e.g.,
raw.githubusercontent.com)
- Create CDNSource objects: Wraps each URL in a
CDNSource with priority and settings
- Assign priorities: Sources are assigned priorities based on their order (1, 2, 3, …)
- Initialize health tracking: Sets up
_failure_counts dictionary with all sources at 0 failures
- Sort by priority: Orders sources so highest-priority (lowest number) is tried first
- Fallback to defaults: If no valid sources found, uses jsDelivr as default
Configuration Sources (in order of precedence):
- Environment variable:
TIF1_CDNS (comma-separated list of HTTPS URLs)
- Config file:
~/.tif1rc in home directory or current directory (if TIF1_TRUST_CWD_CONFIG=true)
- Default:
https://cdn.jsdelivr.net/gh/TracingInsights (jsDelivr CDN)
Validation Rules:
- URLs must start with
https:// (HTTP is rejected for security)
- URLs containing
raw.githubusercontent.com are rejected (rate limits)
- Invalid URLs are logged as warnings and skipped
- If all URLs are invalid, falls back to default jsDelivr CDN
Example: Initialization via Environment Variable
import os
# Set CDN sources before importing tif1
os.environ["TIF1_CDNS"] = "https://cdn.jsdelivr.net/gh/TracingInsights,https://backup.cdn.com/f1"
os.environ["TIF1_CDN_USE_MINIFICATION"] = "true"
# Now import and use
from tif1.cdn import get_cdn_manager
cdn = get_cdn_manager()
print(f"Initialized with {len(cdn.sources)} sources")
for source in cdn.sources:
print(f" {source.name} (priority {source.priority})")
Example: Initialization via Config File
Create ~/.tif1rc:
{
"cdns": [
"https://cdn.jsdelivr.net/gh/TracingInsights",
"https://eu.cdn.example.com/f1data",
"https://us.cdn.example.com/f1data"
],
"cdn_use_minification": true
}
Then use in Python:
from tif1.cdn import get_cdn_manager
# Automatically loads from ~/.tif1rc
cdn = get_cdn_manager()
print(f"Loaded {len(cdn.sources)} sources from config:")
for source in cdn.sources:
print(f" Priority {source.priority}: {source.name}")
print(f" URL: {source.base_url}")
print(f" Minification: {source.use_minification}")
Example: Validation Behavior
import os
from tif1.cdn import get_cdn_manager
# Mix of valid and invalid URLs
os.environ["TIF1_CDNS"] = ",".join([
"https://cdn.jsdelivr.net/gh/TracingInsights", # Valid
"http://insecure.cdn.com/f1", # Invalid (HTTP)
"https://raw.githubusercontent.com/user/repo", # Invalid (blocked)
"https://backup.cdn.com/f1", # Valid
"not-a-url", # Invalid (malformed)
])
cdn = get_cdn_manager()
# Only valid sources are loaded
print(f"Valid sources: {len(cdn.sources)}") # 2 (jsDelivr and backup)
for source in cdn.sources:
print(f" {source.name}: {source.base_url}")
# Check logs for warnings about invalid URLs
# WARNING: Skipping invalid CDN URL: http://insecure.cdn.com/f1
# WARNING: Skipping unsupported CDN URL: https://raw.githubusercontent.com/user/repo
Attributes
sources
List of all configured CDN sources, sorted by priority (lowest priority number first). This includes both enabled and disabled sources.
Characteristics:
- Always sorted: Maintained in priority order (1, 2, 3, …)
- Includes all sources: Both
enabled=True and enabled=False sources
- Includes failed sources: Sources that have exceeded failure threshold
- Mutable: Can be modified directly, but prefer using
add_source() for automatic sorting
Use Cases:
- Iterate over all sources (including disabled) for health reporting
- Count total configured sources
- Inspect source configuration (URLs, priorities, settings)
- Debug CDN setup
Example: Inspecting All Sources
from tif1.cdn import get_cdn_manager
cdn = get_cdn_manager()
print(f"Total configured sources: {len(cdn.sources)}")
print(f"Available sources: {len(cdn.get_sources())}")
print()
for i, source in enumerate(cdn.sources, 1):
status = "enabled" if source.enabled else "disabled"
failures = cdn._failure_counts[source.name]
circuit_status = "open (disabled)" if failures >= cdn._max_failures else "closed (active)"
print(f"{i}. {source.name} ({status})")
print(f" URL: {source.base_url}")
print(f" Priority: {source.priority}")
print(f" Minification: {source.use_minification}")
print(f" Failures: {failures}/{cdn._max_failures}")
print(f" Circuit breaker: {circuit_status}")
print()
Example: Filtering Sources
from tif1.cdn import get_cdn_manager
cdn = get_cdn_manager()
# Get only enabled sources
enabled_sources = [s for s in cdn.sources if s.enabled]
print(f"Enabled sources: {len(enabled_sources)}")
# Get only sources with minification
minified_sources = [s for s in cdn.sources if s.use_minification]
print(f"Sources with minification: {len(minified_sources)}")
# Get sources by priority
high_priority = [s for s in cdn.sources if s.priority == 1]
print(f"High-priority sources: {len(high_priority)}")
# Get healthy sources (no failures)
healthy_sources = [s for s in cdn.sources if cdn._failure_counts[s.name] == 0]
print(f"Healthy sources: {len(healthy_sources)}")
_failure_counts
_failure_counts: dict[str, int]
Internal dictionary tracking consecutive failure counts for each CDN source. When a source’s failure count reaches _max_failures (default: 3), it’s automatically excluded from get_sources() until reset.
Structure:
{
"jsDelivr": 0, # Healthy
"Backup CDN": 2, # Degraded (2 failures)
"Regional CDN": 3, # Disabled (circuit breaker open)
}
Behavior:
- Initialized to 0: All sources start with 0 failures (healthy state)
- Incremented on failure:
mark_failure() increments the count
- Reset on success:
mark_success() resets to 0
- Circuit breaker threshold: Sources with count >=
_max_failures are excluded
- Persistent across requests: Counts persist for the lifetime of the CDN manager instance
Note: This is an internal attribute. Use the following methods instead of modifying directly:
mark_failure(source_name) to increment
mark_success(source_name) to reset
reset() to reset all sources
Example: Monitoring Failure Counts
from tif1.cdn import get_cdn_manager
cdn = get_cdn_manager()
def monitor_failures():
"""Monitor failure counts for all sources."""
print("CDN Failure Monitoring:")
print(f"{'Source':<20} {'Failures':<10} {'Status':<15}")
print("-" * 45)
for source in cdn.sources:
failures = cdn._failure_counts[source.name]
if failures == 0:
status = "🟢 Healthy"
elif failures < cdn._max_failures:
status = f"🟡 {failures} failures"
else:
status = "🔴 Disabled"
print(f"{source.name:<20} {failures}/{cdn._max_failures:<7} {status:<15}")
# Initial state
monitor_failures()
# Simulate some failures
cdn.mark_failure("jsDelivr")
cdn.mark_failure("jsDelivr")
print("\nAfter 2 failures:")
monitor_failures()
# Successful request resets
cdn.mark_success("jsDelivr")
print("\nAfter successful request:")
monitor_failures()
Example: Failure Count Persistence
from tif1.cdn import get_cdn_manager
cdn = get_cdn_manager()
# Simulate failures
for i in range(2):
cdn.mark_failure("jsDelivr")
print(f"Failure {i+1}: count = {cdn._failure_counts['jsDelivr']}")
# Failure count persists across function calls
def check_health():
cdn = get_cdn_manager() # Same singleton instance
return cdn._failure_counts["jsDelivr"]
print(f"Persistent count: {check_health()}") # Still 2
# Reset to clear
cdn.reset()
print(f"After reset: {cdn._failure_counts['jsDelivr']}") # 0
_max_failures
Maximum number of consecutive failures before a CDN source is automatically disabled (circuit breaker threshold). This implements a circuit breaker pattern to avoid wasting time on consistently failing sources.
Default Value: 3 consecutive failures
Circuit Breaker States:
- Closed (healthy):
failure_count < _max_failures → Source is available
- Open (disabled):
failure_count >= _max_failures → Source is excluded from get_sources()
- Half-open (recovering): After
mark_success(), count resets to 0 → Source becomes available again
Why 3 Failures?
The default threshold of 3 provides a good balance:
- Not too sensitive: Tolerates transient network issues (1-2 temporary failures)
- Not too lenient: Quickly disables consistently failing sources (3 failures = clear pattern)
- Fast recovery: Single successful request resets the counter
Customization:
You can modify this value if needed:
from tif1.cdn import get_cdn_manager
cdn = get_cdn_manager()
# More lenient (tolerate more failures)
cdn._max_failures = 5
# More aggressive (disable faster)
cdn._max_failures = 2
# Disable circuit breaker (never auto-disable)
cdn._max_failures = float('inf') # Not recommended
Example: Circuit Breaker Behavior
from tif1.cdn import get_cdn_manager
cdn = get_cdn_manager()
print(f"Circuit breaker threshold: {cdn._max_failures} failures\n")
# Simulate failures until circuit opens
for i in range(1, 5):
cdn.mark_failure("jsDelivr")
failures = cdn._failure_counts["jsDelivr"]
available = any(s.name == "jsDelivr" for s in cdn.get_sources())
print(f"Failure {i}:")
print(f" Count: {failures}/{cdn._max_failures}")
print(f" Available: {available}")
if failures >= cdn._max_failures:
print(f" ⚠️ Circuit breaker OPEN - source disabled")
print()
Example: Custom Threshold
from tif1.cdn import get_cdn_manager
cdn = get_cdn_manager()
# Set more lenient threshold for unreliable networks
cdn._max_failures = 5
print(f"Custom threshold: {cdn._max_failures} failures")
# Now sources tolerate more failures before disabling
for i in range(6):
cdn.mark_failure("jsDelivr")
failures = cdn._failure_counts["jsDelivr"]
available = any(s.name == "jsDelivr" for s in cdn.get_sources())
if i == 4:
print(f"After {i+1} failures: still available = {available}") # True
elif i == 5:
print(f"After {i+1} failures: still available = {available}") # False
Methods
get_sources()
def get_sources() -> list[CDNSource]
Returns a list of currently available CDN sources, sorted by priority (lowest priority number first). This method filters out sources that should not be used for data fetching.
Filtering Logic:
The method excludes:
- Disabled sources: Sources with
enabled=False
- Failed sources: Sources where
_failure_counts[name] >= _max_failures (circuit breaker open)
Returns:
list[CDNSource]: Available sources sorted by priority (ascending)
- Empty list if all sources are disabled or have exceeded failure threshold
Sorting Behavior:
- Sources are sorted by priority: 1, 2, 3, … (lower number = higher priority)
- Sources with the same priority maintain their insertion order (stable sort)
- The first source in the list is tried first by
try_sources()
Use Cases:
- Data fetching: Primary method used by
try_sources() to determine which sources to attempt
- Health monitoring: Check how many sources are currently available
- Debugging: Understand which sources will be used for the next request
- Load balancing: Implement custom logic based on available sources
Example: Basic Usage
from tif1.cdn import get_cdn_manager
cdn = get_cdn_manager()
# Get available sources
sources = cdn.get_sources()
if not sources:
print("⚠️ Warning: No CDN sources available!")
print("All sources are either disabled or have exceeded failure threshold.")
else:
print(f"✓ Available CDN sources: {len(sources)}/{len(cdn.sources)}")
for i, source in enumerate(sources, 1):
failures = cdn._failure_counts.get(source.name, 0)
print(f"\n{i}. {source.name}")
print(f" Base URL: {source.base_url}")
print(f" Priority: {source.priority}")
print(f" Minification: {'✓ enabled' if source.use_minification else '✗ disabled'}")
print(f" Health: {failures}/{cdn._max_failures} failures")
Example: Using Primary Source
from tif1.cdn import get_cdn_manager
cdn = get_cdn_manager()
sources = cdn.get_sources()
if sources:
# First source is highest priority
primary = sources[0]
# Format URL for Belgian GP 2021 Race drivers
url = primary.format_url(
year=2021,
gp="Belgian%20Grand%20Prix",
session="Race",
path="drivers.json"
)
print(f"Primary CDN: {primary.name}")
print(f"URL: {url}")
# If minification is enabled, URL will be:
# https://cdn.jsdelivr.net/gh/TracingInsights/2021@main/Belgian%20Grand%20Prix/Race/drivers.min.json
else:
print("No CDN sources available - cannot fetch data")
Example: Comprehensive Health Monitoring
from tif1.cdn import get_cdn_manager
from datetime import datetime
def monitor_cdn_health():
"""Monitor CDN source health with detailed reporting."""
cdn = get_cdn_manager()
total_sources = len(cdn.sources)
available_sources = cdn.get_sources()
available_count = len(available_sources)
print("=" * 70)
print(f"CDN HEALTH REPORT - {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
print("=" * 70)
# Overall health
health_pct = (available_count / total_sources * 100) if total_sources > 0 else 0
if health_pct == 100:
health_status = "🟢 EXCELLENT"
elif health_pct >= 75:
health_status = "🟡 GOOD"
elif health_pct >= 50:
health_status = "🟠 DEGRADED"
else:
health_status = "🔴 CRITICAL"
print(f"\nOverall Health: {health_status} ({health_pct:.1f}%)")
print(f"Available Sources: {available_count}/{total_sources}")
# Detailed source status
print(f"\n{'Source':<20} {'Priority':<10} {'Failures':<12} {'Status':<15}")
print("-" * 70)
for source in cdn.sources:
failures = cdn._failure_counts.get(source.name, 0)
is_available = source in available_sources
# Determine status
if not source.enabled:
status = "🔴 Disabled"
elif failures >= cdn._max_failures:
status = "🔴 Circuit Open"
elif failures == 0:
status = "🟢 Healthy"
elif failures == 1:
status = "🟡 Degraded"
else:
status = "🟠 Critical"
failure_str = f"{failures}/{cdn._max_failures}"
print(f"{source.name:<20} {source.priority:<10} {failure_str:<12} {status:<15}")
print("=" * 70)
# Recommendations
if health_pct < 100:
print("\n⚠️ RECOMMENDATIONS:")
if available_count == 0:
print(" • All sources are unavailable - call cdn.reset() to recover")
print(" • Check network connectivity")
print(" • Verify CDN URLs are accessible")
elif health_pct < 50:
print(" • Consider adding more backup CDN sources")
print(" • Investigate why sources are failing")
print(" • Call cdn.reset() if network issues have been resolved")
# Run monitoring
monitor_cdn_health()
Example: Filtering and Analysis
from tif1.cdn import get_cdn_manager
cdn = get_cdn_manager()
# Get available sources
available = cdn.get_sources()
# Analyze by priority
priority_groups = {}
for source in available:
priority_groups.setdefault(source.priority, []).append(source)
print("Sources grouped by priority:")
for priority in sorted(priority_groups.keys()):
sources = priority_groups[priority]
print(f"\nPriority {priority}: {len(sources)} source(s)")
for source in sources:
print(f" • {source.name}")
# Check minification support
minified = [s for s in available if s.use_minification]
print(f"\nSources with minification: {len(minified)}/{len(available)}")
# Identify healthy sources (0 failures)
healthy = [s for s in available if cdn._failure_counts[s.name] == 0]
print(f"Perfectly healthy sources: {len(healthy)}/{len(available)}")
Example: Handling No Available Sources
from tif1.cdn import get_cdn_manager
from tif1.exceptions import NetworkError
def safe_fetch_data(year, gp, session, path):
"""Safely fetch data with CDN availability check."""
cdn = get_cdn_manager()
sources = cdn.get_sources()
if not sources:
# No sources available - try recovery
print("⚠️ No CDN sources available. Attempting recovery...")
# Reset all sources to give them another chance
cdn.reset()
sources = cdn.get_sources()
if not sources:
# Still no sources - this is a configuration problem
raise NetworkError(
url=f"{year}/{gp}/{session}/{path}",
status_code=None,
message="No CDN sources configured or all sources are disabled"
)
# Proceed with fetch using available sources
print(f"✓ {len(sources)} CDN source(s) available")
# ... fetch logic here ...
# Usage
try:
safe_fetch_data(2021, "Belgian%20Grand%20Prix", "Race", "drivers.json")
except NetworkError as e:
print(f"Failed to fetch data: {e}")
add_source()
def add_source(source: CDNSource) -> None
Adds a new CDN source to the manager. The source is automatically inserted into the sources list in priority order, and its failure count is initialized to 0.
This method is useful for:
- Adding private CDN mirrors
- Adding regional CDN endpoints for better performance
- Adding backup sources for increased reliability
- Testing custom CDN configurations
Parameters:
source (CDNSource): The CDN source object to add
Behavior:
- Sources are automatically sorted by priority after insertion
- If a source with the same name already exists, both are kept (consider using unique names)
- The new source is immediately available for use via
get_sources()
- Failure count is initialized to 0 (healthy state)
Example:
from tif1.cdn import get_cdn_manager, CDNSource
cdn = get_cdn_manager()
# Add a regional CDN for better performance in Asia
asia_cdn = CDNSource(
name="Asia CDN",
base_url="https://asia.cdn.example.com/f1data",
priority=1, # Higher priority than default (2)
enabled=True,
use_minification=True
)
cdn.add_source(asia_cdn)
# Add a backup CDN with lower priority
backup_cdn = CDNSource(
name="Backup CDN",
base_url="https://backup.cdn.example.com/f1data",
priority=3, # Lower priority - only used if others fail
enabled=True,
use_minification=False
)
cdn.add_source(backup_cdn)
# Verify sources were added
print(f"Total sources: {len(cdn.sources)}")
for source in cdn.get_sources():
print(f" Priority {source.priority}: {source.name}")
Advanced Example: Dynamic CDN Selection
from tif1.cdn import get_cdn_manager, CDNSource
import socket
def add_regional_cdn():
"""Add region-specific CDN based on location."""
cdn = get_cdn_manager()
# Detect region (simplified example)
hostname = socket.gethostname()
if "eu" in hostname or "europe" in hostname:
regional_cdn = CDNSource(
name="Europe CDN",
base_url="https://eu.cdn.example.com/f1data",
priority=1,
enabled=True,
use_minification=True
)
cdn.add_source(regional_cdn)
print("Added European CDN with priority 1")
elif "us" in hostname or "america" in hostname:
regional_cdn = CDNSource(
name="US CDN",
base_url="https://us.cdn.example.com/f1data",
priority=1,
enabled=True,
use_minification=True
)
cdn.add_source(regional_cdn)
print("Added US CDN with priority 1")
# Configure regional CDN before loading data
add_regional_cdn()
mark_failure()
def mark_failure(source_name: str) -> None
Marks a CDN source as having failed a request. This increments the failure counter for the source. When a source reaches the failure threshold (default: 3 consecutive failures), it’s automatically excluded from get_sources() until reset.
This implements a circuit breaker pattern to prevent wasting time on consistently failing sources. The circuit breaker helps:
- Reduce latency by skipping known-bad sources
- Prevent cascading failures
- Allow sources to recover (via
reset() or mark_success())
Parameters:
source_name (str): Name of the CDN source that failed
Behavior:
- Increments failure count by 1
- If count reaches
_max_failures (3), logs a warning
- Source is automatically excluded from
get_sources() after threshold
- Does not disable the source permanently—can be recovered via
reset() or mark_success()
When This Is Called:
This method is automatically called by try_sources() when a CDN request fails. You typically don’t need to call it manually unless implementing custom fetching logic.
Example:
from tif1.cdn import get_cdn_manager
cdn = get_cdn_manager()
# Simulate failures (normally done automatically by try_sources)
print("Simulating CDN failures...")
for i in range(1, 4):
cdn.mark_failure("jsDelivr")
failures = cdn._failure_counts["jsDelivr"]
print(f"Failure {i}: jsDelivr has {failures} failures")
# Check if still available
available = any(s.name == "jsDelivr" for s in cdn.get_sources())
print(f" Still available: {available}")
# After 3 failures, source is excluded
print(f"\nAvailable sources: {[s.name for s in cdn.get_sources()]}")
Advanced Example: Custom Failure Handling
from tif1.cdn import get_cdn_manager
import requests
cdn = get_cdn_manager()
def fetch_with_custom_error_handling(url: str) -> dict:
"""Fetch data with custom error classification."""
try:
response = requests.get(url, timeout=30)
response.raise_for_status()
return response.json()
except requests.exceptions.Timeout:
# Timeout might be temporary - don't mark as failure
print(f"Timeout on {url} - not marking as failure")
raise
except requests.exceptions.HTTPError as e:
if e.response.status_code == 404:
# 404 means data doesn't exist - not a CDN failure
print(f"404 on {url} - data not found")
raise
elif e.response.status_code >= 500:
# Server error - mark as CDN failure
print(f"Server error on {url} - marking as failure")
# Extract CDN name from URL and mark failure
# (In practice, try_sources handles this automatically)
raise
else:
raise
mark_success()
def mark_success(source_name: str) -> None
Resets the failure count for a CDN source to 0, indicating a successful data fetch. This allows a previously failing source to recover and be used again.
This method is crucial for the self-healing behavior of the CDN system. When a source that was experiencing issues successfully serves a request, its failure count is reset, allowing it to be used normally again.
Parameters:
source_name (str): Name of the CDN source that succeeded
Behavior:
- Sets failure count to 0 for the specified source
- Source becomes immediately available via
get_sources() if it was disabled
- Called automatically by
try_sources() on successful fetch
When This Is Called:
This method is automatically called by try_sources() when a CDN request succeeds. You typically don’t need to call it manually.
Example:
from tif1.cdn import get_cdn_manager
cdn = get_cdn_manager()
# Simulate failure and recovery
print("Simulating CDN failure and recovery...")
# Fail the source 3 times
for i in range(3):
cdn.mark_failure("jsDelivr")
print(f"After failures: {cdn._failure_counts['jsDelivr']} failures")
print(f"Available: {any(s.name == 'jsDelivr' for s in cdn.get_sources())}")
# Successful request resets the counter
cdn.mark_success("jsDelivr")
print(f"After success: {cdn._failure_counts['jsDelivr']} failures")
print(f"Available: {any(s.name == 'jsDelivr' for s in cdn.get_sources())}")
Advanced Example: Monitoring Recovery
from tif1.cdn import get_cdn_manager
import time
cdn = get_cdn_manager()
def monitor_source_recovery(source_name: str):
"""Monitor a CDN source's recovery over time."""
print(f"Monitoring {source_name} recovery...")
# Initial state
initial_failures = cdn._failure_counts.get(source_name, 0)
print(f"Initial failures: {initial_failures}")
# Simulate some failures
for i in range(2):
cdn.mark_failure(source_name)
time.sleep(0.1)
print(f"After failures: {cdn._failure_counts[source_name]}")
# Simulate successful recovery
cdn.mark_success(source_name)
print(f"After recovery: {cdn._failure_counts[source_name]}")
# Verify source is available
available = any(s.name == source_name for s in cdn.get_sources())
print(f"Source available: {available}")
monitor_source_recovery("jsDelivr")
reset()
Resets all failure counts for all CDN sources to 0, effectively re-enabling all previously disabled sources. This is useful for recovering from widespread network issues or when you want to give all sources a fresh start.
Use Cases:
- After network outage: When your network connection was down and all sources failed
- After CDN maintenance: When you know CDN providers have resolved their issues
- Testing: When you want to test fallback behavior from a clean state
- Manual recovery: When you want to force retry of all sources
Behavior:
- Resets failure count to 0 for every source in
cdn.sources
- All sources become immediately available via
get_sources() (if enabled=True)
- Does not modify source configuration (priority, URLs, minification settings)
Example:
from tif1.cdn import get_cdn_manager
cdn = get_cdn_manager()
# Simulate multiple source failures
print("Simulating widespread CDN failures...")
for source in cdn.sources:
for _ in range(3):
cdn.mark_failure(source.name)
print(f"Available sources after failures: {len(cdn.get_sources())}")
# Reset all sources
print("\nResetting all CDN sources...")
cdn.reset()
print(f"Available sources after reset: {len(cdn.get_sources())}")
# Verify all sources are healthy
for source in cdn.sources:
failures = cdn._failure_counts[source.name]
print(f" {source.name}: {failures} failures")
Advanced Example: Automatic Recovery Strategy
from tif1.cdn import get_cdn_manager
import time
from datetime import datetime, timedelta
class CDNHealthManager:
"""Manages CDN health with automatic recovery."""
def __init__(self):
self.cdn = get_cdn_manager()
self.last_reset = datetime.now()
self.reset_interval = timedelta(hours=1)
def check_and_reset_if_needed(self):
"""Reset CDN sources if enough time has passed."""
now = datetime.now()
if now - self.last_reset >= self.reset_interval:
available = len(self.cdn.get_sources())
total = len(self.cdn.sources)
if available < total:
print(f"Auto-resetting CDN sources ({available}/{total} available)")
self.cdn.reset()
self.last_reset = now
return True
return False
def get_health_status(self) -> dict:
"""Get detailed health status of all CDN sources."""
total = len(self.cdn.sources)
available = len(self.cdn.get_sources())
return {
"total_sources": total,
"available_sources": available,
"health_percentage": (available / total * 100) if total > 0 else 0,
"last_reset": self.last_reset.isoformat(),
"sources": [
{
"name": source.name,
"failures": self.cdn._failure_counts.get(source.name, 0),
"available": source.name in [s.name for s in self.cdn.get_sources()]
}
for source in self.cdn.sources
]
}
# Usage
health_manager = CDNHealthManager()
# Check health before data fetching
health_manager.check_and_reset_if_needed()
# Get detailed status
status = health_manager.get_health_status()
print(f"CDN Health: {status['health_percentage']:.1f}%")
print(f"Available: {status['available_sources']}/{status['total_sources']}")
try_sources()
def try_sources(
year: int,
gp: str,
session: str,
path: str,
fetch_func: Callable[[str], Any]
) -> Any
Try fetching data from CDN sources with automatic fallback. This is the core method that implements the multi-source fallback logic with circuit breaker patterns. It orchestrates the entire CDN failover process, trying sources in priority order until one succeeds or all fail.
How It Works:
- Get available sources: Calls
get_sources() to get enabled, healthy sources sorted by priority
- Check availability: If no sources available, raises
NetworkError immediately
- Try each source (in priority order):
- Format URL via
source.format_url(year, gp, session, path)
- Log debug message:
"Trying CDN: {source.name} - {url}"
- Call
fetch_func(url) to attempt HTTP request
- On success: Call
mark_success(source.name) and return data (done!)
- On
DataNotFoundError (404): Re-raise immediately (data doesn’t exist, no point trying other sources)
- On other exceptions: Log warning, call
mark_failure(source.name), try next source
- All sources failed: Raise
NetworkError with details from last exception
Parameters:
year (int): Season year (e.g., 2021, 2022, 2023)
gp (str): Grand Prix name, URL-encoded (e.g., “Belgian%20Grand%20Prix”, “Monaco%20Grand%20Prix”)
- Important: Must be URL-encoded (spaces as
%20, not +)
- Use
urllib.parse.quote() if encoding manually
session (str): Session name (e.g., “Race”, “Qualifying”, “Practice 1”, “Sprint”)
path (str): Resource path relative to session directory (e.g., “drivers.json”, “laps.json”, “telemetry.json”)
fetch_func (Callable[[str], Any]): Function that takes a URL string and returns fetched data
- Should raise
DataNotFoundError for 404 responses
- Should raise other exceptions for network/server errors
- Return type can be any (dict, list, bytes, etc.)
Returns:
Any: Fetched data from the first successful CDN source (return type depends on fetch_func)
Raises:
DataNotFoundError: If resource doesn’t exist (404 from any source)
- This is re-raised immediately without trying other sources
- Indicates the requested data genuinely doesn’t exist
NetworkError: If all CDN sources fail
- Includes URL path and status code from last exception
- Indicates all sources are unavailable or experiencing issues
Error Handling Strategy:
The method distinguishes between two types of errors:
-
Data not found (404): Permanent error—data doesn’t exist
- Re-raised immediately as
DataNotFoundError
- No point trying other CDN sources (they’ll all return 404)
- Does NOT increment failure counter (not a CDN failure)
-
Network/server errors: Temporary error—CDN source is failing
- Logged as warning
- Failure counter incremented via
mark_failure()
- Next source is tried
- If all sources fail, raises
NetworkError
Example: Basic Usage
from tif1.cdn import get_cdn_manager
import requests
cdn = get_cdn_manager()
def fetch_json(url: str) -> dict:
"""Fetch JSON data from URL."""
response = requests.get(url, timeout=30)
response.raise_for_status()
return response.json()
# Fetch 2021 Belgian Grand Prix Race drivers
try:
data = cdn.try_sources(
year=2021,
gp="Belgian%20Grand%20Prix",
session="Race",
path="drivers.json",
fetch_func=fetch_json
)
print(f"✓ Successfully fetched {len(data.get('drivers', []))} drivers")
except DataNotFoundError:
print("✗ Drivers data not found for this session")
except NetworkError as e:
print(f"✗ All CDN sources failed: {e}")
Example: With Proper Error Handling
from tif1.cdn import get_cdn_manager
from tif1.exceptions import DataNotFoundError, NetworkError
import requests
import logging
# Enable debug logging to see which CDN is being tried
logging.basicConfig(level=logging.DEBUG)
cdn = get_cdn_manager()
def fetch_with_error_handling(url: str) -> dict:
"""Fetch JSON with proper error classification."""
try:
response = requests.get(url, timeout=30)
if response.status_code == 404:
# Data doesn't exist - raise DataNotFoundError
raise DataNotFoundError(message=f"Resource not found: {url}")
response.raise_for_status()
return response.json()
except requests.exceptions.Timeout:
# Timeout - let CDN manager try next source
raise NetworkError(url=url, message="Request timeout")
except requests.exceptions.ConnectionError:
# Connection failed - let CDN manager try next source
raise NetworkError(url=url, message="Connection failed")
# Fetch data with comprehensive error handling
try:
data = cdn.try_sources(
year=2021,
gp="Belgian%20Grand%20Prix",
session="Race",
path="laps.json",
fetch_func=fetch_with_error_handling
)
print(f"✓ Fetched {len(data.get('laps', []))} laps")
print(f"✓ Used CDN: {cdn.get_sources()[0].name}")
except DataNotFoundError as e:
print(f"✗ Data not found: {e}")
print(" This session may not have lap data available")
except NetworkError as e:
print(f"✗ Network error: {e}")
print(" All CDN sources failed - check network connectivity")
# Show which sources failed
for source in cdn.sources:
failures = cdn._failure_counts[source.name]
print(f" {source.name}: {failures} failures")
Example: Custom Fetch Function with Retries
from tif1.cdn import get_cdn_manager
from tif1.exceptions import DataNotFoundError, NetworkError
import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry
cdn = get_cdn_manager()
def create_fetch_func_with_retries():
"""Create a fetch function with built-in retries."""
# Configure session with retries
session = requests.Session()
retries = Retry(
total=3,
backoff_factor=0.5,
status_forcelist=[500, 502, 503, 504],
allowed_methods=["GET"]
)
adapter = HTTPAdapter(max_retries=retries)
session.mount("https://", adapter)
def fetch(url: str) -> dict:
response = session.get(url, timeout=30)
if response.status_code == 404:
raise DataNotFoundError(message=f"Not found: {url}")
response.raise_for_status()
return response.json()
return fetch
# Use custom fetch function
fetch_func = create_fetch_func_with_retries()
data = cdn.try_sources(
year=2021,
gp="Belgian%20Grand%20Prix",
session="Race",
path="weather.json",
fetch_func=fetch_func
)
print(f"✓ Fetched weather data: {len(data.get('weather', []))} entries")
Example: Monitoring Fallback Behavior
from tif1.cdn import get_cdn_manager
from tif1.exceptions import DataNotFoundError, NetworkError
import requests
import logging
# Enable debug logging
logging.basicConfig(
level=logging.DEBUG,
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
cdn = get_cdn_manager()
def fetch_json(url: str) -> dict:
response = requests.get(url, timeout=30)
if response.status_code == 404:
raise DataNotFoundError(message="Not found")
response.raise_for_status()
return response.json()
# Add multiple sources to see fallback
from tif1.cdn import CDNSource
cdn.add_source(CDNSource(
name="Backup CDN",
base_url="https://backup.example.com/f1",
priority=2
))
print("Attempting fetch with fallback monitoring...\n")
try:
data = cdn.try_sources(
year=2021,
gp="Belgian%20Grand%20Prix",
session="Race",
path="drivers.json",
fetch_func=fetch_json
)
print(f"\n✓ Success! Fetched {len(data.get('drivers', []))} drivers")
# Show which source succeeded
for source in cdn.sources:
failures = cdn._failure_counts[source.name]
if failures == 0:
print(f"✓ {source.name}: Healthy (used for fetch)")
else:
print(f"✗ {source.name}: {failures} failures")
except NetworkError as e:
print(f"\n✗ All sources failed: {e}")
# Detailed failure report
print("\nFailure details:")
for source in cdn.sources:
failures = cdn._failure_counts[source.name]
print(f" {source.name}: {failures}/{cdn._max_failures} failures")
Example: Async Fetch Function
from tif1.cdn import get_cdn_manager
import asyncio
import aiohttp
cdn = get_cdn_manager()
async def async_fetch(url: str) -> dict:
"""Async fetch function for use with try_sources."""
async with aiohttp.ClientSession() as session:
async with session.get(url, timeout=30) as response:
if response.status == 404:
from tif1.exceptions import DataNotFoundError
raise DataNotFoundError(message="Not found")
response.raise_for_status()
return await response.json()
# Wrap async function for use with try_sources
def sync_fetch(url: str) -> dict:
"""Synchronous wrapper for async fetch."""
return asyncio.run(async_fetch(url))
# Use with try_sources
data = cdn.try_sources(
year=2021,
gp="Belgian%20Grand%20Prix",
session="Race",
path="telemetry.json",
fetch_func=sync_fetch
)
print(f"✓ Fetched telemetry data")
Example: Fetch with Progress Tracking
from tif1.cdn import get_cdn_manager
from tif1.exceptions import DataNotFoundError
import requests
from typing import Any
cdn = get_cdn_manager()
class FetchProgress:
"""Track fetch progress across CDN sources."""
def __init__(self):
self.attempts = []
def create_fetch_func(self):
"""Create fetch function that tracks progress."""
def fetch(url: str) -> dict:
attempt = {
"url": url,
"source": self._extract_source_name(url),
"success": False,
"error": None
}
try:
response = requests.get(url, timeout=30)
if response.status_code == 404:
attempt["error"] = "Not found (404)"
self.attempts.append(attempt)
raise DataNotFoundError(message="Not found")
response.raise_for_status()
attempt["success"] = True
self.attempts.append(attempt)
return response.json()
except Exception as e:
attempt["error"] = str(e)
self.attempts.append(attempt)
raise
return fetch
def _extract_source_name(self, url: str) -> str:
"""Extract CDN source name from URL."""
if "jsdelivr" in url:
return "jsDelivr"
elif "backup" in url:
return "Backup CDN"
return "Unknown"
def print_report(self):
"""Print fetch attempt report."""
print("\nFetch Attempt Report:")
print("=" * 60)
for i, attempt in enumerate(self.attempts, 1):
status = "✓ SUCCESS" if attempt["success"] else f"✗ FAILED: {attempt['error']}"
print(f"\nAttempt {i}: {attempt['source']}")
print(f" Status: {status}")
print(f" URL: {attempt['url'][:80]}...")
# Usage
progress = FetchProgress()
try:
data = cdn.try_sources(
year=2021,
gp="Belgian%20Grand%20Prix",
session="Race",
path="drivers.json",
fetch_func=progress.create_fetch_func()
)
print(f"✓ Successfully fetched data")
finally:
progress.print_report()
Performance Considerations:
- Latency: Each failed source adds latency (timeout + retry time). Use reasonable timeouts in
fetch_func.
- Circuit breaker: After 3 failures, sources are automatically disabled, reducing latency for subsequent requests.
- Minification: Enable
use_minification=True to reduce download time by 20-40%.
- Connection pooling: Reuse HTTP sessions in
fetch_func for better performance.
- Parallel fetching:
try_sources() is sequential. For parallel fetching of multiple resources, call it multiple times concurrently.
Thread Safety:
try_sources() is thread-safe for concurrent calls
- Failure counters are updated atomically (though not with locks)
- In high-concurrency scenarios, consider external synchronization for
mark_failure() / mark_success() calls
CDNSource
Dataclass representing a CDN source configuration.
Constructor
@dataclass
class CDNSource:
name: str
base_url: str
priority: int = 0
enabled: bool = True
use_minification: bool = False
Attributes:
name: Human-readable source name
base_url: Base URL for the CDN (must be HTTPS)
priority: Priority level (lower number = higher priority, default: 0)
enabled: Whether source is currently enabled (default: True)
use_minification: Enable JSON minification (appends .min before .json, default: False)
Example:
from tif1.cdn import CDNSource
# Primary source with minification
primary = CDNSource(
name="jsDelivr",
base_url="https://cdn.jsdelivr.net/gh/TracingInsights",
priority=1,
enabled=True,
use_minification=True
)
Methods
def format_url(year: int, gp: str, session: str, path: str) -> str
Format a complete CDN URL for a specific resource with optional minification support.
Parameters:
year: Season year (e.g., 2021)
gp: Grand Prix name, URL-encoded (e.g., “Belgian%20Grand%20Prix”)
session: Session name (e.g., “Race”, “Qualifying”)
path: Resource path (e.g., “drivers.json”)
Returns:
URL Format:
{base_url}/{year}@main/{gp}/{session}/{path}
If use_minification=True and path ends with .json, the path is transformed from file.json to file.min.json.
Example:
from tif1.cdn import CDNSource
# Without minification
source = CDNSource(
name="jsDelivr",
base_url="https://cdn.jsdelivr.net/gh/TracingInsights",
use_minification=False
)
url = source.format_url(
year=2021,
gp="Belgian%20Grand%20Prix",
session="Race",
path="drivers.json"
)
# Result: https://cdn.jsdelivr.net/gh/TracingInsights/2021@main/Belgian%20Grand%20Prix/Race/drivers.json
# With minification
minified_source = CDNSource(
name="jsDelivr",
base_url="https://cdn.jsdelivr.net/gh/TracingInsights",
use_minification=True
)
minified_url = minified_source.format_url(
year=2021,
gp="Belgian%20Grand%20Prix",
session="Race",
path="drivers.json"
)
# Result: https://cdn.jsdelivr.net/gh/TracingInsights/2021@main/Belgian%20Grand%20Prix/Race/drivers.min.json
Configuration
CDN behavior can be configured via the global config:
import tif1
config = tif1.get_config()
# Set custom CDN sources
config.set("cdns", [
"https://cdn.jsdelivr.net/gh/TracingInsights",
"https://backup.cdn.com/f1data"
])
# Enable minification for bandwidth savings
config.set("cdn_use_minification", True)
# Save configuration
config.save()
Practical Example: 2021 Belgian Grand Prix
Here’s a complete example showing how the CDN system works when fetching data for the 2021 Belgian Grand Prix Race:
from tif1.cdn import get_cdn_manager, CDNSource
# Get the global CDN manager
cdn = get_cdn_manager()
# Check available sources
print("Available CDN sources:")
for source in cdn.get_sources():
print(f" {source.name} (priority {source.priority})")
# Add a backup CDN source
backup = CDNSource(
name="Backup CDN",
base_url="https://backup.example.com/f1data",
priority=2,
enabled=True,
use_minification=False
)
cdn.add_source(backup)
# Format URL for Belgian GP 2021 Race drivers data
primary_source = cdn.get_sources()[0]
url = primary_source.format_url(
year=2021,
gp="Belgian%20Grand%20Prix",
session="Race",
path="drivers.json"
)
print(f"\nFetching from: {url}")
# The CDN manager automatically handles fallback
# If jsDelivr fails, it tries the backup source
def fetch_data(url: str) -> dict:
import requests
response = requests.get(url, timeout=30)
response.raise_for_status()
return response.json()
try:
data = cdn.try_sources(
year=2021,
gp="Belgian%20Grand%20Prix",
session="Race",
path="drivers.json",
fetch_func=fetch_data
)
print(f"Successfully fetched data with {len(data.get('drivers', []))} drivers")
except Exception as e:
print(f"All CDN sources failed: {e}")
# Check source health after fetch
print("\nCDN source health:")
for source in cdn.sources:
failures = cdn._failure_counts.get(source.name, 0)
status = "healthy" if failures < 3 else "disabled"
print(f" {source.name}: {failures} failures ({status})")
Best Practices
Configuration
- Use HTTPS only: HTTP CDN sources are rejected for security
- Enable minification: Reduces bandwidth by 20-40% for large datasets
- Configure via file: Use
~/.tif1rc for persistent configuration
- Set priorities wisely: Lower number = higher priority (1 before 2)
# Good configuration
{
"cdns": [
"https://cdn.jsdelivr.net/gh/TracingInsights", # Priority 1
"https://backup.cdn.example.com/f1" # Priority 2
],
"cdn_use_minification": true
}
Error Handling
- Distinguish error types: Handle
DataNotFoundError (404) vs NetworkError (all sources failed)
- Let CDN manager handle fallback: Don’t implement your own retry logic
- Log failures: Enable debug logging to see which CDN is being used
from tif1.exceptions import DataNotFoundError, NetworkError
import logging
logging.getLogger("tif1.cdn").setLevel(logging.DEBUG)
try:
data = cdn.try_sources(...)
except DataNotFoundError:
# Data doesn't exist - don't retry
pass
except NetworkError:
# All sources failed - check network
pass
- Enable minification: Especially for telemetry data (35-40% size reduction)
- Use connection pooling: Reuse HTTP sessions in fetch functions
- Set reasonable timeouts: Balance between patience and responsiveness
- Monitor circuit breaker: Check failure counts to identify problematic sources
Reliability
- Add backup sources: At least 2-3 sources for production applications
- Use regional CDNs: Optimize for your user’s geographic location
- Implement auto-recovery: Periodically reset sources after network issues
- Monitor health: Track availability and failure rates
Security
- Never use raw.githubusercontent.com: Rate limited and blocked by tif1
- Validate custom CDN URLs: Ensure they’re trusted sources
- Use HTTPS everywhere: HTTP sources are automatically rejected
- Audit CDN sources: Regularly review configured sources
Troubleshooting
All CDN Sources Failing
Symptoms: NetworkError: All CDN sources failed
Solutions:
- Check network connectivity:
ping cdn.jsdelivr.net
- Verify CDN URLs are accessible in browser
- Reset all sources:
cdn.reset()
- Check firewall/proxy settings
- Enable debug logging to see detailed errors
import logging
logging.getLogger("tif1.cdn").setLevel(logging.DEBUG)
cdn = get_cdn_manager()
cdn.reset() # Give all sources a fresh start
Data Not Found (404)
Symptoms: DataNotFoundError: Data not found
Cause: Requested data doesn’t exist (not a CDN failure)
Solutions:
- Verify session exists: Check year, GP name, session name
- Check data availability: Some sessions may not have all data types
- Use correct path: Ensure
path parameter is correct (e.g., “drivers.json”)
Symptoms: Data fetching is slow
Solutions:
- Enable minification:
config.set("cdn_use_minification", True)
- Add regional CDN: Closer to your geographic location
- Check network speed: Run speed test
- Use connection pooling: Reuse HTTP sessions
- Increase timeout: May be timing out prematurely
Circuit Breaker Stuck Open
Symptoms: Sources remain disabled after network recovery
Solution: Reset failure counts
cdn = get_cdn_manager()
cdn.reset() # Reset all sources to healthy state
Last modified on May 8, 2026