API Overview
Thetif1 API is a high-performance, modern Python library meticulously designed for Formula 1 data analysis, telemetry processing, and motorsport analytics. Built from the ground up with performance as the absolute primary focus, tif1 provides a clean, intuitive, and powerful interface that maintains full compatibility with the popular fastf1 library while delivering significant speed improvements—often 5-10x faster—through advanced async operations, intelligent multi-layer caching, parallel HTTP fetching, and optional Polars backend support.
Whether you’re a data scientist analyzing race strategies, a motorsport engineer studying telemetry patterns, a developer building F1 applications, or an enthusiast exploring racing data, tif1 provides the tools and performance you need to work efficiently with Formula 1 data at scale.
What Makes tif1 Different
tif1 stands apart from other F1 data libraries through several key innovations:
- Uncompromising Performance: Every single component, from HTTP fetching to DataFrame construction, has been optimized for maximum speed. Async operations run in parallel, caching happens at multiple layers (memory + SQLite), and the optional Polars backend provides memory-efficient processing for large datasets.
-
Production-Ready Architecture: Unlike research-focused libraries,
tif1is built for production use with circuit breakers, retry logic, connection pooling, DNS-over-HTTPS support, comprehensive error handling, and graceful degradation when CDN sources fail. - Developer Experience: Comprehensive type hints throughout the codebase enable excellent IDE autocomplete and type checking. Clear error messages with structured context help you debug issues quickly. Consistent naming conventions make the API predictable and easy to learn.
- Flexible Data Access: Load exactly what you need—skip telemetry for faster lap time analysis, or load everything for comprehensive session exploration. The API adapts to your use case rather than forcing a one-size-fits-all approach.
- Modern Python Practices: Built for Python 3.10+, leveraging modern language features like structural pattern matching, improved type hints, and async/await patterns. The codebase follows strict linting rules (Ruff) and maintains high test coverage.
Design Philosophy
Thetif1 API is built on a foundation of carefully considered design principles that guide every architectural decision and implementation detail:
Performance First
Performance isn’t just a feature—it’s the core reasontif1 exists. Every component has been profiled, optimized, and benchmarked:
-
Async Operations: All network I/O uses async/await patterns with
niquests(a modern fork ofrequests) to enable parallel fetching of multiple data sources simultaneously. Loading a full race session with 20 drivers can fetch all data in parallel rather than sequentially. - Intelligent Caching: Multi-layer caching strategy with in-memory LRU cache for hot data and SQLite-backed persistent cache for all fetched data. Cache hits return data in microseconds rather than seconds. The cache is content-addressed and validates data integrity.
-
Parallel Fetching: When loading session data,
tif1fetches lap data, telemetry, weather, and race control messages in parallel using asyncio task groups. This reduces total load time from ~10-15 seconds to ~2-3 seconds for uncached sessions. - Optional Polars Backend: For users working with large datasets (multiple seasons, comparative analysis), the Polars backend provides 2-5x better memory efficiency and faster DataFrame operations compared to pandas. The backend is lazy-loaded and can be switched at runtime.
-
Optimized JSON Parsing: Uses
orjson(written in Rust) instead of Python’s stdlibjsonfor 2-3x faster JSON parsing. This matters when processing megabytes of telemetry data. - Connection Pooling: HTTP sessions use connection pooling to reuse TCP connections across requests, reducing connection overhead by ~50-100ms per request.
Simplicity and Ergonomics
Complex operations should feel simple. The API hides complexity behind intuitive interfaces:-
Minimal Entry Points: Most users only need
get_session()to start working with F1 data. Everything else is discoverable through the returnedSessionobject via IDE autocomplete. -
Sensible Defaults: All optional parameters have sensible defaults.
get_session()loads all available data by default, but you can selectively disable data sources for faster loading when you don’t need them. - Fuzzy Matching: Event names support fuzzy matching—“spa”, “belgium”, “Belgian Grand Prix”, and “Spa-Francorchamps” all work. Session types accept both full names (“Qualifying”) and short codes (“Q”). This reduces friction and makes the API more forgiving.
-
Progressive Disclosure: Basic usage is simple, but advanced features are available when needed. Start with
session.lapsfor quick analysis, then dive intodriver.get_lap(n).get_telemetry()for detailed telemetry work. -
Method Chaining: Where appropriate, methods return objects that support further operations, enabling natural workflows like
session.get_driver("VER").get_fastest_lap().get_telemetry().
Predictability and Consistency
The API should be easy to learn and remember:-
Consistent Naming: All methods follow clear patterns—
get_*for retrieval operations,load_*for data loading,clear_*for cache operations. Attributes use snake_case, classes use PascalCase. -
Clear Data Hierarchies: The object model mirrors F1’s structure:
SessioncontainsDriverobjects, which containLapobjects, which containTelemetrydata. This mental model matches how you think about F1 data. - Comprehensive Type Hints: Every public function and method includes complete type hints. Your IDE can show you exactly what parameters are expected and what will be returned, reducing the need to consult documentation.
-
Structured Errors: Exceptions include structured context (not just error messages) so you can programmatically handle errors. A
DataNotFoundErrorincludes the year, event, and session that weren’t found.
Compatibility
Existingfastf1 users should be able to migrate with minimal friction:
-
Drop-in Replacement: The core API (
get_session(),Session.laps,Driverobjects) matchesfastf1’s interface. Mostfastf1code works withtif1by just changing the import. -
Compatibility Layer: The
tif1.fastf1_compatmodule provides shims forfastf1-specific functions likeset_log_level()andCache.enable_cache(). -
DataFrame Compatibility: DataFrames returned by
tif1have the same column names and structure asfastf1, ensuring your analysis code doesn’t need changes. -
Migration Path: You can use both libraries side-by-side during migration, gradually moving code to
tif1as you validate behavior.
Extensibility and Customization
Advanced users should be able to customize behavior:-
Configuration System: Global configuration via
get_config()allows you to tune performance parameters (max workers, cache TTL, validation), switch backends (pandas/polars), and control behavior (ultra cold start mode, DNS-over-HTTPS). -
Modular Architecture: The library is split into focused modules (
http_session,async_fetch,cache,cdn) that can be used independently or replaced with custom implementations. - Backend Abstraction: The DataFrame backend is abstracted behind a common interface, making it possible to add new backends (DuckDB, Arrow) without changing user-facing code.
- Cache Management: Full control over cache behavior—clear specific sessions, clear by date range, inspect cache size, vacuum the database, or disable caching entirely for testing.
- Retry and Circuit Breaker: Configurable retry logic with exponential backoff and circuit breaker patterns to handle transient network failures gracefully.
Quick Start
Get started withtif1 in seconds. The library handles all the complexity of data fetching, parsing, caching, and DataFrame construction behind a simple, intuitive interface:
What Happens Behind the Scenes
When you callget_session(), tif1 orchestrates a complex series of operations to deliver data quickly and reliably:
-
Input Validation and Normalization
- Validates the year is within supported range (2018-2026)
- Performs fuzzy matching on event name against the schedule database
- Normalizes session type (accepts “Race”, “R”, “race”, “RACE”, etc.)
- Raises
DataNotFoundErrorwith helpful context if validation fails
-
Schedule Lookup
- Queries the embedded schedule database (JSON files in
src/tif1/data/schedules/) - Retrieves event metadata: official name, location, date, session times
- Determines available sessions for the event (standard vs sprint weekend format)
- Validates the requested session exists for this event
- Queries the embedded schedule database (JSON files in
-
Cache Check (SQLite)
- Computes content-addressed cache key from (year, event, session, data types)
- Queries SQLite cache database (
~/.tif1/cache.dbby default) - Checks cache TTL (time-to-live) to determine if cached data is still fresh
- If cache hit and data is fresh, deserializes DataFrames and returns immediately (microseconds)
- If cache miss or stale data, proceeds to fetch from CDN
-
CDN URL Construction
- Builds URLs for all requested data sources (laps, telemetry, weather, messages)
- Uses jsdelivr CDN pointing to TracingInsights GitHub data repositories
- Constructs fallback URLs in case primary CDN fails
- Includes cache-busting parameters when needed
-
Parallel Async Fetching
- Creates async tasks for each data source using
asyncio.TaskGroup - Fetches all data sources in parallel (not sequential) using
niquests - Uses connection pooling to reuse TCP connections
- Implements retry logic with exponential backoff for transient failures
- Falls back to alternative CDN sources if primary fails
- Typical fetch time: 2-3 seconds for all data sources in parallel
- Creates async tasks for each data source using
-
JSON Parsing
- Parses JSON responses using
orjson(Rust-based, 2-3x faster than stdlib) - Validates JSON structure against expected schema
- Raises
InvalidDataErrorif data is corrupted or malformed - Extracts nested data structures (lap arrays, telemetry points, etc.)
- Parses JSON responses using
-
DataFrame Construction
- Converts parsed JSON into pandas or Polars DataFrames based on config
- Applies column renaming to match
fastf1conventions - Sets appropriate data types (float64 for times, int64 for lap numbers, category for compounds)
- Reorders columns to standard layout
- Adds computed columns (IsPersonalBest, TyreLife, etc.)
- Handles missing data gracefully (NaN for missing telemetry, empty DataFrames for missing sessions)
-
Data Validation (Optional)
- If
validate_dataconfig is enabled, runs Pydantic validation on data structures - Checks for logical consistency (lap times > 0, sector times sum to lap time, etc.)
- Validates driver codes against known driver list
- Can be disabled for 10-15% performance improvement in production
- If
-
Cache Storage
- Serializes DataFrames to efficient binary format (pickle or parquet)
- Stores in SQLite database with metadata (timestamp, data types, size)
- Compresses data to reduce storage (typical compression ratio: 3-5x)
- Updates cache statistics (hit rate, total size, entry count)
-
Object Construction
- Creates
Sessionobject with all loaded data - Initializes
Driverobjects for each driver in the session - Sets up lazy-loading for telemetry data (loaded on first access)
- Establishes relationships between objects (Session → Driver → Lap → Telemetry)
- Returns fully-initialized
Sessionobject ready for analysis
- Creates
Session object.
Performance Characteristics
Understanding the performance profile helps you optimize your workflows:-
Cache Hit (Warm): 1-5 milliseconds
- Data loaded from SQLite cache
- DataFrame deserialization from binary format
- No network I/O
-
Cache Miss (Cold): 2-5 seconds
- Parallel async fetching of all data sources
- JSON parsing and DataFrame construction
- Cache storage for future use
- Dominated by network latency, not CPU
-
Partial Cache Hit: 500ms - 2 seconds
- Some data sources cached, others fetched
- Only missing data sources are fetched
- Faster than full cold start
-
Ultra Cold Start Mode: 100-300 milliseconds
- Optimized for single-query scenarios
- Skips some cache checks and optimizations
- Trades repeated-query performance for first-query speed
- Enable with
config.set("ultra_cold_start", True)
-
Polars Backend: 30-50% faster for large datasets
- Better memory efficiency (2-5x less RAM)
- Faster filtering and aggregation operations
- Lazy evaluation for complex queries
- Enable with
config.set("lib", "polars")
Entry Points
Thetif1 API exposes five primary entry points that serve as the foundation for all data access, configuration, and cache management. These functions are designed to be the only imports you need for most use cases:
| Function | Purpose | Return Type | Typical Use Case | Documentation |
|---|---|---|---|---|
get_session() | Load data for a specific GP session (Practice, Qualifying, Sprint, Race) | Session | Primary data access for analysis | Core API |
get_events() | List all available Grand Prix events for a given year with metadata | pd.DataFrame | Event discovery and schedule planning | Events API |
get_sessions() | List all available sessions for a specific event (FP1, FP2, FP3, Q, S, R) | list[str] | Session discovery for event | Events API |
get_config() | Access and modify global configuration (backend, cache, performance) | Config | Performance tuning and customization | Config API |
get_cache() | Access cache management for clearing, inspecting, and optimizing storage | Cache | Cache maintenance and debugging | Cache API |
Detailed Entry Point Usage
get_session(year, event, session_type, **kwargs)
The primary and most important entry point for loading Formula 1 session data. This function is designed to be flexible, forgiving, and powerful—accepting multiple input formats while providing fine-grained control over what data gets loaded.
Function Signature:
-
year(int): Championship year from 2018 to 2026- 2018-2024: Complete historical data
- 2025-2026: Partial data (as events occur)
- Earlier years: Not supported (data format changed)
- Future years: Will be supported as data becomes available
-
event(str): Event name with flexible matching- Official names: “Monaco Grand Prix”, “British Grand Prix”
- Circuit names: “Silverstone”, “Spa-Francorchamps”
- Location names: “Monaco”, “Belgium”, “Great Britain”
- Partial matches: “monaco”, “silver”, “spa”
- Case insensitive: “MONACO”, “Monaco”, “monaco” all work
- Fuzzy matching uses Levenshtein distance with threshold of 0.7
-
session_type(str): Session type with multiple formats- Full names: “Practice 1”, “Practice 2”, “Practice 3”, “Qualifying”, “Sprint”, “Race”
- Short codes: “FP1”, “FP2”, “FP3”, “Q”, “S”, “R”
- Sprint weekends: “Sprint Qualifying” or “SQ”
- Case insensitive: “race”, “RACE”, “Race” all work
- Normalized internally to canonical form
-
laps(bool): Load lap timing data- Includes: lap times, sector times, tire compounds, tire life, positions, pit stops
- Size: ~50-200 KB per session (compressed)
- Load time: ~200-500ms (cold), <5ms (warm)
- Required for: almost all analysis workflows
-
telemetry(bool): Load high-frequency telemetry- Includes: speed, RPM, throttle, brake, gear, DRS, position (X/Y/Z)
- Sampling rate: ~10-20 Hz (10-20 samples per second)
- Size: ~5-20 MB per session (compressed)
- Load time: ~1-3 seconds (cold), ~10-50ms (warm)
- Required for: detailed lap analysis, corner analysis, driving style comparison
- Optional for: lap time analysis, strategy analysis
-
weather(bool): Load weather conditions- Includes: air temp, track temp, humidity, pressure, wind speed/direction, rainfall
- Sampling rate: ~1 sample per minute
- Size: ~5-10 KB per session
- Load time: ~100-200ms (cold), <5ms (warm)
- Required for: understanding tire performance, strategy decisions
- Optional for: pure lap time analysis
-
messages(bool): Load race control messages- Includes: flags (yellow, red, green), safety cars, penalties, DRS status
- Size: ~10-50 KB per session
- Load time: ~100-200ms (cold), <5ms (warm)
- Required for: understanding race incidents, strategy impacts
- Optional for: qualifying analysis, practice analysis
-
backend(Literal[“pandas”, “polars”] | None): DataFrame backendNone: Use global config setting (default)"pandas": Use pandas DataFrames (more compatible, more features)"polars": Use Polars DataFrames (faster, more memory efficient)- Can be changed per-session without affecting global config
- Polars requires
polarspackage installed
-
force_reload(bool): Bypass cacheFalse: Use cache if available (default, recommended)True: Always fetch from CDN, ignore cache- Useful for: debugging, getting latest data, cache corruption
- Slower: adds 2-5 seconds to load time
Session object with the following key attributes and methods:
-
Load only what you need: Disable unused data sources
-
Use cache effectively: Don’t use
force_reloadunless necessary -
Consider Polars for large datasets: 2-5x better memory efficiency
-
Batch load sessions: Use async methods for parallel loading
get_events(year, **kwargs)
Retrieve the complete event schedule for a championship year, including all Grand Prix events with their metadata, dates, locations, and session structures.
Function Signature:
RoundNumber(int): Sequential round number in championship (1-24)EventName(str): Short event name (e.g., “Monaco Grand Prix”)OfficialEventName(str): Full official name (e.g., “Formula 1 Grand Prix de Monaco 2024”)Location(str): Circuit location/city (e.g., “Monte Carlo”)Country(str): ISO country code (e.g., “MC” for Monaco)EventDate(str/datetime): Date of main raceEventFormat(str): “standard” or “sprint”Session1-5(str): Session names (“Practice 1”, “Qualifying”, “Race”, etc.)Session1-5Date(str): Local date/time for each sessionSession1-5DateUtc(str): UTC date/time for each sessionF1ApiSupport(bool): Whether F1 official API supports this event
- Schedule data is embedded in the package (no network I/O)
- Load time: <1ms (reading from JSON files)
- Data size: ~10-20 KB per year
- No caching needed (always instant)
get_sessions(year, event)
List all available sessions for a specific event, accounting for standard vs sprint weekend formats.
Function Signature:
- Validate session existence before loading
- Build UI dropdowns for session selection
- Iterate through all sessions in an event
- Handle standard vs sprint weekend differences
get_config()
Access global configuration singleton:
get_cache()
Manage the SQLite-backed cache:
Core Objects
Thetif1 API is built around a clear object hierarchy that mirrors the structure of Formula 1 data. Understanding this hierarchy is key to effective use of the library.
Primary Objects
These are the main objects you’ll interact with when usingtif1:
Session
TheSession object represents an entire Formula 1 session (Practice, Qualifying, Sprint, or Race) and serves as the primary container for all session data.
Key Attributes:
session.laps- All lap timing data from all drivers as a DataFramesession.weather- Weather conditions throughout the sessionsession.race_control_messages- Official race control messages and flagssession.results- Final classification and session resultssession.circuit_info- Circuit metadata (length, corners, location)session.drivers- List of driver codes (e.g., [‘VER’, ‘HAM’, ‘LEC’])session.session_info- Session metadata (date, time, type)
get_driver(code)- Get a Driver object for a specific driverget_fastest_laps(by_driver=False)- Get fastest lap(s) from the sessionget_laps_by_driver(code)- Get all laps for a specific driverload()- Explicitly load data (usually called automatically)
Driver
TheDriver object represents a specific driver within a session and provides convenient access to driver-specific data and operations.
Key Attributes:
driver.code- Three-letter driver code (e.g., ‘VER’)driver.name- Full driver name (e.g., ‘Max Verstappen’)driver.team- Team name (e.g., ‘Red Bull Racing’)driver.number- Racing number (e.g., 33)driver.laps- All laps completed by this driver as a DataFramedriver.telemetry- Telemetry data for all laps (if loaded)
get_lap(lap_number)- Get a specific lap by numberget_fastest_lap()- Get the driver’s fastest lapget_fastest_lap_tel()- Get telemetry for the fastest lapget_lap_telemetry(lap_number)- Get telemetry for a specific lapget_stint_data()- Analyze tire stint performance
Lap
TheLap object represents a single lap by a driver and provides access to high-frequency telemetry data for that specific lap.
Key Attributes:
lap.lap_number- Lap number in the sessionlap.lap_time- Total lap time in secondslap.sector_times- Individual sector timeslap.compound- Tire compound usedlap.telemetry- High-frequency telemetry DataFrame
get_telemetry()- Load telemetry data for this lapget_speed_trace()- Get speed data along the lapget_throttle_trace()- Get throttle application datacompare_to(other_lap)- Compare telemetry with another lap
Data Models
These objects represent structured data returned by the API:Laps
A DataFrame containing lap timing data with rich filtering and analysis methods. Columns:Driver- Driver codeLapNumber- Sequential lap numberLapTime- Total lap time (seconds)Sector1Time,Sector2Time,Sector3Time- Sector timesCompound- Tire compound (SOFT, MEDIUM, HARD, INTERMEDIATE, WET)TyreLife- Age of tires in lapsIsPersonalBest- Boolean flag for personal best lapPosition- Track position at lap completionPitInTime,PitOutTime- Pit stop timingTrackStatus- Track condition flags
pick_driver(code)- Filter to specific driverpick_fastest()- Get fastest lappick_compound(compound)- Filter by tire compoundpick_track_status(status)- Filter by track status
Telemetry
A DataFrame containing high-frequency telemetry data sampled at ~10-20 Hz. Columns:Distance- Distance along track (meters)Speed- Speed (km/h)RPM- Engine RPMThrottle- Throttle position (0-100%)Brake- Brake pressure (boolean or 0-100%)DRS- DRS status (0=closed, 1=open)Gear- Current gear (1-8)X,Y,Z- 3D position coordinates
SessionResults
Final classification and results for the session. Attributes:position- Final positiondriver_code- Driver codedriver_name- Full nameteam- Team namepoints- Championship points awardedstatus- Finish status (Finished, DNF, DNS, etc.)time- Total race time or gap to leaderfastest_lap- Fastest lap timefastest_lap_number- Lap number of fastest lap
CircuitInfo
Circuit metadata and track information. Attributes:name- Circuit namelocation- City/regioncountry- Countrylength_km- Track length in kilometerscorners- Number of cornersdrs_zones- Number of DRS zoneslap_record- Lap record time and holder
API Categories
Core data access
Essential APIs for loading and working with F1 data:- Core API - Session, Driver, Lap classes and data loading
- Models - Data model classes and structures
- Events - Event and session discovery
- Schedule - Schedule validation and schema
Data Pipeline
Internal data transformation and fetching:- I/O Pipeline - DataFrame construction and transformation
- Async Fetch - Parallel HTTP fetching with niquests
- HTTP - HTTP session and networking
- HTTP Session - Connection pooling and DoH support
- CDN - Multi-source CDN management with fallback
Configuration & Cache
Performance and reliability management:- Config - Global configuration management
- Cache - SQLite-backed caching system
- Retry - Circuit breaker and retry logic
Visualization & Tools
User-facing utilities:- Plotting - F1-themed visualization utilities
- CLI - Command-line interface
- Jupyter - Notebook integration and rich displays
Utilities & Helpers
Supporting functionality:- Utils - General utility functions
- Utilities - Helper functions for configuration and logging
- Core Utils - Internal utilities for DataFrame operations
- Types - Type definitions and hints
- Validation - Pydantic models and data validation
- Fuzzy - Fuzzy string matching for event/session names
- Lap Operations - Utilities for working with lap data
Compatibility & Errors
Integration and error handling:- FastF1 Compat - FastF1 compatibility layer
- Exceptions - Exception hierarchy and error handling
Common Workflows
Load and explore session
Analyze driver performance
Parallel data loading
Configuration and Optimization
Error Handling
tif1 uses a hierarchy of specific exceptions to help you handle common issues:
Exception Hierarchy
Common Exceptions
DataNotFoundError: The requested GP, year, or session doesn’t existNetworkError: All CDN sources failed or timed outInvalidDataError: The data fetched from the CDN was corruptedDriverNotFoundError: The driver code provided is not in this sessionLapNotFoundError: The requested lap number does not exist for this driverCacheError: Cache operation failedSessionNotLoadedError: Attempted to access data before loading session
Error handling example
Performance Tips
-
Use async methods for parallel loading: 5-10x faster than sequential
-
Enable ultra-cold start for single queries: Minimal latency
-
Use polars lib for large datasets: Better memory efficiency
-
Disable validation in production: 10-15% performance boost
-
Increase concurrency for bulk operations: Faster parallel fetching
Type Hints
All public APIs include comprehensive type hints for better IDE support:Next Steps
- Core API - Start with Session, Driver, and Lap classes
- Tutorials - Learn through practical examples
- Best Practices - Optimize your code
- FAQ - Common questions and answers
Related Pages
Core API
Session, Driver, Lap
Models
Data models
Events
Event discovery
Examples
Code examples