Data Validation
The validation module provides a comprehensive, Pydantic-based validation system that ensures data integrity and catches malformed responses from the CDN before they reach your application code.
Overview
tif1’s validation system acts as a quality gate between raw JSON data from the CDN and your application. It performs deep structural validation, type checking, and anomaly detection to ensure you’re working with clean, consistent data.
What Validation Catches
The validation system detects and handles:
- Missing Required Fields: Ensures all mandatory data fields are present
- Incorrect Data Types: Validates that numeric fields contain numbers, booleans are booleans, etc.
- Inconsistent Array Lengths: Verifies all parallel arrays have matching lengths (critical for DataFrame construction)
- Invalid Enum Values: Checks tire compounds, session types, and other categorical data against known valid values
- Null-like String Values: Automatically converts "", “none”, “null”, “nan” to proper None values
- Out-of-Range Values: Validates numeric constraints (e.g., RPM < 20,000, gear <= 8, stint >= 1)
- Data Anomalies: Detects missing laps, duplicate lap numbers, and statistical outliers
- Field Aliases: Handles both verbose and abbreviated field names from different CDN formats
Validation is disabled by default for optimal performance. The library is designed for speed, and validation adds 10-15% overhead. Enable validation during development, debugging, or when working with untrusted data sources.
When to Enable Validation:
- Development and testing environments
- First-time data exploration for new seasons/events
- Debugging data quality issues
- Working with experimental or beta CDN endpoints
- Building data quality monitoring pipelines
When to Keep Validation Disabled:
- Production environments with trusted data
- Performance-critical applications
- Batch processing large datasets
- Repeated analysis of the same sessions
Core Validation Functions
validate_laps
Validates lap timing data structure with comprehensive field checking and length consistency validation.
def validate_laps(data: dict) -> LapData
Purpose:
Validates raw lap timing JSON from the CDN, ensuring all required fields are present, arrays have consistent lengths, and values meet domain constraints (e.g., stint >= 1, tire life >= 0).
Parameters:
data (dict): Raw JSON dictionary from CDN containing lap timing arrays
Returns:
LapData: Validated Pydantic model with all fields type-checked and normalized
Raises:
pydantic.ValidationError: If validation fails (missing fields, type mismatches, inconsistent lengths)
Validation Rules:
- All required fields must be present:
time, lap, s1, s2, s3, compound, stint, life, pos, status, pb
- All non-empty arrays must have identical length
- Stint numbers must be >= 1
- Tire life must be >= 0
- Null-like strings ("", “none”, “null”, “nan”) are automatically converted to None
- Optional fields (session_time, pit times, speed traps, weather) are validated if present
Example:
from tif1.validation import validate_laps
# Raw lap data from CDN
raw_data = {
"time": [90.123, 89.456, 88.789],
"lap": [1.0, 2.0, 3.0],
"s1": [30.1, 29.8, 29.5],
"s2": [30.0, 29.7, 29.4],
"s3": [30.0, 29.9, 29.9],
"compound": ["SOFT", "SOFT", "SOFT"],
"stint": [1, 1, 1],
"life": [1, 2, 3],
"pos": [1, 1, 1],
"status": ["1", "1", "1"],
"pb": [False, True, True],
# Optional fields
"sesT": [100.0, 189.5, 278.3], # Session time (aliased)
"drv": ["VER", "VER", "VER"], # Driver code (aliased)
"dNum": ["33", "33", "33"], # Driver number (aliased)
}
try:
validated = validate_laps(raw_data)
print(f"✓ Validated {len(validated.lap)} laps")
print(f" Required fields: {len([f for f in validated.model_fields if validated.__getattribute__(f)])}")
# Access validated data
print(f" Fastest lap: {min(validated.time)} seconds")
print(f" Compounds used: {set(validated.compound)}")
except Exception as e:
print(f"✗ Validation failed: {e}")
Common Validation Errors:
# Error: Inconsistent array lengths
bad_data = {
"time": [90.1, 89.5], # 2 elements
"lap": [1.0, 2.0, 3.0], # 3 elements - MISMATCH!
# ... other fields
}
# Raises: ValueError: Inconsistent lap data lengths
# Error: Invalid stint number
bad_data = {
"time": [90.1],
"lap": [1.0],
"stint": [0], # Must be >= 1
# ... other fields
}
# Raises: ValueError: Stint numbers must be >= 1
# Error: Negative tire life
bad_data = {
"time": [90.1],
"lap": [1.0],
"life": [-1], # Must be >= 0
# ... other fields
}
# Raises: ValueError: Tire life must be >= 0
validate_telemetry
Validates high-frequency telemetry data in batch mode (significantly faster than point-by-point validation).
def validate_telemetry(data: dict) -> TelemetryData
Purpose:
Validates raw telemetry JSON containing arrays of sensor readings (speed, RPM, throttle, brake, etc.). Handles nested tel objects and performs boolean coercion for brake/DRS fields.
Parameters:
data (dict): Raw JSON dictionary from CDN containing telemetry arrays
Returns:
TelemetryData: Validated Pydantic model with normalized field names and types
Raises:
pydantic.ValidationError: If validation fails
Validation Rules:
- Required fields:
time, speed (minimum 1 element each)
- All non-empty arrays must have identical length
- Automatically unwraps nested
tel objects if present
- Boolean coercion for
brake and drs fields (handles numeric 0/1 values)
- Null-like strings converted to None
- Supports both aliased (
DriverAhead, DistanceToDriverAhead, dataKey) and standard field names
Special Handling:
The validator handles two common CDN formats:
# Format 1: Flat structure
{
"time": [0.0, 0.1, 0.2],
"speed": [250.0, 251.0, 252.0],
"rpm": [12000, 12100, 12200]
}
# Format 2: Nested tel object (automatically unwrapped)
{
"tel": {
"time": [0.0, 0.1, 0.2],
"speed": [250.0, 251.0, 252.0],
"rpm": [12000, 12100, 12200]
}
}
Example:
from tif1.validation import validate_telemetry
# Raw telemetry data from CDN
raw_data = {
"time": [0.0, 0.1, 0.2, 0.3],
"speed": [250.5, 251.2, 252.0, 253.1],
"rpm": [12000, 12100, 12200, 12300],
"gear": [7, 7, 7, 7],
"throttle": [100.0, 100.0, 99.5, 98.0],
"brake": [0, 0, 0, 1], # Numeric values (will be coerced to bool)
"drs": [1, 1, 1, 0], # Numeric values (will be coerced to bool)
"distance": [0.0, 25.0, 50.0, 75.0],
"x": [100.5, 101.2, 102.0, 102.8],
"y": [50.0, 50.1, 50.2, 50.3],
"z": [0.0, 0.0, 0.0, 0.0],
}
try:
validated = validate_telemetry(raw_data)
print(f"✓ Validated {len(validated.time)} telemetry samples")
print(f" Time range: {validated.time[0]:.3f}s - {validated.time[-1]:.3f}s")
print(f" Speed range: {min(validated.speed):.1f} - {max(validated.speed):.1f} km/h")
print(f" Max RPM: {max(validated.rpm)}")
# Boolean fields are properly typed
print(f" Brake applications: {sum(validated.brake)}")
print(f" DRS active samples: {sum(validated.drs)}")
except Exception as e:
print(f"✗ Validation failed: {e}")
Working with Nested Tel Objects:
# CDN sometimes returns nested structure
nested_data = {
"tel": {
"time": [0.0, 0.1],
"speed": [250.0, 251.0],
"rpm": [12000, 12100]
},
"metadata": {"driver": "VER", "lap": 5}
}
# Validator automatically unwraps the tel object
validated = validate_telemetry(nested_data)
print(f"Time: {validated.time}") # [0.0, 0.1]
print(f"Speed: {validated.speed}") # [250.0, 251.0]
Performance Note:
Batch validation is ~50x faster than point-by-point validation. Always use validate_telemetry() for array data rather than validating individual telemetry points.
validate_drivers
Validates driver information data structure.
def validate_drivers(data: dict) -> DriversData
Purpose:
Validates driver roster data from the CDN, ensuring all driver codes follow the 3-letter format and required metadata fields are present.
Parameters:
data (dict): Raw JSON dictionary from CDN with drivers array
Returns:
DriversData: Validated Pydantic model containing list of DriverInfo objects
Raises:
pydantic.ValidationError: If validation fails
Validation Rules:
driver: Must be exactly 3 uppercase letters (e.g., “VER”, “HAM”, “LEC”)
team: Team name, 1-100 characters
dn: Driver number (string)
fn: First name (required)
ln: Last name (required)
tc: Team color hex code (required)
url: Headshot photo URL (required)
Example:
from tif1.validation import validate_drivers
# Raw driver data from CDN
raw_data = {
"drivers": [
{
"driver": "VER",
"team": "Red Bull Racing",
"dn": "33",
"fn": "Max",
"ln": "Verstappen",
"tc": "3671C6",
"url": "https://example.com/verstappen.png"
},
{
"driver": "HAM",
"team": "Mercedes",
"dn": "44",
"fn": "Lewis",
"ln": "Hamilton",
"tc": "27F4D2",
"url": "https://example.com/hamilton.png"
}
]
}
try:
validated = validate_drivers(raw_data)
print(f"✓ Validated {len(validated.drivers)} drivers")
for driver in validated.drivers:
print(f" {driver.driver} - {driver.fn} {driver.ln} ({driver.team})")
print(f" Number: #{driver.dn}, Color: #{driver.tc}")
except Exception as e:
print(f"✗ Validation failed: {e}")
Common Validation Errors:
# Error: Invalid driver code format
bad_data = {
"drivers": [{
"driver": "VERS", # 4 letters - must be exactly 3!
"team": "Red Bull Racing",
# ... other fields
}]
}
# Raises: ValidationError: driver must match pattern ^[A-Z]{3}$
# Error: Driver code not uppercase
bad_data = {
"drivers": [{
"driver": "ver", # Lowercase - must be uppercase!
"team": "Red Bull Racing",
# ... other fields
}]
}
# Raises: ValidationError: driver must match pattern ^[A-Z]{3}$
# Error: Team name too long
bad_data = {
"drivers": [{
"driver": "VER",
"team": "A" * 101, # 101 characters - max is 100!
# ... other fields
}]
}
# Raises: ValidationError: team must be at most 100 characters
Integration with Session:
import tif1
# Enable validation
config = tif1.get_config()
config.set("validate_data", True)
# Driver data is automatically validated during session load
session = tif1.get_session(2025, "Monaco", "Race")
# If drivers.json is malformed, validation error is raised here
validate_weather
Validates weather data structure with automatic key normalization.
def validate_weather(data: dict) -> WeatherData
Purpose:
Validates weather sensor data from the CDN, handling both PascalCase and aliased field names. Ensures consistent array lengths for time-series weather data.
Parameters:
data (dict): Raw JSON dictionary from CDN containing weather arrays
Returns:
WeatherData: Validated Pydantic model with normalized field names
Raises:
pydantic.ValidationError: If validation fails
Validation Rules:
- Required field:
time (or alias wT) - timestamp in seconds
- All non-empty arrays must have identical length
- Automatically normalizes PascalCase keys (
Time, AirTemp) to snake_case
- Accepts both verbose (
air_temp) and aliased (wAT) field names
- Null-like strings converted to None
Supported Field Formats:
The validator accepts three naming conventions:
# Format 1: Aliased (compact CDN format)
{
"wT": [0.0, 60.0, 120.0],
"wAT": [25.5, 25.7, 25.9],
"wTT": [35.2, 35.5, 35.8]
}
# Format 2: PascalCase (legacy CDN format)
{
"Time": [0.0, 60.0, 120.0],
"AirTemp": [25.5, 25.7, 25.9],
"TrackTemp": [35.2, 35.5, 35.8]
}
# Format 3: snake_case (normalized format)
{
"time": [0.0, 60.0, 120.0],
"air_temp": [25.5, 25.7, 25.9],
"track_temp": [35.2, 35.5, 35.8]
}
Example:
from tif1.validation import validate_weather
# Raw weather data from CDN (aliased format)
raw_data = {
"wT": [0.0, 60.0, 120.0, 180.0], # Time (seconds)
"wAT": [25.5, 25.7, 25.9, 26.1], # Air temperature (°C)
"wTT": [35.2, 35.5, 35.8, 36.0], # Track temperature (°C)
"wH": [60.0, 61.0, 62.0, 63.0], # Humidity (%)
"wP": [1013.0, 1013.2, 1013.5, 1013.7], # Pressure (mbar)
"wR": [False, False, False, True], # Rainfall
"wWD": [180.0, 185.0, 190.0, 195.0], # Wind direction (degrees)
"wWS": [5.0, 5.5, 6.0, 6.5], # Wind speed (km/h)
}
try:
validated = validate_weather(raw_data)
print(f"✓ Validated {len(validated.time)} weather samples")
print(f" Time range: {validated.time[0]:.0f}s - {validated.time[-1]:.0f}s")
print(f" Air temp range: {min(validated.air_temp):.1f}°C - {max(validated.air_temp):.1f}°C")
print(f" Track temp range: {min(validated.track_temp):.1f}°C - {max(validated.track_temp):.1f}°C")
print(f" Rainfall detected: {any(validated.rainfall)}")
except Exception as e:
print(f"✗ Validation failed: {e}")
Field Mapping Reference:
| Alias | PascalCase | snake_case | Description |
|---|
wT | Time | time | Timestamp (seconds) |
wAT | AirTemp | air_temp | Air temperature (°C) |
wTT | TrackTemp | track_temp | Track temperature (°C) |
wH | Humidity | humidity | Relative humidity (%) |
wP | Pressure | pressure | Atmospheric pressure (mbar) |
wR | Rainfall | rainfall | Rainfall indicator (bool) |
wWD | WindDirection | wind_direction | Wind direction (degrees) |
wWS | WindSpeed | wind_speed | Wind speed (km/h) |
Handling Mixed Formats:
# CDN might return mixed naming conventions
mixed_data = {
"Time": [0.0, 60.0], # PascalCase
"wAT": [25.5, 25.7], # Aliased
"track_temp": [35.2, 35.5] # snake_case
}
# Validator normalizes all to snake_case internally
validated = validate_weather(mixed_data)
print(validated.time) # [0.0, 60.0]
print(validated.air_temp) # [25.5, 25.7]
print(validated.track_temp) # [35.2, 35.5]
```---
### `validate_race_control`
Validates race control messages (flags, safety car, penalties, etc.).
```python
def validate_race_control(data: dict) -> RaceControlData
Purpose:
Validates race control message data from the CDN, ensuring message timestamps, categories, and metadata are properly structured.
Parameters:
data (dict): Raw JSON dictionary from CDN containing race control message arrays
Returns:
RaceControlData: Validated Pydantic model with normalized message data
Raises:
pydantic.ValidationError: If validation fails
Validation Rules:
- Required field:
time - message timestamp in seconds
- All non-empty arrays must have identical length
- Supports aliased field names (
cat, msg, dNum)
- Null-like strings converted to None
- Sector field accepts both int and string values
Example:
from tif1.validation import validate_race_control
# Raw race control data from CDN
raw_data = {
"time": [0.0, 300.0, 450.0, 600.0],
"cat": ["Flag", "SafetyCar", "Flag", "Penalty"],
"msg": [
"GREEN FLAG",
"SAFETY CAR DEPLOYED",
"YELLOW FLAG - SECTOR 2",
"5 SECOND TIME PENALTY - CAR 33"
],
"status": ["1", "4", "4", "4"],
"flag": ["GREEN", "YELLOW", "YELLOW", "YELLOW"],
"scope": ["Track", "Track", "Sector", "Driver"],
"sector": [None, None, 2, None],
"dNum": [None, None, None, "33"],
"lap": [None, None, None, 15]
}
try:
validated = validate_race_control(raw_data)
print(f"✓ Validated {len(validated.time)} race control messages")
# Analyze message types
categories = [cat for cat in validated.category if cat]
print(f" Message categories: {set(categories)}")
# Find safety car periods
safety_car_msgs = [
(validated.time[i], validated.message[i])
for i in range(len(validated.time))
if validated.category[i] == "SafetyCar"
]
print(f" Safety car deployments: {len(safety_car_msgs)}")
# Find penalties
penalties = [
(validated.time[i], validated.message[i], validated.racing_number[i])
for i in range(len(validated.time))
if validated.category[i] == "Penalty"
]
print(f" Penalties issued: {len(penalties)}")
for time, msg, driver in penalties:
print(f" Lap {time:.0f}s: {msg} (Driver #{driver})")
except Exception as e:
print(f"✗ Validation failed: {e}")
Message Categories:
Common race control message categories:
Flag: Track flag changes (GREEN, YELLOW, RED, BLUE, etc.)
SafetyCar: Safety car deployment/withdrawal
VirtualSafetyCar: VSC deployment/withdrawal
Penalty: Driver penalties (time penalties, drive-through, etc.)
DRS: DRS enabled/disabled
Other: Miscellaneous messages
Field Reference:
| Field | Alias | Type | Description |
|---|
time | - | float | Message timestamp (seconds from session start) |
category | cat | str | Message category (Flag, SafetyCar, Penalty, etc.) |
message | msg | str | Full message text |
status | - | str | Track status code (“1”=green, “2”=yellow, “4”=SC, etc.) |
flag | - | str | Flag type (GREEN, YELLOW, RED, BLUE, etc.) |
scope | - | str | Message scope (Track, Sector, Driver) |
sector | - | int|str | Affected sector number (1, 2, or 3) |
racing_number | dNum | str | Affected driver number (for driver-specific messages) |
lap | - | int | Lap number when message was issued |
Filtering Messages by Category:
# Load and validate race control data
validated = validate_race_control(raw_data)
# Extract only flag changes
flag_changes = [
(validated.time[i], validated.flag[i])
for i in range(len(validated.time))
if validated.category[i] == "Flag"
]
# Extract only penalties
penalties = [
{
"time": validated.time[i],
"message": validated.message[i],
"driver": validated.racing_number[i],
"lap": validated.lap[i]
}
for i in range(len(validated.time))
if validated.category[i] == "Penalty"
]
# Extract safety car periods
sc_events = [
(validated.time[i], validated.message[i])
for i in range(len(validated.time))
if validated.category[i] in ["SafetyCar", "VirtualSafetyCar"]
]
Pydantic Models
LapData
Comprehensive Pydantic model for lap timing data with automatic length consistency validation.
Purpose:
Represents validated lap timing data with all required and optional fields. Ensures all parallel arrays have consistent lengths, which is critical for DataFrame construction.
Architecture:
Inherits from ConsistentLengthsMixin which provides automatic array length validation across all fields.
Required Fields:
| Field | Type | Description | Constraints |
|---|
time | list[float | None] | Lap time in seconds | min_length=1 |
lap | list[float | None] | Lap number | min_length=1 |
s1 | list[float | None] | Sector 1 time (seconds) | min_length=1 |
s2 | list[float | None] | Sector 2 time (seconds) | min_length=1 |
s3 | list[float | None] | Sector 3 time (seconds) | min_length=1 |
compound | list[str | None] | Tire compound | min_length=1 |
stint | list[int | None] | Stint number | min_length=1, >= 1 |
life | list[int | None] | Tire age (laps) | min_length=1, >= 0 |
pos | list[int | None] | Track position | min_length=1 |
status | list[str | None] | Track status code | min_length=1 |
pb | list[bool | None] | Personal best indicator | min_length=1 |
Optional Fields (with aliases):
| Field | Alias | Type | Description |
|---|
qualifying_session | qs | list[str | None] | Qualifying session identifier (Q1/Q2/Q3) |
session_time | sesT | list[float | None] | Session time when lap completed |
source_driver | drv | list[str | None] | Driver 3-letter code |
driver_number | dNum | list[str | None] | Driver number |
pit_out_time | pout | list[float | None] | Pit exit time |
pit_in_time | pin | list[float | None] | Pit entry time |
sector1_session_time | s1T | list[float | None] | Session time at S1 completion |
sector2_session_time | s2T | list[float | None] | Session time at S2 completion |
sector3_session_time | s3T | list[float | None] | Session time at S3 completion |
speed_i1 | vi1 | list[float | None] | Speed trap intermediate 1 (km/h) |
speed_i2 | vi2 | list[float | None] | Speed trap intermediate 2 (km/h) |
speed_fl | vfl | list[float | None] | Speed trap finish line (km/h) |
speed_st | vst | list[float | None] | Speed trap straight (km/h) |
fresh_tyre | fresh | list[bool | None] | Fresh tire indicator |
source_team | team | list[str | None] | Team name |
lap_start_time | lST | list[float | None] | Lap start time |
lap_start_date | lSD | list[str | None] | Lap start date/time |
deleted | del | list[bool | None] | Deleted lap indicator |
deleted_reason | delR | list[str | None] | Reason for lap deletion |
fastf1_generated | ff1G | list[bool | None] | FastF1 compatibility flag |
is_accurate | iacc | list[bool | None] | Data accuracy indicator |
Weather Fields (per-lap):
| Field | Alias | Type | Description |
|---|
weather_time | wT | list[float | None] | Weather sample timestamp |
air_temp | wAT | list[float | None] | Air temperature (°C) |
humidity | wH | list[float | None] | Relative humidity (%) |
pressure | wP | list[float | None] | Atmospheric pressure (mbar) |
rainfall | wR | list[bool | None] | Rainfall indicator |
track_temp | wTT | list[float | None] | Track temperature (°C) |
wind_direction | wWD | list[float | None] | Wind direction (degrees) |
wind_speed | wWS | list[float | None] | Wind speed (km/h) |
Validation Behavior:
- Length Consistency: All non-empty lists must have the same length
- Stint Validation: All stint values must be >= 1
- Tire Life Validation: All tire life values must be >= 0
- Null Normalization: Null-like strings ("", “none”, “null”, “nan”) converted to None
- Alias Support: Accepts both verbose and aliased field names
Example:
from tif1.validation import LapData
# Create validated lap data
lap_data = LapData(
time=[90.1, 89.5, 88.9],
lap=[1.0, 2.0, 3.0],
s1=[30.0, 29.8, 29.5],
s2=[30.1, 29.9, 29.6],
s3=[30.0, 29.8, 29.8],
compound=["SOFT", "SOFT", "SOFT"],
stint=[1, 1, 1],
life=[1, 2, 3],
pos=[1, 1, 1],
status=["1", "1", "1"],
pb=[False, True, True],
# Optional fields with aliases
sesT=[100.0, 189.5, 278.4], # session_time
drv=["VER", "VER", "VER"], # source_driver
dNum=["33", "33", "33"], # driver_number
)
print(f"Valid lap data with {len(lap_data.lap)} laps")
print(f"Fastest lap: {min(lap_data.time):.3f}s")
print(f"Driver: {lap_data.source_driver[0]}")
# Export to dict for DataFrame creation
lap_dict = lap_data.model_dump()
Working with Aliases:
# Input data can use either verbose or aliased names
data_with_aliases = {
"time": [90.1],
"lap": [1.0],
"s1": [30.0],
"s2": [30.1],
"s3": [30.0],
"compound": ["SOFT"],
"stint": [1],
"life": [1],
"pos": [1],
"status": ["1"],
"pb": [False],
"sesT": [100.0], # Alias for session_time
"drv": ["VER"], # Alias for source_driver
}
lap_data = LapData.model_validate(data_with_aliases)
# Access via verbose names
print(lap_data.session_time) # [100.0]
print(lap_data.source_driver) # ["VER"]
TelemetryData
Comprehensive Pydantic model for high-frequency telemetry data with automatic unwrapping of nested structures.
Purpose:
Represents validated telemetry sensor data with support for multiple CDN formats. Handles nested tel objects and performs boolean coercion for brake/DRS fields.
Architecture:
- Inherits from
ConsistentLengthsMixin for automatic array length validation
- Implements
_unwrap_tel pre-validator to handle nested structures
- Supports both aliased and standard field names
Required Fields:
| Field | Type | Description | Constraints |
|---|
time | list[float | None] | Time from lap start (seconds) | min_length=1 |
speed | list[float | None] | Speed (km/h) | min_length=1 |
Optional Fields:
| Field | Alias | Type | Description | Typical Range |
|---|
rpm | - | list[float | None] | Engine RPM | 0-20,000 |
gear | - | list[int | None] | Gear number | 0-8 (0=neutral) |
throttle | - | list[float | None] | Throttle position (%) | 0-100 |
brake | - | list[bool | None] | Brake status | True/False |
drs | - | list[bool | None] | DRS status | True/False |
distance | - | list[float | None] | Distance from lap start (m) | 0-circuit length |
rel_distance | - | list[float | None] | Relative distance | 0-1 |
driver_ahead | DriverAhead | list[str | None] | Driver code ahead | 3-letter code |
distance_to_driver_ahead | DistanceToDriverAhead | list[float | None] | Gap to car ahead (m) | >= 0 |
x | - | list[float | None] | X coordinate | Circuit-specific |
y | - | list[float | None] | Y coordinate | Circuit-specific |
z | - | list[float | None] | Z coordinate (elevation) | Circuit-specific |
acc_x | - | list[float | None] | X acceleration (m/s²) | -500 to 500 |
acc_y | - | list[float | None] | Y acceleration (m/s²) | -500 to 500 |
acc_z | - | list[float | None] | Z acceleration (m/s²) | -500 to 500 |
data_key | dataKey | list[str | None] | Data source identifier | - |
Special Handling:
- Nested Tel Objects: Automatically unwraps
tel nested structures
- Boolean Coercion: Converts numeric 0/1 to False/True for
brake and drs
- Null Normalization: Converts null-like strings to None
- Empty Arrays: Optional fields can be empty arrays
Validation Behavior:
# All non-empty arrays must have the same length
{
"time": [0.0, 0.1, 0.2], # 3 elements
"speed": [250, 251, 252], # 3 elements ✓
"rpm": [12000, 12100, 12200], # 3 elements ✓
"gear": [7, 7], # 2 elements ✗ INVALID!
}
Example:
from tif1.validation import TelemetryData
# Standard flat structure
tel_data = TelemetryData(
time=[0.0, 0.1, 0.2, 0.3],
speed=[250.0, 251.0, 252.0, 253.0],
rpm=[12000, 12100, 12200, 12300],
gear=[7, 7, 7, 7],
throttle=[100.0, 100.0, 99.5, 98.0],
brake=[False, False, False, True],
drs=[True, True, True, False],
distance=[0.0, 25.0, 50.0, 75.0],
x=[100.5, 101.2, 102.0, 102.8],
y=[50.0, 50.1, 50.2, 50.3],
z=[0.0, 0.0, 0.0, 0.0],
)
print(f"Valid telemetry with {len(tel_data.time)} samples")
print(f"Speed range: {min(tel_data.speed):.1f} - {max(tel_data.speed):.1f} km/h")
print(f"Max RPM: {max(tel_data.rpm)}")
# Export to dict
tel_dict = tel_data.model_dump()
Handling Nested Structures:
# CDN format with nested tel object
nested_data = {
"tel": {
"time": [0.0, 0.1, 0.2],
"speed": [250.0, 251.0, 252.0],
"rpm": [12000, 12100, 12200]
},
"metadata": {"driver": "VER", "lap": 5}
}
# Automatically unwrapped during validation
tel_data = TelemetryData.model_validate(nested_data)
print(tel_data.time) # [0.0, 0.1, 0.2]
print(tel_data.speed) # [250.0, 251.0, 252.0]
print(tel_data.tel) # Original nested dict preserved
Boolean Coercion:
# CDN sometimes returns numeric 0/1 instead of boolean
data_with_numeric_bools = {
"time": [0.0, 0.1, 0.2],
"speed": [250.0, 251.0, 252.0],
"brake": [0, 0, 1], # Numeric values
"drs": [1, 1, 0], # Numeric values
}
# Automatically coerced to boolean
tel_data = TelemetryData.model_validate(data_with_numeric_bools)
print(tel_data.brake) # [False, False, True]
print(tel_data.drs) # [True, True, False]
WeatherData
Pydantic model for session weather data with automatic key normalization.
Purpose:
Represents validated weather sensor data with support for multiple naming conventions (PascalCase, aliased, snake_case).
Architecture:
- Inherits from
ConsistentLengthsMixin for array length validation
- Implements
_normalize_pascalcase_keys pre-validator for key normalization
- Supports three naming conventions simultaneously
Required Field:
| Field | Alias | Type | Description |
|---|
time | wT | list[float | None] | Timestamp (seconds from session start) |
Optional Fields:
| Field | Alias | PascalCase | Type | Description | Unit |
|---|
air_temp | wAT | AirTemp | list[float | None] | Air temperature | °C |
track_temp | wTT | TrackTemp | list[float | None] | Track surface temperature | °C |
humidity | wH | Humidity | list[float | None] | Relative humidity | % |
pressure | wP | Pressure | list[float | None] | Atmospheric pressure | mbar |
rainfall | wR | Rainfall | list[bool | None] | Rainfall indicator | boolean |
wind_direction | wWD | WindDirection | list[float | None] | Wind direction | degrees (0-360) |
wind_speed | wWS | WindSpeed | list[float | None] | Wind speed | km/h |
Validation Behavior:
- Key Normalization: Automatically converts PascalCase to snake_case
- Alias Support: Accepts compact aliased names (wT, wAT, etc.)
- Length Consistency: All non-empty arrays must have same length
- Null Normalization: Converts null-like strings to None
Example:
from tif1.validation import WeatherData
# Using aliased format (compact CDN format)
weather_data = WeatherData(
wT=[0.0, 60.0, 120.0, 180.0],
wAT=[25.5, 25.7, 25.9, 26.1],
wTT=[35.2, 35.5, 35.8, 36.0],
wH=[60.0, 61.0, 62.0, 63.0],
wP=[1013.0, 1013.2, 1013.5, 1013.7],
wR=[False, False, False, True],
wWD=[180.0, 185.0, 190.0, 195.0],
wWS=[5.0, 5.5, 6.0, 6.5],
)
print(f"Valid weather data with {len(weather_data.time)} samples")
print(f"Air temp range: {min(weather_data.air_temp):.1f}°C - {max(weather_data.air_temp):.1f}°C")
print(f"Rainfall detected: {any(weather_data.rainfall)}")
# Export to dict
weather_dict = weather_data.model_dump()
RaceControlData
Pydantic model for race control messages with flexible field types.
Purpose:
Represents validated race control message data including flags, safety car deployments, and penalties.
Architecture:
- Inherits from
ConsistentLengthsMixin for array length validation
- Supports aliased field names
- Flexible sector field (accepts both int and string)
Required Field:
| Field | Type | Description |
|---|
time | list[float | None] | Message timestamp (seconds from session start) |
Optional Fields:
| Field | Alias | Type | Description |
|---|
category | cat | list[str | None] | Message category (Flag, SafetyCar, Penalty, etc.) |
message | msg | list[str | None] | Full message text |
status | - | list[str | None] | Track status code |
flag | - | list[str | None] | Flag type (GREEN, YELLOW, RED, etc.) |
scope | - | list[str | None] | Message scope (Track, Sector, Driver) |
sector | - | list[int | str | None] | Affected sector number |
racing_number | dNum | list[str | None] | Affected driver number |
lap | - | list[int | None] | Lap number |
Example:
from tif1.validation import RaceControlData
rcm_data = RaceControlData(
time=[0.0, 300.0, 450.0],
cat=["Flag", "SafetyCar", "Penalty"],
msg=["GREEN FLAG", "SAFETY CAR DEPLOYED", "5 SEC PENALTY - CAR 33"],
status=["1", "4", "4"],
flag=["GREEN", "YELLOW", "YELLOW"],
scope=["Track", "Track", "Driver"],
dNum=[None, None, "33"],
)
print(f"Valid race control data with {len(rcm_data.time)} messages")
DriversData
Container model for driver information.
Fields:
drivers: List of DriverInfo objects
Example:
from tif1.validation import DriversData, DriverInfo
drivers_data = DriversData(
drivers=[
DriverInfo(
driver="VER",
team="Red Bull Racing",
dn="33",
fn="Max",
ln="Verstappen",
tc="3671C6",
url="https://example.com/verstappen.png"
)
]
)
DriverInfo
Pydantic model for individual driver information with strict validation.
Fields:
| Field | Type | Description | Constraints |
|---|
driver | str | 3-letter driver code | Exactly 3 uppercase letters (^[A-Z]$) |
team | str | Team name | 1-100 characters |
dn | str | Driver number | Required |
fn | str | First name | Required |
ln | str | Last name | Required |
tc | str | Team color (hex) | Required |
url | str | Headshot photo URL | Required |
Validation:
- Driver code must be exactly 3 uppercase letters
- Team name must be 1-100 characters
- All fields are required (no None values)
Example:
from tif1.validation import DriverInfo
driver = DriverInfo(
driver="VER",
team="Red Bull Racing",
dn="33",
fn="Max",
ln="Verstappen",
tc="3671C6",
url="https://example.com/verstappen.png"
)
print(f"{driver.fn} {driver.ln} - #{driver.dn}")
print(f"Team: {driver.team} (#{driver.tc})")
Enums
TireCompound
Enumeration of valid tire compound values used in F1 sessions.
class TireCompound(str, Enum):
SOFT = "SOFT"
MEDIUM = "MEDIUM"
HARD = "HARD"
INTERMEDIATE = "INTERMEDIATE"
WET = "WET"
UNKNOWN = "UNKNOWN"
TEST_UNKNOWN = "TEST-UNKNOWN"
Usage:
from tif1.validation import TireCompound
# Validate compound value
compound = "SOFT"
if compound in [c.value for c in TireCompound]:
print(f"✓ Valid compound: {compound}")
# Use in validation
assert TireCompound.SOFT == "SOFT"
assert TireCompound.INTERMEDIATE == "INTERMEDIATE"
Compound Types:
- SOFT: Softest compound, fastest but least durable
- MEDIUM: Middle compound, balanced performance
- HARD: Hardest compound, slowest but most durable
- INTERMEDIATE: For damp conditions
- WET: For wet conditions
- UNKNOWN: Compound not identified
- TEST-UNKNOWN: Used in testing sessions
SessionType
Enumeration of valid F1 session types.
class SessionType(str, Enum):
PRACTICE_1 = "Practice 1"
PRACTICE_2 = "Practice 2"
PRACTICE_3 = "Practice 3"
QUALIFYING = "Qualifying"
SPRINT = "Sprint"
SPRINT_QUALIFYING = "Sprint Qualifying"
SPRINT_SHOOTOUT = "Sprint Shootout"
RACE = "Race"
Usage:
from tif1.validation import SessionType
# Validate session type
session = "Qualifying"
if session in [s.value for s in SessionType]:
print(f"✓ Valid session: {session}")
# Check for sprint weekend
is_sprint_weekend = session in [
SessionType.SPRINT.value,
SessionType.SPRINT_QUALIFYING.value,
SessionType.SPRINT_SHOOTOUT.value
]
Session Types:
- Practice 1/2/3: Free practice sessions
- Qualifying: Standard qualifying format
- Sprint: Sprint race (100km race on Saturday)
- Sprint Qualifying: Qualifying for sprint race (2023 format)
- Sprint Shootout: Short qualifying for sprint race (2024+ format)
- Race: Main Grand Prix race
LapStatus
Enumeration of valid lap status values.
class LapStatus(str, Enum):
VALID = "VALID"
INVALID = "INVALID"
OUTLAP = "OUTLAP"
INLAP = "INLAP"
Usage:
from tif1.validation import LapStatus
# Filter valid laps
valid_laps = [lap for lap in laps if lap["status"] == LapStatus.VALID.value]
# Identify pit laps
pit_laps = [
lap for lap in laps
if lap["status"] in [LapStatus.OUTLAP.value, LapStatus.INLAP.value]
]
Status Types:
- VALID: Clean lap with no track limits violations
- INVALID: Lap deleted due to track limits or other violations
- OUTLAP: Lap exiting pit lane
- INLAP: Lap entering pit lane
AnomalyType
Enumeration of data anomaly types detected by the validation system.
class AnomalyType(str, Enum):
MISSING_LAPS = "missing_laps"
DUPLICATE_LAPS = "duplicate_laps"
OUTLIER_TIMES = "outlier_times"
Usage:
from tif1.validation import detect_lap_anomalies, AnomalyType
anomalies = detect_lap_anomalies(laps)
# Filter by type
missing = [a for a in anomalies if a.type == AnomalyType.MISSING_LAPS]
duplicates = [a for a in anomalies if a.type == AnomalyType.DUPLICATE_LAPS]
outliers = [a for a in anomalies if a.type == AnomalyType.OUTLIER_TIMES]
Anomaly Types:
- MISSING_LAPS: Gaps in lap number sequence (e.g., laps 1, 2, 4, 5 - missing lap 3)
- DUPLICATE_LAPS: Same lap number appears multiple times
- OUTLIER_TIMES: Lap times significantly different from average (>3x mean)
Anomaly Detection
detect_lap_anomalies
Detects data quality issues in lap data with structured, actionable results.
def detect_lap_anomalies(laps: list[dict]) -> list[Anomaly]
Purpose:
Analyzes lap data to identify missing laps, duplicate lap numbers, and statistical outliers. Returns structured anomaly objects with severity levels and detailed context.
Parameters:
laps (list[dict]): List of lap dictionaries with at least lap and/or time fields
Returns:
list[Anomaly]: List of detected anomalies with type, severity, description, and details
Detection Logic:
-
Missing Laps: Checks for gaps in lap number sequence
- Severity:
medium
- Details: List of missing lap numbers
-
Duplicate Laps: Identifies lap numbers that appear multiple times
- Severity:
high
- Details: List of duplicate lap numbers
-
Outlier Times: Finds lap times >3x the average (lenient threshold)
- Severity:
low
- Details: Count of outliers and average lap time
- Requires: At least 3 laps for meaningful statistics
Example:
from tif1.validation import detect_lap_anomalies, AnomalyType
# Sample lap data with issues
laps = [
{"lap": 1, "time": 90.0},
{"lap": 2, "time": 89.5},
# Missing lap 3
{"lap": 4, "time": 89.2},
{"lap": 5, "time": 270.0}, # Outlier (pit stop lap)
{"lap": 5, "time": 88.8}, # Duplicate lap 5
]
# Detect anomalies
anomalies = detect_lap_anomalies(laps)
# Process results
for anomaly in anomalies:
print(f"[{anomaly.severity.upper()}] {anomaly.type}")
print(f" {anomaly.description}")
if anomaly.details:
print(f" Details: {anomaly.details}")
print()
# Output:
# [MEDIUM] missing_laps
# Missing 1 lap(s)
# Details: {'missing_laps': [3]}
#
# [HIGH] duplicate_laps
# Duplicate lap numbers detected
# Details: {'duplicate_laps': [5]}
#
# [LOW] outlier_times
# 1 outlier lap time(s) detected
# Details: {'outlier_count': 1, 'average_time': 111.5}
Filtering by Severity:
anomalies = detect_lap_anomalies(laps)
# Critical issues only
critical = [a for a in anomalies if a.severity == "high"]
# All issues except low severity
important = [a for a in anomalies if a.severity in ["medium", "high"]]
# Count by severity
severity_counts = {}
for anomaly in anomalies:
severity_counts[anomaly.severity] = severity_counts.get(anomaly.severity, 0) + 1
print(f"Severity distribution: {severity_counts}")
Integration with Session Data:
import tif1
from tif1.validation import detect_lap_anomalies
# Load session
session = tif1.get_session(2025, "Monaco", "Race")
# Get laps as list of dicts
laps_df = session.laps
laps_list = laps_df.to_dict('records')
# Detect anomalies
anomalies = detect_lap_anomalies(laps_list)
if anomalies:
print(f"⚠ Found {len(anomalies)} data quality issues:")
for anomaly in anomalies:
print(f" [{anomaly.severity}] {anomaly.description}")
else:
print("✓ No data quality issues detected")
Anomaly
Structured model for detected data anomalies.
Fields:
| Field | Type | Description |
|---|
type | AnomalyType | Anomaly category (missing_laps, duplicate_laps, outlier_times) |
severity | str | Severity level: “low”, “medium”, or “high” |
description | str | Human-readable description of the issue |
details | dict[str, Any] | Additional context (missing lap numbers, outlier counts, etc.) |
Example:
from tif1.validation import Anomaly, AnomalyType
# Create anomaly manually
anomaly = Anomaly(
type=AnomalyType.MISSING_LAPS,
severity="medium",
description="Missing 2 lap(s) in sequence",
details={"missing_laps": [5, 12]}
)
# Access fields
print(f"Type: {anomaly.type}")
print(f"Severity: {anomaly.severity}")
print(f"Description: {anomaly.description}")
print(f"Missing laps: {anomaly.details['missing_laps']}")
# Serialize to dict
anomaly_dict = anomaly.model_dump()
Severity Levels:
- low: Minor issues that don’t affect analysis (e.g., statistical outliers from pit stops)
- medium: Moderate issues that may affect completeness (e.g., missing laps)
- high: Critical issues that indicate data corruption (e.g., duplicate lap numbers)
Configuration
Validation in tif1 is controlled through the validate_data configuration option and related settings. The system is designed for maximum performance by default, with validation disabled to minimize overhead.
Global Configuration
Enable Validation Globally
import tif1
# Get configuration instance
config = tif1.get_config()
# Enable validation for all data types
config.set("validate_data", True)
# Now all CDN data will be validated
session = tif1.get_session(2025, "Monaco", "Race")
# drivers.json, weather.json, and rcm.json are automatically validated
Disable Validation (Default)
import tif1
config = tif1.get_config()
config.set("validate_data", False) # Default setting
# Validation skipped for ~10-15% performance improvement
session = tif1.get_session(2025, "Monaco", "Race")
Environment Variables
Configure validation via environment variables for deployment environments:
# Enable validation
export TIF1_VALIDATE_DATA=true
# Disable validation (default)
export TIF1_VALIDATE_DATA=false
# Configuration is automatically loaded from environment
import tif1
session = tif1.get_session(2025, "Monaco", "Race")
# Respects TIF1_VALIDATE_DATA environment variable
Configuration File
Set validation in .tif1rc configuration file:
{
"validate_data": true,
"validate_lap_times": false,
"validate_telemetry": false
}
File Locations (checked in order):
TIF1_CONFIG_FILE environment variable path
./tif1rc (if TIF1_TRUST_CWD_CONFIG=true)
~/.tif1rc (user home directory)
import tif1
# Configuration automatically loaded from .tif1rc
config = tif1.get_config()
print(f"Validation enabled: {config.get('validate_data')}")
Validation Behavior
What Gets Validated
When validate_data=True, the following CDN payloads are validated:
| File | Validation Function | Data Type |
|---|
drivers.json | validate_drivers() | Driver roster information |
weather.json | validate_weather_data() | Weather sensor data |
rcm.json | validate_race_control_data() | Race control messages |
Note: Lap and telemetry data validation is controlled separately (see below).
Validation Modes
Non-Strict Mode (Default):
- Validation errors are logged but don’t raise exceptions
- Original data is returned if validation fails
- Suitable for production environments
from tif1.validation import validate_lap_data
# Non-strict: returns original data on failure
validated = validate_lap_data(raw_data, strict=False)
Strict Mode:
- Validation errors raise
InvalidDataError exceptions
- Suitable for development and testing
- Ensures data quality is enforced
from tif1.validation import validate_lap_data
# Strict: raises exception on failure
try:
validated = validate_lap_data(raw_data, strict=True)
except Exception as e:
print(f"Validation failed: {e}")
# Handle error appropriately
Advanced Configuration
Selective Validation
Enable validation for specific data types only:
import tif1
config = tif1.get_config()
# Validate only drivers and weather, skip race control
config.set("validate_data", True)
# Manual validation for specific types
from tif1.validation import validate_drivers, validate_weather
# Validate drivers
drivers_data = validate_drivers(raw_drivers)
# Validate weather
weather_data = validate_weather(raw_weather)
# Skip race control validation
# (process raw_rcm directly)
Ultra Cold Start Mode
Validation is automatically disabled in ultra-cold start mode for maximum performance:
import tif1
config = tif1.get_config()
config.set("ultra_cold_start", True) # Enables ultra-fast loading
config.set("validate_data", True) # Ignored in ultra-cold mode
session = tif1.get_session(2025, "Monaco", "Race")
# Validation is skipped regardless of validate_data setting
import tif1
config = tif1.get_config()
# Development: Enable all validation
config.set("validate_data", True)
config.set("validate_lap_times", True)
config.set("validate_telemetry", True)
# Production: Disable all validation for maximum speed
config.set("validate_data", False)
config.set("validate_lap_times", False)
config.set("validate_telemetry", False)
# Hybrid: Validate critical data only
config.set("validate_data", True) # Drivers, weather, RCM
config.set("validate_lap_times", False) # Skip lap validation
config.set("validate_telemetry", False) # Skip telemetry validation
Configuration Persistence
Save Configuration
import tif1
from pathlib import Path
config = tif1.get_config()
config.set("validate_data", True)
# Save to default location (~/.tif1rc)
config.save()
# Save to custom location
config.save(Path("./my-config.json"))
Load Configuration
import os
import tif1
# Load from custom file
os.environ["TIF1_CONFIG_FILE"] = "/path/to/config.json"
config = tif1.get_config()
# Configuration loaded from custom file
Validation Integration Points
Automatic Validation (Session Loading)
import tif1
# Enable validation
config = tif1.get_config()
config.set("validate_data", True)
# Validation happens automatically during session load
session = tif1.get_session(2025, "Monaco", "Race")
# drivers.json, weather.json, rcm.json validated automatically
# Access validated data
print(f"Drivers: {len(session.results)}")
print(f"Weather samples: {len(session.weather)}")
print(f"Race control messages: {len(session.race_control_messages)}")
Manual Validation (Custom Workflows)
from tif1.validation import (
validate_drivers,
validate_weather,
validate_race_control,
validate_laps,
validate_telemetry
)
# Load raw data from CDN
raw_drivers = load_from_cdn("drivers.json")
raw_weather = load_from_cdn("weather.json")
raw_rcm = load_from_cdn("rcm.json")
# Validate manually
try:
drivers = validate_drivers(raw_drivers)
weather = validate_weather(raw_weather)
rcm = validate_race_control(raw_rcm)
print("✓ All data validated successfully")
except Exception as e:
print(f"✗ Validation failed: {e}")
Configuration Best Practices
- Development: Enable all validation to catch data issues early
config.set("validate_data", True)
config.set("validate_lap_times", True)
config.set("validate_telemetry", True)
- Testing: Use strict mode to enforce data quality
validated = validate_lap_data(data, strict=True)
- Production: Disable validation for maximum performance
config.set("validate_data", False)
- CI/CD: Use environment variables for configuration
export TIF1_VALIDATE_DATA=true
export TIF1_ULTRA_COLD_START=false
- Monitoring: Enable validation periodically to check data quality
# Run validation check once per day
if should_run_validation_check():
config.set("validate_data", True)
session = tif1.get_session(2025, "Monaco", "Race")
# Check for validation errors in logs
Complete Examples
Custom Validation
from tif1.validation import validate_laps, LapData
def validate_and_clean_laps(raw_data: dict) -> LapData:
"""Validate laps and clean invalid data."""
try:
validated = validate_laps(raw_data)
return validated
except Exception as e:
print(f"Validation error: {e}")
# Clean data
cleaned = clean_lap_data(raw_data)
# Retry validation
return validate_laps(cleaned)
def clean_lap_data(data: dict) -> dict:
"""Remove invalid entries from lap data."""
# Remove laps with missing times
valid_indices = [
i for i, time in enumerate(data.get("time", []))
if time is not None and time > 0
]
# Filter all fields
cleaned = {}
for key, values in data.items():
if isinstance(values, list):
cleaned[key] = [values[i] for i in valid_indices]
else:
cleaned[key] = values
return cleaned
# Usage
raw_data = load_raw_lap_data()
validated = validate_and_clean_laps(raw_data)
Anomaly Detection Workflow
from tif1.validation import detect_lap_anomalies, AnomalyType
import tif1
def analyze_session_quality(year, gp, session_name):
"""Analyze data quality for a session."""
session = tif1.get_session(year, gp, session_name)
# Get all laps
laps = session.laps
# Convert to list of dicts
lap_dicts = laps.to_dict('records')
# Detect anomalies
anomalies = detect_lap_anomalies(lap_dicts)
# Categorize anomalies
missing = [a for a in anomalies if a.type == AnomalyType.MISSING_LAPS]
duplicates = [a for a in anomalies if a.type == AnomalyType.DUPLICATE_LAPS]
outliers = [a for a in anomalies if a.type == AnomalyType.OUTLIER_TIMES]
print(f"Data Quality Report:")
print(f" Total laps: {len(lap_dicts)}")
print(f" Missing laps: {len(missing)}")
print(f" Duplicate laps: {len(duplicates)}")
print(f" Outlier times: {len(outliers)}")
# Show details
for anomaly in anomalies:
print(f" [{anomaly.severity}] {anomaly.description}")
if anomaly.details:
print(f" {anomaly.details}")
return anomalies
# Usage with 2021 Belgian Grand Prix Race
anomalies = analyze_session_quality(2021, "Belgian Grand Prix", "Race")
Validation with Logging
from tif1.validation import validate_laps, validate_telemetry
import logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
def validate_with_logging(data: dict, data_type: str):
"""Validate data with detailed logging."""
logger.info(f"Validating {data_type} data...")
try:
if data_type == "laps":
validated = validate_laps(data)
elif data_type == "telemetry":
validated = validate_telemetry(data)
else:
raise ValueError(f"Unknown data type: {data_type}")
logger.info(f"Validation successful")
return validated
except Exception as e:
logger.error(f"Validation failed: {e}")
raise
# Usage
validated = validate_with_logging(raw_data, "laps")
Best Practices
- Use strict mode during development: Catches data issues early.
validated = validate_lap_data(data, strict=True)
- Handle validation errors gracefully: Don’t crash on bad data.
from tif1.exceptions import InvalidDataError
try:
validated = validate_laps(data)
except Exception as e:
# Log and use fallback
logger.warning(f"Validation failed: {e}")
pass
-
Run anomaly detection periodically: Monitor data quality over time.
-
Clean data before validation: Remove obvious errors first using normalization functions.
-
Leverage null-like string conversion: The validation module automatically converts "", “none”, “null”, “nan” to None.
Troubleshooting
Validation Errors
from tif1.exceptions import InvalidDataError
try:
validated = validate_laps(data)
except Exception as e:
print(f"Error: {e}")
# Check specific fields by inspecting the error message
if "stint" in str(e).lower():
print("Issue with stint numbers")
elif "life" in str(e).lower():
print("Issue with tire life values")
Inconsistent Lengths
# Check array lengths before validation
lengths = {key: len(val) for key, val in data.items() if isinstance(val, list)}
print(f"Array lengths: {lengths}")
# All non-empty arrays should be equal
non_empty_lengths = [l for l in lengths.values() if l > 0]
if len(set(non_empty_lengths)) > 1:
print("Inconsistent array lengths detected")
# Validation is already optimized and disabled by default
# For manual validation, use non-strict mode
validated = validate_laps(data, strict=False)
# Or skip validation entirely by not calling validation functions
# The library handles this automatically based on performance settings