# Calibration Overview

The **Butlr Calibration System** provides advanced drift correction and data quality improvements for occupancy metrics. This feature addresses the natural drift that occurs in sensor-based occupancy tracking by applying known occupancy counts, automated corrections, and filtering algorithms to produce highly accurate occupancy data.

## Overview

The calibration system processes raw sensor data through an 8-step pipeline that corrects sensor drift, eliminates impossible negative values, and incorporates known occupancy counts. This ensures that occupancy metrics accurately reflect actual space usage while preserving important usage patterns over time.

{% hint style="info" %}
**Dashboard Default**: The Butlr Dashboard displays calibrated data by default to provide the most accurate occupancy metrics for analysis and reporting.
{% endhint %}

{% hint style="warning" %}
**Measurement Compatibility**: Calibration (`"calibrated": "true"`) can only be used with traffic-based measurements: `traffic_floor_occupancy` and `traffic_room_occupancy`. It is not available for other measurement types.
{% endhint %}

{% hint style="success" %}
**Key Features**

1. **Drift Correction:** Automatically corrects accumulated sensor errors using known occupancy counts
2. **Real Count Integration:** Incorporates actual occupancy counts at specific timestamps
3. **Automated Zero Detection:** Uses motion sensors to identify when spaces are empty
4. **Multi-Stage Processing:** Applies multiple validation and correction algorithms
5. **Global Time Support:** Handles different timezones and business day configurations

**Important Note:** Calibration processing operates on raw sensor data and may have different performance characteristics than standard aggregated queries. The system processes data day-by-day for optimal accuracy.
{% endhint %}

***

## The `calibrated: true` Flag

### Purpose

The `calibrated` parameter in API requests controls which processing path the system uses:

* **`"true"`**: Enables full calibration processing with drift correction and filters
* **`"raw"`**: Uses calibration processor but skips filter pipeline (aggregated data only)
* **`"false"` or omitted**: Uses standard pre-aggregated data processing

### Example Query

The following example demonstrates how to retrieve **calibrated occupancy data** using the Reporting API.

**POST** `https://api.butlr.io/api/v3/reporting`

```json
{
  "window": {
    "every": "5m",
    "function": "median"
  },
  "filter": {
    "start": "2025-01-01T00:00:00Z",
    "stop": "2025-01-08T00:00:00Z",
    "measurements": ["traffic_room_occupancy"],
    "calibrated": "true",
    "rooms": {
      "eq": ["room_id"]
    }
  },
  "group_by": {
    "order": ["time"]
  },
  "calibration_points": [
    {
      "timestamp": "2025-01-01T18:00:00Z",
      "occupancy": 12,
      "type": "manual_count"
    }
  ]
}
```

***

## Calibration Processing Pipeline

### 8-Step Processing Architecture

The calibration system processes data through a structured pipeline that operates day-by-day:

1. **Retrieve Raw Sensor Data**: Collects raw traffic events from sensors
2. **Aggregate Occupancy**: Converts events to 1-minute occupancy values
3. **Apply Calibration Filters**: Drift correction and quantization
4. **Negative Value Correction**: Ensures no negative occupancy values
5. **Interpolation**: Fills missing time intervals
6. **Time Range Filtering**: Applies time constraints
7. **Boundary Filtering**: Filters to requested time range
8. **Window Aggregation**: Applies user-specified window function

### Day-by-Day Processing

The system processes data in operational days based on timezone configuration:

* **Historical days**: Automatically adds start-of-day and end-of-day calibration points assuming zero occupancy
* **Current day**: Only adds beginning-of-day calibration point
* **Custom timezones**: Supports configurable business day boundaries

***

## Data Quality Processing

### Drift Correction

**What it does**: Adjusts occupancy counts throughout the day to align with known accurate counts at specific timestamps.

**Why it matters**: Over time, sensors can gradually drift from their true readings. By anchoring the data to known accurate counts (like manual counts or automated empty-room detection), your occupancy data stays reliable even over extended periods.

**Result**: More accurate occupancy trends that reflect actual space usage patterns.

### Smart Rounding

**What it does**: Converts fractional occupancy values to realistic whole person counts.

**Why it matters**: People can't be divided into fractions. This processing ensures your data represents actual people counts rather than statistical averages that don't make physical sense.

**Result**: Clean, interpretable occupancy numbers that correspond to real people.

### Zero Floor Protection

**What it does**: Intelligently adjusts data processing to ensure occupancy values never go below zero while preserving the natural shape and patterns of your occupancy data.

**Why it matters**: During processing, mathematical calculations can sometimes produce negative values. Rather than simply clipping these values, the system contextually re-processes the data to maintain accurate occupancy trends and patterns.

**Result**: Clean data that preserves authentic usage patterns while ensuring all values represent realistic occupancy scenarios.

***

## Calibration Points

### Types of Calibration Points

#### 1. Automatic Operational Day Boundaries

* **Historical days**: Beginning-of-day and end-of-day calibration points (assuming zero occupancy)
* **Current day**: Only beginning-of-day calibration point
* **Custom timezones**: Configurable business day start/end times

#### 2. User-Provided Points

* Explicitly provided via API request in `calibration_points` array
* Must be within query time range
* Timestamps in UTC RFC3339 format
* Override automatic points if at same timestamp

#### 3. PIR Zero Points

* Generated from PIR sensor absence detection
* Created when all eligible sensors show no motion for configured duration
* Include false positive detection and correction

### Calibration Point Structure

```json
{
  "timestamp": "2024-01-01T18:00:00Z",  // UTC RFC3339 format
  "occupancy": 12,                      // Non-negative integer
  "type": "manual_count"                // Source identifier
}
```

***

## PIR-Based Calibration

The system can automatically generate calibration points using PIR motion sensors to identify when rooms are unoccupied.

**How it works**: When PIR sensors detect no activity across an entire room for a configured period, the system automatically creates a calibration point marking that time as zero occupancy.

**Smart validation**: The system includes validation logic to ensure these automatically generated points are accurate and don't interfere with legitimate occupancy patterns.

***

## Objectives of the Calibration System

### Core Problem

Occupancy sensors track people by counting entries and exits throughout the day. This approach can accumulate small errors over time due to:

* Multiple people entering or exiting simultaneously
* Environmental factors that may affect sensor readings
* Natural sensor calibration drift over extended periods

### System Objectives

1. **Drift Correction**: Ensure occupancy reaches realistic values at known points
2. **Physical Reality**: Correct for negative occupancy values
3. **Ground Truth Integration**: Incorporate known occupancy measurements
4. **Automated Correction**: Use PIR sensors for hands-off calibration
5. **Temporal Accuracy**: Maintain accurate occupancy patterns over time

***

## Advanced Usage

### When to Use Calibrated Data

Calibrated data is ideal for:

* **Accuracy-critical applications** where precise occupancy counts are essential
* **Long-term trend analysis** where sensor drift could affect results
* **Compliance reporting** requiring validated occupancy metrics
* **Space optimization** decisions based on actual usage patterns

### Performance Considerations

**Processing Time**: Calibrated queries may take longer than standard queries due to additional processing steps.

**Data Accuracy**: Higher accuracy comes with increased computational overhead.

**Scalable Processing**: The system handles large time ranges efficiently through day-by-day processing.

{% hint style="info" %}
For real-time occupancy data without calibration, refer to the [Real-time Occupancy](https://docs.butlr.io/real-time-occupancy/webhooks-overview) documentation.
{% endhint %}
