Compare commits

...

34 Commits

Author SHA1 Message Date
56fc58b843 fix(tools): reject CSV metacharacters in flash_device.py inputs
device-id, location-id, wifi-ssid, and wifi-password were interpolated
directly into the NVS partition CSV. A value containing comma, double
quote, CR, or LF would split the field/row and silently provision the
wrong NVS keys — easiest concrete failure: a Wi-Fi password containing
a comma. Validate operator-supplied strings before generating the CSV.

Add an empty tools/__init__.py so the regression tests can import the
helper as 'tools.flash_device' (matches the existing 'server.*' test
pattern).

Found via adversarial review (run 2026-05-01-192928, gpt-5.5 reviewer).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 15:44:57 -07:00
641ab29277 fix(server): reject inverted period_start/period_end in CameraRecord
A misbehaving or clock-broken device could submit period_end <=
period_start, polluting the camera_records table with zero-length or
inverted windows that corrupt downstream hourly analytics. Add a
Pydantic model_validator so the request is rejected at the API
boundary instead of silently persisting bad ranges.

Found via adversarial review (run 2026-05-01-191359, both reviewers).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 15:44:57 -07:00
8342904488 fix(firmware/lib): wrap-safe millis() comparison in net_guard reconnect timer
net_guard_tick() compared absolute uint32_t millis() values:
  if (millis() < s_next_retry_ms) return;
This is broken across the ~49.7-day millis() wrap: depending on which
side of the wrap each value lands, retries either tight-loop or stall
indefinitely. The device is designed for multi-month uptime, so this
is a real production case, not a theoretical one.

Replace with the standard wrap-safe pattern using a signed difference.

Found via adversarial review (run 2026-05-01-202910, gpt-5.5 reviewer).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 15:36:06 -07:00
ef00afb14e fix(firmware/lib): validate HMAC secret length and hex format before signing
hmac_sign() previously trusted whatever secret_hex came out of NVS:
- Lengths >128 chars overflowed the fixed 64-byte stack buffer in
  hex_to_bytes (out_len was unbounded).
- Non-hex characters were silently decoded to 0 via strtol with no
  end-pointer check, producing signatures under a corrupted key.
- Empty secrets fell through to mbedtls_md_hmac_starts with len=0.

flash_device.py now rejects malformed --hmac-secret at provision time,
but hmac_sign should also refuse to sign under a malformed key regardless
of how it ended up in NVS (legacy provisioning, partial flash, etc.).

Add length, hex-charset, and even-length validation; make hex_to_bytes
return bool and have hmac_sign return empty HString on any failure
(callers already treat empty as failure via post_json_once).

Found via adversarial review (run 2026-05-01-202910, both reviewers).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 15:36:06 -07:00
96ede7c999 chore: gitignore secrets, pycache, and adversarial-review artifacts
Add patterns for *secret* files (e.g. operator-saved HMAC secrets at
repo root), __pycache__/ directories, and .adversarial-review/ run
artifacts so they don't get accidentally committed via 'git add -A'.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 13:21:15 -07:00
e2dbe6a2d5 fix(server): COALESCE diagnostic columns so v1.0 heartbeats don't clear v1.1 data
store_heartbeat_diagnostics() unconditionally SET each diagnostic column
to its parameter, so a v1.0.0 heartbeat (which omits the five v1.1.0
fields and leaves them as None after Pydantic parsing) erased previously
stored diagnostics for that device. Wrap each parameter in
COALESCE(?, column_name) so omitted fields preserve the existing value.

Found via adversarial review (gpt-5.5 reviewer, run 2026-05-01-191359).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 13:19:23 -07:00
2226c1b4ca fix(tools): validate flash_device.py HMAC secret format before flashing
--hmac-secret accepted any string and passed it through to NVS, silently
producing a device that cannot authenticate to the server. Reject anything
that isn't exactly 64 hex characters (32 bytes) before generating the NVS
image. Auto-generated secrets are validated too as a defensive check.

Found via adversarial review (both reviewers, run 2026-05-01-192928).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 13:19:16 -07:00
a0eee0e6d4 fix(firmware): preserve buffered records appended during flush POST
reporter_flush() snapshotted the buffers under lock, released the lock
to POST, then unconditionally cleared the entire buffer on success.
Records appended by reporter_submit_*() during the in-flight POST were
silently erased. Replace clear() with erase() of just the snapshotted
prefix so concurrent appends survive.

Found via adversarial review (gpt-5.5 reviewer, run 2026-05-01-190903).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 13:19:11 -07:00
a585a56cff fix(firmware): upgrade NimBLE to 2.x + DNS fallback for unreliable resolvers
NimBLE-Arduino 1.4.2 had an init/fire race in its FreeRTOS callout porting
layer where os_callout_timer_cb dispatched a queued TimerHandle expiry
against a not-yet-initialized event (NULL fn pointer), causing PC=0
InstrFetchProhibited within ~1s of boot when the camera task starved the
timer service. Confirmed by ets_printf instrumentation. Upgrading to
^2.0.0 rewrites the porting layer and eliminates the race; verified clean
on the customer network for 1+ hour.

Also rolls in DNS-resilience work that surfaced the BLE crash during
provisioning: pin lwIP/esp-netif resolvers to 1.1.1.1/8.8.8.8 across DHCP
renewals, add three-tier resolver fallback in reporter with a hardcoded
IP of last resort, and switch to raw WiFiClient with manual Host header
to bypass HTTPClient's brittle DNS path.

Migration touches for NimBLE 2.x:
- NimBLEAdvertisedDeviceCallbacks -> NimBLEScanCallbacks
- onResult signature now takes const NimBLEAdvertisedDevice*
- setAdvertisedDeviceCallbacks -> setScanCallbacks
- start(0, nullptr, false) -> start(0, false, false)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 11:34:17 -07:00
461ed7d888 docs(readme): add HMAC secret generation command to operator setup
Step 2 now shows openssl rand -hex 32 (with python and /dev/urandom
fallbacks) and writes to .agent/dc-<id>-secret with chmod 600, so the
flash_device.py example can read $(cat ...) the same way the known-good
dc-0002 command does.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 14:45:08 -07:00
259256a550 docs: retailer packet — setup guide (.docx) + repo QR code
Adds the printed materials shipped with each device:
- retailer-setup-guide.docx — non-technical 1-2 page setup guide
- retailer-setup-guide.py — generator script for the .docx
- doorcounter-repo-qr.png — QR code linking to the public Gitea repo

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 14:38:22 -07:00
be44299d3e docs(readme): add quick-start, hardware sources, power draw + latency notes
Adds a sourced parts table (M5 TimerCamera-F, USB cable, 5V adapter), the
~750 mW measured power draw, the 3-5s detection latency caveat, and a
six-step Quick Start aimed at semi-technical operators deploying their
own device.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 14:26:45 -07:00
268b595340 Merge branch 'feat/network-resilience'
Network resilience hardening: NVS event-log ring buffer, event-driven
WiFi reconnect with backoff, HTTP timeouts + retry, task watchdog,
software heartbeat-miss watchdog (6h), EVT_BOOT/EVT_REBOOT logging,
heartbeat v1.1.0 diagnostic payload, server stub + migration, docs.
2026-04-23 14:12:40 -07:00
a795cfa0ad fix(firmware): reboot on FATAL failures + emit NTP_SYNC + server-coord warning
- Config-load and camera-init FATAL branches now reboot (3s LED signal
  before restart) instead of hanging forever. Matches the enum name
  REBOOT_FATAL_* and makes camera-init failures diagnosable via the
  next boot's heartbeat recent_events. Config failures produce a
  visible reboot loop rather than a silent hang.
- Emit EVT_NTP_SYNC(seconds_since_boot) on the first NTP-synced
  reporter iteration so slow / failed NTP sync is a visible signal in
  the heartbeat's recent_events window.
- README "Deploying firmware 1.1" now opens with a "Before you flash"
  warning directing the operator to land server-side heartbeat
  schema changes first (migration 005 + stub integration) to avoid a
  strict-schema 4xx reboot loop after deployment.
2026-04-23 14:10:32 -07:00
d943b3df5a feat(firmware): log reason before FATAL hang loops
Two FATAL while(true) hangs in main.cpp (config load fail, camera init
fail) previously relied on the hardware watchdog to reboot the device,
leaving the cause invisible beyond a generic TWDT reset reason. Now
each path logs EVT_REBOOT with REBOOT_FATAL_CONFIG or REBOOT_FATAL_CAMERA
before hanging, so the next heartbeat's recent_events surfaces which
branch hung. Server-side decoder updated for the two new enum values.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 14:03:57 -07:00
2d95069bd1 docs: network-resilience firmware 1.1 deployment + field diagnostic guide
Flash command, expected first-boot behavior, per-feature summary of the
1.1 release, 24-hour field-check playbook, and a reference table for
decoding the heartbeat's recent_events array.
2026-04-23 14:02:09 -07:00
867e90b1f6 feat(server): heartbeat-diagnostics stub + migration for real server import
The real server lives in a separate repo; this repo carries reference
stubs for each endpoint (see camera_endpoint.py precedent). Adds the
Pydantic extension, persistence helper, migration 005, and tests that
the real server can copy when adding diagnostic-field support.

Matches the firmware v1.1.0 heartbeat payload shape. Old-shape
payloads (firmware v1.0.0) continue to parse cleanly with the new
fields defaulting to None.
2026-04-23 13:59:31 -07:00
5c9f5df0ce feat(firmware): include diagnostics in heartbeat payload
Heartbeat v1.1.0 now carries heap stats (free + min_free since boot),
esp_reset_reason(), last WiFi disconnect code, and the last 8
persisted event-log entries. Makes field failures diagnosable
server-side without retrieving the device: the post-reboot heartbeat
will include EVT_BOOT with reset reason and whatever EVT_WIFI_DOWN
or EVT_HTTP_FAIL entries preceded it.
2026-04-23 13:54:55 -07:00
f08f70a8fb feat(firmware): software heartbeat-miss watchdog reboots after 6h offline
Reporter task counts consecutive heartbeat failures from the bool
returned by reporter_heartbeat (Task 5). After 6 consecutive misses
(~6 hours at the hourly cadence) the device logs EVT_HEARTBEAT_MISS
then EVT_REBOOT(REBOOT_HEARTBEAT_MISS) and restarts, giving the whole
network stack a clean reinitialization. The 200ms delay before the
restart lets NVS commit the REBOOT entry so the next boot can report
it via EVT_BOOT + esp_reset_reason().
2026-04-23 13:52:07 -07:00
7b546d0ed7 feat(firmware): enable task watchdog on camera/reporter/loop tasks
30s TWDT subscribes all three long-running tasks and panics on hang.
The reporter task's retry loop explicitly feeds between attempts so
the 3-try sequence (worst case 52s) does not itself trip the dog.
Reset reason on next boot is visible via esp_reset_reason() which
EVT_BOOT already logs.
2026-04-23 13:49:05 -07:00
8f8ad0b1b0 fix(firmware): add HTTP timeouts + 3-try retry, report heartbeat status
Unbounded TLS/HTTP POSTs were blocking the reporter task indefinitely
on weak WiFi. Now: 5s connect timeout, 10s response timeout, 3 attempts
with 0/2s/5s backoff. Every attempt logs HTTP_OK or HTTP_FAIL to the
event log. reporter_heartbeat now returns bool so the caller can count
consecutive misses.
2026-04-23 13:44:17 -07:00
57129ba078 fix(firmware): net_guard silent-wifi-death fallback + header hygiene
- net_guard_tick now detects status-vs-event divergence. If s_up is
  true but WiFi.status() says otherwise (rare: driver wedge, silent
  RF failure), force DOWN state and schedule reconnect. Uses 0xFF
  disconnect reason so the event log distinguishes this path.
- Forward-declare DeviceConfig in net_guard.h so consumers that don't
  call net_guard_start don't transitively pull config.h.
2026-04-23 13:41:53 -07:00
af3067d481 refactor(firmware): drive WiFi reconnect from net_guard events
loop() no longer blocks for 5s after a disconnect; reconnect is
scheduled from the WiFi event handler with exponential backoff.
Buffered reports flush on every clean UP transition.
2026-04-23 13:36:29 -07:00
cfa0d2563f fix(firmware): event_log bounded mutex wait, skip on contention
Mutex take in event_log_write and event_log_read_recent switched
from portMAX_DELAY to pdMS_TO_TICKS(50) with skip-on-timeout. Prevents
the high-priority WiFi event task from stalling on NVS writes; diag
loss under contention is preferable to dropped WiFi events.
2026-04-23 13:31:54 -07:00
84d9ba349b fix(firmware): net_guard boot-state seed + no spurious disconnect
- Seed s_up from WiFi.status() in net_guard_start so the first
  STA_GOT_IP (fired during setup's busy-wait, before onEvent was
  registered) is not missed — prevents a reconnect flap on every boot.
- Drop WiFi.disconnect() from net_guard_tick; WiFi.begin() alone
  re-associates cleanly and avoids a spurious STA_DISCONNECTED that
  was double-logging EVT_WIFI_DOWN on every retry.
- Re-check s_up after the millis() timing gate to close the
  GOT_IP-vs-tick race.
- Document the volatile-only shared-state contract.
2026-04-23 13:31:47 -07:00
9f293b4639 feat(firmware): event-driven WiFi reconnect with exponential backoff
net_guard registers WiFi.onEvent() so disconnects are handled
immediately instead of polled every 1s. Backoff 1s->2s->4s->...->60s cap.
Every up/down transition is logged to the event log with the disconnect
reason code, so field failures are diagnosable.
2026-04-23 13:26:10 -07:00
95724bf3ff feat(firmware): log boot and reboot reason to event log
Every boot logs EVT_BOOT with esp_reset_reason(); every deliberate
ESP.restart() is preceded by EVT_REBOOT with a reason code. This
gives us a persistent answer to 'why did the device just reboot?'.
2026-04-23 13:21:23 -07:00
9eb1e19651 test(firmware): event_log boot recovery — partial fill and post-wrap
Exercises the slot-scan logic in event_log_init(): after a simulated
reboot (RAM state cleared, NVS slots preserved) the module must
resume with the correct head/cnt so newest-first read order is
unchanged and subsequent writes continue the seq monotonically.

Adds native-only event_log_test_simulate_reboot() helper. Lifts the
slot-scan loop out of the #ifdef ARDUINO guard so the native stub
exercises the same recovery path as production; the platform-specific
NVS setup remains guarded.
2026-04-23 13:18:08 -07:00
95f91d3656 fix(firmware): event_log thread safety and NVS wear
- Remove monotonic counter writes to NVS (stop burning flash on every
  event). Derive head and cnt by scanning slots on boot.
- Widen seq to uint32 so slot scan works across multi-year lifetimes.
- Add FreeRTOS mutex around write/read so WiFi event handlers can
  safely call event_log_write from another task.
- Check Preferences.begin() return; disable logging if NVS unavailable.
- Extract NTP_SYNC_THRESHOLD constant; drop misleading native uptime.
- Add tests for empty read, max_entries truncation, real-path hash.
2026-04-23 13:13:21 -07:00
9232766e60 feat(firmware): add NVS-backed event log ring buffer
Persistent 32-slot ring buffer of tagged diagnostic events (boot, wifi
up/down, http ok/fail, heartbeat miss, reboot). Used to diagnose field
failures post-hoc via the heartbeat payload, without needing serial
access. Native-native stub lets policy be unit-tested.
2026-04-23 13:06:38 -07:00
a37207b6ff feat: event-based walker detector tuned to real 7' overhead mount
Replace per-track line-crossing counter with a single event state machine
gated by foreground pixel count (ENTER=250, EXIT=150) and finalized by
quiet-exit or timeout. Direction inferred from centroid excursion
(up_score vs down_score) on quiet-exit fires, and from net displacement
(last_c vs first_c) on timeout fires.

Tuning reflects bench data at the intended 7' overhead mount: walkers
produce smaller centroid excursions than originally modelled, so
EXTENT gates, MIN_TRAJ, MAX_FRAMES and REFRACTORY were all relaxed from
their initial guesses. Constants and rationale live in firmware/lib/cv/cv.h.

Bench results (8 isolated walks, 4 entries + 4 exits):
  * Event detection: 8/8 (100%)
  * Aggregate entries+exits split: 4+4 (matches)
  * Per-walk direction labelling: 4/8 (~50%)

Document explicitly that per-walk direction is unreliable at this mount
and that downstream analytics should trust only gross traffic
(entries + exits). Recovering direction would require a physical mount
change or a richer signal; both are out of scope for v1.

Tooling:
  * tools/replay_logs.py — replay event state machine against captured
    [F] diagnostic lines, for offline tuning without flash-test loops.
  * firmware/src/main_capture.cpp + tools/capture_frames.py +
    tools/replay_frames.py — raw-frame capture firmware and Python port
    of the detector, kept in tree for future iteration even though the
    TimerCamera-F serial driver stripped specific byte ranges in testing
    and log-based replay became the working path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 16:03:36 -07:00
3b471992f2 feat(cv): directional once-per-track counting + detection LED blinks
A single person walking under the overhead camera was generating both an
entry and an exit within a few seconds — the line-crossing logic treated
a blob's traversal into one side of the frame and out the other as two
separate events whenever the track spawned near the line, oscillated
against shadows, or churned at creation.

Replaced line-crossing semantics with directional traversal:
- Each track records spawn_y at creation and a counted flag.
- An event fires only if the track is not yet counted, spawned firm on
  one side of the line (|spawn_y - line_y| > CV_TRAVERSAL_MARGIN_PX),
  and is now firm on the opposite side. Direction of travel determines
  entry vs exit. The track is then flagged counted — one trip, one count.
- Cooldown remains as a secondary safety net.

main.cpp: single/double LED pulse on entry/exit detections. Saves and
restores the current LED state so upload (yellow-on) and no-WiFi
indicators aren't clobbered.

Tests updated to walk blobs beyond the margin and register two new
cases: wobble-at-line doesn't count, and a reversed full traversal
doesn't double-count on the same track.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 09:46:59 -07:00
24aaae6ff2 docs: add Troubleshooting section + serial_monitor.py diagnostic tool
- README: note NVS may be cleared by firmware uploads (requires re-running
  flash_device.py); new Troubleshooting table covering the fast-blink fatal
  state, captive-portal fallback, and no-counts cases.
- tools/serial_monitor.py: ESP32 RTS/DTR reset + serial capture with
  per-line elapsed-time prefix. Used to distinguish "unprovisioned" vs
  "WiFi failed" boot states (fast-blink LED alone is ambiguous).
- README project-tree updated to include lib/cv, docs/server-prompt-…,
  and the new tool.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 07:05:21 -07:00
62931e26ff fix(cv): add per-direction crossing cooldown to suppress track-churn double-counts
When a blob briefly drops below CV_MIN_BLOB_PX, its track is killed and respawns,
causing the same person to generate multiple counts per visit (~50/min observed
in field). Add a per-direction cooldown (default 5 frames ≈ 0.8s @ 5 fps) that
drops subsequent entries (or exits) within the window of the last counted one.
Entry and exit cooldowns are tracked independently.

Fixed at compile time for now; exposing as a server-push tunable is deferred
until the server-push-config branch lands. See docs/server-prompt-crossing-
cooldown.md for the server-side coordination notes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 06:33:11 -07:00
38 changed files with 4188 additions and 254 deletions

8
.gitignore vendored
View File

@@ -1 +1,9 @@
.worktrees/
.agent/
.claude/
.adversarial-review/
graphify-out/
firmware/.pio/
*.log
*secret*
__pycache__/

327
README.md
View File

@@ -1,12 +1,96 @@
# DoorCounter
Retail door traffic counter using M5Stack TimerCamera-F (ESP32 + OV3660). Counts entries/exits via overhead camera CV, passively scans BLE foot traffic, and reports hourly to `logs.research.bike`.
Retail door traffic counter using M5Stack TimerCamera-F (ESP32 + OV3660). Counts walker traversals via overhead camera CV, passively scans BLE foot traffic, and reports hourly to `logs.research.bike`.
> **Known limitations.**
> - **Directional accuracy.** Counts are reported as `{entries, exits}` for API compatibility, but **per-walk direction labelling is not reliable at the current mount (7' overhead, straight down).** Bench testing: event detection 100% (8/8), per-walk direction ~50% (coin flip). **Trust gross traffic: `entries + exits` ≈ total walkers.** See [Directional counting](#directional-counting).
> - **Detection latency.** A walker takes **35 seconds** from entering the FOV to being registered as a count — the state machine waits for the walker to clear the frame (or a 5s timeout) before finalizing. Counts are not instantaneous; hourly aggregation is the intended consumption mode.
## Hardware
- **Device**: M5Stack TimerCamera-F (ESP32-S, OV3660, PSRAM, WiFi/BLE)
- **Mount**: Overhead, camera pointing straight down, centered above doorway
- **Power**: USB (any phone charger)
| Component | Source | Notes |
|-----------|--------|-------|
| **Camera** | [M5Stack TimerCamera-F (OV3660 fisheye, PSRAM)](https://shop.m5stack.com/products/esp32-psram-timer-camera-fisheye-ov3660) | ESP32 + WiFi/BLE on board |
| **USB cable** | [USB-A → USB-C, right-angle](https://www.amazon.com/dp/B0DWMPVP4F) | Right-angle plug helps with overhead mounts |
| **Power supply** | [5V USB wall adapter](https://www.amazon.com/dp/B0B2WLSY9D) | Any 5V/1A+ USB charger works |
- **Mount**: Overhead, camera pointing straight down, centered above doorway (~7' / 2.1m height)
- **Power draw**: **~750 mW measured at the wall** (camera + WiFi + BLE all active). Runs cool — fanless, can be sealed in a small enclosure. Annual energy cost at US residential rates is well under $1.
## Quick Start (semi-technical)
The fastest path from "box arrived" to "counts in the dashboard." Comfortable with a terminal but not necessarily an embedded developer? Start here.
**You will need**: the camera + cable + power supply listed above, a Linux/macOS computer with USB, and ~20 minutes.
### 1. Install the toolchain (one-time)
```bash
# Python 3.10+ and pip
pip install --user platformio esptool esp-idf-nvs-partition-gen
```
PlatformIO installs the ESP32 compiler on first build — expect a few minutes the first time.
### 2. Clone this repo
```bash
git clone https://github.com/<your-org>/DoorCounter.git
cd DoorCounter
```
### 3. Plug the camera in
Connect the USB-C cable to the TimerCamera and the other end to your computer. On Linux it appears as `/dev/ttyUSB0`; on macOS as `/dev/tty.usbserial-*`. If you don't see it, install [CP210x USB drivers](https://www.silabs.com/developer-tools/usb-to-uart-bridge-vcp-drivers).
### 4. Flash the firmware
```bash
cd firmware
pio run -t upload --upload-port /dev/ttyUSB0
```
### 5. Provision the device with its credentials
Pick a unique device ID (e.g. `dc-0001`), a location ID, and generate a 32-byte HMAC secret. The server admin must record this same secret — counts won't be accepted without it.
```bash
# Generate a fresh secret
openssl rand -hex 32 > my-device-secret.txt
# Provision
python tools/flash_device.py \
--port /dev/ttyUSB0 \
--device-id dc-0001 \
--location-id my-store \
--hmac-secret "$(cat my-device-secret.txt)" \
--wifi-ssid "MyStoreWiFi" \
--wifi-password "wifi-password-here"
```
> If you skip `--wifi-ssid`/`--wifi-password`, the device opens a `DoorCounter-Setup` WiFi access point on boot. Connect a phone to it and enter the credentials in the captive portal.
### 6. Mount the device
1. Position above the doorway, camera lens pointing straight down (~7' / 2.1m up).
2. Plug into the wall adapter — that's it. The LED turns red while joining WiFi, then off once it's counting.
3. First heartbeat lands at the server within ~60 seconds; first hourly count batch arrives at the top of the next hour.
### What "working" looks like
- LED behavior: **off** = counting normally · **red** = no WiFi · **yellow** = uploading · **brief flash** when a walker is registered (1 flash = entry, 2 flashes = exit).
- A walker takes 35 seconds from entering the FOV to triggering the LED flash — this is normal.
- Hourly uploads to `logs.research.bike` (or your configured server) include the entry/exit counts since the last report.
### If something is off
| Symptom | Try |
|---------|-----|
| Red LED stays on | Wrong WiFi password — re-run step 5, or use the `DoorCounter-Setup` captive portal. |
| LED blinks ~1 Hz forever (or device reboots in a loop) | NVS got wiped — re-run step 5 with the same credentials. |
| No counts appearing on server | Run `python tools/serial_monitor.py --port /dev/ttyUSB0 --reset --timestamp --seconds 30` and watch for `[CV] entry/exit` lines as you walk under it. |
For deeper troubleshooting see [Troubleshooting](#troubleshooting) and [Operator Setup](#operator-setup).
## Firmware
@@ -21,7 +105,8 @@ pio run -t upload --upload-port /dev/ttyUSB0
| Module | Behavior |
|--------|----------|
| CV pipeline | 5 fps, 96×96 grayscale, blob tracking, line-crossing count |
| CV pipeline | 5 fps, 96×96 grayscale, event-based walker detector (foreground-count state machine; centroid-trajectory direction heuristic) with post-fire refractory period |
| Detection LED | Single blink on entry, double blink on exit (preserves upload/no-WiFi status LED) |
| BLE scanner | Continuous passive scan; deinits during hourly upload to free heap |
| Reporter | Hourly HMAC-signed POST; 60s boot report for fast connectivity check |
| Provisioning | Captive portal AP on first boot for WiFi setup |
@@ -32,6 +117,71 @@ pio run -t upload --upload-port /dev/ttyUSB0
- **First report**: 60 seconds after NTP sync (connectivity check)
- **Subsequent reports**: every 3600 seconds
### Counting model — event-based walker detector
The CV pipeline is a **single event state machine** (no per-blob tracking
for counting). Per-frame foreground pixel count gates event start and end;
centroid trajectory within the active event decides direction.
**Event lifecycle:**
1. **Idle → Active**: `fg_count ≥ CV_EVENT_ENTER_THRESH` (250 px) fires event start.
Background updates freeze while the event is active so the walker does
not get absorbed into the baseline.
2. **Active accumulation**: every frame updates `first_c` (once), `min_c`,
`max_c`, `last_c`, `min_y_seen`, `max_y_seen`, and the frame count.
3. **Active → End** (either):
- **Quiet exit**: `fg_count < CV_EVENT_EXIT_THRESH` (150 px) for
`CV_EVENT_QUIET_FRAMES` (3) consecutive frames — walker has left.
- **Timeout**: `event_frame_count > CV_EVENT_MAX_FRAMES` (25 frames ≈ 5s).
4. On end, the event is finalized: gated by minimum duration, vertical
extent (must span a large fraction of the frame), and minimum centroid
trajectory magnitude. Background snaps to the current frame.
5. A **refractory period** (`CV_EVENT_REFRACTORY_FRAMES` = 10 ≈ 2s) after
a fire blocks a new event from starting — absorbs residual lingering
motion that would otherwise double-count.
**Direction heuristic** (applied only if the event passes all gates):
- `up_score = first_c min_c` (how far centroid excursed upward)
- `down_score = max_c first_c` (how far it excursed downward)
- Quiet-exit events: `is_entry = (up_score ≥ down_score)`
- Timeout events: `is_entry = (last_c < first_c)` — net displacement is
more reliable than excursion when the walker is still in frame at timeout.
Per-mount convention: centroid moving **up through the frame** (y decreasing)
= **entry** into the store.
### Directional counting — known limitation
**Per-walk direction labelling is unreliable at the current mount.** In
bench testing (8 alternating entry/exit walks at 4s intervals, 7' overhead
mount pointing straight down):
- **Event detection**: 8/8 (100%) — every walk produced exactly one event.
- **Aggregate split**: 4 entries + 4 exits — matches the 4+4 ground truth.
- **Per-walk direction**: 4/8 (50%) — essentially a coin flip.
At this mount, entries and exits produce nearly identical centroid
trajectories: both begin near mid-frame (walker is already large when
`fg_count` crosses 250), both reach a peak excursion toward the top, and
both end near mid-frame (walker's tail is still visible when `fg_count`
drops below 150). No heuristic over the recorded centroid statistics
separates them with better than ~50% accuracy on alternating walks.
**What we ship, and what the server should trust:**
- **Gross traffic (`entries + exits`) is accurate.** This is the number
downstream analytics should use as "people through the door this hour."
- **Directional split is reported but unreliable.** Treat individual
`entries` and `exits` values as a best-effort labelling. Do not infer
net flow or dwell from them.
To actually recover per-walk direction would require either a physical
change (raise or tilt the camera so walkers enter/leave through the frame
edges) or a richer signal than centroid statistics (e.g. time-resolved
optical flow, or a second sensor). That work is out of scope for v1.
See `firmware/lib/cv/cv.h` for tuning constants and `cv.cpp` for the
finalize logic.
## Operator Setup
### 1. Flash firmware
@@ -43,18 +193,60 @@ pio run -t upload --upload-port /dev/ttyUSB0
### 2. Provision device identity
Generate a fresh 32-byte HMAC secret (64 hex chars) and stash it where you
won't lose it — the server must store the same value or counts will be
rejected:
```bash
# Generate and save (one device per file; never commit these)
mkdir -p .agent
openssl rand -hex 32 > .agent/dc-0042-secret
chmod 600 .agent/dc-0042-secret
```
> No `openssl`? Equivalents:
> - `python3 -c 'import secrets; print(secrets.token_hex(32))'`
> - `head -c 32 /dev/urandom | xxd -p -c 64`
Then provision:
```bash
python tools/flash_device.py \
--port /dev/ttyUSB0 \
--device-id dc-0042 \
--location-id retailer-123 \
--hmac-secret <32-byte-hex> \
--hmac-secret "$(cat .agent/dc-0042-secret)" \
--wifi-ssid "StoreWiFi" \
--wifi-password "secret"
```
WiFi credentials are optional — if omitted, device starts captive portal on boot.
**Known-good command for dc-0002** (dev device at research.bike):
```bash
python tools/flash_device.py \
--port /dev/ttyUSB0 \
--device-id dc-0002 \
--location-id retailer-123 \
--hmac-secret "$(cat .agent/dc-0002-secret)" \
--wifi-ssid Elly-Fi \
--wifi-password <ask> \
--line-offset 50
```
Secret is stored in `.agent/dc-0002-secret` (gitignored). Server must already
know this secret — do not rotate without updating the server side.
> **Re-provision after firmware uploads.** Flashing firmware via
> `pio run -t upload` may clear the NVS partition on this board.
> - **FW 1.0**: device boots into a ~1 Hz LED blink (hang in "not provisioned" fatal).
> - **FW 1.1+**: device reboot-loops with `FATAL: device_id/location_id/hmac_secret not provisioned`
> followed by `rst:0xc (SW_CPU_RESET)` (FATAL paths now reboot instead of hang).
>
> Either way, re-run `flash_device.py` with the same credentials. See
> [Troubleshooting](#troubleshooting).
### 3. OTA updates
```bash
@@ -70,7 +262,7 @@ python tools/ota_push.py \
3. Connect phone to `DoorCounter-Setup` WiFi
4. Browser opens automatically → enter store WiFi password → done
**LED indicators**: Red = no WiFi · Blue = counting · Yellow = uploading
**LED indicators**: Red = no WiFi · Blue = counting · Yellow = uploading · Brief flash (×1) on entry · Brief flash (×2) on exit
## API
@@ -90,7 +282,9 @@ All requests are HMAC-SHA256 signed. See [design spec](docs/superpowers/specs/20
DoorCounter/
├── firmware/
│ ├── platformio.ini
│ ├── lib/hmac/ — HMAC-SHA256 signing library
│ ├── lib/
│ │ ├── cv/ — CV pipeline (event state machine, centroid-trajectory direction)
│ │ └── hmac/ — HMAC-SHA256 signing library
│ └── src/
│ ├── main.cpp — FreeRTOS tasks, boot sequence
│ ├── config.* — NVS read/write
@@ -100,8 +294,119 @@ DoorCounter/
│ └── reporter.* — hourly batch POST + local buffer
├── tools/
│ ├── flash_device.py — NVS provisioning script
── ota_push.py — OTA push script
├── docs/superpowers/specs/
│ └── 2026-04-13-door-counter-design.md
── ota_push.py — OTA push script
│ └── serial_monitor.py — reset + read serial with timestamps (diagnostic)
├── docs/
│ ├── server-prompt-crossing-cooldown.md — server-side coordination notes
│ └── superpowers/specs/2026-04-13-door-counter-design.md
└── server/ — API server (separate deployment)
```
## Troubleshooting
| Symptom | Likely cause | Remedy |
|---------|--------------|--------|
| ~1 Hz LED blink after boot (FW 1.0), OR reboot loop with `FATAL: device_id/location_id/hmac_secret not provisioned``rst:0xc (SW_CPU_RESET)` (FW 1.1+) | NVS missing `device_id` / `location_id` / `hmac_secret`. Commonly triggered by a firmware upload wiping NVS. FW 1.1+ reboots on FATAL instead of hanging. | Re-run `flash_device.py` with the device's known credentials (see section 2 for dc-0002). |
| Device stays on `DoorCounter-Setup` AP instead of joining customer WiFi | SSID/password in NVS wrong, or network out of range. | Connect phone to `DoorCounter-Setup` → captive portal → re-enter WiFi. Or reflash NVS with correct `--wifi-ssid` / `--wifi-password`. |
| No entries/exits counted for a known-walking doorway | WiFi captive portal still up (camera task starts only after connect); or camera blocked/unfocused. | Check LED: solid on = booting/uploading, off = counting. Run `serial_monitor.py` to see `[CV] entry/exit` log lines. |
Capture a boot log with timestamps:
```bash
python tools/serial_monitor.py --port /dev/ttyUSB0 --reset --timestamp --seconds 30
```
## Deploying firmware 1.1 (network resilience)
### Before you flash
Firmware 1.1 adds five new fields to the `POST /api/v1/heartbeat` payload
(`reset_reason`, `heap_free`, `heap_min_free`, `last_disconnect_code`,
`recent_events`). **The real server must accept these optional fields before
you deploy firmware 1.1**, or strict-schema validation will 4xx every
heartbeat; after 6 consecutive misses (~6h) the heartbeat-miss watchdog
will reboot the device, producing a reboot loop.
Reference migration and handler code for the real server are in this repo:
- `server/heartbeat_diagnostics_stub.py` — Pydantic model extensions,
`store_heartbeat_diagnostics()` helper, and `EVENT_TAG_DECODER` /
`REBOOT_REASON_DECODER` reference tables.
- `server/migrations/005_heartbeat_diagnostics.sql` — adds five nullable
columns to the `heartbeats` table (adjust table name to match the real
server's schema).
Copy the stub additions into the production server repo, run the
migration, and confirm a v1.1.0-shape heartbeat returns 200 before you
flash any device.
### Flash command
```bash
cd firmware && pio run -e timercam -t upload
```
> **If the device reboot-loops after flashing** with `FATAL:
> device_id/location_id/hmac_secret not provisioned`, NVS was wiped. Re-run
> `flash_device.py` (see [section 2](#2-provision-device-identity)). FW 1.1
> turned the old FW 1.0 LED-blink hang into an explicit reboot loop; same
> root cause, same fix.
### Expected first boot
On the serial log (115200 baud), the device prints the boot banner, then
initializes `event_log`, then records the reset reason via `EVT_BOOT`.
The first heartbeat fires roughly 60-70s after power-on (15s WiFi
busy-wait + NTP sync + 60s `BOOT_REPORT_DELAY_S`). Monitor with
`pio device monitor` or:
```bash
python tools/serial_monitor.py --port /dev/ttyUSB0 --reset --timestamp --seconds 90
```
### What's new in 1.1
- Event-driven WiFi reconnect with 1s→60s exponential backoff (`net_guard` module); disconnect reasons logged.
- HTTP timeouts (5s connect / 10s response) + 3-try retry on every POST.
- ESP-IDF Task Watchdog (30s) on camera, reporter, and loop tasks; panic → reboot → reason surfaces in the next heartbeat.
- Software heartbeat-miss watchdog: 6 consecutive missed heartbeats (~6 h) triggers a clean reboot.
- Persistent NVS event-log ring buffer (32 entries) surfaced in the heartbeat's `recent_events` field.
- New heartbeat fields: `reset_reason`, `heap_free`, `heap_min_free`, `last_disconnect_code`, `recent_events`.
### 24-hour field checks
After deploying a device, run through this checklist against the server's
heartbeat records at the 24-hour mark:
- **Heartbeat count ≥ 22** — ≥ 92% uptime across 24 h at the hourly cadence.
- **No sustained `t=6` (EVT_HEARTBEAT_MISS) entries in `recent_events`** — transient singletons are expected; repeated misses indicate a sticky network problem worth investigating.
- **`heap_min_free` stable day over day** — a downward drift indicates a leak. Alert threshold: min-free drops by more than 20% vs baseline.
- **`last_disconnect_code` matches known AP behavior** — reason 8 (assoc lost) and reason 15 (4-way handshake timeout) are common on busy APs; recurring reason 200+ indicates a firmware bug.
- **`reset_reason` has no unexpected values** — see table below.
| `reset_reason` | Meaning | Expected? |
|----------------|---------|-----------|
| 1 | Power-on | Normal immediately after a deployment. |
| 4 | Software reset (our `ESP.restart()`) | Correlate with `EVT_REBOOT` in `recent_events`. |
| 6 | Task watchdog | Investigate — a task hung for 30s. |
| 7 | Brownout | Investigate power supply / USB cable. |
| 8 | SDIO reset | Unusual — investigate. |
### Decoding recent_events
The `recent_events` array is a ring buffer of `{t, d0, d1, ts}` entries.
Tag definitions live in `firmware/lib/event_log/event_log.h`:
| `t` | Event | `d0` | `d1` |
|-----|-------|------|------|
| 1 | `EVT_BOOT` | `esp_reset_reason()` | — |
| 2 | `EVT_WIFI_UP` | RSSI | — |
| 3 | `EVT_WIFI_DOWN` | disconnect reason code; `0xFF` = silent-death fallback | — |
| 4 | `EVT_HTTP_OK` | fnv1a-16 path hash | elapsed ms (capped at 65535) |
| 5 | `EVT_HTTP_FAIL` | path hash | HTTP status or negative errno cast to `uint16` |
| 6 | `EVT_HEARTBEAT_MISS` | consecutive miss count | — |
| 7 | `EVT_NTP_SYNC` | reserved | — |
| 8 | `EVT_REBOOT` | `RebootReason`: 1=HEARTBEAT_MISS, 2=FACTORY_RESET, 3=OTA, 4=WIFI_REPROV | — |
Server-side decoder tables (`EVENT_TAG_DECODER`, `REBOOT_REASON_DECODER`)
live in `server/heartbeat_diagnostics_stub.py`.

Binary file not shown.

After

Width:  |  Height:  |  Size: 645 B

Binary file not shown.

View File

@@ -0,0 +1,133 @@
from docx import Document
from docx.shared import Pt, Inches, RGBColor
from docx.enum.text import WD_ALIGN_PARAGRAPH
doc = Document()
for section in doc.sections:
section.top_margin = Inches(0.6)
section.bottom_margin = Inches(0.6)
section.left_margin = Inches(0.8)
section.right_margin = Inches(0.8)
style = doc.styles['Normal']
style.font.name = 'Calibri'
style.font.size = Pt(11)
def heading(text, size=18, color=(0x1F, 0x3A, 0x5F), space_before=6, space_after=4):
p = doc.add_paragraph()
p.paragraph_format.space_before = Pt(space_before)
p.paragraph_format.space_after = Pt(space_after)
run = p.add_run(text)
run.bold = True
run.font.size = Pt(size)
run.font.color.rgb = RGBColor(*color)
return p
def subheading(text):
return heading(text, size=13, color=(0x1F, 0x3A, 0x5F), space_before=8, space_after=2)
def body(text, bold_lead=None):
p = doc.add_paragraph()
p.paragraph_format.space_after = Pt(4)
if bold_lead:
r = p.add_run(bold_lead)
r.bold = True
p.add_run(text)
else:
p.add_run(text)
return p
def bullet(text, bold_lead=None):
p = doc.add_paragraph(style='List Bullet')
p.paragraph_format.space_after = Pt(2)
if bold_lead:
r = p.add_run(bold_lead)
r.bold = True
p.add_run(text)
else:
p.add_run(text)
return p
# ---------- Title ----------
title = doc.add_paragraph()
title.alignment = WD_ALIGN_PARAGRAPH.CENTER
tr = title.add_run('DoorCounter')
tr.bold = True
tr.font.size = Pt(28)
tr.font.color.rgb = RGBColor(0x1F, 0x3A, 0x5F)
sub = doc.add_paragraph()
sub.alignment = WD_ALIGN_PARAGRAPH.CENTER
sr = sub.add_run('A simple, private way to count visitors to your store')
sr.italic = True
sr.font.size = Pt(13)
sr.font.color.rgb = RGBColor(0x55, 0x55, 0x55)
sub.paragraph_format.space_after = Pt(10)
# ---------- What it is ----------
heading('What is in the box?', size=14)
bullet('A small camera (about the size of a matchbox)', bold_lead='Camera — ')
bullet('A USB cable to power it', bold_lead='Cable — ')
bullet('A small wall plug', bold_lead='Power adapter — ')
body('That\'s it. There is nothing to install on your computer or phone, no software to log into, and no monthly fee.')
# ---------- What it does ----------
heading('What does it do?', size=14)
body('The camera mounts above your front door, pointing straight down at the floor. Whenever someone walks underneath, it counts them. Once an hour, it sends the count to us so we can share visitor traffic reports with you.')
p = doc.add_paragraph()
p.paragraph_format.space_after = Pt(4)
r = p.add_run('Your privacy is protected. ')
r.bold = True
p.add_run('The camera looks straight down at the top of people\'s heads — it cannot see faces. No video or photos are ever saved or sent anywhere. Only the count of how many people walked through.')
# ---------- Setup ----------
heading('How do I set it up?', size=14)
body('The whole process takes about 5 minutes. You will need a stepladder and your store\'s WiFi password.')
subheading('Step 1 — Mount the camera above your door')
body('Use the included double-sided tape (or a screw, if you prefer) to stick the camera to the ceiling, directly above where people walk through your front door. The lens should point straight down at the floor. Aim for roughly 7 feet (about 2 meters) above the floor — most ceilings work fine.')
subheading('Step 2 — Plug it in')
body('Connect the USB cable to the camera and to the wall plug. Plug the wall plug into any standard outlet. The camera will turn on automatically — you will see a small red light.')
subheading('Step 3 — Connect it to your WiFi')
body('Take out your phone and open its WiFi settings. You will see a new network called "DoorCounter-Setup". Connect to it. Your phone will automatically open a setup page — enter your store\'s WiFi name and password, then tap Save.')
body('After about 30 seconds, the red light on the camera will turn off. That means it is connected and counting. You are done!', bold_lead='')
# ---------- Day to day ----------
heading('What do I do day-to-day?', size=14)
body('Nothing. The camera works on its own, 24 hours a day. It uses about as much electricity as a nightlight (less than $1 per year), runs cool, and never needs to be touched.')
p = doc.add_paragraph()
p.paragraph_format.space_after = Pt(4)
r = p.add_run('A small light blinks each time someone walks through. ')
r.bold = True
p.add_run('You may notice the count happens 35 seconds after the person passes — that is normal.')
# ---------- Troubleshooting ----------
heading('If something seems wrong', size=14)
bullet('your WiFi password is probably wrong, or the WiFi network is out of range. Reconnect your phone to "DoorCounter-Setup" and re-enter the password.', bold_lead='Red light stays on — ')
bullet('unplug it for 10 seconds and plug it back in.', bold_lead='No light at all — ')
bullet('please contact us using the information below.', bold_lead='Anything else — ')
# ---------- Contact ----------
heading('Questions?', size=14)
body('We are happy to help. Reach out anytime:')
bullet('peter@research.bike', bold_lead='Email: ')
bullet('https://git.research.bike/Bicycle_Market_Research/DoorCounter', bold_lead='Project page: ')
footer = doc.add_paragraph()
footer.alignment = WD_ALIGN_PARAGRAPH.CENTER
fr = footer.add_run('Thank you for participating in our retail traffic study.')
fr.italic = True
fr.font.size = Pt(10)
fr.font.color.rgb = RGBColor(0x77, 0x77, 0x77)
footer.paragraph_format.space_before = Pt(12)
import sys
out = sys.argv[1] if len(sys.argv) > 1 else 'retailer-setup-guide.docx'
doc.save(out)
print(f"wrote {out}")

View File

@@ -0,0 +1,78 @@
# Server-Side Prompt — Crossing Cooldown Awareness
> Give this to your server-side agent after the firmware commit that introduces
> `CV_CROSSING_COOLDOWN_FRAMES` in `firmware/lib/cv/cv.h` has been flashed to devices.
## Context
The DoorCounter firmware now enforces a **per-direction crossing cooldown** in
its CV pipeline. After a counted entry, subsequent entries within 5 frames
(~1 second at 5 fps) are silently dropped on-device. Exits follow the same
rule independently. This is a device-side fix for the known track-churn bug
(single person producing 5+ counts per visit when their blob briefly drops
below the minimum-blob-pixel threshold).
Constants live in `firmware/lib/cv/cv.h`:
```c++
static const uint32_t CV_CROSSING_COOLDOWN_FRAMES = 5; // per-direction
```
Cooldown is **fixed at compile time**. It is **not** currently a server-
pushable tuning parameter.
## What the server should do
1. **Expect lower per-hour entry/exit counts** from devices running this
firmware compared to their historical baseline. This reflects suppression
of false positives, not a device regression. Do not alert on the drop.
2. **If you have a per-device tuning-config push mechanism** (the one planned
in `.agent/plan.md` — server-push CV config via heartbeat response), add
`cooldown_frames` to the sendable tuning set with:
- Default: `5`
- Valid range: `0..60` (0 disables cooldown; 60 ≈ 12s at 5 fps)
- Semantics: per-direction, applied to both entries and exits
- Persist alongside other CV tuning keys in NVS under a new key `cv_cool`.
- Bump the `cfg_version` scheme accordingly.
**Do not ship this server-side change yet** — the firmware change in this
commit keeps the cooldown as a compile-time constant. A future firmware
commit on `feature/server-push-config` will expose it as a runtime
tunable and bump `cfg_version`. Coordinate the rollout: firmware first,
then server.
3. **Dashboard**: if you render device CV parameters in a per-device settings
view, add a read-only row "Crossing cooldown (frames): 5" sourced from
the firmware's compiled default. Mark it editable only once the firmware
exposes it as a tunable.
4. **Telemetry (optional, low priority)**: consider adding a field
`suppressed_crossings_delta` to the heartbeat or camera-events payload
so operators can see how often cooldown is kicking in. This would require
a firmware change; flag it as future work only if churn continues.
## What NOT to do
- Do not attempt to push `cooldown_frames` via the existing config channel
today — the firmware will ignore unknown fields, which is fine, but
shipping server changes that assume the firmware-side plumbing exists
will break the integration contract.
- Do not "correct" the lower counts via server-side multiplication. The
cooldown is the correct behavior; old counts were inflated by the churn
bug.
## Verification checklist
- [ ] Historical counts chart annotated with "firmware v{N} deployed"
marker on the rollout date.
- [ ] Per-device tuning view renders cooldown row (read-only for now).
- [ ] No alert fires on the per-device count drop post-rollout.
## Reference
- Firmware change: `firmware/lib/cv/cv.h` (`CV_CROSSING_COOLDOWN_FRAMES`),
`firmware/lib/cv/cv.cpp` (suppression logic in `cv_process`).
- Design spec: `docs/superpowers/specs/2026-04-13-door-counter-design.md`
§ 3.1 "Counting logic".
- Unit test: `firmware/test/test_cv/test_cv.cpp::test_cooldown_suppresses_rapid_re_entry`.

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,189 @@
# BLE / NimBLE Timer-Callout Crash — Handoff
**Date opened:** 2026-05-01
**Status:** Resolved 2026-05-01 by upgrading `h2zero/NimBLE-Arduino` from `^1.4.2` to `^2.0.0` (`firmware/platformio.ini:24`). BLE scanning re-enabled via `BLE_SCANNING_ENABLED 1` (`firmware/src/main.cpp:30`). Verified clean on customer network for 1+ hour with no panics.
**Goal:** Re-enable BLE scanning without the device crashing within ~1s of boot.
**Confirmed root cause:** Instrumented `os_callout_timer_cb` with `ets_printf` and observed the very first callout fire on the direct-call path with both `evq=NULL` and `fn=NULL`, while the same `co` address later (post-init) showed valid `evq` and `fn`. Same callout struct reused — classic NimBLE 1.x callout init/fire race where the FreeRTOS `TimerHandle_t` had a queued expiry against a not-yet-initialized event. NimBLE 2.x rewrote the porting layer; the race is gone.
**Migration touches (NimBLE 1.x → 2.x):**
- `NimBLEAdvertisedDeviceCallbacks``NimBLEScanCallbacks`
- `onResult(NimBLEAdvertisedDevice*)``onResult(const NimBLEAdvertisedDevice*)`
- `setAdvertisedDeviceCallbacks(cb, true)``setScanCallbacks(cb, true)`
- `start(0, nullptr, false)``start(0, false, false)` (signature: `duration, isContinue, restart`)
BLE was working before today's customer-site provisioning trip. The crash is reliably reproducible on the current build at the customer location whenever `BLE_SCANNING_ENABLED` is set back to `1`. It may or may not reproduce on a quieter network — the camera task's CPU-starvation pattern is shared, but the crash window's exact trigger is unconfirmed.
---
## Symptom
Within ~1s of boot, after several `cam_hal: EV-VSYNC-OVF` lines from the camera driver:
```
Guru Meditation Error: Core 0 panic'ed (InstrFetchProhibited). Exception was unhandled.
Core 0 register dump:
PC : 0x00000000 PS : 0x00060630 A0 : 0x8009a9af A1 : 0x3ffbd6e0
A2 : 0x3fff1ef8 A3 : 0x00000001 ...
A8 : 0x800f2ebc ...
EXCCAUSE: 0x00000014 EXCVADDR: 0x00000000
Backtrace: 0xfffffffd:0x3ffbd6e0 0x4009a9ac:0x3ffbd700
```
Decoded with `~/.platformio/packages/toolchain-xtensa-esp32/bin/xtensa-esp32-elf-addr2line -e .pio/build/timercam/firmware.elf -pfiC 0x4009a9ac 0x400f2ebc`:
```
prvProcessReceivedCommands at freertos/timers.c:852
(inlined by) prvTimerTask at freertos/timers.c:600
os_callout_timer_cb at NimBLE-Arduino/.../npl_os_freertos.c:1742
```
`PC=0` + `EXCCAUSE=0x14` (InstrFetchProhibited) = jump-to-NULL. The FreeRTOS timer-service task is dispatching a NimBLE callout whose callback function pointer is NULL.
The relevant NimBLE source:
```c
// firmware/.pio/libdeps/timercam/NimBLE-Arduino/src/nimble/porting/npl/freertos/src/npl_os_freertos.c:1729-1742
static void
os_callout_timer_cb(TimerHandle_t timer)
{
struct ble_npl_callout *co;
co = pvTimerGetTimerID(timer);
assert(co);
if (co->evq) {
ble_npl_eventq_put(co->evq, &co->ev);
} else {
co->ev.fn(&co->ev); // <-- co->ev.fn is NULL
}
}
```
Either `co->ev.fn` is genuinely NULL on the direct-call path, OR — given the addr2line frame is a few lines off and the callsite is ambiguous — the FreeRTOS timer's own callback pointer (`pxTimer->pxCallbackFunction`) is NULL inside `prvProcessReceivedCommands`. Both indicate a callout/timer being freed or zeroed while the FreeRTOS timer service still has a command queued for it.
---
## Environment
- Board: M5Stack TimerCam-F (ESP32-D0WDQ6-V3, dual-core 240 MHz, 4MB flash).
- BLE library: `h2zero/NimBLE-Arduino@^1.4.2` (`firmware/platformio.ini`). 1.4.2 is end-of-life on the 1.x branch; 2.x exists with breaking API changes.
- Camera: OV3660 via `esp32-camera` driver, 96×96 grayscale @ 5 FPS.
- BLE scan: passive, low-overhead, hash-collected by `firmware/src/ble_scanner.cpp`.
- Tasks: `task_camera` (core 1, prio 2, 8KB stack), `task_reporter` (core 0, prio 1, 8KB stack), Arduino loop (default).
- The camera task triggers `cam_hal: EV-VSYNC-OVF` whenever frame capture overlaps another long operation — this consistently precedes the crash in logs.
---
## What's been ruled out
1. **DNS / network code** — entirely unrelated. DNS path works in production via the new fallback-IP machinery (`firmware/src/reporter.cpp` `resolve_api_ip` and `firmware/src/reporter.h` `REPORTER_API_FALLBACK_IP`). Do not regress this; it shipped with reports working at the customer site.
2. **Our BLE app code** — the backtrace stays inside the FreeRTOS timer service and NimBLE's own porting layer; nothing in `ble_scanner.cpp` is on the call stack. The bug is in vendored NimBLE.
3. **Memory corruption from our side**`A2 = 0x3fff1ef8` is a normal heap address, no obvious overrun pattern. Heap is healthy at the time (we'd see a different fault otherwise).
4. **Stack overflow** — A1 = 0x3ffbd6e0 is well within the FreeRTOS timer-service task's stack range; no canary smash log.
---
## What changed today
| File | Change | Keep? |
|---|---|---|
| `firmware/src/main.cpp` | Added `BLE_SCANNING_ENABLED 0` gate; all `ble_scanner_*` callsites compile out; `BLEHourlyRecord` zero-stubbed when off | Keep until crash fixed; flip to `1` to reproduce |
| `firmware/src/main.cpp` | Removed verbose `[F]`/`[CV] spawn` per-frame logging; kept entry/exit + heartbeat | Keep |
| `firmware/src/ble_scanner.cpp` | Removed `[BLE] new device:` per-discovery log | Keep |
| `firmware/src/reporter.{h,cpp}` | DNS resolution with fallback IP, raw `WiFiClient` HTTP, manual `Host:` header | Keep — production fix |
| `firmware/lib/net_guard/net_guard.{h,cpp}` | DNS pin to 1.1.1.1/8.8.8.8 at lwIP + esp-netif layers; `net_guard_dump_dns` diagnostic | Keep |
---
## Reproduction
1. `cd firmware && pio run -e timercam`.
2. Edit `firmware/src/main.cpp`, set `#define BLE_SCANNING_ENABLED 1`. Rebuild.
3. Flash a TimerCam: `python tools/flash_device.py --port /dev/ttyUSB0 --device-id dc-XXXX --location-id <loc> --hmac-secret <secret> --wifi-ssid "<ssid>" --wifi-password "<pw>"`.
4. `pio device monitor --port /dev/ttyUSB0 --baud 115200`.
5. Wait ≤30s. Expect the `Guru Meditation Error: Core 0 panic'ed (InstrFetchProhibited)` traceback above.
Crash is **deterministic** on the customer's network (Elly-Fi). Worth retesting on a quiet desk network — if it doesn't repro there, the trigger is camera-task starvation interacting with NimBLE timers, not a pure NimBLE bug.
To decode any future crash backtrace:
```sh
~/.platformio/packages/toolchain-xtensa-esp32/bin/xtensa-esp32-elf-addr2line \
-e firmware/.pio/build/timercam/firmware.elf -pfiC <addr1> <addr2> ...
```
---
## Investigation paths, in order of effort/confidence
### 1. Confirm the failing call site (cheap, do this first)
The addr2line line numbers can be off by ±3 due to inlining. Add a temporary `Serial.printf` patch to `npl_os_freertos.c` `os_callout_timer_cb` to log `co`, `co->evq`, `co->ev.fn` on entry. Reproduce. Then we know with certainty whether `co->ev.fn` is NULL on the direct-call path or whether this is an FreeRTOS-level issue (queued command for a deleted timer).
If `evq != NULL` and we still crash, the NULL is in the queued event dispatcher (a different code path; pivot the investigation).
### 2. Try upgrading NimBLE-Arduino to 2.x (medium effort, likely-fix)
`platformio.ini` has `h2zero/NimBLE-Arduino@^1.4.2`. 2.x rewrote the porting layer significantly. Breaking API changes — `NimBLEAdvertisedDeviceCallbacks` was renamed/restructured. Touch points: `firmware/src/ble_scanner.cpp` (the only file that uses NimBLE).
Try: pin `^2.0.0`, fix the API breakage in `ble_scanner.cpp` (it's <100 lines). If 2.x crashes too, the issue is independent of NimBLE version → pivot to (3) or (4).
### 3. Reduce camera-task starvation (cheap, may be sufficient)
The `EV-VSYNC-OVF` lines are the canary. The camera task pins core 1 at priority 2 doing CV processing every 200ms. NimBLE host task runs on core 0 by default but the FreeRTOS timer service task is core-agnostic and may be starved during long CV passes that hold a mutex.
Things to try in `firmware/src/main.cpp`:
- Lower `CAM_FPS` from 5 to 3, see if VSYNC-OVF still appears.
- Move CV processing off the capture path (capture into a queue, process at lower priority).
- Raise FreeRTOS timer-service task priority via `configTIMER_TASK_PRIORITY` (sdkconfig).
- Confirm NimBLE host task pinning — `CONFIG_BT_NIMBLE_PINNED_TO_CORE` should be 0 or 1 (not unpinned).
### 4. Local NULL-guard patch (last resort, masks the bug)
If upgrade is blocked and starvation reduction isn't enough, patch the vendored source:
```c
// npl_os_freertos.c:1740
} else {
if (co->ev.fn) co->ev.fn(&co->ev);
}
```
This silences the crash but drops the dropped event. The dropped events are likely scan-result deliveries; we'd undercount BLE devices but not crash. Acceptable as a stopgap with a `// TODO: remove when NimBLE upgraded` and a note in this doc.
Caveat: vendored library files in `.pio/libdeps/` get blown away by clean builds. Either copy NimBLE into `firmware/lib/` and pin it (vendored), or use `lib_archive` + a post-install script. Don't ship a build that depends on an unpinned hand-edit.
### 5. Replace BLE stack (high effort)
If 2.x also crashes and starvation reduction doesn't help, switch to the IDF-native bluedroid stack via the Arduino-ESP32 `BLEDevice` API. Larger memory footprint (~30KB more heap) but a different lifecycle model — won't share NimBLE's bug.
---
## Constraints / things not to break
- `firmware/src/reporter.cpp` DNS path with `REPORTER_API_FALLBACK_IP` — production fix, must keep working. Do not regress to `HTTPClient`.
- `BLE_SCANNING_ENABLED 0` is the **shipping default** until this is resolved. Devices in the field rely on this; flip to `1` only in dev builds.
- `firmware/lib/net_guard/net_guard.cpp` `net_guard_pin_dns()` is called both at boot and on every WiFi reconnect; if reorganizing net_guard, preserve both call sites.
- The `ble_scanner` module supports `ble_scanner_pause`/`resume` for OTA — verify it still works after any NimBLE upgrade (`ArduinoOTA.onStart` hook in `main.cpp:248`).
---
## Open questions
- Does the crash repro on a quiet network with no `EV-VSYNC-OVF`? (Determines whether starvation is necessary vs sufficient.)
- Was BLE working in a previous build, and on which NimBLE version? Earliest BLE-related commit traced to is well before today; binary search across firmware commits with BLE enabled would identify the regression boundary if it's our code.
- Does the customer site have an unusual RF environment (very dense BLE) that increases the callout-churn rate, making the race more likely? Worth a `nimble_scan_event` count log during a 60s capture window.
---
## Quick verification once you think it's fixed
1. Set `BLE_SCANNING_ENABLED 1`, rebuild, flash.
2. Run for at least 10 minutes on the customer network — the original crash hit within ~1s, so 10 min with no panic is strong evidence.
3. Confirm a successful hourly cycle: `[CV] entry/exit`, then `[HTTP] POST .../events/batch ... -> 200`, BLE record with non-zero `unique_devices`.
4. Run a second device side-by-side; confirm no cross-device interference.
When done, set `BLE_SCANNING_ENABLED 1` as the default and remove the gate (keep the comment block as institutional memory of the bug).

View File

@@ -13,7 +13,7 @@
[TimerCamera-F Device]
├── Provisioning module — captive portal AP on first boot
├── Config store — NVS: device_id, location_id, HMAC secret, WiFi creds, line_offset
├── Camera + CV module — captures frames, runs line-crossing counter
├── Camera + CV module — captures frames, runs event-based walker detector
├── BLE scanner — continuous passive scan (WiFi coexistence mode)
├── Report buffer — accumulates counts in RAM, flushes hourly
└── HTTP client — HMAC-signed POSTs to logs.research.bike
@@ -89,15 +89,38 @@ Capture → Grayscale → Downscale 96×96 → Frame diff → Threshold → Blob
| Downscale | Bilinear to 96×96 (~11× compute reduction) |
| Frame diff | Absolute difference against rolling background (updated every ~2s when no motion) |
| Threshold | Pixels > 30 intensity delta = foreground |
| Blob detect | Connected components; blobs < 8×8 px discarded as noise |
| Centroid track | Nearest-centroid matching frame-to-frame (max 15px), tracks persist up to 10 missed frames |
| Line crossing | Virtual horizontal line at configurable vertical position (default: 50% of frame height) |
| Event state machine | Single global state machine (not per-blob). Per-frame `fg_count` (total foreground pixels) gates event start and end. |
| Event start | `fg_count ≥ CV_EVENT_ENTER_THRESH` (250 px) → event becomes active. Background updates freeze for the event's duration so the walker does not blend into the baseline. |
| Event accumulation | Each frame records `first_c` (centroid_y at start), running `min_c` / `max_c` / `last_c`, vertical extents (`min_y_seen`, `max_y_seen`), and frame count. |
| Event end | Either **quiet exit** (`fg_count < CV_EVENT_EXIT_THRESH` (150 px) for `CV_EVENT_QUIET_FRAMES` (3) consecutive frames) or **timeout** (`event_frame_count > CV_EVENT_MAX_FRAMES` (25)). On end, background snaps to the current frame. |
| Fire gates | Duration ≥ `CV_EVENT_MIN_FRAMES` (5), `min_y_seen ≤ CV_EVENT_EXTENT_TOP` (25) AND `max_y_seen ≥ CV_EVENT_EXTENT_BOT` (50) — event must span a large fraction of the frame — AND `max(up_score, down_score) ≥ CV_EVENT_MIN_TRAJ` (5) |
| Refractory | `CV_EVENT_REFRACTORY_FRAMES` (10 ≈ 2s) after a fire, the machine refuses to start a new event — absorbs lingering motion of the just-counted walker. |
**Counting logic:**
- Centroid crosses line top→bottom = **entry**
- Centroid crosses line bottom→top = **exit**
**Direction heuristic (applied after fire gates pass):**
- `up_score = first_c min_c` (peak upward centroid excursion)
- `down_score = max_c first_c` (peak downward centroid excursion)
- **Quiet-exit fires**: `is_entry = (up_score ≥ down_score)`
- **Timeout fires**: `is_entry = (last_c < first_c)` — walker is still in frame at timeout, so net displacement is a better signal than excursion.
Counts accumulate as `{entries, exits}` in RAM and reset each hour on report.
Per-mount convention: centroid moving **up through the frame** (y decreasing) = **entry** into the store.
**Counting surface**: `{entries, exits}` accumulate in RAM and reset each hour on report.
**Directional accuracy is best-effort, not guaranteed.** In bench testing at the intended 7' overhead straight-down mount:
| Metric | Result |
|--------|--------|
| Event detection | 8/8 walks (100%) |
| Aggregate entry/exit split | 4+4 vs ground-truth 4+4 (matches) |
| Per-walk direction labelling | 4/8 (50%) — no better than chance |
At this mount, entries and exits produce nearly identical centroid trajectories: the walker is already large when `fg_count` crosses 250 (so `first_c` is always near mid-frame), their tail is still visible when `fg_count` drops below 150 (so `last_c` is always near mid-frame), and the excursion in between peaks upward for both directions. No statistic computable from (`first_c`, `min_c`, `max_c`, `last_c`, duration) separates them reliably.
**Contract with downstream consumers (API and analytics):**
- **`entries + exits` is the trustworthy number** — it is the count of walkers through the doorway in the hour. Use this as "foot traffic."
- **Individual `entries` and `exits` are reported for API shape compatibility, but should not be relied on for net flow, dwell, or any per-direction analysis.**
Recovering true direction requires either a physical change (tilt or raise the camera so walkers pass fully through the frame edges) or a richer signal (time-resolved centroid trajectory, optical flow, secondary sensor). Both are out of scope for v1.
---

View File

@@ -5,8 +5,21 @@
#include <algorithm>
#include <vector>
static void event_reset(CVState& s) {
s.event_active = false;
s.event_start_frame = 0;
s.event_frame_count = 0;
s.event_peak_n = 0;
s.event_first_c = -1.0f;
s.event_last_c = -1.0f;
s.event_min_c = (float)CV_H;
s.event_max_c = -1.0f;
s.event_min_y_seen = CV_H;
s.event_max_y_seen = -1;
s.event_quiet_count = 0;
}
void cv_init(CVState& state) {
// Initialize members directly — avoid CVState{} temporary which puts 9KB on stack
memset(state.background, 0, sizeof(state.background));
state.bg_valid = false;
state.last_motion_frame = 0;
@@ -15,6 +28,8 @@ void cv_init(CVState& state) {
state.tracks.clear();
state.entries = 0;
state.exits = 0;
state.last_fire_frame = 0;
event_reset(state);
}
void cv_reset_counts(CVState& state) {
@@ -24,9 +39,6 @@ void cv_reset_counts(CVState& state) {
struct Point { int x, y; };
// Note: queue may grow to CV_PIXELS entries (~72KB) on large blobs.
// Requires PSRAM (enabled via -DBOARD_HAS_PSRAM in platformio.ini).
// BFS flood fill. Marks visited pixels (sets fg to 0). Returns {-1,-1} if blob < CV_MIN_BLOB_PX.
static std::pair<float,float> extract_blob(uint8_t* fg, int start_x, int start_y) {
std::vector<Point> queue;
queue.reserve(512);
@@ -58,7 +70,7 @@ static std::pair<float,float> extract_blob(uint8_t* fg, int start_x, int start_y
static std::vector<std::pair<float,float>> find_centroids(const uint8_t* fg) {
std::vector<std::pair<float,float>> result;
static uint8_t fg_copy[CV_PIXELS]; // static to avoid 9KB stack allocation
static uint8_t fg_copy[CV_PIXELS];
memcpy(fg_copy, fg, CV_PIXELS);
for (int y = 0; y < CV_H; y++) {
@@ -80,8 +92,62 @@ static void frame_diff(const uint8_t* frame, const uint8_t* bg,
}
}
CVResult cv_process(CVState& state, const uint8_t* frame, uint8_t line_pct) {
CVResult result = {0, 0};
// Decide whether the just-ended event should fire and in which direction.
// Up-through-frame (centroid excursion from high y toward low y) maps to
// ENTRY per mount convention.
static void finalize_event(CVState& s, CVResult& result) {
if (s.event_frame_count < CV_EVENT_MIN_FRAMES) return;
// Note: no MAX_FRAMES rejection here. An event that runs the full duration
// may still be a valid walker whose fg_count stayed above EXIT_THRESH due
// to a stale bg or an AEC-driven lighting shift. Extent + MIN_TRAJ gates
// below already reject stationary-person / wobble events.
if (s.event_min_y_seen > CV_EVENT_EXTENT_TOP) return;
if (s.event_max_y_seen < CV_EVENT_EXTENT_BOT) return;
// Direction from centroid excursion relative to event start.
// up_score: how far centroid excursed upward (smaller y) from first_c.
// down_score: how far it excursed downward (larger y) from first_c.
float up_score = s.event_first_c - s.event_min_c;
float down_score = s.event_max_c - s.event_first_c;
float winning = (up_score >= down_score) ? up_score : down_score;
if (winning < CV_EVENT_MIN_TRAJ) return;
// Timeout-aware direction. Quiet-exit events (fg fell below EXIT_THRESH)
// have walker fully out of frame → min/max excursion bracket the true
// traversal and up/down scores are reliable. Timeout events (event hit
// MAX_FRAMES while still elevated) captured both an approach and a
// departure within the window, so excursion measures the walker's
// *range in frame* rather than direction — an entry walker who paused
// near the top, then drifted back toward the middle before timeout
// gets (wrongly) called an entry by up-score even though net motion is
// mixed. For those, the net first→last centroid displacement is a
// better direction signal (it's where the walker ended up, not just
// where they peaked).
bool timed_out = (s.event_frame_count > CV_EVENT_MAX_FRAMES);
bool is_entry;
if (timed_out) {
is_entry = (s.event_last_c < s.event_first_c);
} else {
is_entry = (up_score >= down_score);
}
if (is_entry) {
s.entries++;
result.entries_delta++;
} else {
s.exits++;
result.exits_delta++;
}
s.last_fire_frame = s.frame_index;
result.fire_first_c = s.event_first_c;
result.fire_min_c = s.event_min_c;
result.fire_max_c = s.event_max_c;
result.fire_last_c = s.event_last_c;
result.fire_duration = s.event_frame_count;
}
CVResult cv_process(CVState& state, const uint8_t* frame, uint8_t /*line_pct*/) {
CVResult result = {0, 0, 0, -1, -1, -1.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0};
state.frame_index++;
if (!state.bg_valid) {
@@ -90,85 +156,147 @@ CVResult cv_process(CVState& state, const uint8_t* frame, uint8_t line_pct) {
return result;
}
static uint8_t fg[CV_PIXELS]; // static: avoids 9KB on task stack
static uint8_t fg[CV_PIXELS];
frame_diff(frame, state.background, fg, CV_PIXELS);
int fg_count = 0;
for (int i = 0; i < CV_PIXELS; i++) fg_count += fg[i];
bool motion = fg_count > CV_MIN_BLOB_PX;
if (!motion) {
if (state.frame_index - state.last_motion_frame > 10) {
memcpy(state.background, frame, CV_PIXELS);
// Running-average background blend: bg = (31*bg + frame)/32. Adapts to
// slow scene drift during idle periods. Frozen during an active event so
// the walker's signature is never absorbed — otherwise bg retains a
// "ghost" of the walker for ~30 frames after they leave, keeping fg_count
// elevated and preventing subsequent walkers from producing a clean
// trajectory.
if (!state.event_active) {
for (int i = 0; i < CV_PIXELS; i++) {
state.background[i] = (uint8_t)(((uint16_t)state.background[i] * 31 + frame[i]) >> 5);
}
}
int fg_count = 0;
int min_y = CV_H, max_y = -1;
long sum_y = 0;
for (int y = 0; y < CV_H; y++) {
const uint8_t* row = &fg[y * CV_W];
int row_count = 0;
for (int x = 0; x < CV_W; x++) row_count += row[x];
if (row_count > 0) {
if (y < min_y) min_y = y;
if (y > max_y) max_y = y;
sum_y += (long)row_count * y;
fg_count += row_count;
}
}
result.fg_count = fg_count;
result.fg_min_y = (fg_count > 0) ? min_y : -1;
result.fg_max_y = (fg_count > 0) ? max_y : -1;
result.fg_centroid_y = (fg_count > 0) ? ((float)sum_y / fg_count) : -1.0f;
// Hard self-heal: if more than half the frame is fg, bg is catastrophically
// wrong. Snap and skip the event machine this frame.
if (fg_count > CV_PIXELS / 2) {
memcpy(state.background, frame, CV_PIXELS);
state.last_motion_frame = state.frame_index;
if (state.event_active) event_reset(state);
return result;
}
// Diagnostic track management (no effect on counting).
bool motion = fg_count > CV_MIN_BLOB_PX;
if (motion) {
state.last_motion_frame = state.frame_index;
auto centroids = find_centroids(fg);
std::vector<bool> centroid_matched(centroids.size(), false);
for (auto& track : state.tracks) {
float best_dist = CV_MAX_MOVE * CV_MAX_MOVE;
int best_idx = -1;
for (int i = 0; i < (int)centroids.size(); i++) {
if (centroid_matched[i]) continue;
float dx = centroids[i].first - track.x;
float dy = centroids[i].second - track.y;
float d2 = dx*dx + dy*dy;
if (d2 < best_dist) { best_dist = d2; best_idx = i; }
}
if (best_idx >= 0) {
centroid_matched[best_idx] = true;
track.x = centroids[best_idx].first;
track.y = centroids[best_idx].second;
track.missed = 0;
} else {
track.missed++;
}
}
state.tracks.erase(
std::remove_if(state.tracks.begin(), state.tracks.end(),
[](const CVTrack& t){ return t.missed > CV_MAX_MISSED; }),
state.tracks.end());
for (int i = 0; i < (int)centroids.size(); i++) {
if (centroid_matched[i]) continue;
CVTrack t;
t.id = state.next_id++;
t.x = centroids[i].first;
t.y = centroids[i].second;
t.spawn_y = t.y;
t.missed = 0;
state.tracks.push_back(t);
}
} else {
for (auto& t : state.tracks) t.missed++;
state.tracks.erase(
std::remove_if(state.tracks.begin(), state.tracks.end(),
[](const CVTrack& t){ return t.missed > CV_MAX_MISSED; }),
state.tracks.end());
return result;
}
state.last_motion_frame = state.frame_index;
// Event state machine. Refractory period after a fire blocks new events
// for CV_EVENT_REFRACTORY_FRAMES frames — absorbs lingering-walker motion
// that would otherwise re-trigger a second count.
bool in_refractory = state.last_fire_frame != 0 &&
(state.frame_index - state.last_fire_frame) < CV_EVENT_REFRACTORY_FRAMES;
auto centroids = find_centroids(fg);
std::vector<bool> centroid_matched(centroids.size(), false);
for (auto& track : state.tracks) {
float best_dist = CV_MAX_MOVE * CV_MAX_MOVE;
int best_idx = -1;
for (int i = 0; i < (int)centroids.size(); i++) {
if (centroid_matched[i]) continue;
float dx = centroids[i].first - track.x;
float dy = centroids[i].second - track.y;
float d2 = dx*dx + dy*dy;
if (d2 < best_dist) { best_dist = d2; best_idx = i; }
if (!state.event_active) {
if (!in_refractory && fg_count >= CV_EVENT_ENTER_THRESH) {
state.event_active = true;
state.event_start_frame = state.frame_index;
state.event_frame_count = 1;
state.event_peak_n = fg_count;
state.event_first_c = result.fg_centroid_y;
state.event_last_c = result.fg_centroid_y;
state.event_min_c = result.fg_centroid_y;
state.event_max_c = result.fg_centroid_y;
state.event_min_y_seen = min_y;
state.event_max_y_seen = max_y;
state.event_quiet_count = 0;
}
if (best_idx >= 0) {
centroid_matched[best_idx] = true;
track.x = centroids[best_idx].first;
track.y = centroids[best_idx].second;
track.missed = 0;
} else {
state.event_frame_count++;
if (fg_count > state.event_peak_n) state.event_peak_n = fg_count;
if (fg_count > 0) {
state.event_last_c = result.fg_centroid_y;
if (result.fg_centroid_y < state.event_min_c) state.event_min_c = result.fg_centroid_y;
if (result.fg_centroid_y > state.event_max_c) state.event_max_c = result.fg_centroid_y;
if (min_y < state.event_min_y_seen) state.event_min_y_seen = min_y;
if (max_y > state.event_max_y_seen) state.event_max_y_seen = max_y;
}
if (fg_count < CV_EVENT_EXIT_THRESH) {
state.event_quiet_count++;
if (state.event_quiet_count >= CV_EVENT_QUIET_FRAMES) {
finalize_event(state, result);
event_reset(state);
memcpy(state.background, frame, CV_PIXELS);
}
} else {
track.missed++;
}
}
state.tracks.erase(
std::remove_if(state.tracks.begin(), state.tracks.end(),
[](const CVTrack& t){ return t.missed > CV_MAX_MISSED; }),
state.tracks.end());
float line_y = (line_pct / 100.0f) * CV_H;
for (int i = 0; i < (int)centroids.size(); i++) {
if (centroid_matched[i]) continue;
CVTrack t;
t.id = state.next_id++;
t.x = centroids[i].first;
t.y = centroids[i].second;
t.above_line = (t.y < line_y);
t.missed = 0;
state.tracks.push_back(t);
}
// Line crossing check
for (auto& track : state.tracks) {
if (track.missed > 0) continue; // only check tracks matched this frame
bool now_above = (track.y < line_y);
if (now_above != track.above_line) {
if (!now_above) {
// was above, now below → entry
state.entries++;
result.entries_delta++;
} else {
// was below, now above → exit
state.exits++;
result.exits_delta++;
state.event_quiet_count = 0;
if (state.event_frame_count > CV_EVENT_MAX_FRAMES) {
// Timeout end: fg still elevated. Snap bg anyway — in practice
// a stuck-high event means bg is stale (walker has merged
// with stale bg, or AEC shifted). Leaving bg stale permanently
// poisons subsequent events. If a walker truly is mid-frame
// they'll get absorbed into bg, but that's a rare corner
// beaten by the common case of stale bg chaining events.
finalize_event(state, result);
event_reset(state);
memcpy(state.background, frame, CV_PIXELS);
}
}
track.above_line = now_above;
}
return result;

View File

@@ -12,10 +12,63 @@ static const int CV_MIN_BLOB_PX = 64;
static const float CV_MAX_MOVE = 15.0f;
static const int CV_MAX_MISSED = 10;
// Event-based walker detector. Per-frame zone-flip approaches were direction-
// blind at realistic mounts: a walker traversing top-to-bottom and a walker
// traversing bottom-to-top produced identical zone-dominance sequences
// (geometric artifact of asymmetric zones + body spanning the line). The
// event approach buffers a whole walker event, then decides direction from
// the centroid trajectory: sign(first_centroid_y - peak_centroid_y) > 0 means
// the centroid moved upward through the frame during the event.
//
// Per-mount convention: UP through frame == ENTRY into store. Flip the camera
// mount or invert the mapping in cv_process if the physical install differs.
// fg_count thresholds that gate event start/end. Tuned against a real
// 8-walk isolated test (see .agent/walk_isolated_8walks.log). Lower than
// initial guesses because the 7' overhead mount produces smaller centroid
// excursions than we originally modelled.
static const int CV_EVENT_ENTER_THRESH = 250;
static const int CV_EVENT_EXIT_THRESH = 150;
// Number of consecutive sub-EXIT frames required to end an event.
static const int CV_EVENT_QUIET_FRAMES = 3;
// Min/max event duration in frames. Below min = too brief to be a walker
// (noise burst). Above max = stationary object or stuck detection.
static const int CV_EVENT_MIN_FRAMES = 5;
// MAX bounds the event duration. Too low (15) cut events off while walker
// was still physically in frame — every fire hit dur=MAX+1 and bg snapped
// with a walker-ghost baked in, corrupting the next walk. Too high (40)
// merged multiple walkers. 25 frames (5s) lets a single walker reach the
// quiet-exit path (fg drops below EXIT_THRESH) before timeout, so bg snaps
// on a clean empty frame.
static const int CV_EVENT_MAX_FRAMES = 25;
// Required vertical extent: during the event, fg must have reached near the
// top of the frame (min_y <= TOP) AND near the bottom (max_y >= BOT). At a
// 7' overhead mount real walkers span fg y≈0..70, not 0..95 — the original
// 10/85 gates rejected most real walks. Relaxed to catch them while still
// filtering small local motion that doesn't span the doorway.
static const int CV_EVENT_EXTENT_TOP = 25;
static const int CV_EVENT_EXTENT_BOT = 50;
// Minimum centroid excursion (max of up_score/down_score) for a valid
// trajectory. At overhead mount walker centroid traverses ~15-40 pixels;
// 15 was too aggressive and dropped clean walks. 5 still filters wobble.
static const float CV_EVENT_MIN_TRAJ = 5.0f;
// Refractory period after a fire. Shorter than originally chosen — at 5 fps
// a second walker can arrive within 2s of the first, especially at busy
// doorways. 10 frames = 2s of back-pressure, tuned to match the gap between
// consecutive isolated walks in the test log.
static const uint32_t CV_EVENT_REFRACTORY_FRAMES = 10;
// Diagnostic only: tracks are kept for spawn logging. Counting does NOT
// depend on tracks.
struct CVTrack {
int id;
float x, y;
bool above_line;
float spawn_y;
int missed;
};
@@ -28,11 +81,36 @@ struct CVState {
std::vector<CVTrack> tracks;
int entries;
int exits;
// Event state machine.
bool event_active;
uint32_t event_start_frame;
int event_frame_count;
int event_peak_n;
float event_first_c;
float event_last_c;
float event_min_c; // min centroid_y observed during event
float event_max_c; // max centroid_y observed during event
int event_min_y_seen;
int event_max_y_seen;
int event_quiet_count;
uint32_t last_fire_frame; // 0 = never; frame of last counted fire
};
struct CVResult {
int entries_delta;
int exits_delta;
// Per-frame foreground diagnostics (populated every call).
int fg_count;
int fg_min_y;
int fg_max_y;
float fg_centroid_y;
// Populated only on a fire frame; zeroed otherwise.
float fire_first_c;
float fire_min_c;
float fire_max_c;
float fire_last_c;
int fire_duration;
};
void cv_init(CVState& state);

View File

@@ -0,0 +1,156 @@
// firmware/lib/event_log/event_log.cpp
#include "event_log.h"
#include <string.h>
#include <stdio.h>
#ifdef ARDUINO
#include <Arduino.h>
#include <Preferences.h>
#include <time.h>
#include <freertos/FreeRTOS.h>
#include <freertos/semphr.h>
static Preferences s_prefs;
static const char* NVS_NS = "evlog";
static bool s_ok = false;
static SemaphoreHandle_t s_mutex = nullptr;
static uint32_t g_head = 0; // next write slot (0..31), RAM-only
static uint32_t g_cnt = 0; // total writes since boot scan, RAM-only
static constexpr time_t NTP_SYNC_THRESHOLD = 1700000000; // 2023-11-14
#else
// Native build: in-memory stub
#include <cstdint>
static uint8_t g_slots[32 * 32];
static uint32_t g_head = 0;
static uint32_t g_cnt = 0;
extern "C" void event_log_test_reset() {
memset(g_slots, 0, sizeof(g_slots));
g_head = 0;
g_cnt = 0;
}
extern "C" void event_log_test_simulate_reboot() {
// Simulate device reboot: clear in-RAM state, keep persistent slots.
g_head = 0;
g_cnt = 0;
}
#endif
static const size_t SLOTS = 32;
static const size_t SLOT_SIZE = sizeof(EventLogEntry);
uint16_t event_log_path_hash(const char* path) {
// fnv1a-16 (fold 32-bit fnv1a down to 16 bits)
uint32_t h = 0x811c9dc5u;
while (*path) { h ^= (uint8_t)*path++; h *= 0x01000193u; }
return (uint16_t)((h >> 16) ^ (h & 0xFFFF));
}
static void slot_write(size_t idx, const EventLogEntry& e) {
#ifdef ARDUINO
char key[8]; snprintf(key, sizeof(key), "s%u", (unsigned)idx);
s_prefs.putBytes(key, &e, SLOT_SIZE);
#else
memcpy(&g_slots[idx * SLOT_SIZE], &e, SLOT_SIZE);
#endif
}
static bool slot_read(size_t idx, EventLogEntry& e) {
#ifdef ARDUINO
char key[8]; snprintf(key, sizeof(key), "s%u", (unsigned)idx);
size_t n = s_prefs.getBytes(key, &e, SLOT_SIZE);
return n == SLOT_SIZE;
#else
memcpy(&e, &g_slots[idx * SLOT_SIZE], SLOT_SIZE);
return true;
#endif
}
void event_log_init() {
#ifdef ARDUINO
if (s_mutex == nullptr) {
s_mutex = xSemaphoreCreateMutex();
}
s_ok = s_prefs.begin(NVS_NS, /*readOnly=*/false);
if (!s_ok) {
Serial.println("[evlog] NVS begin failed");
return;
}
#endif
// Scan all 32 slots; locate the one with the largest seq.
// Empty log: every slot tag == 0 (not a valid EventLogTag, which starts at 1).
uint32_t max_seq = 0;
int max_idx = -1;
bool any_valid = false;
for (size_t i = 0; i < SLOTS; i++) {
EventLogEntry e = {};
if (!slot_read(i, e)) continue;
if (e.tag == 0) continue;
any_valid = true;
if (max_idx < 0 || e.seq >= max_seq) {
max_seq = e.seq;
max_idx = (int)i;
}
}
if (any_valid) {
g_head = (uint32_t)((max_idx + 1) % SLOTS);
g_cnt = max_seq + 1;
} else {
g_head = 0;
g_cnt = 0;
}
}
void event_log_write(EventLogTag tag, uint16_t data0, uint16_t data1) {
#ifdef ARDUINO
if (!s_ok) return;
// Bounded wait: skip on contention rather than stall the calling task.
// This matters because event_log_write runs from the WiFi event task
// (priority 23); blocking it on a 10-100ms NVS write can overflow the
// event queue. Diagnostic loss is preferable to dropped WiFi events.
if (s_mutex && xSemaphoreTake(s_mutex, pdMS_TO_TICKS(50)) != pdTRUE) return;
EventLogEntry e = {};
time_t now = time(nullptr);
e.ts_unix = (now > NTP_SYNC_THRESHOLD) ? (uint32_t)now : 0;
e.uptime_s = (uint32_t)(millis() / 1000);
e.tag = (uint8_t)tag;
e.data0 = data0;
e.data1 = data1;
e.seq = g_cnt;
slot_write(g_head % SLOTS, e);
g_head = (g_head + 1) % SLOTS;
g_cnt = g_cnt + 1;
if (s_mutex) xSemaphoreGive(s_mutex);
#else
EventLogEntry e = {};
e.ts_unix = 0;
e.uptime_s = 0;
e.tag = (uint8_t)tag;
e.data0 = data0;
e.data1 = data1;
e.seq = g_cnt;
slot_write(g_head % SLOTS, e);
g_head = (g_head + 1) % SLOTS;
g_cnt = g_cnt + 1;
#endif
}
size_t event_log_read_recent(EventLogEntry* out, size_t max_entries) {
#ifdef ARDUINO
if (!s_ok) return 0;
// Bounded wait to match event_log_write. Reads are slower (32 NVS gets),
// but returning 0 entries under contention beats blocking the caller.
if (s_mutex && xSemaphoreTake(s_mutex, pdMS_TO_TICKS(50)) != pdTRUE) return 0;
#endif
uint32_t head = g_head;
uint32_t cnt = g_cnt;
size_t available = (cnt < SLOTS) ? (size_t)cnt : SLOTS;
size_t n = (max_entries < available) ? max_entries : available;
for (size_t i = 0; i < n; i++) {
// newest is at (head - 1), then (head - 2), ... modulo SLOTS
size_t idx = (head + SLOTS - 1 - i) % SLOTS;
slot_read(idx, out[i]);
}
#ifdef ARDUINO
if (s_mutex) xSemaphoreGive(s_mutex);
#endif
return n;
}

View File

@@ -0,0 +1,48 @@
// firmware/lib/event_log/event_log.h
#pragma once
#include <stdint.h>
#include <stddef.h>
enum EventLogTag : uint8_t {
EVT_BOOT = 1, // data0 = esp_reset_reason() value
EVT_WIFI_UP = 2, // data0 = rssi (signed, cast)
EVT_WIFI_DOWN = 3, // data0 = disconnect reason code
EVT_HTTP_OK = 4, // data0 = path hash (fnv1a16), data1 = elapsed_ms
EVT_HTTP_FAIL = 5, // data0 = path hash, data1 = (http_code or negative errno)
EVT_HEARTBEAT_MISS = 6, // data0 = consecutive miss count
EVT_NTP_SYNC = 7, // data0 = seconds since boot
EVT_REBOOT = 8, // data0 = reason enum (defined below)
};
enum RebootReason : uint8_t {
REBOOT_HEARTBEAT_MISS = 1,
REBOOT_FACTORY_RESET = 2,
REBOOT_OTA = 3,
REBOOT_WIFI_REPROV = 4,
REBOOT_FATAL_CONFIG = 5,
REBOOT_FATAL_CAMERA = 6,
};
struct EventLogEntry {
uint32_t ts_unix; // 0 if NTP not synced yet; fall back to millis/1000
uint32_t uptime_s; // millis()/1000 at log time
uint16_t data0;
uint16_t data1;
uint8_t tag; // EventLogTag
uint32_t seq; // widened; survives multi-year event rates
uint8_t _pad[15]; // pad to 32 bytes for fixed slot size
} __attribute__((packed));
static_assert(sizeof(EventLogEntry) == 32, "EventLogEntry must be 32 bytes");
// NVS-backed 32-slot ring buffer. Safe to call before NTP sync.
// Call exactly once from application setup, before any task writes events.
void event_log_init();
// Safe to call from any FreeRTOS task after event_log_init().
// Bounded mutex wait (~50ms) — will silently skip on contention rather than
// block the calling task. Acceptable for diagnostic logging.
void event_log_write(EventLogTag tag, uint16_t data0 = 0, uint16_t data1 = 0);
// Same bounded-wait contract as event_log_write: returns 0 on mutex timeout.
size_t event_log_read_recent(EventLogEntry* out, size_t max_entries);
uint16_t event_log_path_hash(const char* path); // fnv1a16 — exposed for tests

View File

@@ -14,12 +14,21 @@ static HString bytes_to_hex(const uint8_t* bytes, size_t len) {
return out;
}
static void hex_to_bytes(const HString& hex, uint8_t* out, size_t out_len) {
if (hex.length() % 2 != 0) return; // malformed — odd-length hex
for (size_t i = 0; i < out_len && (i * 2 + 1) < hex.length(); i++) {
char byte_str[3] = {hex[i*2], hex[i*2+1], 0};
static bool is_hex_char(char c) {
return (c >= '0' && c <= '9') ||
(c >= 'a' && c <= 'f') ||
(c >= 'A' && c <= 'F');
}
static bool hex_to_bytes(const HString& hex, uint8_t* out, size_t out_len) {
if (hex.length() != out_len * 2) return false;
for (size_t i = 0; i < out_len; i++) {
char a = hex[i*2], b = hex[i*2+1];
if (!is_hex_char(a) || !is_hex_char(b)) return false;
char byte_str[3] = {a, b, 0};
out[i] = (uint8_t)strtol(byte_str, nullptr, 16);
}
return true;
}
static bool sha256(const uint8_t* data, size_t len, uint8_t out[32]) {
@@ -52,10 +61,20 @@ HString hmac_sign(const HString& secret_hex,
snprintf(ts_buf, sizeof(ts_buf), "%u", (unsigned)timestamp);
HString message = method + "\n" + path + "\n" + ts_buf + "\n" + body_hash_hex;
// 3. Decode secret from hex
// 3. Decode secret from hex. Reject empty / odd-length / oversized /
// non-hex inputs — flash_device.py validates at provision time, but
// hmac_sign refuses to sign under a malformed key regardless of how it
// ended up in NVS (legacy provisioning, NVS corruption, etc.).
if (secret_hex.length() == 0 ||
secret_hex.length() > 128 ||
secret_hex.length() % 2 != 0) {
return HString{};
}
size_t secret_len = secret_hex.length() / 2;
uint8_t secret[64] = {};
hex_to_bytes(secret_hex, secret, secret_len);
if (!hex_to_bytes(secret_hex, secret, secret_len)) {
return HString{};
}
// 4. HMAC-SHA256(secret, message)
uint8_t hmac_result[32];

View File

@@ -0,0 +1,6 @@
{
"name": "net_guard",
"build": {
"flags": ["-I$PROJECT_SRC_DIR"]
}
}

View File

@@ -0,0 +1,142 @@
// firmware/lib/net_guard/net_guard.cpp
#include "net_guard.h"
uint32_t net_guard_next_backoff_ms(uint32_t attempt) {
if (attempt >= 6) return 60000;
return 1000u * (1u << attempt);
}
#ifdef ARDUINO
#include "config.h"
#include <WiFi.h>
#include <Arduino.h>
#include <lwip/dns.h>
#include <esp_netif.h>
#include "event_log.h"
// Both lwIP's ip_addr_t and esp-netif's esp_ip_addr_t alias the same on-disk
// layout for IPv4, but the C++ types differ. Take the raw u32 to sidestep it.
static String fmt_v4(uint32_t addr_be) {
if (addr_be == 0) return String("0.0.0.0");
char b[16];
snprintf(b, sizeof(b), "%u.%u.%u.%u",
(unsigned)((addr_be >> 0) & 0xFF),
(unsigned)((addr_be >> 8) & 0xFF),
(unsigned)((addr_be >> 16) & 0xFF),
(unsigned)((addr_be >> 24) & 0xFF));
return String(b);
}
void net_guard_dump_dns(const char* tag) {
const ip_addr_t* d0 = dns_getserver(0);
const ip_addr_t* d1 = dns_getserver(1);
Serial.printf("[DNS] %s lwip: %s , %s\n", tag,
fmt_v4(d0 ? ip_2_ip4(d0)->addr : 0).c_str(),
fmt_v4(d1 ? ip_2_ip4(d1)->addr : 0).c_str());
esp_netif_t* sta = esp_netif_get_handle_from_ifkey("WIFI_STA_DEF");
if (sta) {
esp_netif_dns_info_t main_dns{}, backup_dns{};
esp_netif_get_dns_info(sta, ESP_NETIF_DNS_MAIN, &main_dns);
esp_netif_get_dns_info(sta, ESP_NETIF_DNS_BACKUP, &backup_dns);
Serial.printf("[DNS] %s netif: %s , %s\n", tag,
fmt_v4(main_dns.ip.u_addr.ip4.addr).c_str(),
fmt_v4(backup_dns.ip.u_addr.ip4.addr).c_str());
} else {
Serial.printf("[DNS] %s netif: <no STA handle>\n", tag);
}
}
void net_guard_pin_dns() {
ip_addr_t d1, d2;
IP_ADDR4(&d1, 1, 1, 1, 1);
IP_ADDR4(&d2, 8, 8, 8, 8);
dns_setserver(0, &d1);
dns_setserver(1, &d2);
// Also push through the esp_netif layer. dns_setserver() writes the
// global lwIP table directly; esp_netif_set_dns_info() is what the
// DHCP client itself calls, so writing here prevents the next DHCP
// event from silently overwriting our pin.
esp_netif_t* sta = esp_netif_get_handle_from_ifkey("WIFI_STA_DEF");
if (sta) {
esp_netif_dns_info_t info{};
IP_ADDR4(&info.ip, 1, 1, 1, 1);
esp_netif_set_dns_info(sta, ESP_NETIF_DNS_MAIN, &info);
IP_ADDR4(&info.ip, 8, 8, 8, 8);
esp_netif_set_dns_info(sta, ESP_NETIF_DNS_BACKUP, &info);
}
net_guard_dump_dns("pinned");
}
// Shared with the WiFi event task. 32-bit aligned loads/stores are atomic on
// Xtensa; volatile suffices. Tick re-evaluates every loop iteration, so stale
// reads self-correct within ~200ms.
static const DeviceConfig* s_cfg = nullptr;
static volatile uint8_t s_last_disconnect = 0;
static volatile bool s_up = false;
static volatile uint32_t s_attempts = 0;
static volatile uint32_t s_next_retry_ms = 0;
static void on_wifi_event(WiFiEvent_t event, WiFiEventInfo_t info) {
switch (event) {
case ARDUINO_EVENT_WIFI_STA_GOT_IP:
// Override DHCP-supplied DNS. Some routers return TC=1 for short
// answers (forcing TCP fallback that lwIP can't follow), or hand
// out an unreachable resolver. Pin to public resolvers so
// hostByName() never depends on the local network's DNS quality.
net_guard_pin_dns();
s_up = true;
s_attempts = 0;
s_next_retry_ms = 0;
event_log_write(EVT_WIFI_UP, (uint16_t)(int16_t)WiFi.RSSI(), 0);
break;
case ARDUINO_EVENT_WIFI_STA_DISCONNECTED:
s_up = false;
s_last_disconnect = (uint8_t)info.wifi_sta_disconnected.reason;
event_log_write(EVT_WIFI_DOWN, s_last_disconnect, 0);
s_next_retry_ms = millis() + net_guard_next_backoff_ms(s_attempts);
break;
default: break;
}
}
void net_guard_start(const DeviceConfig& cfg) {
s_cfg = &cfg;
// Seed s_up from the current WiFi state. setup()'s busy-wait on
// WiFi.begin() can produce a STA_GOT_IP before onEvent() is registered;
// without this seed, the first tick would force a spurious reconnect.
if (WiFi.status() == WL_CONNECTED) s_up = true;
WiFi.onEvent(on_wifi_event);
WiFi.setAutoReconnect(false); // we drive reconnect ourselves
}
bool net_guard_is_up() { return s_up; }
uint8_t net_guard_last_disconnect_reason() { return s_last_disconnect; }
extern "C" void net_guard_tick() {
// Watchdog against silent WiFi death: if we think we're up but the radio
// disagrees, force the DOWN state so reconnect scheduling kicks in.
if (s_up && WiFi.status() != WL_CONNECTED) {
s_up = false;
s_last_disconnect = 0xFF; // 0xFF = "silent death, no event"
event_log_write(EVT_WIFI_DOWN, s_last_disconnect, 0);
s_next_retry_ms = millis() + net_guard_next_backoff_ms(s_attempts);
}
if (s_up || s_cfg == nullptr) return;
// Wrap-safe: signed difference handles the ~49.7-day millis() wrap. The
// device is meant to run for months between reboots, so absolute compare
// (millis() < s_next_retry_ms) would either tight-loop retries across the
// wrap or stall them until millis() climbed back past an old high mark.
if ((int32_t)(millis() - s_next_retry_ms) < 0) return;
if (s_up) return; // re-check after the timing gate — closes GOT_IP-vs-tick race
s_attempts++;
// WiFi.begin() alone re-associates cleanly; a prior WiFi.disconnect() call
// synchronously emits STA_DISCONNECTED on the event task, which would
// double-log EVT_WIFI_DOWN (reason=ASSOC_LEAVE) on every retry.
WiFi.begin(s_cfg->wifi_ssid.c_str(), s_cfg->wifi_pass.c_str());
s_next_retry_ms = millis() + net_guard_next_backoff_ms(s_attempts);
}
#endif

View File

@@ -0,0 +1,33 @@
// firmware/lib/net_guard/net_guard.h
#pragma once
#include <stdint.h>
// Exponential backoff: 1s, 2s, 4s, 8s, 16s, 32s, 60s, 60s, ...
// attempt 0 -> 1000ms, clamped at 60000ms.
uint32_t net_guard_next_backoff_ms(uint32_t attempt);
#ifdef ARDUINO
struct DeviceConfig; // forward-decl; only net_guard_start needs the full type
// Registers WiFi.onEvent() handler and starts auto-reconnect loop.
// Must be called once after WiFi.begin() succeeds.
void net_guard_start(const DeviceConfig& cfg);
// True iff WiFi is currently associated with IP.
bool net_guard_is_up();
// Last disconnect reason code from WIFI_EVENT_STA_DISCONNECTED (0 = none).
uint8_t net_guard_last_disconnect_reason();
// Non-blocking tick called from loop(); kicks reconnect if due.
extern "C" void net_guard_tick();
// Override DHCP-supplied DNS with public resolvers (1.1.1.1, 8.8.8.8).
// Idempotent; safe to call repeatedly. net_guard re-applies on every GOT_IP,
// but main.cpp must call it once for the boot association (which completes
// before net_guard_start() registers its event handler).
void net_guard_pin_dns();
// Diagnostic: print current DNS table state from both lwIP and esp_netif.
void net_guard_dump_dns(const char* tag);
#endif

View File

@@ -7,6 +7,7 @@ platform = espressif32@6.6.0
board = m5stack-timer-cam
framework = arduino
board_build.partitions = partitions_4mb_ota.csv
build_src_filter = +<*> -<main_capture.cpp>
build_flags =
-DBOARD_HAS_PSRAM
-mfix-esp32-psram-cache-issue
@@ -20,7 +21,26 @@ upload_flags = --no-stub
lib_deps =
tzapu/WiFiManager@^2.0.17
bblanchon/ArduinoJson@^7.0.0
h2zero/NimBLE-Arduino@^1.4.2
h2zero/NimBLE-Arduino@^2.0.0
espressif/esp32-camera
; Frame-capture build. Strips WiFi/BLE/CV/reporter; streams raw 96x96 frames
; over serial at 921600 baud for offline algorithm iteration.
[env:timercam-capture]
platform = espressif32@6.6.0
board = m5stack-timer-cam
framework = arduino
board_build.partitions = partitions_4mb_ota.csv
build_flags =
-DBOARD_HAS_PSRAM
-mfix-esp32-psram-cache-issue
-DCORE_DEBUG_LEVEL=0
-DCONFIG_SPIRAM_USE_MALLOC=1
build_src_filter = -<*> +<main_capture.cpp> +<camera.cpp>
monitor_speed = 460800
upload_speed = 115200
upload_flags = --no-stub
lib_deps =
espressif/esp32-camera
[env:native]

View File

@@ -42,8 +42,8 @@ static String sha256_prefix(const String& input) {
return hex;
}
class ScanCallback : public NimBLEAdvertisedDeviceCallbacks {
void onResult(NimBLEAdvertisedDevice* dev) override {
class ScanCallback : public NimBLEScanCallbacks {
void onResult(const NimBLEAdvertisedDevice* dev) override {
String mac = String(dev->getAddress().toString().c_str());
String hash = sha256_prefix(mac);
int rssi = dev->getRSSI();
@@ -51,7 +51,6 @@ class ScanCallback : public NimBLEAdvertisedDeviceCallbacks {
std::lock_guard<std::mutex> lock(s_mutex);
auto it = s_seen.find(hash);
if (it == s_seen.end()) {
Serial.printf("[BLE] new device: %s (rssi %d)\n", hash.c_str(), rssi);
s_seen[hash] = {rssi, 1};
} else {
it->second.rssi_sum += rssi;
@@ -68,16 +67,16 @@ static NimBLEScan* s_scan = nullptr;
void ble_scanner_start() {
NimBLEDevice::init("");
s_scan = NimBLEDevice::getScan();
s_scan->setAdvertisedDeviceCallbacks(&s_callback, true); // true = allow duplicates
s_scan->setScanCallbacks(&s_callback, true); // true = allow duplicates
s_scan->setActiveScan(false); // passive
s_scan->setInterval(100);
s_scan->setWindow(99);
s_scan->setMaxResults(0); // don't store results — callback-only
s_scan->start(0, nullptr, false); // 0 = continuous
s_scan->start(0, false, false); // duration=0 (forever), isContinue=false, restart=false
}
void ble_scanner_pause() { if (s_scan) s_scan->stop(); }
void ble_scanner_resume() { if (s_scan) s_scan->start(0, nullptr, false); }
void ble_scanner_resume() { if (s_scan) s_scan->start(0, false, false); }
void ble_scanner_deinit() {
if (s_scan) s_scan->stop();

View File

@@ -8,6 +8,10 @@
#include "cv.h"
#include "ble_scanner.h"
#include "reporter.h"
#include "event_log.h"
#include "net_guard.h"
#include <esp_system.h>
#include <esp_task_wdt.h>
// LED on GPIO2 (TimerCamera-F built-in LED) — verify against board schematic
// Factory reset: hold GPIO37 (BOOT button) for 5 seconds
@@ -15,6 +19,15 @@
#define BUTTON_PIN 37
#define FACTORY_RESET_HOLD_MS 5000
// BLE scanning disabled in production until the NimBLE-Arduino 1.4.2 timer
// race is resolved. Symptom: FreeRTOS timer task dispatches an
// os_callout_timer_cb whose callback fn is NULL, causing PC=0 fetch and
// Historical note: NimBLE-Arduino 1.4.2 had an init/fire race in its FreeRTOS
// callout porting layer that caused a NULL-fn dispatch (PC=0,
// InstrFetchProhibited) within ~1s of boot when the camera task starved the
// timer service. Fixed by upgrading to 2.x (see platformio.ini).
#define BLE_SCANNING_ENABLED 1
#define CAM_FPS 5
#define CAM_INTERVAL_MS (1000 / CAM_FPS)
#define REPORT_INTERVAL_S 3600
@@ -24,49 +37,77 @@ static DeviceConfig g_cfg;
static CVState g_cv;
static SemaphoreHandle_t s_cv_mutex = nullptr;
// LED: simple on/off — blink patterns can be added later
static void led_set(bool on) { digitalWrite(LED_PIN, on ? HIGH : LOW); }
// Non-blocking-ish detection blink. Saves and restores the current LED state
// so it doesn't clobber upload/no-wifi indicators. Total duration: ~60ms per
// pulse + 80ms gap between pulses.
static void led_blink_pattern(int pulses) {
bool prev = digitalRead(LED_PIN);
for (int i = 0; i < pulses; i++) {
led_set(true);
vTaskDelay(pdMS_TO_TICKS(60));
led_set(false);
if (i < pulses - 1) vTaskDelay(pdMS_TO_TICKS(80));
}
led_set(prev);
}
static void check_factory_reset() {
if (digitalRead(BUTTON_PIN) != LOW) return;
uint32_t held = millis();
while (digitalRead(BUTTON_PIN) == LOW) {
if (millis() - held >= FACTORY_RESET_HOLD_MS) {
event_log_write(EVT_REBOOT, REBOOT_FACTORY_RESET, 0);
config_clear_wifi();
ESP.restart();
}
delay(50);
esp_task_wdt_reset();
}
}
// Camera + CV task — runs on core 1 at 5 fps
static void task_camera(void*) {
static uint8_t frame[CV_PIXELS]; // static: avoids 9KB on task stack
int last_logged_track_id = 0; // diagnostic: log each new track once
esp_task_wdt_add(nullptr);
while (true) {
if (camera_capture_96(frame)) {
if (xSemaphoreTake(s_cv_mutex, pdMS_TO_TICKS(100)) == pdTRUE) {
CVResult r = cv_process(g_cv, frame, g_cfg.line_offset);
if (r.entries_delta) Serial.printf("[CV] entry +%d (total %d)\n", r.entries_delta, g_cv.entries);
if (r.exits_delta) Serial.printf("[CV] exit +%d (total %d)\n", r.exits_delta, g_cv.exits);
(void)last_logged_track_id;
if (r.entries_delta) Serial.printf("[CV] entry +%d (total %d) first=%.1f min=%.1f max=%.1f last=%.1f dur=%d\n",
r.entries_delta, g_cv.entries,
r.fire_first_c, r.fire_min_c, r.fire_max_c, r.fire_last_c, r.fire_duration);
if (r.exits_delta) Serial.printf("[CV] exit +%d (total %d) first=%.1f min=%.1f max=%.1f last=%.1f dur=%d\n",
r.exits_delta, g_cv.exits,
r.fire_first_c, r.fire_min_c, r.fire_max_c, r.fire_last_c, r.fire_duration);
xSemaphoreGive(s_cv_mutex);
if (r.entries_delta) led_blink_pattern(1);
if (r.exits_delta) led_blink_pattern(2);
}
}
vTaskDelay(pdMS_TO_TICKS(CAM_INTERVAL_MS));
esp_task_wdt_reset();
}
}
// Hourly reporter task — runs on core 0
static void task_reporter(void*) {
uint32_t last_report_ts = 0; // 0 = not initialized yet
esp_task_wdt_add(nullptr);
while (true) {
vTaskDelay(pdMS_TO_TICKS(10000)); // check every 10s
esp_task_wdt_reset();
uint32_t now = (uint32_t)(time(nullptr));
if (now < 1700000000UL) continue; // NTP not synced
// First valid timestamp — schedule boot report 60s from now
if (last_report_ts == 0) {
event_log_write(EVT_NTP_SYNC, (uint16_t)(millis() / 1000), 0);
last_report_ts = now - (REPORT_INTERVAL_S - BOOT_REPORT_DELAY_S);
continue;
}
@@ -78,7 +119,9 @@ static void task_reporter(void*) {
last_report_ts = now;
// Deinit BLE to free ~25KB heap for SSL handshakes
#if BLE_SCANNING_ENABLED
ble_scanner_deinit();
#endif
led_set(true); // on = uploading
CameraHourlyRecord cam_rec;
@@ -88,19 +131,41 @@ static void task_reporter(void*) {
xSemaphoreGive(s_cv_mutex);
} else {
// Failed to acquire — skip this cycle, will report next hour
#if BLE_SCANNING_ENABLED
ble_scanner_reinit();
#endif
led_set(false);
continue;
}
#if !BLE_SCANNING_ENABLED
BLEHourlyRecord ble_rec = {period_start, period_end, 0, 0};
#else
BLEHourlyRecord ble_rec = ble_scanner_collect(period_start, period_end);
#endif
reporter_submit_camera(g_cfg, cam_rec);
reporter_submit_ble(g_cfg, ble_rec);
reporter_heartbeat(g_cfg, millis() / 1000, WiFi.RSSI());
bool hb_ok = reporter_heartbeat(g_cfg, millis() / 1000, WiFi.RSSI());
#if BLE_SCANNING_ENABLED
ble_scanner_reinit();
#endif
led_set(false);
static uint8_t consecutive_misses = 0;
if (hb_ok) {
consecutive_misses = 0;
} else {
consecutive_misses++;
event_log_write(EVT_HEARTBEAT_MISS, consecutive_misses, 0);
Serial.printf("[WDG] heartbeat miss %u/6\n", consecutive_misses);
if (consecutive_misses >= 6) {
event_log_write(EVT_REBOOT, REBOOT_HEARTBEAT_MISS, 0);
delay(200); // let NVS commit before reboot
ESP.restart();
}
}
}
}
@@ -110,14 +175,26 @@ void setup() {
pinMode(BUTTON_PIN, INPUT_PULLUP);
led_set(true); // on = booting
event_log_init();
event_log_write(EVT_BOOT, (uint16_t)esp_reset_reason(), 0);
if (!config_load(g_cfg)) {
Serial.println("FATAL: device_id/location_id/hmac_secret not provisioned");
while (true) { delay(500); led_set(!digitalRead(LED_PIN)); } // fast blink
event_log_write(EVT_REBOOT, REBOOT_FATAL_CONFIG, 0);
// Blink fast for 3s so a physically-present operator can see it,
// then reboot so EVT_BOOT history on the next heartbeat surfaces
// the failure — though in this case the device can't heartbeat
// without config, so the real signal is the fast-blink-then-reboot
// cycle visible on the LED.
uint32_t t0 = millis();
while (millis() - t0 < 3000) { led_set(!digitalRead(LED_PIN)); delay(100); }
ESP.restart();
}
// Connect to WiFi
if (!config_has_wifi()) {
provisioning_run();
event_log_write(EVT_REBOOT, REBOOT_WIFI_REPROV, 0);
ESP.restart();
}
@@ -131,9 +208,16 @@ void setup() {
if (WiFi.status() != WL_CONNECTED) {
// Saved creds failed — re-provision
provisioning_run();
event_log_write(EVT_REBOOT, REBOOT_WIFI_REPROV, 0);
ESP.restart();
}
// Boot connect happens before net_guard registers its WiFi event handler,
// so the GOT_IP-driven DNS override there won't fire for this association.
// Pin DNS now; net_guard re-applies it on every subsequent reconnect.
net_guard_pin_dns();
net_guard_start(g_cfg);
led_set(false); // off = connected
// NTP sync (UTC)
@@ -143,38 +227,62 @@ void setup() {
if (!camera_init()) {
Serial.println("FATAL: camera init failed");
while (true) delay(1000);
event_log_write(EVT_REBOOT, REBOOT_FATAL_CAMERA, 0);
uint32_t t0 = millis();
while (millis() - t0 < 3000) { led_set(!digitalRead(LED_PIN)); delay(100); }
ESP.restart();
}
reporter_init();
#if BLE_SCANNING_ENABLED
ble_scanner_start();
#endif
// OTA update support
ArduinoOTA.setHostname(g_cfg.device_id.c_str());
#if !BLE_SCANNING_ENABLED
ArduinoOTA.onStart([]() { });
#else
ArduinoOTA.onStart([]() { ble_scanner_pause(); });
ArduinoOTA.onEnd([]() { ble_scanner_resume(); ESP.restart(); });
#endif
ArduinoOTA.onEnd([]() {
#if BLE_SCANNING_ENABLED
ble_scanner_resume();
#endif
event_log_write(EVT_REBOOT, REBOOT_OTA, 0);
ESP.restart();
});
#if !BLE_SCANNING_ENABLED
ArduinoOTA.onError([](ota_error_t e) { });
#else
ArduinoOTA.onError([](ota_error_t e) { ble_scanner_resume(); });
#endif
ArduinoOTA.begin();
s_cv_mutex = xSemaphoreCreateMutex();
// Task watchdog: 30s timeout, panic on trigger so we reboot and log
// via esp_reset_reason() in EVT_BOOT on the next boot.
esp_task_wdt_init(30, /*panic=*/true);
esp_task_wdt_add(nullptr); // subscribe the Arduino loopTask
xTaskCreatePinnedToCore(task_camera, "cam", 8192, nullptr, 2, nullptr, 1);
xTaskCreatePinnedToCore(task_reporter, "rep", 8192, nullptr, 1, nullptr, 0);
}
void loop() {
esp_task_wdt_reset();
ArduinoOTA.handle();
check_factory_reset();
net_guard_tick();
if (WiFi.status() != WL_CONNECTED) {
led_set(true); // on = no WiFi
WiFi.reconnect();
delay(5000);
if (WiFi.status() == WL_CONNECTED) {
led_set(false);
reporter_flush(g_cfg);
}
static bool s_was_up = true;
bool up = net_guard_is_up();
if (up != s_was_up) {
led_set(!up); // LED on when NOT up
if (up) reporter_flush(g_cfg);
s_was_up = up;
}
delay(1000);
delay(200);
}

View File

@@ -0,0 +1,64 @@
// firmware/src/main_capture.cpp
//
// Frame-dump firmware. Replaces main.cpp when building env:timercam-capture.
// Streams raw 96x96 grayscale frames at 5 fps over serial (921600 baud) for
// offline algorithm iteration.
//
// Wire format per frame (little-endian):
// magic uint32 0xDC0FC0DE
// frame_ix uint32 monotonic counter
// millis uint32 ms since boot
// pixels byte[9216] raw grayscale 96x96, row-major
//
// No WiFi, no BLE, no CV. Just camera → serial.
#include <Arduino.h>
#include "camera.h"
#include "cv.h" // for CV_PIXELS
#define LED_PIN 2
#define CAM_FPS 5
#define CAM_INTERVAL_MS (1000 / CAM_FPS)
// Magic chosen from bytes that commonly survive; 'FRM1' ascii.
// Avoid high bytes 0xA0-AF / 0xD0-DF — observed missing from the CH9102 stream.
static const uint32_t FRAME_MAGIC = 0x314D5246; // 'FRM1' little-endian on wire
void setup() {
Serial.begin(460800);
pinMode(LED_PIN, OUTPUT);
digitalWrite(LED_PIN, HIGH);
delay(500);
Serial.println("# capture-mode: 460800 baud, 96x96 gray @ 5fps");
Serial.flush();
if (!camera_init()) {
Serial.println("# FATAL: camera init failed");
while (true) {
digitalWrite(LED_PIN, !digitalRead(LED_PIN));
delay(200);
}
}
digitalWrite(LED_PIN, LOW);
}
void loop() {
static uint8_t frame[CV_PIXELS];
static uint32_t frame_ix = 0;
uint32_t t0 = millis();
if (camera_capture_96(frame)) {
uint32_t ms = millis();
Serial.write((uint8_t*)&FRAME_MAGIC, 4);
Serial.write((uint8_t*)&frame_ix, 4);
Serial.write((uint8_t*)&ms, 4);
Serial.write(frame, CV_PIXELS);
frame_ix++;
digitalWrite(LED_PIN, frame_ix & 1);
}
uint32_t elapsed = millis() - t0;
if (elapsed < CAM_INTERVAL_MS) delay(CAM_INTERVAL_MS - elapsed);
}

View File

@@ -1,12 +1,18 @@
// firmware/src/reporter.cpp
#include "reporter.h"
#include "hmac.h"
#include "event_log.h"
#include "net_guard.h"
#include <HTTPClient.h>
#include <ArduinoJson.h>
#include <WiFi.h>
#include <algorithm>
#include <vector>
#include <time.h>
#include <freertos/semphr.h>
#include <esp_task_wdt.h>
#include <esp_system.h>
#include <esp_heap_caps.h>
static std::vector<CameraHourlyRecord> s_cam_buf;
static std::vector<BLEHourlyRecord> s_ble_buf;
@@ -21,25 +27,127 @@ static uint32_t now_ts() {
return (uint32_t)time(nullptr);
}
static bool post_json(const DeviceConfig& cfg, const char* path, const String& body) {
// Last successfully resolved IP — used as a warm fallback if a subsequent
// resolution fails. Never takes precedence over a fresh successful resolve.
static IPAddress s_cached_api_ip;
// Resolve the API host. Tries hostByName first; on failure falls back to the
// last good resolution, then to the hardcoded fallback IP. Returns the IP via
// out-param and a label describing where it came from for logging.
static bool resolve_api_ip(IPAddress& out, const char*& source) {
IPAddress ip;
uint32_t r0 = millis();
bool ok = WiFi.hostByName(REPORTER_API_HOST_NAME, ip);
uint32_t elapsed = millis() - r0;
if (ok) {
s_cached_api_ip = ip;
out = ip;
source = "dns";
Serial.printf("[DNS] %s -> %s (%u ms)\n",
REPORTER_API_HOST_NAME, ip.toString().c_str(), (unsigned)elapsed);
return true;
}
Serial.printf("[DNS] %s -> FAIL (%u ms)\n",
REPORTER_API_HOST_NAME, (unsigned)elapsed);
net_guard_dump_dns("on-fail");
net_guard_pin_dns(); // re-assert in case something overwrote the table
if ((uint32_t)s_cached_api_ip != 0) {
out = s_cached_api_ip;
source = "cache";
return true;
}
if (out.fromString(REPORTER_API_FALLBACK_IP)) {
source = "fallback";
return true;
}
return false;
}
// Drains and parses the HTTP response status line. Returns the numeric status
// code, or -1 on read timeout / malformed response.
static int read_http_status(WiFiClient& client, uint32_t timeout_ms) {
uint32_t deadline = millis() + timeout_ms;
while (!client.available() && millis() < deadline) vTaskDelay(pdMS_TO_TICKS(10));
if (!client.available()) return -1;
String line = client.readStringUntil('\n');
line.trim();
// Format: "HTTP/1.1 200 OK"
int sp1 = line.indexOf(' ');
if (sp1 < 0) return -1;
int sp2 = line.indexOf(' ', sp1 + 1);
String code_str = (sp2 > 0) ? line.substring(sp1 + 1, sp2) : line.substring(sp1 + 1);
return code_str.toInt();
}
static bool post_json_once(const DeviceConfig& cfg, const char* path, const String& body) {
uint32_t ts = now_ts();
// Reject if NTP hasn't synced yet (timestamp would be near epoch 0)
if (ts < 1700000000UL) return false; // pre-2023 → clock not valid
if (ts < 1700000000UL) return false;
String sig = hmac_sign(cfg.hmac_secret, "POST", path, ts, body);
if (sig.isEmpty()) return false; // HMAC failed
if (sig.isEmpty()) return false;
HTTPClient http;
String url = String(REPORTER_API_HOST) + path;
http.begin(url);
http.addHeader("Content-Type", "application/json");
http.addHeader("X-Device-Id", cfg.device_id);
http.addHeader("X-Timestamp", String(ts));
http.addHeader("X-Signature", sig);
IPAddress ip;
const char* ip_source = "?";
if (!resolve_api_ip(ip, ip_source)) {
Serial.printf("[HTTP] POST %s -> resolve-fail\n", path);
event_log_write(EVT_HTTP_FAIL, event_log_path_hash(path), (uint16_t)-1);
return false;
}
int code = http.POST(body);
http.end();
Serial.printf("[HTTP] POST %s → %d\n", url.c_str(), code);
return (code == 200);
uint32_t t0 = millis();
WiFiClient client;
client.setTimeout(10); // seconds — read timeout
if (!client.connect(ip, REPORTER_API_PORT, 5000 /*ms connect timeout*/)) {
uint32_t elapsed = millis() - t0;
Serial.printf("[HTTP] connect %s:%u (%s) -> failed (%u ms)\n",
ip.toString().c_str(), REPORTER_API_PORT, ip_source, (unsigned)elapsed);
event_log_write(EVT_HTTP_FAIL, event_log_path_hash(path), (uint16_t)-1);
return false;
}
// Manual HTTP/1.1 — gives us full control over the Host header so the
// server's vhost routing works even when we connect by IP.
client.printf("POST %s HTTP/1.1\r\n", path);
client.printf("Host: %s\r\n", REPORTER_API_HOST_NAME);
client.print ("Connection: close\r\n");
client.print ("Content-Type: application/json\r\n");
client.printf("Content-Length: %u\r\n", (unsigned)body.length());
client.printf("X-Device-Id: %s\r\n", cfg.device_id.c_str());
client.printf("X-Timestamp: %u\r\n", (unsigned)ts);
client.printf("X-Signature: %s\r\n", sig.c_str());
client.print ("\r\n");
client.print(body);
int code = read_http_status(client, 10000);
// Drain so the server can close cleanly.
while (client.connected() && client.available()) client.read();
client.stop();
uint32_t elapsed = millis() - t0;
uint16_t phash = event_log_path_hash(path);
Serial.printf("[HTTP] POST %s%s (%s %s) -> %d (%u ms)\n",
REPORTER_API_HOST_NAME, path, ip_source, ip.toString().c_str(),
code, (unsigned)elapsed);
if (code == 200) {
event_log_write(EVT_HTTP_OK, phash, (uint16_t)((elapsed > 65535) ? 65535 : elapsed));
return true;
}
event_log_write(EVT_HTTP_FAIL, phash, (uint16_t)code);
return false;
}
static bool post_json(const DeviceConfig& cfg, const char* path, const String& body) {
// 3 attempts. Worst case per call: 3 × (5s connect + 10s response) + 0 + 2 + 5 = 52s.
// TWDT is fed before the backoff delay and before each attempt so the 30s
// timeout doesn't fire mid-sequence.
static const uint16_t DELAYS_MS[] = { 0, 2000, 5000 };
for (int i = 0; i < 3; i++) {
esp_task_wdt_reset();
if (DELAYS_MS[i]) vTaskDelay(pdMS_TO_TICKS(DELAYS_MS[i]));
esp_task_wdt_reset();
if (post_json_once(cfg, path, body)) return true;
}
return false;
}
static String build_camera_batch(const DeviceConfig& cfg,
@@ -147,16 +255,36 @@ void reporter_submit_ble(const DeviceConfig& cfg, const BLEHourlyRecord& rec) {
}
}
void reporter_heartbeat(const DeviceConfig& cfg, uint32_t uptime_s, int wifi_rssi) {
bool reporter_heartbeat(const DeviceConfig& cfg, uint32_t uptime_s, int wifi_rssi) {
JsonDocument doc;
doc["device_id"] = cfg.device_id;
doc["firmware_version"] = "1.0.0";
doc["firmware_version"] = "1.1.0";
doc["free_storage_pct"] = 100;
doc["wifi_rssi"] = wifi_rssi;
doc["pending_records"] = (int)(s_cam_buf.size() + s_ble_buf.size());
doc["uptime_seconds"] = uptime_s;
// Diagnostics (new in 1.1.0)
doc["reset_reason"] = (int)esp_reset_reason();
doc["heap_free"] = (int)esp_get_free_heap_size();
doc["heap_min_free"] = (int)esp_get_minimum_free_heap_size();
doc["last_disconnect_code"] = (int)net_guard_last_disconnect_reason();
// Last 8 event-log entries, newest first
EventLogEntry recent[8];
size_t n = event_log_read_recent(recent, 8);
JsonArray evs = doc["recent_events"].to<JsonArray>();
for (size_t i = 0; i < n; i++) {
JsonObject e = evs.add<JsonObject>();
e["t"] = recent[i].tag;
e["d0"] = recent[i].data0;
e["d1"] = recent[i].data1;
e["ts"] = recent[i].ts_unix;
e["up"] = recent[i].uptime_s;
}
String body; serializeJson(doc, body);
post_json(cfg, "/api/v1/heartbeat", body);
return post_json(cfg, "/api/v1/heartbeat", body);
}
void reporter_flush(const DeviceConfig& cfg) {
@@ -169,7 +297,10 @@ void reporter_flush(const DeviceConfig& cfg) {
String body = build_camera_batch(cfg, cam_snap);
if (post_json(cfg, "/api/v1/camera/events/batch", body)) {
xSemaphoreTake(s_buf_mutex, portMAX_DELAY);
s_cam_buf.clear();
// Erase only the prefix we snapshotted; FIFO append from
// submit_camera during the in-flight POST stays buffered.
size_t n = std::min(cam_snap.size(), s_cam_buf.size());
s_cam_buf.erase(s_cam_buf.begin(), s_cam_buf.begin() + n);
xSemaphoreGive(s_buf_mutex);
}
}
@@ -177,7 +308,8 @@ void reporter_flush(const DeviceConfig& cfg) {
String body = build_ble_batch(cfg, ble_snap);
if (post_json(cfg, "/api/v1/events/batch", body)) {
xSemaphoreTake(s_buf_mutex, portMAX_DELAY);
s_ble_buf.clear();
size_t n = std::min(ble_snap.size(), s_ble_buf.size());
s_ble_buf.erase(s_ble_buf.begin(), s_ble_buf.begin() + n);
xSemaphoreGive(s_buf_mutex);
}
}

View File

@@ -11,11 +11,16 @@ struct CameraHourlyRecord {
int exits;
};
static const int REPORTER_MAX_BUFFER = 24;
static const char* REPORTER_API_HOST = "http://logs.research.bike";
static const int REPORTER_MAX_BUFFER = 24;
static const char* REPORTER_API_HOST_NAME = "logs.research.bike";
static const uint16_t REPORTER_API_PORT = 80;
// Hardcoded fallback used when DNS fails (some customer networks intercept
// :53 with a transparent proxy that mangles responses). Update if the
// server's IP changes — but a successful hostByName() always wins over this.
static const char* REPORTER_API_FALLBACK_IP = "5.78.114.131";
void reporter_init();
void reporter_submit_camera(const DeviceConfig& cfg, const CameraHourlyRecord& rec);
void reporter_submit_ble(const DeviceConfig& cfg, const BLEHourlyRecord& rec);
void reporter_heartbeat(const DeviceConfig& cfg, uint32_t uptime_s, int wifi_rssi);
bool reporter_heartbeat(const DeviceConfig& cfg, uint32_t uptime_s, int wifi_rssi);
void reporter_flush(const DeviceConfig& cfg);

View File

@@ -7,160 +7,290 @@ static void fill_frame(uint8_t* f, uint8_t val) {
memset(f, val, CV_PIXELS);
}
// Draw a rectangular walker-blob spanning rows [y0, y1], columns [cx-hw, cx+hw].
// Pixel value 200 over background 100 -> frame_diff threshold (30) is cleared.
static void draw_walker(uint8_t* f, int y0, int y1, int cx, int hw) {
fill_frame(f, 100);
for (int y = y0; y <= y1; y++) {
if (y < 0 || y >= CV_H) continue;
for (int x = cx - hw; x <= cx + hw; x++) {
if (x < 0 || x >= CV_W) continue;
f[y * CV_W + x] = 200;
}
}
}
static void prime_bg(CVState& state) {
uint8_t bg[CV_PIXELS];
fill_frame(bg, 100);
cv_process(state, bg, 50);
}
// Let the event state machine see QUIET_FRAMES+1 empty frames so any active
// event finalizes before the next test assertion.
static void quiesce(CVState& state) {
uint8_t bg[CV_PIXELS]; fill_frame(bg, 100);
for (int i = 0; i < CV_EVENT_QUIET_FRAMES + 1; i++) cv_process(state, bg, 50);
}
void setUp(void) {}
void tearDown(void) {}
void test_frame_diff_no_change_gives_no_fg() {
CVState state;
cv_init(state);
uint8_t frame[CV_PIXELS];
fill_frame(frame, 128);
void test_no_change_no_event() {
CVState state; cv_init(state);
uint8_t frame[CV_PIXELS]; fill_frame(frame, 128);
CVResult r1 = cv_process(state, frame, 50);
TEST_ASSERT_EQUAL_INT(0, r1.entries_delta);
CVResult r2 = cv_process(state, frame, 50);
TEST_ASSERT_EQUAL_INT(0, r2.entries_delta);
TEST_ASSERT_EQUAL_INT(0, r2.exits_delta);
}
void test_frame_diff_large_change_detected_no_crash() {
CVState state;
cv_init(state);
uint8_t bg[CV_PIXELS], fg_frame[CV_PIXELS];
fill_frame(bg, 100);
fill_frame(fg_frame, 200);
cv_process(state, bg, 50);
CVResult r = cv_process(state, fg_frame, 50);
// Tracking not yet implemented — just verify no crash and result is zero
TEST_ASSERT_EQUAL_INT(0, r.entries_delta);
TEST_ASSERT_EQUAL_INT(0, r.exits_delta);
}
void test_cv_init_clears_state() {
CVState state;
state.entries = 99; state.exits = 88;
state.entries = 99; state.exits = 88; state.event_active = true;
cv_init(state);
TEST_ASSERT_EQUAL_INT(0, state.entries);
TEST_ASSERT_EQUAL_INT(0, state.exits);
TEST_ASSERT_FALSE(state.bg_valid);
TEST_ASSERT_FALSE(state.event_active);
}
void test_cv_reset_counts() {
CVState state;
cv_init(state);
state.entries = 5;
state.exits = 3;
CVState state; cv_init(state);
state.entries = 5; state.exits = 3;
cv_reset_counts(state);
TEST_ASSERT_EQUAL_INT(0, state.entries);
TEST_ASSERT_EQUAL_INT(0, state.exits);
}
void test_tracking_spawns_track_for_new_blob() {
CVState state;
cv_init(state);
void test_walker_up_through_frame_is_entry() {
// Simulate a walker traversing from bottom to top of frame.
// Per-frame fg_count and centroid (11-wide column, height H -> n=11*H):
// t0 y=60..95 n=396 c=77 <- event starts (n >= ENTER=300)
// t1 y=30..95 n=726 c=62
// t2 y=0..95 n=1056 c=47
// t3 y=0..60 n=671 c=30
// t4 y=0..25 n=286 c=12 (below EXIT=200, quiet=1)
// t5 y=0..10 n=121 c=5 (below EXIT, quiet=2)
// t6 empty quiet=3 -> finalize
CVState state; cv_init(state);
prime_bg(state);
uint8_t bg[CV_PIXELS];
fill_frame(bg, 100);
cv_process(state, bg, 50); // init background
// Frame with a bright 30x30 blob in top-left quadrant
uint8_t blob_frame[CV_PIXELS];
fill_frame(blob_frame, 100);
for (int y = 5; y < 35; y++)
for (int x = 5; x < 35; x++)
blob_frame[y * CV_W + x] = 200;
cv_process(state, blob_frame, 50);
TEST_ASSERT_EQUAL_INT(1, (int)state.tracks.size());
TEST_ASSERT_FLOAT_WITHIN(5.0f, 20.0f, state.tracks[0].x);
TEST_ASSERT_FLOAT_WITHIN(5.0f, 20.0f, state.tracks[0].y);
}
static void make_blob_frame(uint8_t* f, int cx, int cy) {
fill_frame(f, 100);
for (int y = cy - 12; y <= cy + 12; y++)
for (int x = cx - 12; x <= cx + 12; x++)
if (y >= 0 && y < CV_H && x >= 0 && x < CV_W)
f[y * CV_W + x] = 200;
}
void test_blob_crossing_line_top_to_bottom_is_entry() {
CVState state;
cv_init(state);
// Line at 50% = y=48; step ≤14px per frame to stay within CV_MAX_MOVE
uint8_t bg[CV_PIXELS];
fill_frame(bg, 100);
cv_process(state, bg, 50); // init background
// Walk blob from y=20 toward line; crossing occurs at y=48 (above→below)
// Stop at crossing frame and assert its result
int setup[] = {20, 34};
for (int i = 0; i < 2; i++) {
uint8_t f[CV_PIXELS]; make_blob_frame(f, 48, setup[i]);
int rows[][2] = {{60,95},{30,95},{0,95},{0,60},{0,25},{0,10}};
for (int i = 0; i < 6; i++) {
uint8_t f[CV_PIXELS]; draw_walker(f, rows[i][0], rows[i][1], 48, 5);
cv_process(state, f, 50);
}
uint8_t fcross[CV_PIXELS]; make_blob_frame(fcross, 48, 48);
CVResult r = cv_process(state, fcross, 50);
quiesce(state);
TEST_ASSERT_EQUAL_INT(1, r.entries_delta);
TEST_ASSERT_EQUAL_INT(0, r.exits_delta);
TEST_ASSERT_EQUAL_INT(1, state.entries);
TEST_ASSERT_EQUAL_INT(0, state.exits);
}
void test_walker_down_through_frame_is_exit() {
CVState state; cv_init(state);
prime_bg(state);
int rows[][2] = {{0,35},{0,65},{0,95},{35,95},{70,95},{85,95}};
for (int i = 0; i < 6; i++) {
uint8_t f[CV_PIXELS]; draw_walker(f, rows[i][0], rows[i][1], 48, 5);
cv_process(state, f, 50);
}
quiesce(state);
TEST_ASSERT_EQUAL_INT(0, state.entries);
TEST_ASSERT_EQUAL_INT(1, state.exits);
}
void test_approach_retreat_without_full_extent_does_not_fire() {
// Walker approaches from bottom, reaches y=30, retreats, never reaches top.
// Extent gate requires min_y_seen <= 10; this event tops out at y=30 so
// extent never clears and no fire occurs regardless of trajectory score.
CVState state; cv_init(state);
prime_bg(state);
int rows[][2] = {{60,95},{40,95},{30,95},{40,95},{60,95},{80,95}};
for (int i = 0; i < 6; i++) {
uint8_t f[CV_PIXELS]; draw_walker(f, rows[i][0], rows[i][1], 48, 5);
cv_process(state, f, 50);
}
quiesce(state);
TEST_ASSERT_EQUAL_INT(0, state.entries);
TEST_ASSERT_EQUAL_INT(0, state.exits);
}
void test_brief_burst_below_min_duration_does_not_fire() {
// One frame of large fg, then gone. Event starts, immediately quiesces,
// duration ends up below CV_EVENT_MIN_FRAMES.
CVState state; cv_init(state);
prime_bg(state);
uint8_t f[CV_PIXELS]; draw_walker(f, 0, 95, 48, 5);
cv_process(state, f, 50);
quiesce(state);
TEST_ASSERT_EQUAL_INT(0, state.entries);
TEST_ASSERT_EQUAL_INT(0, state.exits);
}
void test_stationary_large_blob_does_not_fire() {
// Static large blob in frame for many frames, then removed. Centroid
// never moves -> MIN_TRAJ gate blocks fire.
CVState state; cv_init(state);
prime_bg(state);
for (int i = 0; i < 10; i++) {
uint8_t f[CV_PIXELS]; draw_walker(f, 0, 95, 48, 5);
cv_process(state, f, 50);
}
quiesce(state);
TEST_ASSERT_EQUAL_INT(0, state.entries);
TEST_ASSERT_EQUAL_INT(0, state.exits);
}
// Wait out the refractory period with bg-only frames so the next walker
// event is accepted.
static void wait_refractory(CVState& state) {
uint8_t bg[CV_PIXELS]; fill_frame(bg, 100);
for (uint32_t i = 0; i < CV_EVENT_REFRACTORY_FRAMES + 2; i++) {
cv_process(state, bg, 50);
}
}
void test_two_sequential_walkers_count_twice() {
CVState state; cv_init(state);
prime_bg(state);
int rows[][2] = {{60,95},{30,95},{0,95},{0,60},{0,25},{0,10}};
for (int i = 0; i < 6; i++) {
uint8_t f[CV_PIXELS]; draw_walker(f, rows[i][0], rows[i][1], 48, 5);
cv_process(state, f, 50);
}
quiesce(state);
wait_refractory(state);
for (int i = 0; i < 6; i++) {
uint8_t f[CV_PIXELS]; draw_walker(f, rows[i][0], rows[i][1], 48, 5);
cv_process(state, f, 50);
}
quiesce(state);
TEST_ASSERT_EQUAL_INT(2, state.entries);
TEST_ASSERT_EQUAL_INT(0, state.exits);
}
void test_full_reversal_counts_entry_then_exit() {
CVState state; cv_init(state);
prime_bg(state);
int up_rows[][2] = {{60,95},{30,95},{0,95},{0,60},{0,25},{0,10}};
int down_rows[][2] = {{0,35},{0,65},{0,95},{35,95},{70,95},{85,95}};
for (int i = 0; i < 6; i++) {
uint8_t f[CV_PIXELS]; draw_walker(f, up_rows[i][0], up_rows[i][1], 48, 5);
cv_process(state, f, 50);
}
quiesce(state);
wait_refractory(state);
for (int i = 0; i < 6; i++) {
uint8_t f[CV_PIXELS]; draw_walker(f, down_rows[i][0], down_rows[i][1], 48, 5);
cv_process(state, f, 50);
}
quiesce(state);
TEST_ASSERT_EQUAL_INT(1, state.entries);
TEST_ASSERT_EQUAL_INT(1, state.exits);
}
void test_refractory_suppresses_back_to_back_fire() {
// After a fire, a second event attempted within CV_EVENT_REFRACTORY_FRAMES
// is suppressed. Simulates walker lingering / ghost re-triggering.
CVState state; cv_init(state);
prime_bg(state);
int rows[][2] = {{60,95},{30,95},{0,95},{0,60},{0,25},{0,10}};
for (int i = 0; i < 6; i++) {
uint8_t f[CV_PIXELS]; draw_walker(f, rows[i][0], rows[i][1], 48, 5);
cv_process(state, f, 50);
}
quiesce(state);
TEST_ASSERT_EQUAL_INT(1, state.entries);
// Immediate second walker within refractory window — should NOT count.
for (int i = 0; i < 6; i++) {
uint8_t f[CV_PIXELS]; draw_walker(f, rows[i][0], rows[i][1], 48, 5);
cv_process(state, f, 50);
}
quiesce(state);
TEST_ASSERT_EQUAL_INT(1, state.entries);
}
void test_blob_crossing_line_bottom_to_top_is_exit() {
CVState state;
cv_init(state);
void test_event_counts_after_refractory_expires() {
CVState state; cv_init(state);
prime_bg(state);
uint8_t bg[CV_PIXELS]; fill_frame(bg, 100);
cv_process(state, bg, 50);
// Walk blob from y=76 toward line; crossing occurs at y=34 (below→above)
// Stop at crossing frame and assert its result
int setup[] = {76, 62, 48};
for (int i = 0; i < 3; i++) {
uint8_t f[CV_PIXELS]; make_blob_frame(f, 48, setup[i]);
int rows[][2] = {{60,95},{30,95},{0,95},{0,60},{0,25},{0,10}};
for (int i = 0; i < 6; i++) {
uint8_t f[CV_PIXELS]; draw_walker(f, rows[i][0], rows[i][1], 48, 5);
cv_process(state, f, 50);
}
uint8_t fcross[CV_PIXELS]; make_blob_frame(fcross, 48, 34);
CVResult r = cv_process(state, fcross, 50);
quiesce(state);
TEST_ASSERT_EQUAL_INT(1, state.entries);
TEST_ASSERT_EQUAL_INT(0, r.entries_delta);
TEST_ASSERT_EQUAL_INT(1, r.exits_delta);
// Wait out the refractory period.
uint8_t bg[CV_PIXELS]; fill_frame(bg, 100);
for (uint32_t i = 0; i < CV_EVENT_REFRACTORY_FRAMES + 2; i++) {
cv_process(state, bg, 50);
}
// Second walker — should now count.
for (int i = 0; i < 6; i++) {
uint8_t f[CV_PIXELS]; draw_walker(f, rows[i][0], rows[i][1], 48, 5);
cv_process(state, f, 50);
}
quiesce(state);
TEST_ASSERT_EQUAL_INT(2, state.entries);
}
void test_no_crossing_same_side_no_count() {
CVState state;
cv_init(state);
void test_noise_below_enter_thresh_does_not_start_event() {
// Tiny 5x5 blob (25 px) never crosses ENTER=300, event never starts.
CVState state; cv_init(state);
prime_bg(state);
uint8_t bg[CV_PIXELS]; fill_frame(bg, 100);
cv_process(state, bg, 50);
auto small = [](uint8_t* f, int cy) {
fill_frame(f, 100);
for (int y = cy-2; y <= cy+2; y++)
for (int x = 46; x <= 50; x++)
if (y>=0 && y<CV_H && x>=0 && x<CV_W) f[y*CV_W+x] = 200;
};
for (int cy = 10; cy <= 90; cy += 8) {
uint8_t f[CV_PIXELS]; small(f, cy);
cv_process(state, f, 50);
}
quiesce(state);
uint8_t f1[CV_PIXELS]; make_blob_frame(f1, 48, 20); // above line
cv_process(state, f1, 50);
uint8_t f2[CV_PIXELS]; make_blob_frame(f2, 48, 30); // still above line, moved closer
CVResult r = cv_process(state, f2, 50);
TEST_ASSERT_EQUAL_INT(0, r.entries_delta);
TEST_ASSERT_EQUAL_INT(0, r.exits_delta);
TEST_ASSERT_EQUAL_INT(0, state.entries);
TEST_ASSERT_EQUAL_INT(0, state.exits);
}
int main() {
UNITY_BEGIN();
RUN_TEST(test_frame_diff_no_change_gives_no_fg);
RUN_TEST(test_frame_diff_large_change_detected_no_crash);
RUN_TEST(test_no_change_no_event);
RUN_TEST(test_cv_init_clears_state);
RUN_TEST(test_cv_reset_counts);
RUN_TEST(test_tracking_spawns_track_for_new_blob);
RUN_TEST(test_blob_crossing_line_top_to_bottom_is_entry);
RUN_TEST(test_blob_crossing_line_bottom_to_top_is_exit);
RUN_TEST(test_no_crossing_same_side_no_count);
RUN_TEST(test_walker_up_through_frame_is_entry);
RUN_TEST(test_walker_down_through_frame_is_exit);
RUN_TEST(test_approach_retreat_without_full_extent_does_not_fire);
RUN_TEST(test_brief_burst_below_min_duration_does_not_fire);
RUN_TEST(test_stationary_large_blob_does_not_fire);
RUN_TEST(test_two_sequential_walkers_count_twice);
RUN_TEST(test_full_reversal_counts_entry_then_exit);
RUN_TEST(test_refractory_suppresses_back_to_back_fire);
RUN_TEST(test_event_counts_after_refractory_expires);
RUN_TEST(test_noise_below_enter_thresh_does_not_start_event);
return UNITY_END();
}

View File

@@ -0,0 +1,141 @@
// firmware/test/test_native/test_event_log.cpp
#include <unity.h>
#include <string.h>
#include "event_log.h"
// --- Native NVS stub (declared in event_log.cpp for native builds) ---
extern "C" void event_log_test_reset();
void setUp() { event_log_test_reset(); }
void tearDown() {}
void test_entry_is_32_bytes() {
TEST_ASSERT_EQUAL(32, sizeof(EventLogEntry));
}
void test_path_hash_is_stable_and_differs() {
uint16_t a = event_log_path_hash("/api/v1/heartbeat");
uint16_t b = event_log_path_hash("/api/v1/heartbeat");
uint16_t c = event_log_path_hash("/api/v1/camera/events/batch");
TEST_ASSERT_EQUAL(a, b);
TEST_ASSERT_NOT_EQUAL(a, c);
}
void test_write_then_read_recent_returns_newest_first() {
event_log_init();
event_log_write(EVT_BOOT, 1, 0);
event_log_write(EVT_WIFI_UP, 2, 0);
event_log_write(EVT_HTTP_FAIL, 3, 500);
EventLogEntry buf[8];
size_t n = event_log_read_recent(buf, 8);
TEST_ASSERT_EQUAL(3, n);
TEST_ASSERT_EQUAL(EVT_HTTP_FAIL, buf[0].tag);
TEST_ASSERT_EQUAL(500, buf[0].data1);
TEST_ASSERT_EQUAL(EVT_WIFI_UP, buf[1].tag);
TEST_ASSERT_EQUAL(EVT_BOOT, buf[2].tag);
}
void test_ring_buffer_wraps_after_32_entries() {
event_log_init();
for (int i = 0; i < 40; i++) event_log_write(EVT_HTTP_OK, (uint16_t)i, 0);
EventLogEntry buf[32];
size_t n = event_log_read_recent(buf, 32);
TEST_ASSERT_EQUAL(32, n);
// Newest first: data0 should be 39, 38, 37, ... down to 8
TEST_ASSERT_EQUAL(39, buf[0].data0);
TEST_ASSERT_EQUAL(8, buf[31].data0);
}
void test_empty_log_read_returns_zero() {
event_log_init();
EventLogEntry buf[8];
size_t n = event_log_read_recent(buf, 8);
TEST_ASSERT_EQUAL(0, n);
}
void test_read_recent_truncates_to_max_entries() {
event_log_init();
for (int i = 0; i < 10; i++) event_log_write(EVT_HTTP_OK, (uint16_t)i, 0);
EventLogEntry buf[3];
size_t n = event_log_read_recent(buf, 3);
TEST_ASSERT_EQUAL(3, n);
// Newest 3: data0 == 9, 8, 7
TEST_ASSERT_EQUAL(9, buf[0].data0);
TEST_ASSERT_EQUAL(8, buf[1].data0);
TEST_ASSERT_EQUAL(7, buf[2].data0);
}
void test_path_hash_distinguishes_real_api_paths() {
uint16_t h1 = event_log_path_hash("/api/v1/heartbeat");
uint16_t h2 = event_log_path_hash("/api/v1/camera/events/batch");
uint16_t h3 = event_log_path_hash("/api/v1/events/batch");
TEST_ASSERT_NOT_EQUAL(h1, h2);
TEST_ASSERT_NOT_EQUAL(h1, h3);
TEST_ASSERT_NOT_EQUAL(h2, h3);
}
extern "C" void event_log_test_simulate_reboot();
void test_boot_recovery_after_partial_fill() {
// Phase 1: write 5 entries before "reboot"
event_log_init();
for (uint16_t i = 0; i < 5; i++) event_log_write(EVT_HTTP_OK, i, 0);
// Phase 2: simulate reboot (clear RAM state, keep slots), re-init, verify
event_log_test_simulate_reboot();
event_log_init();
// All 5 original entries should still be readable, newest first
EventLogEntry buf[8];
size_t n = event_log_read_recent(buf, 8);
TEST_ASSERT_EQUAL(5, n);
TEST_ASSERT_EQUAL(4, buf[0].data0); // newest
TEST_ASSERT_EQUAL(0, buf[4].data0); // oldest
// Phase 3: write one more — seq must continue (not restart at 0),
// so the new entry is the newest and slot index 5 holds it
event_log_write(EVT_HTTP_OK, 99, 0);
n = event_log_read_recent(buf, 8);
TEST_ASSERT_EQUAL(6, n);
TEST_ASSERT_EQUAL(99, buf[0].data0);
TEST_ASSERT_EQUAL(4, buf[1].data0);
}
void test_boot_recovery_after_wrap() {
// Phase 1: write 40 entries (wraps the 32-slot ring once; oldest 8 dropped)
event_log_init();
for (uint16_t i = 0; i < 40; i++) event_log_write(EVT_HTTP_OK, i, 0);
// Phase 2: simulate reboot, re-init
event_log_test_simulate_reboot();
event_log_init();
// Still 32 entries visible, newest=39, oldest=8
EventLogEntry buf[32];
size_t n = event_log_read_recent(buf, 32);
TEST_ASSERT_EQUAL(32, n);
TEST_ASSERT_EQUAL(39, buf[0].data0);
TEST_ASSERT_EQUAL(8, buf[31].data0);
// Phase 3: one more write — newest becomes 100, head advances past
// wherever the max-seq slot was, oldest drops to data0=9
event_log_write(EVT_HTTP_OK, 100, 0);
n = event_log_read_recent(buf, 32);
TEST_ASSERT_EQUAL(32, n);
TEST_ASSERT_EQUAL(100, buf[0].data0);
TEST_ASSERT_EQUAL(9, buf[31].data0);
}
int main() {
UNITY_BEGIN();
RUN_TEST(test_entry_is_32_bytes);
RUN_TEST(test_path_hash_is_stable_and_differs);
RUN_TEST(test_write_then_read_recent_returns_newest_first);
RUN_TEST(test_ring_buffer_wraps_after_32_entries);
RUN_TEST(test_empty_log_read_returns_zero);
RUN_TEST(test_read_recent_truncates_to_max_entries);
RUN_TEST(test_path_hash_distinguishes_real_api_paths);
RUN_TEST(test_boot_recovery_after_partial_fill);
RUN_TEST(test_boot_recovery_after_wrap);
return UNITY_END();
}

View File

@@ -0,0 +1,32 @@
// firmware/test/test_net_guard/test_net_guard.cpp
#include <unity.h>
#include "net_guard.h"
void setUp() {}
void tearDown() {}
void test_backoff_starts_at_one_second() {
TEST_ASSERT_EQUAL(1000, net_guard_next_backoff_ms(0));
}
void test_backoff_doubles_each_attempt() {
TEST_ASSERT_EQUAL(2000, net_guard_next_backoff_ms(1));
TEST_ASSERT_EQUAL(4000, net_guard_next_backoff_ms(2));
TEST_ASSERT_EQUAL(8000, net_guard_next_backoff_ms(3));
TEST_ASSERT_EQUAL(16000, net_guard_next_backoff_ms(4));
TEST_ASSERT_EQUAL(32000, net_guard_next_backoff_ms(5));
}
void test_backoff_clamps_at_60s() {
TEST_ASSERT_EQUAL(60000, net_guard_next_backoff_ms(6));
TEST_ASSERT_EQUAL(60000, net_guard_next_backoff_ms(7));
TEST_ASSERT_EQUAL(60000, net_guard_next_backoff_ms(100));
}
int main() {
UNITY_BEGIN();
RUN_TEST(test_backoff_starts_at_one_second);
RUN_TEST(test_backoff_doubles_each_attempt);
RUN_TEST(test_backoff_clamps_at_60s);
return UNITY_END();
}

View File

@@ -11,7 +11,7 @@ import sqlite3
from typing import List
from fastapi import Depends
from pydantic import BaseModel, Field
from pydantic import BaseModel, Field, model_validator
class CameraRecord(BaseModel):
@@ -20,6 +20,12 @@ class CameraRecord(BaseModel):
entries: int = Field(ge=0)
exits: int = Field(ge=0)
@model_validator(mode="after")
def _period_order(self):
if self.period_end <= self.period_start:
raise ValueError("period_end must be strictly greater than period_start")
return self
class CameraEventsRequest(BaseModel):
location_id: str

View File

@@ -0,0 +1,129 @@
# server/heartbeat_diagnostics_stub.py
# Add these models and the persistence helper to the server's main.py alongside
# the existing heartbeat endpoint (POST /api/v1/heartbeat).
# Requires: diagnostic columns on the heartbeats table (see migrations/005_heartbeat_diagnostics.sql)
#
# Firmware v1.1.0 extends the heartbeat payload with five optional diagnostic
# fields. v1.0.0-shape payloads (without these fields) must continue to parse
# cleanly — every new field is Optional and defaults to None.
#
# IMPORTANT: Adjust the table name in store_heartbeat_diagnostics to match the
# real server's schema if it differs from "heartbeats".
import json
import sqlite3
from typing import List, Optional
from pydantic import BaseModel
class RecentEvent(BaseModel):
t: int # EventLogTag (see EVENT_TAG_DECODER)
d0: int # tag-specific datum 0
d1: int # tag-specific datum 1
ts: int # unix timestamp (seconds)
up: int # seconds since boot when event was logged
# Extend the existing HeartbeatRequest model in main.py by adding these five
# optional fields. The rest of the heartbeat model (device_id, uptime, etc.)
# stays as-is. Shown here as a standalone model for reference/testing.
class HeartbeatDiagnosticsFields(BaseModel):
reset_reason: Optional[int] = None
heap_free: Optional[int] = None
heap_min_free: Optional[int] = None
last_disconnect_code: Optional[int] = None
recent_events: Optional[List[RecentEvent]] = None
# Example of the fully-extended heartbeat request model (merge into the
# existing HeartbeatRequest in main.py rather than introducing a second class):
class HeartbeatRequestWithDiagnostics(BaseModel):
device_id: str
uptime: int
# ... existing fields from the v1.0.0 heartbeat model go here ...
# New v1.1.0 diagnostic fields:
reset_reason: Optional[int] = None
heap_free: Optional[int] = None
heap_min_free: Optional[int] = None
last_disconnect_code: Optional[int] = None
recent_events: Optional[List[RecentEvent]] = None
# Call this inside the existing receive_heartbeat handler after the base
# heartbeat row has been inserted/updated. It persists the diagnostic fields
# on the same row keyed by device_id.
def store_heartbeat_diagnostics(
db: sqlite3.Connection,
device_id: str,
hb: HeartbeatRequestWithDiagnostics,
) -> None:
"""Persist the v1.1.0 diagnostic fields onto the heartbeats row for device_id.
recent_events is JSON-serialized into a TEXT column for flexibility;
the other four fields are stored as INTEGERs. All fields are nullable
and left untouched when the payload omits them (v1.0.0 compatibility).
"""
recent_events_json = (
json.dumps([ev.model_dump() for ev in hb.recent_events])
if hb.recent_events is not None
else None
)
cursor = db.cursor()
# COALESCE preserves existing column values when the v1.0.0 payload omits
# diagnostic fields (Pydantic resolves them to None).
cursor.execute(
"""UPDATE heartbeats
SET reset_reason = COALESCE(?, reset_reason),
heap_free = COALESCE(?, heap_free),
heap_min_free = COALESCE(?, heap_min_free),
last_disconnect_code = COALESCE(?, last_disconnect_code),
recent_events = COALESCE(?, recent_events)
WHERE device_id = ?""",
(
hb.reset_reason,
hb.heap_free,
hb.heap_min_free,
hb.last_disconnect_code,
recent_events_json,
device_id,
),
)
db.commit()
# ---------------------------------------------------------------------------
# Decoders — use these in dashboards / alerting to label the integer tags the
# firmware emits. Keep in sync with firmware/include/event_log.h.
# ---------------------------------------------------------------------------
# EventLogTag values (RecentEvent.t) -> human name.
# Per-tag interpretation of d0/d1:
# EVT_BOOT d0=esp_reset_reason()
# EVT_WIFI_UP d0=RSSI (int16 cast to uint16)
# EVT_WIFI_DOWN d0=disconnect reason (0xFF = silent-death)
# EVT_HTTP_OK d0=path_hash, d1=elapsed_ms
# EVT_HTTP_FAIL d0=path_hash, d1=http_status_or_errno
# EVT_HEARTBEAT_MISS d0=consecutive_count
# EVT_NTP_SYNC d0=seconds_since_boot (reserved, not emitted)
# EVT_REBOOT d0=RebootReason (see REBOOT_REASON_DECODER)
EVENT_TAG_DECODER = {
1: "EVT_BOOT",
2: "EVT_WIFI_UP",
3: "EVT_WIFI_DOWN",
4: "EVT_HTTP_OK",
5: "EVT_HTTP_FAIL",
6: "EVT_HEARTBEAT_MISS",
7: "EVT_NTP_SYNC",
8: "EVT_REBOOT",
}
# EVT_REBOOT.d0 values -> human name. Firmware-initiated reboot reasons.
REBOOT_REASON_DECODER = {
1: "HEARTBEAT_MISS",
2: "FACTORY_RESET",
3: "OTA",
4: "WIFI_REPROV",
5: "FATAL_CONFIG",
6: "FATAL_CAMERA",
}

View File

@@ -0,0 +1,14 @@
-- migrations/005_heartbeat_diagnostics.sql
-- Add v1.1.0 diagnostic columns to the existing heartbeats table.
-- Adjust the table name ("heartbeats") to match the real server's schema.
-- Apply: sqlite3 <db_file> < migrations/005_heartbeat_diagnostics.sql
--
-- sqlite's ALTER TABLE ADD COLUMN only takes one column per statement, so
-- each field is added separately. All columns are nullable, so firmware
-- v1.0.0 payloads (which omit these fields) remain accepted unchanged.
ALTER TABLE heartbeats ADD COLUMN reset_reason INTEGER;
ALTER TABLE heartbeats ADD COLUMN heap_free INTEGER;
ALTER TABLE heartbeats ADD COLUMN heap_min_free INTEGER;
ALTER TABLE heartbeats ADD COLUMN last_disconnect_code INTEGER;
ALTER TABLE heartbeats ADD COLUMN recent_events TEXT; -- JSON-serialized list of {t,d0,d1,ts,up}

View File

@@ -98,3 +98,15 @@ def test_negative_counts_rejected():
with pytest.raises(ValidationError):
CameraRecord(period_start=1712000000, period_end=1712003600,
entries=-1, exits=0)
def test_inverted_period_rejected():
"""Pydantic should reject period_end <= period_start."""
from pydantic import ValidationError
from server.camera_endpoint import CameraRecord
with pytest.raises(ValidationError):
CameraRecord(period_start=1712003600, period_end=1712003600,
entries=0, exits=0)
with pytest.raises(ValidationError):
CameraRecord(period_start=1712003600, period_end=1712000000,
entries=0, exits=0)

View File

@@ -0,0 +1,156 @@
# server/test_heartbeat_diagnostics_stub.py
# Template tests for the heartbeat diagnostic-fields extension.
# Adapt imports and fixtures to match the actual server's test structure.
#
# To run against the actual server (once integrated):
# pytest server/test_heartbeat_diagnostics_stub.py -v
import json
import sqlite3
def _make_db() -> sqlite3.Connection:
"""In-memory sqlite fixture matching migrations/005_heartbeat_diagnostics.sql
applied on top of a minimal heartbeats table."""
db = sqlite3.connect(":memory:")
db.execute("""
CREATE TABLE heartbeats (
device_id TEXT PRIMARY KEY,
uptime INTEGER,
reset_reason INTEGER,
heap_free INTEGER,
heap_min_free INTEGER,
last_disconnect_code INTEGER,
recent_events TEXT
)
""")
db.commit()
return db
def _v10_payload() -> dict:
"""Firmware v1.0.0-shape heartbeat: no diagnostic fields."""
return {"device_id": "dc-test-01", "uptime": 12345}
def _v11_payload() -> dict:
"""Firmware v1.1.0-shape heartbeat: includes all five diagnostic fields."""
return {
"device_id": "dc-test-01",
"uptime": 12345,
"reset_reason": 1,
"heap_free": 123456,
"heap_min_free": 100000,
"last_disconnect_code": 201,
"recent_events": [
{"t": 1, "d0": 1, "d1": 0, "ts": 1712000000, "up": 0},
{"t": 3, "d0": 255, "d1": 0, "ts": 1712000050, "up": 50},
],
}
def test_v10_shape_parses_with_new_fields_none():
"""A v1.0.0 heartbeat (no diagnostic fields) must parse cleanly; all new
fields default to None."""
from server.heartbeat_diagnostics_stub import HeartbeatRequestWithDiagnostics
hb = HeartbeatRequestWithDiagnostics(**_v10_payload())
assert hb.device_id == "dc-test-01"
assert hb.uptime == 12345
assert hb.reset_reason is None
assert hb.heap_free is None
assert hb.heap_min_free is None
assert hb.last_disconnect_code is None
assert hb.recent_events is None
def test_v11_shape_populates_new_fields():
"""A v1.1.0 heartbeat populates each diagnostic field and the event list."""
from server.heartbeat_diagnostics_stub import HeartbeatRequestWithDiagnostics
hb = HeartbeatRequestWithDiagnostics(**_v11_payload())
assert hb.reset_reason == 1
assert hb.heap_free == 123456
assert hb.heap_min_free == 100000
assert hb.last_disconnect_code == 201
assert hb.recent_events is not None
assert len(hb.recent_events) == 2
assert hb.recent_events[0].t == 1
assert hb.recent_events[1].t == 3
assert hb.recent_events[1].d0 == 255 # 0xFF silent-death marker
assert hb.recent_events[1].ts == 1712000050
def test_store_heartbeat_diagnostics_writes_fields_and_json():
"""store_heartbeat_diagnostics must JSON-serialize recent_events and write
each integer field as submitted."""
from server.heartbeat_diagnostics_stub import (
HeartbeatRequestWithDiagnostics,
store_heartbeat_diagnostics,
)
db = _make_db()
# Seed the heartbeats row the base handler would have inserted first.
db.execute(
"INSERT INTO heartbeats (device_id, uptime) VALUES (?, ?)",
("dc-test-01", 12345),
)
db.commit()
hb = HeartbeatRequestWithDiagnostics(**_v11_payload())
store_heartbeat_diagnostics(db, "dc-test-01", hb)
row = db.execute(
"""SELECT reset_reason, heap_free, heap_min_free,
last_disconnect_code, recent_events
FROM heartbeats
WHERE device_id = ?""",
("dc-test-01",),
).fetchone()
assert row[0] == 1
assert row[1] == 123456
assert row[2] == 100000
assert row[3] == 201
events = json.loads(row[4])
assert isinstance(events, list)
assert len(events) == 2
assert events[0] == {"t": 1, "d0": 1, "d1": 0, "ts": 1712000000, "up": 0}
assert events[1]["d0"] == 255
def test_store_heartbeat_diagnostics_v10_leaves_fields_null():
"""v1.0.0 payload: all diagnostic columns should remain NULL after store."""
from server.heartbeat_diagnostics_stub import (
HeartbeatRequestWithDiagnostics,
store_heartbeat_diagnostics,
)
db = _make_db()
db.execute(
"INSERT INTO heartbeats (device_id, uptime) VALUES (?, ?)",
("dc-test-01", 12345),
)
db.commit()
hb = HeartbeatRequestWithDiagnostics(**_v10_payload())
store_heartbeat_diagnostics(db, "dc-test-01", hb)
row = db.execute(
"""SELECT reset_reason, heap_free, heap_min_free,
last_disconnect_code, recent_events
FROM heartbeats
WHERE device_id = ?""",
("dc-test-01",),
).fetchone()
assert row == (None, None, None, None, None)
def test_event_tag_decoder_labels():
"""Sanity check: decoder maps firmware tag values to the expected names."""
from server.heartbeat_diagnostics_stub import EVENT_TAG_DECODER, REBOOT_REASON_DECODER
assert EVENT_TAG_DECODER[1] == "EVT_BOOT"
assert EVENT_TAG_DECODER[3] == "EVT_WIFI_DOWN"
assert EVENT_TAG_DECODER[8] == "EVT_REBOOT"
assert REBOOT_REASON_DECODER[1] == "HEARTBEAT_MISS"
assert REBOOT_REASON_DECODER[4] == "WIFI_REPROV"

0
tools/__init__.py Normal file
View File

105
tools/capture_frames.py Normal file
View File

@@ -0,0 +1,105 @@
#!/usr/bin/env python3
# tools/capture_frames.py
#
# Read framed 96x96 grayscale frames from the capture-mode firmware over serial
# and write them to a .bin file for offline replay.
#
# Wire format per frame (little-endian):
# magic u32 0xDC0FC0DE
# frame_ix u32
# millis u32
# pixels 9216 bytes
#
# Output file is the raw concatenation of frames (same layout as the wire),
# so replay_frames.py can stream it with identical parsing.
#
# Usage: python tools/capture_frames.py --port /dev/ttyUSB0 --out walk.bin --duration 60
import argparse
import serial
import struct
import sys
import time
MAGIC = 0x314D5246 # 'FRM1' — ascii bytes that survive the CH9102 stream
FRAME_PIXELS = 96 * 96
HEADER_LEN = 12
FRAME_LEN = HEADER_LEN + FRAME_PIXELS
def read_exact(ser, n):
buf = bytearray()
while len(buf) < n:
chunk = ser.read(n - len(buf))
if not chunk:
return None
buf.extend(chunk)
return bytes(buf)
def find_magic(ser):
"""Scan serial byte-by-byte until we see the 4-byte MAGIC."""
window = bytearray()
magic_bytes = struct.pack('<I', MAGIC)
while True:
b = ser.read(1)
if not b:
return False
window.extend(b)
if len(window) > 4:
del window[0]
if bytes(window) == magic_bytes:
return True
def main():
ap = argparse.ArgumentParser()
ap.add_argument('--port', required=True)
ap.add_argument('--baud', type=int, default=460800)
ap.add_argument('--out', required=True)
ap.add_argument('--duration', type=float, default=60.0,
help='Seconds to capture (default 60)')
args = ap.parse_args()
ser = serial.Serial(args.port, args.baud, timeout=1.0)
print(f'# listening on {args.port} @ {args.baud} for {args.duration}s...',
file=sys.stderr)
# Drain boot banner lines.
deadline_banner = time.time() + 2.0
while time.time() < deadline_banner:
line = ser.readline()
if line.startswith(b'#'):
print(line.decode(errors='replace').rstrip(), file=sys.stderr)
if b'capture-mode' in line:
break
deadline = time.time() + args.duration
frames = 0
last_ix = None
dropped = 0
with open(args.out, 'wb') as f:
while time.time() < deadline:
if not find_magic(ser):
continue
body = read_exact(ser, 8 + FRAME_PIXELS)
if body is None:
break
frame_ix, ms = struct.unpack('<II', body[:8])
if last_ix is not None and frame_ix != last_ix + 1:
dropped += frame_ix - last_ix - 1
last_ix = frame_ix
f.write(struct.pack('<I', MAGIC))
f.write(body)
frames += 1
if frames % 25 == 0:
print(f'# {frames} frames, last ix={frame_ix} ms={ms} '
f'dropped={dropped}', file=sys.stderr)
print(f'# done: {frames} frames written to {args.out} '
f'({dropped} dropped)', file=sys.stderr)
if __name__ == '__main__':
main()

View File

@@ -15,16 +15,38 @@ Usage:
"""
import argparse
import os
import re
import secrets
import subprocess
import sys
import tempfile
HMAC_SECRET_RE = re.compile(r"^[0-9a-fA-F]{64}$")
NVS_NAMESPACE = "doorcounter"
NVS_PARTITION_OFFSET = "0x9000"
NVS_PARTITION_SIZE = "0x5000" # matches firmware partition table (20KB)
# Characters that would change the field/row structure of the NVS-CSV format
# (key,type,encoding,value). A value containing any of these would either
# split into more fields or add rows, silently provisioning the wrong keys.
_CSV_FORBIDDEN = (",", '"', "\n", "\r")
def _reject_csv_metacharacters(name, value):
"""Exit with an error if value contains a character that would corrupt
the NVS CSV. Used for operator-supplied strings (device id, location id,
WiFi credentials)."""
for c in _CSV_FORBIDDEN:
if c in value:
print(
f"Error: --{name} contains forbidden character {c!r}; "
f"this would corrupt the NVS partition CSV.",
file=sys.stderr,
)
sys.exit(1)
def build_nvs_csv(device_id, location_id, hmac_secret,
wifi_ssid=None, wifi_pass=None, line_offset=50):
@@ -63,6 +85,10 @@ def main():
args = parser.parse_args()
hmac_secret = args.hmac_secret or secrets.token_hex(32)
if not HMAC_SECRET_RE.match(hmac_secret):
print("Error: --hmac-secret must be exactly 64 hex characters (32 bytes)",
file=sys.stderr)
sys.exit(1)
if args.hmac_secret is None:
print(f"Generated HMAC secret: {hmac_secret}")
print(" *** SAVE THIS — you need it to register the device on the server ***")
@@ -71,6 +97,13 @@ def main():
print("Error: --line-offset must be 0-100", file=sys.stderr)
sys.exit(1)
_reject_csv_metacharacters("device-id", args.device_id)
_reject_csv_metacharacters("location-id", args.location_id)
if args.wifi_ssid is not None:
_reject_csv_metacharacters("wifi-ssid", args.wifi_ssid)
if args.wifi_password is not None:
_reject_csv_metacharacters("wifi-password", args.wifi_password)
with tempfile.TemporaryDirectory() as tmp:
csv_path = os.path.join(tmp, "nvs.csv")
bin_path = os.path.join(tmp, "nvs.bin")

211
tools/replay_frames.py Normal file
View File

@@ -0,0 +1,211 @@
#!/usr/bin/env python3
# tools/replay_frames.py
#
# Offline Python port of the event-based CV detector (firmware/lib/cv/cv.cpp).
# Reads a .bin file produced by capture_frames.py and prints events.
#
# Purpose: iterate algorithm changes in seconds instead of minutes. All
# constants match cv.h so baseline behavior matches firmware.
#
# Usage:
# python tools/replay_frames.py walk.bin
# python tools/replay_frames.py walk.bin --enter 250 --exit 150 --max 25
#
# Output: one line per frame with fg diagnostics, plus [ENTRY]/[EXIT] lines
# when the detector fires.
import argparse
import struct
import sys
import numpy as np
MAGIC = 0x314D5246 # 'FRM1'
W = H = 96
PIXELS = W * H
HEADER = 12
FRAME_LEN = HEADER + PIXELS
class Detector:
"""Mirror of firmware CV state machine. Single walker events, centroid
trajectory direction. Only per-frame fg_count + min/max y + centroid y
feed the decision — per-blob tracks are diagnostic in firmware, dropped
here."""
def __init__(self, args):
self.a = args
self.bg = None
self.ev_active = False
self.ev_frames = 0
self.ev_first_c = -1.0
self.ev_last_c = -1.0
self.ev_min_c = float(H)
self.ev_max_c = -1.0
self.ev_min_y = H
self.ev_max_y = -1
self.ev_quiet = 0
self.last_fire = 0
self.frame_ix = 0
self.entries = 0
self.exits = 0
def _reset_event(self):
self.ev_active = False
self.ev_frames = 0
self.ev_first_c = self.ev_last_c = -1.0
self.ev_min_c = float(H)
self.ev_max_c = -1.0
self.ev_min_y = H
self.ev_max_y = -1
self.ev_quiet = 0
def _finalize(self):
a = self.a
if self.ev_frames < a.min_frames: return None
if self.ev_min_y > a.extent_top: return None
if self.ev_max_y < a.extent_bot: return None
up = self.ev_first_c - self.ev_min_c
down = self.ev_max_c - self.ev_first_c
winning = max(up, down)
if winning < a.min_traj: return None
is_entry = up >= down
self.last_fire = self.frame_ix
info = dict(
kind='ENTRY' if is_entry else 'EXIT',
first=self.ev_first_c, min=self.ev_min_c,
max=self.ev_max_c, last=self.ev_last_c,
dur=self.ev_frames,
)
if is_entry: self.entries += 1
else: self.exits += 1
return info
def step(self, frame):
"""frame: uint8 array of shape (H, W). Returns list of fire dicts."""
self.frame_ix += 1
fires = []
if self.bg is None:
self.bg = frame.astype(np.int16)
return fires
bg = self.bg.astype(np.int16)
diff = np.abs(frame.astype(np.int16) - bg)
fg = (diff > self.a.diff_thresh).astype(np.uint8)
# Running-avg bg blend, frozen during active event.
if not self.ev_active:
self.bg = ((self.bg * 31 + frame.astype(np.int16)) >> 5)
fg_count = int(fg.sum())
if fg_count > 0:
row_counts = fg.sum(axis=1)
ys = np.where(row_counts > 0)[0]
min_y = int(ys.min())
max_y = int(ys.max())
centroid_y = float((row_counts * np.arange(H)).sum() / fg_count)
else:
min_y, max_y, centroid_y = -1, -1, -1.0
# Self-heal on catastrophic bg mismatch.
if fg_count > PIXELS // 2:
self.bg = frame.astype(np.int16)
if self.ev_active: self._reset_event()
return fires
a = self.a
in_refractory = (self.last_fire != 0 and
(self.frame_ix - self.last_fire) < a.refractory)
if not self.ev_active:
if not in_refractory and fg_count >= a.enter_thresh:
self.ev_active = True
self.ev_frames = 1
self.ev_first_c = centroid_y
self.ev_last_c = centroid_y
self.ev_min_c = centroid_y
self.ev_max_c = centroid_y
self.ev_min_y = min_y
self.ev_max_y = max_y
self.ev_quiet = 0
else:
self.ev_frames += 1
if fg_count > 0:
self.ev_last_c = centroid_y
if centroid_y < self.ev_min_c: self.ev_min_c = centroid_y
if centroid_y > self.ev_max_c: self.ev_max_c = centroid_y
if min_y < self.ev_min_y: self.ev_min_y = min_y
if max_y > self.ev_max_y: self.ev_max_y = max_y
ended = False
if fg_count < a.exit_thresh:
self.ev_quiet += 1
if self.ev_quiet >= a.quiet_frames:
ended = True
else:
self.ev_quiet = 0
if self.ev_frames > a.max_frames:
ended = True
if ended:
fire = self._finalize()
if fire: fires.append(fire)
self._reset_event()
self.bg = frame.astype(np.int16)
return fires, fg_count, min_y, max_y, centroid_y
def iter_frames(path):
with open(path, 'rb') as f:
data = f.read()
n = len(data) // FRAME_LEN
for i in range(n):
off = i * FRAME_LEN
magic, ix, ms = struct.unpack('<III', data[off:off + HEADER])
if magic != MAGIC:
raise RuntimeError(f'bad magic at frame {i}: 0x{magic:08x}')
frame = np.frombuffer(data, dtype=np.uint8,
count=PIXELS, offset=off + HEADER).reshape(H, W)
yield ix, ms, frame
def main():
ap = argparse.ArgumentParser()
ap.add_argument('path')
ap.add_argument('--diff-thresh', dest='diff_thresh', type=int, default=30)
ap.add_argument('--enter', dest='enter_thresh', type=int, default=300)
ap.add_argument('--exit', dest='exit_thresh', type=int, default=200)
ap.add_argument('--quiet', dest='quiet_frames', type=int, default=3)
ap.add_argument('--min', dest='min_frames', type=int, default=5)
ap.add_argument('--max', dest='max_frames', type=int, default=25)
ap.add_argument('--extent-top', dest='extent_top', type=int, default=10)
ap.add_argument('--extent-bot', dest='extent_bot', type=int, default=85)
ap.add_argument('--min-traj', dest='min_traj', type=float, default=15.0)
ap.add_argument('--refractory', dest='refractory', type=int, default=15)
ap.add_argument('--quiet-log', action='store_true',
help='Suppress per-frame fg lines')
args = ap.parse_args()
det = Detector(args)
total = 0
for ix, ms, frame in iter_frames(args.path):
total += 1
out = det.step(frame)
if out == []:
if not args.quiet_log:
print(f'[{ix:4d}] bg init')
continue
fires, fg, miny, maxy, cy = out
if not args.quiet_log and fg > 0:
print(f'[{ix:4d}] n={fg:4d} y={miny:2d}..{maxy:2d} c={cy:5.1f}')
for fire in fires:
print(f' >>> {fire["kind"]} first={fire["first"]:.1f} '
f'min={fire["min"]:.1f} max={fire["max"]:.1f} '
f'last={fire["last"]:.1f} dur={fire["dur"]}')
print(f'\n# {total} frames entries={det.entries} exits={det.exits}')
if __name__ == '__main__':
main()

186
tools/replay_logs.py Normal file
View File

@@ -0,0 +1,186 @@
#!/usr/bin/env python3
# tools/replay_logs.py
#
# Replay the event state machine against text serial logs captured from the
# production firmware. Input lines of the form:
# [F] n=<fg_count> y=<min_y>..<max_y> c=<centroid_y>
#
# Those four values are exactly what the firmware's event state machine
# consumes — so we can iterate event-level params (thresholds, max_frames,
# extent gates, trajectory cutoffs, refractory) offline without needing raw
# frames or the device.
#
# Usage:
# python tools/replay_logs.py walk.log
# python tools/replay_logs.py walk.log --enter 250 --exit 100 --max 30 --min-traj 10
# cat walk.log | python tools/replay_logs.py - --ground-truth 12
import argparse
import re
import sys
LINE_RE = re.compile(
r'\[F\]\s+n=(?P<n>\d+)\s+y=(?P<miny>-?\d+)\.\.(?P<maxy>-?\d+)\s+c=(?P<c>-?\d+\.\d+)'
)
def parse_frames(text):
"""Yield (fg_count, min_y, max_y, centroid_y) per [F] line, in order."""
for line in text.splitlines():
m = LINE_RE.search(line)
if not m:
continue
yield int(m['n']), int(m['miny']), int(m['maxy']), float(m['c'])
class Detector:
"""Mirror of firmware event state machine. Only uses per-frame diagnostic
values — the same inputs the firmware feeds it."""
def __init__(self, a):
self.a = a
self.ev = False
self.ev_n = 0
self.ev_first = self.ev_last = -1.0
self.ev_min = 1e9
self.ev_max = -1.0
self.ev_miny = 1e9
self.ev_maxy = -1
self.ev_quiet = 0
self.last_fire = -10**9
self.ix = 0
self.entries = 0
self.exits = 0
self.fires = []
def _reset(self):
self.ev = False
self.ev_n = 0
self.ev_first = self.ev_last = -1.0
self.ev_min = 1e9; self.ev_max = -1.0
self.ev_miny = 1e9; self.ev_maxy = -1
self.ev_quiet = 0
def _finalize(self):
a = self.a
if self.ev_n < a.min_frames:
return ('reject_short', None)
if self.ev_miny > a.extent_top:
return ('reject_extent_top', None)
if self.ev_maxy < a.extent_bot:
return ('reject_extent_bot', None)
up = self.ev_first - self.ev_min
down = self.ev_max - self.ev_first
winning = max(up, down)
if winning < a.min_traj:
return ('reject_traj', None)
timed_out = self.ev_n > a.max_frames
if timed_out:
is_entry = self.ev_last < self.ev_first
else:
is_entry = up >= down
kind = 'ENTRY' if is_entry else 'EXIT'
self.last_fire = self.ix
info = dict(kind=kind, first=self.ev_first, min=self.ev_min,
max=self.ev_max, last=self.ev_last, dur=self.ev_n,
up=up, down=down, ix=self.ix)
if is_entry: self.entries += 1
else: self.exits += 1
self.fires.append(info)
return ('fire', info)
def step(self, n, miny, maxy, c):
self.ix += 1
a = self.a
refractory = (self.ix - self.last_fire) < a.refractory
if not self.ev:
if not refractory and n >= a.enter_thresh:
self.ev = True
self.ev_n = 1
self.ev_first = self.ev_last = c
self.ev_min = c; self.ev_max = c
self.ev_miny = miny; self.ev_maxy = maxy
self.ev_quiet = 0
return None
self.ev_n += 1
if n > 0:
self.ev_last = c
if c < self.ev_min: self.ev_min = c
if c > self.ev_max: self.ev_max = c
if miny < self.ev_miny: self.ev_miny = miny
if maxy > self.ev_maxy: self.ev_maxy = maxy
ended = False
if n < a.exit_thresh:
self.ev_quiet += 1
if self.ev_quiet >= a.quiet_frames:
ended = True
reason = 'quiet'
else:
self.ev_quiet = 0
if self.ev_n > a.max_frames:
ended = True
reason = 'timeout'
if ended:
result = self._finalize()
self._reset()
return (reason, result)
return None
def main():
ap = argparse.ArgumentParser()
ap.add_argument('path', help='log file, or - for stdin')
ap.add_argument('--enter', dest='enter_thresh', type=int, default=300)
ap.add_argument('--exit', dest='exit_thresh', type=int, default=200)
ap.add_argument('--quiet', dest='quiet_frames', type=int, default=3)
ap.add_argument('--min', dest='min_frames', type=int, default=5)
ap.add_argument('--max', dest='max_frames', type=int, default=25)
ap.add_argument('--extent-top', dest='extent_top', type=int, default=10)
ap.add_argument('--extent-bot', dest='extent_bot', type=int, default=85)
ap.add_argument('--min-traj', dest='min_traj', type=float, default=15.0)
ap.add_argument('--refractory', dest='refractory', type=int, default=15)
ap.add_argument('--ground-truth', type=int, default=0,
help='Total expected walks for accuracy calc')
ap.add_argument('-v', '--verbose', action='store_true',
help='Print every event end, including rejections')
args = ap.parse_args()
text = sys.stdin.read() if args.path == '-' else open(args.path).read()
det = Detector(args)
rejects = {}
for n, miny, maxy, c in parse_frames(text):
out = det.step(n, miny, maxy, c)
if out is None:
continue
reason, result = out
if result is None:
continue
kind, info = result
if kind == 'fire':
print(f' {info["kind"]:5} first={info["first"]:5.1f} '
f'min={info["min"]:5.1f} max={info["max"]:5.1f} '
f'last={info["last"]:5.1f} dur={info["dur"]:2d} '
f'exit={reason}')
else:
rejects[kind] = rejects.get(kind, 0) + 1
if args.verbose:
print(f' [drop {kind}]')
total = det.entries + det.exits
print(f'\n=== entries={det.entries} exits={det.exits} total={total} ===')
print(f'rejected events: {rejects}')
if args.ground_truth:
gt = args.ground_truth
acc = min(total, gt) / gt * 100
over = max(0, total - gt)
print(f'accuracy vs gt={gt}: {acc:.0f}% (over={over})')
if __name__ == '__main__':
main()

56
tools/serial_monitor.py Executable file
View File

@@ -0,0 +1,56 @@
#!/usr/bin/env python3
# Serial monitor for ESP32. Optionally pulses RTS/DTR to reset the device
# so we capture boot output. Prefixes each line with elapsed seconds.
import serial, sys, time, argparse
def main():
ap = argparse.ArgumentParser()
ap.add_argument("--port", default="/dev/ttyUSB0")
ap.add_argument("--baud", type=int, default=115200)
ap.add_argument("--seconds", type=int, default=20)
ap.add_argument("--reset", action="store_true",
help="Pulse RTS/DTR to reset the ESP32 before reading")
ap.add_argument("--timestamp", action="store_true",
help="Prefix each line with elapsed seconds since boot")
args = ap.parse_args()
try:
s = serial.Serial(args.port, args.baud, timeout=0.2,
rtscts=False, dsrdtr=False)
except Exception as e:
print(f"[open-fail] {e}", flush=True)
sys.exit(2)
if args.reset:
s.setDTR(False)
s.setRTS(True)
time.sleep(0.1)
s.setRTS(False)
s.reset_input_buffer()
t0 = time.time()
end = t0 + args.seconds
buf = b""
while time.time() < end:
chunk = s.read(512)
if chunk:
buf += chunk
while b"\n" in buf:
line, buf = buf.split(b"\n", 1)
text = line.decode("utf-8", errors="replace").rstrip("\r")
if args.timestamp:
sys.stdout.write(f"[{time.time()-t0:5.1f}s] {text}\n")
else:
sys.stdout.write(text + "\n")
sys.stdout.flush()
if buf:
text = buf.decode("utf-8", errors="replace")
if args.timestamp:
sys.stdout.write(f"[{time.time()-t0:5.1f}s] {text}\n")
else:
sys.stdout.write(text)
sys.stdout.flush()
s.close()
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,17 @@
import pytest
from tools.flash_device import _reject_csv_metacharacters
def test_clean_value_accepted():
"""A value with no metacharacters should pass without exiting."""
_reject_csv_metacharacters("device-id", "dc-0042")
_reject_csv_metacharacters("location-id", "retailer-123")
_reject_csv_metacharacters("wifi-ssid", "StoreWiFi-2.4GHz")
_reject_csv_metacharacters("wifi-password", "p@ssw0rd!~#$%^&*()_+-=:;<>?/")
@pytest.mark.parametrize("bad", ["Home,Network", 'pa"ss', "ssid\nfoo", "name\rbar"])
def test_metacharacter_rejected(bad):
with pytest.raises(SystemExit):
_reject_csv_metacharacters("wifi-ssid", bad)