AI-Powered Industrial Surveillance Platform
Unified Technical Blueprint — Part A: Sections 1-10
| Document Property | Value |
|---|---|
| Version | 1.0.0 |
| Classification | Technical Blueprint — Production Design |
| Target DVR | CP PLUS ORANGE CP-UVR-0801E1-CV2 |
| Channels | 8 active (scalable to 64+) |
| Resolution | 960 x 1080 per channel |
| DVR Network | 192.168.29.200/24, RTSP port 554 |
| Date | 2025 |
Cross-Reference Guide: This unified blueprint synthesizes six specialist design documents. For detailed specifications on any subsystem, refer to:
architecture.md— Full architecture, scaling, failover, cost estimationvideo_ingestion.md— RTSP configuration, FFmpeg commands, edge gateway specsai_vision.md— Model configurations, inference code, benchmarksdatabase_schema.md— Complete DDL, triggers, views, RLS policiessuspicious_activity.md— Detection algorithms, scoring engine pseudocodetraining_system.md— Training pipelines, quality gates, versioning logic
Table of Contents
- Section 1: Executive Summary
- Section 2: Kimi Swarm Team and Agent Responsibilities
- Section 3: Assumptions
- Section 4: Full Architecture
- Section 5: Data Flow from DVR to Cloud to Dashboard
- Section 6: Recommended Tech Stack
- Section 7: Database Schema
- Section 8: AI Model and Training Strategy
- Section 9: Suspicious Activity Night-Mode Design
- Section 10: Live Video Streaming Design
Section 1: Executive Summary
1.1 Project Objective
This blueprint defines the complete technical design for an AI-powered industrial surveillance platform that transforms a legacy CP PLUS 8-channel DVR system into a modern, intelligent security operations center. The platform processes real-time video from 8 camera channels, applies state-of-the-art computer vision and face recognition AI, detects suspicious activity during night hours, and provides a unified dashboard for security operators — all while maintaining the highest standards of reliability, security, and data privacy.
The system is designed around a cloud+edge hybrid architecture where all compute-intensive AI inference runs in the cloud (AWS Mumbai), while a local edge gateway handles stream ingestion, buffering, and site-local concerns. A WireGuard VPN tunnel protects all communication between edge and cloud, ensuring the DVR has zero public internet exposure.
1.2 Key Capabilities
| Capability | Description | Technology |
|---|---|---|
| Human Detection | Real-time person detection across all 8 channels at 15-20 FPS | YOLO11m + TensorRT FP16, 640x640 |
| Face Detection | Accurate face localization with 5-point landmarks for alignment | SCRFD-500M-BNKPS, 640x640 |
| Face Recognition | 512-D embedding extraction with 99.83% LFW accuracy | ArcFace R100 IR-SE100 (MS1MV3) |
| Person Tracking | Persistent identity tracking across frames with occlusion recovery | ByteTrack (Kalman + IoU), 80.3% MOTA |
| Unknown Clustering | Automatic grouping of unknown faces for operator review | HDBSCAN + DBSCAN fallback, 89.5% purity |
| Night Mode Surveillance | 10-detection-module suspicious activity analysis (22:00-06:00) | Composite scoring engine with time-decay |
| AI Vibe Controls | Three intuitive presets (Relaxed/Balanced/Strict) mapping to 4 confidence levels | Dynamic threshold adjustment |
| Safe Self-Learning | Three-mode training system with conflict detection and approval workflows | MLflow + Airflow + Manual Review |
| 24/7 Reliability | Graceful degradation: video never stops, AI catch-up on recovery | Tiered storage + circuit breakers + replay |
| Real-Time Alerts | 6-level escalation (NONE to EMERGENCY) with multi-channel notifications | Telegram, WhatsApp, Email, Webhook |
| Live Dashboard | Multi-camera grid with HLS streaming and single-camera low-latency WebRTC | Next.js 14 + HLS.js + WebRTC |
1.3 Architecture Approach
The platform follows a cloud+edge+VPN hybrid pattern with five network security zones:
Cameras (8ch) --> DVR (local) --> Edge Gateway (local) --> WireGuard VPN --> AWS Cloud (EKS)
| |
| 2TB NVMe buffer | Encrypted tunnel
| 7-day ring buffer | UDP 51820
| FFmpeg ingestion | ChaCha20-Poly1305
Key architectural decisions:
| Decision | Choice | Rationale |
|---|---|---|
| Cloud Provider | AWS ap-south-1 (Mumbai) | Lowest latency to India, mature managed services |
| Container Orchestration | Amazon EKS + K3s edge | Managed control plane, GPU node support, lightweight edge |
| VPN | WireGuard | ~60% faster than OpenVPN, modern crypto, simple setup |
| Message Queue | Apache Kafka (MSK) | Durable ordered log, replay capability, proven at scale |
| AI Inference | NVIDIA Triton + TensorRT | GPU-optimized, dynamic batching, model ensemble |
| Database | PostgreSQL 16 + pgvector | ACID compliance, native 512-D vector support |
| Object Storage | MinIO (edge+cloud) + S3 (archive) | S3-compatible API, tiered cost optimization |
1.4 Target Environment
The platform targets a CP PLUS ORANGE CP-UVR-0801E1-CV2 DVR with the following characteristics:
| Property | Value | Impact on Design |
|---|---|---|
| Brand/Model | CP PLUS ORANGE CP-UVR-0801E1-CV2 | Dahua-compatible RTSP URL scheme |
| Channels | 8 active | Initial deployment scope |
| Resolution | 960 x 1080 per channel | AI input: letterbox to 640x640 |
| LAN IP | 192.168.29.200/24 | Edge gateway on same subnet |
| RTSP Port | 554 | TCP interleaved mandatory |
| ONVIF | V2.6.1.867657 (Server V19.06) | Auto-discovery supported |
| DVR Disk | FULL (0 bytes free) | All archival is edge-managed; no DVR recording |
| VPN Access | WireGuard-secured | No public exposure; all traffic encrypted |
Critical Design Impact: The DVR disk being full means the system cannot rely on DVR-side recording or playback features. All archival storage is managed by the edge gateway's 2TB NVMe buffer and cloud tiering.
1.5 Key Differentiators
1. AI Vibe Controls Instead of exposing complex threshold parameters to operators, the system provides three intuitive "vibe" presets — Relaxed, Balanced, and Strict — that internally map to optimized configurations for detection sensitivity and face match strictness. This innovation makes the system accessible to non-technical security staff while maintaining AI precision.
2. Safe Self-Learning Training System The platform captures operator corrections (confirmations, corrections, merges, rejections) and feeds them back into model improvement through a carefully designed three-mode learning pipeline: Manual Only, Suggested Learning (recommended), and Approved Auto-Update. A synchronous conflict detector blocks five types of label conflicts before they reach the training dataset, ensuring model integrity.
3. 24/7 Reliability with Graceful Degradation The system is architected around a single priority: video recording never stops. If the AI inference service fails, recording continues locally with queued catch-up processing on recovery. If the VPN tunnel fails, the edge gateway maintains 7 days of local buffer. If the cloud database fails, alerts accumulate in Kafka's durable log. Every failure mode has a defined degradation strategy.
4. 10-Module Night Surveillance The suspicious activity detection system goes beyond simple motion detection to provide comprehensive behavioral analysis through 10 specialized detection modules — from intrusion and loitering to abandoned objects and repeated re-entry patterns — all combined through a composite scoring engine with exponential time-decay.
1.6 Production Readiness Assessment
| Dimension | Status | Notes |
|---|---|---|
| Architecture Completeness | Production-Ready | All 12 services fully specified with resource allocations |
| AI Model Selection | Production-Ready | Industry-standard models with published benchmarks |
| Database Design | Production-Ready | 29 tables, 4 views, 8 triggers, partitioning, RLS |
| Security Architecture | Production-Ready | 7-layer defense in depth, encrypted credentials, VPN-only |
| Scaling Path | Defined | 8 -> 16 -> 32 -> 64+ cameras with concrete resource allocations |
| Failover Design | Production-Ready | Graceful degradation matrix for all failure modes |
| Estimated Timeline | 14 weeks | 4 implementation phases defined |
| Estimated Monthly Cost | ~$2,140 USD | 8-camera deployment at steady state |
Section 2: Kimi Swarm Team and Agent Responsibilities
The unified blueprint was synthesized from the outputs of 11 specialist agents, each responsible for a specific domain of the platform design.
2.1 Agent Responsibility Matrix
| # | Agent | Responsibility | Key Deliverables |
|---|---|---|---|
| 1 | Requirements Analyst | Elicited and structured all functional/non-functional requirements | Requirements traceability matrix, user stories, acceptance criteria |
| 2 | System Architect | Designed overall cloud+edge+VPN topology and service interactions | Deployment topology, 5 security zones, scaling roadmap, failover matrix |
| 3 | Video Ingestion Engineer | Specified RTSP configuration, edge gateway, and stream processing | RTSP URL patterns, FFmpeg commands, auto-reconnect logic, HLS generation |
| 4 | AI Vision Scientist | Selected and configured all CV/AI models for the inference pipeline | Model selection table, inference pipeline architecture, confidence handling |
| 5 | Database Architect | Designed complete data model with partitioning, indexing, and security | 29 tables + 4 views + 8 triggers, pgvector HNSW index, RLS policies |
| 6 | Suspicious Activity Designer | Designed 10 detection modules and composite scoring engine | Detection algorithms, scoring formula, YAML configuration schema |
| 7 | Training System Engineer | Designed self-learning pipeline with safety controls | 3 learning modes, conflict detection, quality gates, versioning |
| 8 | Frontend Developer | Designed Next.js dashboard with real-time video and alerts | Component architecture, HLS.js integration, WebSocket alerts |
| 9 | DevOps Engineer | Specified CI/CD, monitoring, and infrastructure-as-code | GitHub Actions + ArgoCD, Prometheus/Grafana, alerting rules |
| 10 | Security Architect | Designed defense-in-depth security across all layers | 7 security layers, secret management, encryption standards |
| 11 | Technical Writer (this document) | Synthesized all specialist outputs into unified blueprint | 10-section unified document with cross-references |
2.2 Agent Interaction Flow
+-----------------------------------------------------------------------------+
| KIMI SWARM TEAM ORCHESTRATION |
+-----------------------------------------------------------------------------+
| |
| Requirements Analyst |
| | |
| v |
| +---------+ +---------+ +---------+ +---------+ |
| | System |<-->| Video |<-->| AI |<-->| Database| |
| |Architect| |Ingestion| |Vision | |Architect| |
| +---------+ +---------+ +---------+ +---------+ |
| ^ | |
| | +---------+ +---------+ | |
| +---------->|Suspicious|<-->|Training |<-------+ |
| |Activity | |System | |
| |Designer | |Engineer | |
| +---------+ +---------+ |
| | |
| v |
| +---------+ +---------+ +---------+ |
| |Frontend | |DevOps | |Security | |
| |Developer| |Engineer | |Architect| |
| +---------+ +---------+ +---------+ |
| | |
| v |
| +---------------------+ |
| | Technical Writer | |
| | (Unified Blueprint) | |
| +---------------------+ |
| |
+-----------------------------------------------------------------------------+
2.3 Cross-Agent Design Consistency
The following cross-cutting concerns were harmonized across all agent outputs during synthesis:
| Concern | Resolution | Agents Coordinated |
|---|---|---|
| Video latency budget | < 100ms end-to-end (AI); ~35-65s (HLS live) | Video Ingestion, AI Vision, Frontend |
| Face embedding storage | 512-D float32, pgvector HNSW index, cosine similarity | Database, AI Vision, Training |
| Event data retention | 90 days hot (MinIO), 1 year cold (Glacier), 7 days edge | Database, Architecture, Video Ingestion |
| Alert escalation | 6 levels: NONE -> LOW -> MEDIUM -> HIGH -> CRITICAL -> EMERGENCY | Suspicious Activity, Database, Frontend |
| Model versioning | Semantic MAJOR.MINOR.PATCH with MLflow registry | Training, AI Vision, Architecture |
| Graceful degradation | Video never stops; AI catch-up on recovery | Architecture, Video Ingestion, AI Vision |
| Security zones | 5 zones: Internet -> ALB -> Application -> Data -> Edge | Architecture, Security, Video Ingestion |
Section 3: Assumptions
All assumptions made across the specialist designs are consolidated below. These should be validated before implementation begins.
3.1 Network and Hardware Assumptions
| ID | Assumption | Validation Method | Risk if Invalid |
|---|---|---|---|
| NW-01 | Edge gateway has dual Ethernet: one for local DVR subnet (192.168.29.0/24), one for internet/VPN | Physical site survey | Cannot bridge DVR to VPN |
| NW-02 | Site internet bandwidth >= 16 Mbps sustained upload for 8 channels | ISP speed test | Video drops, AI delays |
| NW-03 | WireGuard UDP port 51820 is not blocked by site firewall | Firewall rule check | VPN cannot establish |
| NW-04 | DVR RTSP server supports TCP interleaved transport (rtsp_transport tcp) |
FFmpeg test probe | UDP fallback has packet loss |
| NW-05 | DVR supports 16+ concurrent RTSP sessions (8 channels x 2 streams) | Session stress test | Stream contention |
| NW-06 | MTU 1400 is viable through site NAT/firewall for WireGuard tunnel | Ping with DF bit test | Fragmentation issues |
| HW-01 | Intel NUC 13 Pro (i5-1340P, 16GB RAM, 512GB NVMe) is available for edge gateway | Hardware procurement | May need Jetson Orin alternative |
| HW-02 | Edge gateway has UPS backup for graceful shutdown on power loss | Electrical survey | Data corruption on hard power-off |
| HW-03 | AWS g4dn.xlarge (T4 GPU) instances are available in ap-south-1 | AWS EC2 capacity check | Need alternative GPU instance |
3.2 DVR Capabilities Assumptions
| ID | Assumption | Validation Method | Risk if Invalid |
|---|---|---|---|
| DVR-01 | DVR RTSP streams are accessible at rtsp://admin:{PASS}@192.168.29.200:554/cam/realmonitor?channel=N&subtype=M |
FFmpeg connectivity test | Need alternative URL format |
| DVR-02 | DVR continues serving RTSP streams even with disk full (0 bytes free) | 24-hour stream stability test | Streams may stall |
| DVR-03 | DVR sub-stream (subtype=1) provides sufficient quality for AI inference (typically 352x288 to 704x576) | Frame quality inspection | May need main stream for AI |
| DVR-04 | DVR ONVIF server supports device discovery and stream URI retrieval | ONVIF Device Manager test | Manual camera configuration needed |
| DVR-05 | DVR channel numbering is 1-indexed (1-8) | ONVIF profile enumeration | Off-by-one errors in configuration |
| DVR-06 | DVR Digest authentication works with the provided credentials | RTSP DESCRIBE request test | May need Basic auth or different scheme |
3.3 Environmental Assumptions
| ID | Assumption | Impact if Invalid |
|---|---|---|
| ENV-01 | Cameras provide adequate lighting for face recognition during night hours (minimum 10 lux at face distance) | Face recognition accuracy degrades; may need IR illumination |
| ENV-02 | Camera angles allow frontal face capture at entry/exit points (yaw < 45 degrees) | Face recognition miss rate increases |
| ENV-03 | Indoor industrial environment with minimal weather interference | False positive rate from rain/shadows is low |
| ENV-04 | Maximum person-to-camera distance is within 10 meters for face recognition | Faces may be too small (< 20px) for reliable detection |
| ENV-05 | Camera positions are stable (no PTZ movement during normal operation) | Zone calibration remains valid |
3.4 Operational Assumptions
| ID | Assumption | Impact if Invalid |
|---|---|---|
| OPS-01 | Security operators will review unknown face clusters and provide identity labels daily | Unknown person database grows without enrichment |
| OPS-02 | Admin will review training suggestions at least weekly in "Suggested Learning" mode | Training queue backlog accumulates |
| OPS-03 | Site has authorized personnel who can access edge gateway for maintenance (SSH, physical) | Remote troubleshooting limited |
| OPS-04 | Alert fatigue is a genuine concern — false positive rate > 20% leads to ignored alerts | AI vibe controls and suppression tuned accordingly |
| OPS-05 | Incident video review requires 10-second pre-event and 30-second post-event clips | Clip configuration fixed |
3.5 Security Assumptions
| ID | Assumption | Impact if Invalid |
|---|---|---|
| SEC-01 | WireGuard encryption (ChaCha20-Poly1305) meets organizational security requirements | May need additional encryption layer |
| SEC-02 | AWS VPC with private subnets satisfies data residency requirements for India | Compliance review needed |
| SEC-03 | Face embeddings (512-D vectors) do not constitute PII under applicable regulations | Legal review needed for biometric data handling |
| SEC-04 | Edge gateway physical security is equivalent to server room security | Tampering risk if edge is physically accessible |
| SEC-05 | DVR credentials can be stored encrypted (AES-256) in cloud database | Key management infrastructure required |
3.6 AI Performance Assumptions
| ID | Assumption | Impact if Invalid |
|---|---|---|
| AI-01 | YOLO11m TensorRT FP16 achieves > 75% person AP@50 on surveillance footage | May need fine-tuning on site-specific data |
| AI-02 | ArcFace R100 achieves > 98% Rank-1 accuracy on enrolled persons with 5+ reference images | Enrollment quality gates ensure minimum samples |
| AI-03 | HDBSCAN achieves > 89% cluster purity on 512-D face embeddings from this camera setup | Fallback to DBSCAN if density varies too much |
| AI-04 | ByteTrack maintains < 2 ID switches per 100 frames in industrial environment with occlusion | May need BoT-SORT upgrade for complex scenes |
| AI-05 | GPU (T4) can sustain 15-20 FPS processing per stream across 8 streams with batching | CPU fallback at 5-8 FPS if GPU unavailable |
Section 4: Full Architecture
4.1 High-Level System Architecture
The platform employs a cloud+edge hybrid architecture with five network security zones. Video streams are ingested at the edge, processed by AI in the cloud, and presented through a web-based dashboard. A WireGuard VPN tunnel provides encrypted, zero-exposure connectivity between edge and cloud.
+=============================================================================+
| CLOUD+EDGE+VPN ARCHITECTURE |
+=============================================================================+
| |
| ZONE 0: INTERNET (UNTRUSTED) |
| +---------------------+ |
| | Users / Browsers | |
| | HTTPS :443 | |
| +----------+----------+ |
| | |
| v |
| ZONE 1: AWS VPC EDGE (DEMILITARIZED) |
| +--------------------------------------------------------------+ |
| | AWS ALB (:443) + WAF v2 + Rate Limit + Geo-Restriction | |
| | | | |
| | v | |
| | Traefik Ingress Controller (:8443) | |
| | - Route: /api/* -> Backend Service | |
| | - Route: /ws/* -> WebSocket Handler | |
| | - Route: / -> Next.js Web App | |
| | - TLS: Let's Encrypt auto certificates | |
| +--------------------------------------------------------------+ |
| | |
| v |
| ZONE 2: AWS VPC APPLICATION (TRUSTED) |
| +--------------------------------------------------------------+ |
| | +-------------+ +-------------+ +---------------------+ | |
| | | Stream | | AI Inference| | Suspicious Activity | | |
| | | Ingestion | | Service | | Service (Night Mode)| | |
| | | (Go/FFmpeg) | | (Triton) | | (Go/Python) | | |
| | | :8081 | | :8001 gRPC | | :8083 | | |
| | +-------------+ +-------------+ +---------------------+ | |
| | +-------------+ +-------------+ +---------------------+ | |
| | | Backend API | | Training | | Notification | | |
| | | (Go/Gin) | | Service | | Service | | |
| | | :8080 | | (PyTorch) | | (Go) | | |
| | +-------------+ +-------------+ +---------------------+ | |
| | +--------------------+ | |
| | | Web Frontend | HLS Playback Service | |
| | | (Next.js 14 :3000) | (Go :8085) | |
| | +--------------------+ | |
| +--------------------------------------------------------------+ |
| | |
| v |
| ZONE 3: AWS VPC DATA (HIGHLY RESTRICTED) |
| +--------------------------------------------------------------+ |
| | +-------------+ +-------------+ +-------------+ | |
| | | PostgreSQL | | Redis | | Kafka | | |
| | | 16 (RDS) | | 7 Cluster | | (MSK) | | |
| | | :5432 | | :6379 | | :9092 | | |
| | | pgvector | | Pub/Sub | | 3 brokers | | |
| | | HNSW index | | Streams | | 3 AZs | | |
| | +-------------+ +-------------+ +-------------+ | |
| | +-------------+ +-----------------------------------+ | |
| | | MinIO | | S3 (Cold Archive) | | |
| | | (S3-compat) | | - Standard (30d) | | |
| | | :9000 | | - IA (31-90d) | | |
| | | 10 TB | | - Glacier Deep Archive (90d+) | | |
| | +-------------+ +-----------------------------------+ | |
| +--------------------------------------------------------------+ |
| | |
| | WireGuard VPN Tunnel (UDP 51820) |
| | ChaCha20-Poly1305 encryption |
| | Cloud peer: 10.200.0.1/32 <-> Edge peer: 10.200.0.2/32 |
| v |
| ZONE 4: EDGE NETWORK (PHYSICALLY ISOLATED) |
| +--------------------------------------------------------------+ |
| | +--------------------------------------------------------+ | |
| | | EDGE GATEWAY (Intel NUC) | | |
| | | Ubuntu 22.04 LTS | K3s v1.28+ | 2TB NVMe | | |
| | | | | |
| | | +-----------------+ +-----------------+ | | |
| | | | Stream Manager | | HLS Segmenter | | | |
| | | | (Python/asyncio)| | (FFmpeg/nginx) | | | |
| | | | 8x RTSP feeds | | 2s segments | | | |
| | | +-----------------+ +-----------------+ | | |
| | | +-----------------+ +-----------------+ | | |
| | | | Frame Extractor | | Buffer Manager | | | |
| | | | (AI decimation) | | (20GB ring buf) | | | |
| | | +-----------------+ +-----------------+ | | |
| | | +--------------------------------------------------+ | | |
| | | | VPN Client (WireGuard) | Health Monitor | | | |
| | | +--------------------------------------------------+ | | |
| | +--------------------------------------------------------+ | |
| | | |
| | Local Network (192.168.29.0/24) |
| | +------------------+ +------------------+ |
| | | CP PLUS DVR | | Local Monitor | |
| | | 192.168.29.200 | | 192.168.29.10 | |
| | | 8ch | RTSP :554 | | (optional) | |
| | +------------------+ +------------------+ |
| | CH1 CH2 CH3 CH4 CH5 CH6 CH7 CH8 |
| +--------------------------------------------------------------+ |
| |
+=============================================================================+
4.2 Service Interaction Diagram
+-----------------------------------------------------------------------------+
| SERVICE INTERACTIONS |
+-----------------------------------------------------------------------------+
| |
| INTERNET USERS |
| | |
| | HTTPS :443 |
| v |
| +---------+ +----------+ +----------+ |
| | AWS ALB |----->| Traefik |----->| Next.js | Web Frontend |
| | +WAF | | Ingress | | (SSR) | Dashboard |
| +---------+ +----------+ +----+-----+ |
| | |
| +--------------------+--------------------+ |
| | | | |
| v v v |
| +---------+ +------------+ +----------+ |
| |Backend | | WebSocket | | HLS | |
| |API (Go) | | Handler | | Playback | |
| |:8080 | | /ws/alerts | | Service | |
| +----+----+ +------------+ +----+-----+ |
| | |
| | gRPC :50051 |
| v |
| +---------+ +------------+ +----------+ +----------+ |
| | Stream | | AI | |Suspicious| |Training | |
| |Ingestion|<-->| Inference |<-->| Activity | |Service | |
| |(Go) | |(Triton) | |(Night) | |(PyTorch) | |
| +----+----+ +------+-----+ +----+-----+ +----+-----+ |
| | | | | |
| v v v v |
| +---------------------------------------------------------------+ |
| | KAFKA (MSK) | |
| | streams.raw (8 parts) ai.detections (16 parts) | |
| | alerts.critical (4 parts) training.data (30-day ret.) | |
| | notifications.* system.metrics (7-day ret.) | |
| +---------------------------------------------------------------+ |
| | | | | |
| v v v v |
| +---------+ +------------+ +----------+ +----------+ |
| |PostgreSQL| | Redis | | MinIO | | MLflow | |
| |16 +pgvec | |7 Cluster | |S3-compat | | Model | |
| |:5432 | |:6379 | |:9000 | | Registry | |
| +---------+ +------------+ +----------+ +----------+ |
| |
| Edge Gateway: WireGuard peer at 10.200.0.2/32 |
| Stream Ingestion pulls frames via VPN -> sends to Kafka |
| |
+-----------------------------------------------------------------------------+
4.3 Network Security Zones
Five security zones provide defense in depth, from the public internet to the physically isolated edge network.
+=============================================================================+
| NETWORK SECURITY ZONES |
+=============================================================================+
| |
| +---------------------------------------------------------------------+ |
| | ZONE 0: INTERNET (UNTRUSTED) | |
| | - Public users, any source IP | |
| | - AWS Shield Standard DDoS protection | |
| | - Geo-restriction: allow specific countries only | |
| +---------------------------+-----------------------------------------+ |
| | |
| | HTTPS :443 |
| v |
| +---------------------------------------------------------------------+ |
| | ZONE 1: AWS VPC EDGE (DEMILITARIZED) | |
| | - ALB + WAF v2 (SQL injection, XSS, rate limiting rules) | |
| | - Traefik Ingress (:8443) | |
| | - Auth: JWT + RBAC, API keys for edge gateway | |
| | - Public API endpoints ONLY | |
| | SG: alb-public-sg: 443 from 0.0.0.0/0 | |
| | SG: traefik-sg: 8443 from alb-sg ONLY | |
| +---------------------------+-----------------------------------------+ |
| | |
| | Internal :8080-8090 |
| v |
| +---------------------------------------------------------------------+ |
| | ZONE 2: AWS VPC APPLICATION (TRUSTED, ISOLATED) | |
| | - Stream Ingestion, AI Inference, Suspicious Activity | |
| | - Training, Backend API, Notification Services | |
| | - Pod Security: No root, read-only FS, no privilege escalation | |
| | - Network Policies: Ingress only from API GW namespace | |
| | SG: app-sg: 8080-8090 from traefik-sg ONLY | |
| +---------------------------+-----------------------------------------+ |
| | |
| | Data Layer :5432, :6379, :9092, :9000 |
| v |
| +---------------------------------------------------------------------+ |
| | ZONE 3: AWS VPC DATA (HIGHLY RESTRICTED) | |
| | - PostgreSQL (RDS), Redis (ElastiCache), Kafka (MSK) | |
| | - MinIO object storage, S3 cold archive | |
| | - Security Groups: ONLY from app-sg | |
| | - RDS: Encrypted at rest (AWS KMS), no public access | |
| | - S3: Bucket policy deny all except VPC endpoint | |
| +---------------------------+-----------------------------------------+ |
| | |
| | WireGuard VPN (UDP 51820) |
| | ChaCha20-Poly1305 |
| v |
| +---------------------------------------------------------------------+ |
| | ZONE 4: EDGE NETWORK (PHYSICALLY ISOLATED) | |
| | - Edge Gateway (Intel NUC), K3s node | |
| | - WireGuard peer, stream ingestion, local buffer | |
| | - DVR (192.168.29.200): NO internet access, local ONLY | |
| | - Edge Firewall: ALLOW 192.168.29.0/24 -> DVR :554,:80 | |
| | ALLOW OUT 51820/udp -> Cloud VPN endpoint | |
| | DENY ALL other incoming | |
| +---------------------------------------------------------------------+ |
| |
+=============================================================================+
4.4 Service Descriptions
| # | Service | Purpose | Technology | Port | Replicas |
|---|---|---|---|---|---|
| 1 | Edge Gateway Agent | RTSP stream pull, local recording, VPN endpoint, heartbeat | Go 1.21, systemd + K3s | 8080, 51820 | 1 (per site) |
| 2 | Stream Ingestion | Receive frames from edge, decode, produce to Kafka, store segments | Go 1.21, FFmpeg | 8081 | 3-20 (HPA) |
| 3 | AI Inference | GPU-accelerated detection, face recognition, embedding | Triton 2.40, TensorRT | 8000, 8001, 8002 | 1-4 (GPU HPA) |
| 4 | Suspicious Activity | Night-mode analysis, 10 detection modules, scoring engine | Python 3.11, OpenCV | 8083 | 2-8 (HPA) |
| 5 | Training Service | Model retraining, fine-tuning, A/B validation | PyTorch 2.1, CUDA 12.1 | 8084 | 0-1 (GPU spot) |
| 6 | Backend API | REST API, authentication, business logic | Go 1.21, Gin | 8080 | 3-10 (HPA) |
| 7 | Web Frontend | Dashboard, live view, timeline, analytics | Next.js 14, React 18 | 3000 | 3 (CDN) |
| 8 | Notification | Multi-channel alert dispatch (Telegram, WhatsApp, Email) | Go 1.21 | 8086 | 2-5 (HPA) |
| 9 | HLS Playback | HLS segment serving for dashboard live view | Go 1.21 | 8085 | 2-4 (HPA) |
| 10 | PostgreSQL | Primary database with pgvector for embeddings | PostgreSQL 16 (RDS) | 5432 | 1 (Multi-AZ) |
| 11 | Redis | Session store, cache, pub/sub, stream tracking | Redis 7 (ElastiCache) | 6379 | 2 shards x 2 replicas |
| 12 | Kafka | Event bus, durable log, stream replay | Apache Kafka (MSK) | 9092 | 3 brokers x 3 AZs |
| 13 | MinIO | Object storage for video, snapshots, model artifacts | MinIO (S3-compatible) | 9000, 9001 | Edge: 1, Cloud: 4 |
4.5 Physical Edge Gateway Specification
| Component | Specification |
|---|---|
| Hardware | Intel NUC 13 Pro, Core i5-1340P (12 cores, 16 threads) |
| Alternative | NVIDIA Jetson Orin NX 16GB (for on-edge AI inference) |
| RAM | 16GB DDR4-3200 (32GB recommended for 16+ channels) |
| Storage | 2TB NVMe SSD (7-day circular buffer for all 8 streams) |
| LAN | Intel i226-V 2.5GbE (local DVR subnet) |
| WAN | Second Ethernet or WiFi (internet for VPN) |
| OS | Ubuntu 22.04.4 LTS Server (no GUI) |
| Container Runtime | Docker CE 25.x + Docker Compose 2.x |
| K8s Distribution | K3s v1.28+ (lightweight, single-node or 2-node HA) |
| Power | UPS-backed, auto-restart on power loss (BIOS setting) |
| Network | Dual interface: eth0 for local DVR, eth1 for internet/VPN |
4.6 Cloud Infrastructure Specification
| Component | Specification |
|---|---|
| Region | Primary: ap-south-1 (Mumbai), DR: ap-southeast-1 (Singapore) |
| VPC | 10.100.0.0/16, 3 AZs, private subnets only for workloads |
| EKS | Managed node groups: on-demand for API, spot for batch/GPU |
| GPU Nodes | g4dn.xlarge (NVIDIA T4) for Triton inference, 1-4 auto-scaled |
| ALB | Internet-facing, WAF v2 attached, Shield Advanced optional |
| RDS | PostgreSQL 16, db.r6g.xlarge, Multi-AZ, encrypted at rest |
| ElastiCache | Redis 7, cluster mode enabled, 2 shards x 2 replicas |
| MSK (Kafka) | 3 broker nodes, kafka.m5.large, 3 AZs |
| S3 | Standard (hot 30d), IA (31-90d), Glacier Deep Archive (90d+) |
4.7 Scaling Approach
The system scales from the initial 8-camera deployment to 64+ cameras through well-defined phases:
+-----------------------------------------------------------------------------+
| CAMERA SCALING ROADMAP |
+-----------------------------------------------------------------------------+
| |
| CURRENT: 8 cameras (1 DVR) |
| +-- Edge: Intel NUC i7, 32GB RAM |
| +-- Bandwidth: ~16 Mbps upstream (2 Mbps per H.264 stream) |
| +-- Cloud AI: 1x T4 GPU (8 streams @ 1 fps, batch=8) |
| +-- Kafka: 8 partitions (streams.raw) |
| +-- PostgreSQL: db.r6g.xlarge |
| +-- Monthly cost: ~$2,140 |
| |
| PHASE 1: 16 cameras (2 DVRs / 2 sites) |
| +-- Edge: 2x Intel NUC (one per site) |
| +-- Bandwidth: ~32 Mbps |
| +-- Cloud AI: 1x T4 GPU (batch=16, still sufficient) |
| +-- Kafka: 16 partitions |
| +-- Monthly cost: ~$3,200 |
| |
| PHASE 2: 32 cameras (4 DVRs / 4 sites) |
| +-- Edge: 4x Intel NUC |
| +-- VPN: Hub-spoke model (4 edge peers -> 1 cloud endpoint) |
| +-- Bandwidth: ~64 Mbps |
| +-- Cloud AI: 2x T4 GPUs (HPA: 2-6 replicas) |
| +-- Kafka: 32 partitions |
| +-- PostgreSQL: db.r6g.2xlarge |
| +-- Monthly cost: ~$5,500 |
| |
| PHASE 3: 64 cameras (8 DVRs / 8 sites) |
| +-- Edge: 8x Intel NUC (or Jetson Orin for edge AI pre-filter) |
| +-- Bandwidth: ~128 Mbps (dedicated circuit recommended) |
| +-- Cloud AI: 4x T4 GPUs or 2x A10G (g5.2xlarge) |
| +-- Kafka: 64 partitions, consider MSK multi-cluster |
| +-- PostgreSQL: db.r6g.4xlarge + read replica |
| +-- Monthly cost: ~$9,800 |
| |
+-----------------------------------------------------------------------------+
4.8 Failover and Reliability Design
The graceful degradation matrix defines behavior for every failure mode:
+=============================================================================+
| GRACEFUL DEGRADATION MATRIX |
+=============================================================================+
| |
| Failure Mode | Degradation Strategy |
| ------------------------- | ----------------------------------------------- |
| AI Inference Service DOWN | Continue recording ALL video locally |
| (GPU failure, model crash)| Events stored as "unprocessed" |
| | No real-time alerts |
| | Queue frames for later batch processing |
| | Dashboard shows "AI OFFLINE" banner |
| |
| Kafka DOWN (MSK outage) | Edge Gateway buffers locally (20GB ring buffer) |
| | Backpressure: reduce to key frames only (0.2fps)|
| | Auto-reconnect with 2x exponential backoff |
| | Replay from local buffer when Kafka recovers |
| |
| VPN Tunnel DOWN | Full local operation mode |
| (internet outage) | All recording continues locally (7-day buffer) |
| | Local alert buzzer/relay (configurable) |
| | No cloud dashboard access |
| | Auto-sync when VPN recovers |
| |
| PostgreSQL DOWN (RDS) | Alert queue builds in Kafka (durable log) |
| | Events not lost (Kafka 7-day retention) |
| | Read-only dashboard mode (Redis cache) |
| | Alert on-call engineer |
| |
| Notification Service DOWN | Alerts accumulate in DB |
| | Retry with exponential backoff |
| | Dead letter after 24 hours |
| | Dashboard shows pending count |
| |
| Edge Gateway DOWN (power) | Cloud dashboard shows "SITE OFFLINE" |
| | Last known recordings in cloud |
| | Alert sent immediately |
| | UPS: graceful shutdown, preserve data |
| |
+=============================================================================+
Priority Order (highest first):
- Video recording NEVER STOPS (local edge priority)
- Critical alerts ALWAYS FIRE (local buzzer + queued cloud alerts)
- AI inference gracefully degrades to batch catch-up on recovery
- Dashboard operates in read-only/cache mode during DB outage
- Cloud sync resumes automatically when connectivity restored
Reliability Mechanisms:
| Mechanism | Implementation | Target |
|---|---|---|
| Stream Reconnect | Exponential backoff: 1s -> 2s -> 4s -> 8s -> max 30s | < 60s recovery |
| Circuit Breaker | 5 failures -> OPEN (60s) -> HALF_OPEN (3 test calls) -> CLOSED | Prevent cascade failures |
| VPN Watchdog | Ping every 30s, restart WireGuard on 3 consecutive failures | < 90s VPN recovery |
| Kafka Producer | acks=all, retries=10, enable.idempotence=true, LZ4 compression |
Zero message loss |
| Kafka Consumer | Manual offset commit AFTER DB write success | Exactly-once processing |
| Health Checks | 5-layer: K8s probes -> Service metrics -> Dependency checks -> E2E synthetic -> Edge heartbeat | < 2 min detection |
| Auto-scaling | GPU util > 80% for 2 min -> scale out; Kafka lag > 1000 for 5 min -> scale out | Proactive capacity |
Section 5: Data Flow from DVR to Cloud to Dashboard
This section traces the complete data journey from camera capture through AI processing to user presentation.
5.1 Overview: Seven Data Flows
+=============================================================================+
| SEVEN DATA FLOW PATHWAYS |
+=============================================================================+
| |
| Flow 1: Camera --> DVR --> Edge Gateway |
| [Analog/Digital] -> [H.264 Encode] -> [RTSP Server] |
| |
| Flow 2: Edge Gateway --> VPN --> Cloud Kafka |
| [FFmpeg ingest] -> [Frame extract] -> [Kafka Producer] |
| |
| Flow 3: Stream Ingestion --> AI Inference |
| [Kafka Consumer] -> [GPU Batch] -> [Detection + Face Recog.] |
| |
| Flow 4: AI Inference --> Events --> Database |
| [Detection results] -> [Event enrich] -> [PostgreSQL] |
| |
| Flow 5: Events --> Alerts --> Notifications |
| [Scoring engine] -> [Alert create] -> [Multi-channel send] |
| |
| Flow 6: Live Streams --> Browser Dashboard |
| [HLS segmenter] -> [Nginx relay] -> [HLS.js player] |
| |
| Flow 7: Training Feedback Loop |
| [Operator review] -> [Conflict detect] -> [Model update] |
| |
+=============================================================================+
5.2 Flow 1: Camera to DVR to Edge Gateway
Path: Analog/Digital Camera -> DVR internal encoder -> DVR RTSP server -> Edge Gateway FFmpeg client
Protocol Stack:
| Layer | Technology | Details |
|---|---|---|
| Camera Interface | Analog BNC / CVBS / AHD | CP PLUS DVR supports multiple analog standards |
| DVR Encoding | H.264 High Profile | Hardware encoder, real-time, low latency |
| DVR Storage | Internal HDD (currently FULL) | 0 bytes free — no local recording possible |
| Network Transport | RTSP over TCP (interleaved) | Mandatory for reliable NAT/VPN traversal |
| URL Pattern | rtsp://admin:{PASS}@192.168.29.200:554/cam/realmonitor?channel=N&subtype=M |
N=1-8, M=0(main)/1(sub) |
| Client | FFmpeg 6.0+ | -rtsp_transport tcp -stimeout 5000000 |
| Frame Rate | 25 FPS (PAL) or 30 FPS (NTSC) | Configurable per channel |
| Resolution (main) | 960 x 1080 (per channel) | Full resolution |
| Resolution (sub) | 352 x 288 to 704 x 576 | Lower bandwidth for AI |
FFmpeg RTSP Connection Command:
ffmpeg -hide_banner -loglevel warning \
-rtsp_transport tcp \
-stimeout 5000000 \
-fflags +genpts+discardcorrupt+igndts+ignidx \
-reorder_queue_size 64 \
-i "rtsp://admin:password@192.168.29.200:554/cam/realmonitor?channel=1&subtype=0" \
-c copy -f segment -segment_time 60 -reset_timestamps 1 \
-strftime 1 "/data/buffer/ch1/%Y%m%d_%H%M%S.mkv"
Latency Budget:
| Stage | Latency |
|---|---|
| Camera -> DVR (analog) | ~1-5 ms |
| DVR encoding | ~50-100 ms |
| RTSP over LAN | ~1-2 ms |
| Total (camera to edge gateway) | ~52-107 ms |
5.3 Flow 2: Edge Gateway to VPN Tunnel to Cloud
Path: Edge Gateway FFmpeg -> Frame extraction -> JPEG encoding -> Kafka Producer -> WireGuard VPN -> Cloud MSK
Frame Processing Pipeline:
+------------+ +-------------+ +---------------+ +-------------+ +-----------+
| Raw RTSP | -> | FFmpeg | -> | Frame | -> | JPEG | -> | Kafka |
| H.264 | | Demux/Decode| | Decimation | | Encoder | | Producer |
| 25 FPS | | | | (1 fps) | | Quality 85 | | (LZ4) |
| 960x1080 | | | | 640x640 crop | | | | |
+------------+ +-------------+ +---------------+ +-------------+ +-----------+
FFmpeg Frame Extraction for AI:
ffmpeg -hide_banner -loglevel warning \
-rtsp_transport tcp -stimeout 5000000 \
-fflags +genpts+discardcorrupt -reorder_queue_size 64 \
-i "rtsp://admin:password@192.168.29.200:554/cam/realmonitor?channel=1&subtype=0" \
-vf "fps=1,scale=640:640:force_original_aspect_ratio=decrease,pad=640:640:(ow-iw)/2:(oh-ih)/2:black" \
-q:v 5 -f image2pipe -vcodec mjpeg pipe:1
WireGuard VPN Tunnel Configuration:
| Parameter | Value |
|---|---|
| Protocol | UDP 51820 |
| Encryption | ChaCha20-Poly1305 |
| Key Exchange | Curve25519 (ECDH) |
| Preshared Key | Enabled per-peer |
| Keepalive | 25 seconds |
| MTU | 1400 (to account for WireGuard + IP headers) |
| Cloud Endpoint | 10.200.0.1/32 (EC2 bastion or ALB) |
| Edge Endpoint | 10.200.0.2/32 |
| Route | 10.200.0.0/16 (AWS VPC) accessible from edge |
VPN watchdog script runs every 30 seconds; restarts WireGuard on 3 consecutive ping failures.
Latency Budget:
| Stage | Latency |
|---|---|
| Frame extraction (FFmpeg) | ~50-100 ms |
| JPEG encoding | ~5-10 ms |
| Kafka produce (local) | ~1-2 ms |
| WireGuard tunnel | ~5-15 ms (Mumbai -> India site) |
| MSK broker | ~1-2 ms |
| Total (edge to cloud Kafka) | ~62-129 ms |
5.4 Flow 3: Stream Ingestion to AI Inference
Path: Kafka streams.raw topic -> Stream Ingestion consumer -> Triton Inference Server -> Kafka ai.detections topic
Pipeline Architecture:
+------------+ +-------------------+ +------------------+ +-------------+
| streams.raw| -> | Stream Ingestion | -> | NVIDIA Triton | -> | ai.detections |
| (8 parts) | | (Go consumer) | | (GPU inference) | | (16 parts) |
| JPEG frames| | Batch aggregator | | gRPC :8001 | | Detection |
| + metadata | | (batch=8, timeout)| | Dynamic batching | | + embeddings |
+------------+ +-------------------+ +------------------+ +-------------+
Triton Model Configuration:
| Model | Inputs | Outputs | GPU Memory | Latency (P50) |
|---|---|---|---|---|
| YOLO11m-det (TensorRT FP16) | 3x640x640 float16 | Bboxes, scores, labels | ~2.1 GB | 12 ms |
| SCRFD-500M (TensorRT FP16) | 3x640x640 float16 | Bboxes, landmarks, scores | ~1.8 GB | 8 ms |
| ArcFace R100 (TensorRT FP16) | 3x112x112 float16 | 512-D embedding | ~3.2 GB | 5 ms |
Total GPU memory: ~7.1 GB (fits in T4 16 GB with 8 streams)
Latency Budget:
| Stage | Latency |
|---|---|
| Kafka consume (batch) | ~10-50 ms |
| Preprocessing (resize, normalize) | ~5-15 ms |
| YOLO11m inference (GPU) | ~12 ms (P50) |
| SCRFD face detection (GPU) | ~8 ms (P50) |
| ArcFace embedding (GPU, per face) | ~5 ms (P50) |
| Post-processing (NMS, matching) | ~10-30 ms |
| Kafka produce (results) | ~1-2 ms |
| Total (Kafka to detection output) | ~51-132 ms |
5.5 Flow 4: AI Inference to Events to Database
Path: AI Detection results -> Event enricher -> PostgreSQL (multiple tables)
Data Transformation:
+------------+ +-------------------+ +---------------------+ +------------+
| Detection | -> | Event Enricher | -> | PostgreSQL Writer | -> | events |
| results | | - Add camera_id | | - UPSERT person | | persons |
| (raw) | | - Match person | | - INSERT event | | embeddings |
| | | - Check whitelist | | - INSERT embedding | | face_crops |
+------------+ +-------------------+ +---------------------+ +------------+
Database Write Operations per Detection:
| Operation | Table | Type | Notes |
|---|---|---|---|
| Insert event record | events |
INSERT | With bounding box, confidence, timestamp |
| Upsert person | persons |
INSERT/UPDATE | If new face, create person record |
| Insert face crop | face_crops |
INSERT | S3 URL, bounding box, quality score |
| Upsert embedding | face_embeddings |
INSERT/UPDATE | 512-D vector, pgvector HNSW index |
| Increment counters | camera_stats |
UPDATE | Daily aggregation |
5.6 Flow 5: Events to Alerts to Notifications
Path: AI events -> Suspicious Activity scoring engine -> Alert creation -> Notification dispatch
Scoring and Escalation:
+------------+ +-------------------+ +------------------+ +-------------+
| AI events | -> | Suspicious Activity| -> | Alert Manager | -> | Notification |
| (persons, | | Scoring Engine | | - Deduplicate | | Service |
| faces) | | - 10 modules | | - Rate limit | | - Telegram |
| | | - Composite score | | - Suppress dup | | - WhatsApp |
| | | - Time decay | | - Escalation | | - Email |
+------------+ +-------------------+ +------------------+ +-------------+
Alert Escalation Matrix:
| Score | Level | Color | Notification | Action |
|---|---|---|---|---|
| 0.00 - 0.20 | NONE | Gray | None | Log only |
| 0.20 - 0.40 | LOW | Blue | Dashboard only | Log + indicator |
| 0.40 - 0.60 | MEDIUM | Yellow | Dashboard + App push | Alert dispatched |
| 0.60 - 0.80 | HIGH | Orange | All of above + Telegram | Immediate alert |
| 0.80 - 1.00 | CRITICAL | Red | All of above + WhatsApp + Email | Critical alert |
| > 1.00 | EMERGENCY | Purple + flashing | All channels + SMS | Emergency dispatch |
5.7 Flow 6: Live Streams to Browser Dashboard
Path: DVR RTSP -> Edge Gateway FFmpeg -> HLS segmenter -> Nginx -> CDN -> Browser HLS.js
+--------+ +---------------+ +---------------+ +---------+ +----------+
| DVR | -> | Edge Gateway | -> | HLS Segmenter | -> | Nginx | -> | Browser |
| RTSP | | FFmpeg | | (2s segments) | | (relay) | | HLS.js |
| 25 FPS | | -copyts | | H.264 + AAC | | HTTPS | | Video tag|
+--------+ +---------------+ +---------------+ +---------+ +----------+
HLS Configuration:
| Parameter | Value |
|---|---|
| Segment duration | 2 seconds |
| Segment list size | 5 segments (10-second sliding window) |
| Playlist type | Live (no #EXT-X-ENDLIST) |
| Codec | H.264 High Profile + AAC-LC |
| Adaptive bitrate | 3 variants: high (3 Mbps), mid (1 Mbps), low (500 Kbps) |
Latency:
| Stage | Latency |
|---|---|
| DVR encoding | ~50-100 ms |
| RTSP to edge | ~1-2 ms |
| FFmpeg demux/remux | ~20-50 ms |
| HLS segmenting (2s) | ~2000 ms |
| Nginx relay | ~1-5 ms |
| CDN propagation | ~10-50 ms |
| HLS.js buffer | ~1-2 segments (2-4s) |
| Browser decode | ~20-50 ms |
| Total (camera to eye) | ~2.1 - 2.3 seconds |
5.8 Flow 7: Training Feedback Loop
Path: Operator review actions -> Conflict detection -> Training dataset -> Model training -> Quality gates -> Deployment
+------------+ +------------------+ +----------------+ +-------------+ +-----------+
| Operator | -> | Conflict | -> | Training | -> | Quality | -> | Deployment |
| Review | | Detection | | Dataset | | Gates | | (A/B test) |
| (confirm, | | (5 types) | | - Curate | | - Precision | | |
| correct, | | - Block conflicts| | - Label | | >= 0.97 | | |
| merge, | | - Queue safe | | - Augment | | - Recall | | |
| reject) | | additions | | - Version | | >= 0.95 | | |
+------------+ +------------------+ +----------------+ +-------------+ +-----------+
Training Data Flow:
| Stage | Frequency | Trigger |
|---|---|---|
| Review action collection | Continuous | Operator clicks on dashboard |
| Conflict detection | Immediate (synchronous) | Every review action |
| Training dataset build | Weekly (or on-demand) | Queue threshold or manual |
| Model training | On dataset build | Airflow DAG trigger |
| Quality gate evaluation | After training | Automated pipeline |
| A/B deployment | After quality pass | Admin approval |
| Full production | After A/B success | Auto-promote at 48h |
Section 6: Recommended Tech Stack
6.1 Technology Selection Matrix
| Layer | Technology | Version | Purpose | Rationale |
|---|---|---|---|---|
| Cloud Platform | AWS | 2025 | Infrastructure (ap-south-1 Mumbai) | Best India region latency, mature managed services |
| Container Orchestration | Amazon EKS | v1.28+ | Managed Kubernetes control plane | GPU node support, Cluster Autoscaler |
| Edge K8s | K3s | v1.28+ | Lightweight Kubernetes at edge | Single binary, resource-efficient |
| VPN | WireGuard | v1.0+ | Encrypted tunnel between edge and cloud | ~60% faster than OpenVPN, modern crypto |
| Reverse Proxy | Traefik | v2.10+ | Kubernetes Ingress controller | Native K8s integration, automatic TLS |
| AI Inference | NVIDIA Triton | 2.40 | GPU model serving, dynamic batching | Multi-framework, TensorRT optimization |
| CV Framework | OpenCV | 4.8+ | Image processing, pre/post-processing | Industry standard, Python/Go bindings |
| AI/ML Framework | PyTorch | 2.1+ | Model training, custom inference | Ecosystem, CUDA 12 support |
| Deep Learning | TensorRT | 8.6+ | GPU-optimized inference for YOLO, SCRFD, ArcFace | FP16 support, 3-5x speedup |
| Language: AI | Python | 3.11 | AI inference, training, suspicious activity detection | Ecosystem, scientific computing |
| Language: Services | Go | 1.21 | Stream ingestion, backend API, notifications | Performance, concurrency, small binaries |
| Language: Frontend | TypeScript | 5.2 | Web dashboard | Type safety, React ecosystem |
| Web Framework | Next.js | 14 (App Router) | React SSR dashboard | Server components, streaming |
| UI Library | React | 18 | Component-based UI | Concurrent features, Suspense |
| Styling | Tailwind CSS | 3.4 | Utility-first CSS | Rapid development, consistent design |
| Video Player | HLS.js | 1.4 | Browser HLS playback | MSE-based, adaptive bitrate |
| Database | PostgreSQL | 16 | Primary database, vector storage | ACID, pgvector extension |
| Vector Search | pgvector | 0.5+ | HNSW index for 512-D face embeddings | Native PostgreSQL, ivfflat+hnsw |
| Cache/Session | Redis | 7 | Session store, pub/sub, rate limiting | Data structures, cluster mode |
| Message Queue | Apache Kafka | 3.6+ (MSK) | Durable event log, stream replay | Exactly-once, retention, partitions |
| Object Storage | MinIO | latest (RELEASE.2024) | S3-compatible hot storage | Edge + cloud, erasure coding |
| Cold Archive | Amazon S3 | Standard/IA/Glacier | Tiered archival (30d/90d/365d) | Cost optimization |
| Model Registry | MLflow | 2.8+ | Model versioning, experiment tracking | Open source, S3 artifact store |
| Orchestration | Apache Airflow | 2.7+ | Training pipeline DAGs | Backfill, retries, observability |
| Monitoring | Prometheus | 2.47+ | Metrics collection | Pull-based, K8s service discovery |
| Visualization | Grafana | 10.1+ | Dashboards, alerting | Panels, annotations, shared links |
| Log Aggregation | Grafana Loki | 2.9+ | Centralized logging | Label-based, cost-effective |
| CI/CD | GitHub Actions | v4 | Build, test, lint pipelines | Native GitHub integration |
| GitOps | ArgoCD | 2.9+ | Kubernetes continuous delivery | Declarative, drift detection |
| Infrastructure | Terraform | 1.6+ | IaC for AWS resources | State management, modules |
| Secrets | AWS Secrets Manager | - | Encrypted credential storage | Rotation, IAM integration |
6.2 Hardware Requirements
Edge Gateway (Per Site)
| Component | Minimum | Recommended | High Availability |
|---|---|---|---|
| CPU | Intel i5-1340P (12 cores) | Intel i7-1370P (14 cores) | 2x Intel i7 (HA cluster) |
| RAM | 16 GB DDR4-3200 | 32 GB DDR4-3200 | 32 GB per node |
| Storage | 1 TB NVMe SSD | 2 TB NVMe SSD | 2 TB per node + NAS sync |
| Network | 1 Gbps Ethernet | 2.5 Gbps Ethernet | Dual NIC + bonding |
| GPU (optional) | None | NVIDIA Jetson Orin NX 16GB | On-edge AI pre-filtering |
| Power | UPS 600VA | UPS 1000VA | Dual PSU + generator |
Cloud GPU Nodes (AI Inference)
| Cameras | GPU | VRAM | Streams | Cost/month (spot) |
|---|---|---|---|---|
| 1-8 | g4dn.xlarge (T4) | 16 GB | 8 | ~$200-350 |
| 8-16 | g4dn.xlarge (T4) | 16 GB | 16 | ~$350-500 |
| 16-32 | g4dn.2xlarge (T4) | 16 GB | 32 | ~$600-900 |
| 32-64 | g5.2xlarge (A10G) | 24 GB | 64 | ~$1200-1800 |
| 64+ | p4d.24xlarge (A100) | 40 GB | 128 | ~$5000-8000 |
6.3 Software Versions Summary
| Category | Software | Version |
|---|---|---|
| Operating System | Ubuntu Server LTS | 22.04.4 |
| Container Runtime | Docker CE | 25.x |
| Container Orchestration | Kubernetes (EKS/K3s) | 1.28+ |
| AI Serving | NVIDIA Triton Inference Server | 2.40 |
| GPU Runtime | CUDA | 12.1+ |
| GPU Driver | NVIDIA Driver | 535+ |
| Deep Learning Optimization | TensorRT | 8.6+ |
| AI Framework | PyTorch | 2.1+ |
| Computer Vision | OpenCV | 4.8+ |
| Video Processing | FFmpeg | 6.0+ |
| Service Language | Go | 1.21+ |
| AI/Training Language | Python | 3.11+ |
| Frontend Framework | Next.js | 14 |
| UI Library | React | 18 |
| Database | PostgreSQL | 16 |
| Message Queue | Apache Kafka | 3.6+ |
| Cache | Redis | 7 |
| Object Storage | MinIO | 2024+ |
| CI/CD | GitHub Actions | v4 |
| GitOps | ArgoCD | 2.9+ |
| Monitoring | Prometheus + Grafana | 2.47+ / 10.1+ |
| Logging | Grafana Loki | 2.9+ |
| VPN | WireGuard | 1.0+ |
| Model Registry | MLflow | 2.8+ |
| Orchestration | Apache Airflow | 2.7+ |
| Infrastructure | Terraform | 1.6+ |
6.4 Port Reference
| Service | Port | Protocol | Location | Notes |
|---|---|---|---|---|
| DVR RTSP | 554 | TCP | 192.168.29.200 | Local network only |
| DVR HTTP | 80 | TCP | 192.168.29.200 | Admin UI, local only |
| DVR HTTPS | 443 | TCP | 192.168.29.200 | Admin UI, local only |
| DVR TCP | 25001 | TCP | 192.168.29.200 | Proprietary protocol |
| DVR UDP | 25002 | UDP | 192.168.29.200 | Proprietary protocol |
| DVR NTP | 123 | UDP | 192.168.29.200 | Time sync |
| WireGuard | 51820 | UDP | Cloud + Edge | VPN tunnel |
| Edge Admin | 8080 | TCP | 192.168.29.5 | Local admin UI |
| Edge SSH | 22 | TCP | 192.168.29.5 | Admin access only |
| Traefik HTTP | 8000 | TCP | EKS | Internal HTTP entrypoint |
| Traefik HTTPS | 8443 | TCP | EKS | Internal HTTPS entrypoint |
| ALB HTTPS | 443 | TCP | AWS | Public-facing |
| Backend API | 8080 | TCP | EKS pods | Internal service port |
| Triton HTTP | 8000 | TCP | EKS GPU nodes | Model inference HTTP |
| Triton gRPC | 8001 | TCP | EKS GPU nodes | Model inference gRPC |
| Triton Metrics | 8002 | TCP | EKS GPU nodes | Prometheus metrics |
| PostgreSQL | 5432 | TCP | RDS | VPC-private |
| Redis | 6379 | TCP | ElastiCache | VPC-private |
| Kafka | 9092 | TCP | MSK | VPC-private |
| MinIO API | 9000 | TCP | EKS + Edge | S3-compatible API |
| MinIO Console | 9001 | TCP | EKS + Edge | Admin console |
| Prometheus | 9090 | TCP | EKS | Metrics collection |
| Grafana | 3000 | TCP | EKS | Dashboards |
Section 7: Database Schema
7.1 Schema Overview
The database is designed around a relational core (PostgreSQL 16) with pgvector extension for 512-dimensional face embedding storage and similarity search. The schema consists of 29 tables, 4 views, and 8 trigger functions, organized into 10 logical domains.
Schema Philosophy:
- Strict normalization for reference data (cameras, persons, rules) to ensure data integrity
- JSONB flexibility for event metadata and configuration to accommodate evolving AI outputs
- Partitioning on all high-volume time-series tables for query performance and lifecycle management
- pgvector HNSW indexing for sub-10ms face similarity search at scale
- Row-level security (RLS) for multi-tenant site isolation
- AES-256 encryption for all stored credentials (DVR passwords, API tokens)
7.2 Entity Relationship Overview
+=============================================================================+
| ENTITY RELATIONSHIP DIAGRAM |
+=============================================================================+
| |
| SITE (1) --------------------< (N) DVR |
| | | |
| | | (1) |
| | v |
| | CAMERA (N) <------------------< (N) ALERT_RULE|
| | | | |
| | | (N) | (1) |
| | v v |
| | +---------------------------------------------------------+ |
| | | EVENT (N) -->--(1) PERSON (1)--< (N) FACE_EMBEDDING |
| | | | | |
| | | | (N) | (N) |
| | | v v |
| | | FACE_CROP (N) PERSON_CLUSTER |
| | | | |
| | | | (N) +---------+|
| | | v | Training||
| | | MEDIA_FILE (1) ----------------------------------------->| Dataset ||
| | | |---------||
| | +--------------------------------------------------------->| Job ||
| | | Model ||
| | +---------+ | Version ||
| | | Review | +---------+|
| | | Action | |
| | +---------+ |
| | ^ |
| | | (N) |
| +------------------------------------+ |
| USER (N) -->--(N) ROLE_PERMISSION |
| | |
| | (1) |
| v |
| WATCHLIST (N) -->--(N) WATCHLIST_ENTRY |
| |
| +---------+ +---------+ +---------+ +---------+ |
| | Telegram| |WhatsApp | | Email | |Webhook | |
| | Config | | Config | | Config | | Config | |
| +---------+ +---------+ +---------+ +---------+ |
| ^ ^ ^ ^ |
| | | | | |
| +--------------+-------------+--------------+ |
| | |
| NOTIFICATION_CHANNEL |
| | |
| | (1) |
| v |
| NOTIFICATION_LOG |
| |
| +---------+ +---------+ +---------+ |
| | Audit | | System | | Device | |
| | Log | | Health | | Connect.| |
| |(partitioned) | Log | | Log | |
| +---------+ +---------+ +---------+ |
| |
+=============================================================================+
7.3 Core Tables Summary
7.3.1 Site and Infrastructure Tables
| Table | Purpose | Key Fields | Rows (est.) |
|---|---|---|---|
sites |
Physical locations (factories, warehouses) | id, name, location, timezone, settings | 1-10 |
dvrs |
DVR/NVR devices per site | id, site_id, ip_address, port, username, password_encrypted, model, channels, status | 1-10 |
cameras |
Individual camera channels | id, dvr_id, channel_number, name, rtsp_url, resolution, fps, status, zone_config, zone_description | 8-64 |
7.3.2 AI Detection and Identity Tables
| Table | Purpose | Key Fields | Rows (est.) |
|---|---|---|---|
events |
All AI detection events (partitioned monthly) | id, camera_id, event_type, timestamp, confidence, bounding_box, person_id, face_crop_id, track_id | 1M-10M/month |
persons |
Known and unknown individuals | id, name, status (known/unknown/blacklisted), role, company, notes, created_at | 100-10,000 |
face_crops |
Cropped face images metadata | id, event_id, person_id, storage_path, bounding_box, quality_score, blur_score, pose_yaw, pose_pitch | 500K-5M/month |
face_embeddings |
512-D face embeddings (pgvector) | id, person_id, face_crop_id, embedding (vector(512)), model_version, is_primary | 500K-5M |
person_clusters |
Unknown person cluster groups | id, cluster_label, representative_embedding_id, sample_count, first_seen, last_seen, status | 10-1,000 |
7.3.3 Alert and Notification Tables
| Table | Purpose | Key Fields | Rows (est.) |
|---|---|---|---|
alert_rules |
Per-camera alert configuration | id, camera_id, rule_type, name, config_json, schedule, enabled | 50-500 |
alerts |
Generated alert records | id, camera_id, rule_id, person_id, alert_type, severity, status, message | 1K-50K/month |
notification_channels |
Alert destination endpoints | id, name, channel_type, config_json, is_active | 5-20 |
telegram_configs |
Telegram Bot API credentials | id, channel_id, bot_token_encrypted, chat_id | 1-5 |
whatsapp_configs |
WhatsApp Business API credentials | id, channel_id, api_key_encrypted, phone_number_id | 1-5 |
notification_log |
Delivery status per notification | id, alert_id, channel_id, status, sent_at, error_message | 1K-50K/month |
7.3.4 Watchlist and Access Control Tables
| Table | Purpose | Key Fields | Rows (est.) |
|---|---|---|---|
users |
Dashboard users and operators | id, username, email, password_hash, role, is_active | 5-50 |
roles |
Permission roles | id, name, permissions_json | 3-10 |
watchlists |
Named monitoring lists | id, name, watch_type (vip/blacklist/custom), is_active | 5-20 |
watchlist_entries |
Persons on watchlists | id, watchlist_id, person_id, added_by, added_at | 10-1,000 |
7.3.5 Training and ML Pipeline Tables
| Table | Purpose | Key Fields | Rows (est.) |
|---|---|---|---|
training_datasets |
Curated face datasets for training | id, name, description, person_ids_json, sample_count, version, status | 10-100 |
training_jobs |
Model training job tracking | id, dataset_id, model_version_from, model_version_to, status, metrics_json | 10-100 |
model_versions |
Registry of trained model versions | id, version_string, training_job_id, metrics_json, is_production, is_rollback_available | 10-50 |
review_actions |
Operator review decisions | id, event_id, reviewer_id, action, from_person_id, to_person_id, notes | 1K-100K |
7.3.6 Media and Storage Tables
| Table | Purpose | Key Fields | Rows (est.) |
|---|---|---|---|
media_files |
Registry of stored video/images | id, file_type, storage_path, size_bytes, checksum, camera_id, event_id, retention_until | 100K-1M |
video_clips |
Video clip metadata for incidents | id, media_file_id, start_time, end_time, camera_id, event_id, duration_seconds | 10K-100K |
7.3.7 Audit and Monitoring Tables (Partitioned)
| Table | Purpose | Partition | Retention |
|---|---|---|---|
audit_logs |
All user and system actions | Monthly by timestamp | 1 year (Glacier) |
system_health_logs |
Component health metrics | Monthly by timestamp | 90 days |
device_connectivity_logs |
Camera/DVR connectivity events | Monthly by timestamp | 90 days |
7.4 Indexing Strategy
7.4.1 pgvector HNSW Index (Critical Path)
-- HNSW index for sub-10ms face similarity search
-- ef_search controls recall/speed tradeoff (higher = more accurate, slower)
CREATE INDEX idx_face_embeddings_hnsw
ON face_embeddings
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 128);
-- Query: Find top-K similar faces
SELECT person_id, 1 - (embedding <=> query_vector) AS similarity
FROM face_embeddings
WHERE is_primary = true
ORDER BY embedding <=> query_vector
LIMIT 5;
| Parameter | Value | Rationale |
|---|---|---|
m |
16 | Number of bi-directional links per node (higher = better recall, more memory) |
ef_construction |
128 | Build-time exploration factor (higher = better index quality) |
ef_search (runtime SET) |
64-256 | Search-time exploration factor (SET hnsw.ef_search = 128) |
| Distance metric | Cosine similarity (<=>) |
Optimal for normalized face embeddings |
7.4.2 B-Tree Indexes (Standard Queries)
| Table | Index | Purpose |
|---|---|---|
events |
(camera_id, timestamp DESC) |
Time-range queries per camera |
events |
(event_type, timestamp DESC) |
Filter by event type |
events |
(person_id) WHERE person_id IS NOT NULL |
Person event lookup |
face_crops |
(person_id, quality_score DESC) |
Best quality face per person |
alerts |
(status, created_at DESC) |
Pending alerts by age |
alerts |
(severity, status) |
Critical alert dashboard |
persons |
(status, name) |
Person directory with status filter |
persons |
(created_at DESC) |
Recently added persons |
media_files |
(retention_until) WHERE retention_until < NOW() + 7 days |
Expiring media cleanup |
7.5 Partitioning Strategy
All high-volume time-series tables are partitioned monthly using pg_partman for automated partition management.
+-----------------------------------------------------------------------------+
| PARTITIONING ARCHITECTURE |
+-----------------------------------------------------------------------------+
| |
| events (parent, empty) |
| +-- events_y2024m01 (Jan 2024 data) |
| +-- events_y2024m02 (Feb 2024 data) |
| +-- events_y2024m03 (Mar 2024 data) |
| +-- events_y2024m04 (Apr 2024 data) |
| +-- events_y2024m05 (May 2024 data) <-- Hot (in memory) |
| +-- events_default (fallback) |
| |
| Partition pruning: WHERE timestamp >= '2024-05-01' |
| -> Only scans events_y2024m05 |
| -> ~30x faster for time-range queries |
| |
| Managed by: pg_partman extension |
| - Auto-create: 2 months ahead |
| - Auto-drop: After retention period (detach + archive) |
| |
+-----------------------------------------------------------------------------+
Partitioned Tables:
| Table | Partition Key | Partition Type | Retention |
|---|---|---|---|
events |
timestamp |
Monthly RANGE | 90 days hot, 1 year archive |
audit_logs |
timestamp |
Monthly RANGE | 1 year total |
system_health_logs |
timestamp |
Monthly RANGE | 90 days |
device_connectivity_logs |
timestamp |
Monthly RANGE | 90 days |
face_crops |
created_at |
Monthly RANGE | 90 days hot, 1 year archive |
7.6 Retention Policies
| Data Tier | Storage | Duration | Lifecycle |
|---|---|---|---|
| Hot Tier | PostgreSQL + MinIO | 0-30 days | Fast query, indexed, in-memory cache |
| Warm Tier | S3 Standard | 30-90 days | Available on-demand, still indexed |
| Cold Tier | S3 Infrequent Access | 90-365 days | Retrieval within minutes |
| Archive Tier | Glacier Deep Archive | 1-7 years | Retrieval within 12-48 hours |
| Compliance | Glacier Vault Lock | 7+ years | Immutable, legal hold |
Automated Cleanup:
| Task | Frequency | Mechanism |
|---|---|---|
| Expire old event partitions | Daily (pg_partman) | DETACH PARTITION + S3 upload |
| Delete expired media files | Daily | Cron job: DELETE from media_files + MinIO removal |
| Purge old notification logs | Weekly | DELETE WHERE created_at < NOW() - INTERVAL '90 days' |
| Archive face crops to S3 | Daily | Lambda: copy to S3 IA, update storage_path |
| Compress audit logs | Monthly | pglz/zstd compression on detached partitions |
| Vacuum and analyze | Weekly (auto-vacuum) | PostgreSQL autovacuum daemon |
7.7 Security Considerations
7.7.1 Credential Encryption
All sensitive credentials stored with AES-256 encryption:
| Table | Encrypted Field | Encryption |
|---|---|---|
dvrs |
password_encrypted |
AES-256-CBC, key from AWS Secrets Manager |
telegram_configs |
bot_token_encrypted |
AES-256-CBC |
whatsapp_configs |
api_key_encrypted |
AES-256-CBC |
7.7.2 Row-Level Security (RLS)
For multi-site deployments, RLS policies enforce that users only see data for sites they have access to:
-- Enable RLS on critical tables
ALTER TABLE events ENABLE ROW LEVEL SECURITY;
ALTER TABLE persons ENABLE ROW LEVEL SECURITY;
ALTER TABLE alerts ENABLE ROW LEVEL SECURITY;
-- Policy: Users see only data from their assigned sites
CREATE POLICY site_isolation_events ON events
USING (camera_id IN (
SELECT c.id FROM cameras c
JOIN dvrs d ON c.dvr_id = d.id
JOIN site_users su ON d.site_id = su.site_id
WHERE su.user_id = current_setting('app.current_user_id')::UUID
));
7.7.3 Access Control
| Role | Permissions |
|---|---|
super_admin |
Full access to all sites, all operations |
site_admin |
Full access to assigned sites, user management |
operator |
View dashboards, acknowledge alerts, review persons |
viewer |
Read-only access to dashboards and events |
7.7.4 Audit Trail
The audit_logs table (partitioned monthly) captures every significant action:
| Action | Captured Data |
|---|---|
login |
User, IP, timestamp, MFA status, success/failure |
person_create |
Creator, name, initial status, source event |
person_update |
Updater, changed fields, old/new values |
alert_acknowledge |
Acknowledger, alert ID, timestamp |
alert_resolve |
Resolver, resolution notes |
training_approve |
Approver, model version, dataset version |
model_deploy |
Deployer, version, A/B split percentage |
config_change |
Changer, changed parameters, old/new values |
7.7.5 Backup Strategy
| Component | Method | Frequency | Retention |
|---|---|---|---|
| PostgreSQL | RDS automated backups | Daily | 35 days |
| PostgreSQL | Manual snapshots | Before any schema change | 90 days |
| MinIO/S3 | Cross-region replication | Continuous | 90 days in DR region |
| Face embeddings | pg_dump + vector export | Weekly | 90 days |
| Model artifacts | MLflow artifact store | On training completion | Indefinite |
Reference: For complete DDL including all CREATE TABLE statements, triggers, views, and functions, see
database_schema.md— Sections 2 through 15 contain the full schema definition with comments and constraints.
Section 8: AI Model and Training Strategy
8.1 AI Model Selection
The inference pipeline uses three complementary deep learning models — for human detection, face detection, and face recognition — all optimized with TensorRT for GPU inference. All models run on a single NVIDIA T4 GPU with dynamic batching.
| Component | Model | Framework | Input Size | FPS (T4) | Accuracy |
|---|---|---|---|---|---|
| Human Detection | YOLO11m (Ultralytics) | PyTorch -> ONNX -> TensorRT FP16 | 640 x 640 | 213 | mAP@50: 80.5% (COCO) |
| Face Detection | SCRFD-500M-BNKPS (InsightFace) | PyTorch -> ONNX -> TensorRT FP16 | 640 x 640 | ~400 | AP_medium: 87.2% (WIDERFace) |
| Face Recognition | ArcFace R100 (IR-SE100) | PyTorch -> ONNX -> TensorRT FP16 | 112 x 112 | ~800 | 99.83% (LFW), 98.35% (MegaFace) |
| Person Tracking | ByteTrack | Native Python + NumPy | N/A | N/A | 80.3% MOTA (MOT17) |
| Unknown Clustering | HDBSCAN + DBSCAN fallback | scikit-learn | 512-D vectors | N/A | 89.5% purity, 0.855 BCubed F |
| Fall Detection | YOLOv8n-pose | TensorRT FP16 | 640 x 640 | ~300 | Part of suspicious activity |
| Object Detection | YOLOv8s | TensorRT FP16 | 640 x 640 | ~450 | Abandoned object detection |
8.1.1 Human Detection: YOLO11m
| Property | Value |
|---|---|
| Architecture | CSPDarknet backbone + PANet neck + Decoupled head |
| Parameters | 19.6 M |
| FLOPs | 68.2 B (at 640x640) |
| TensorRT Optimization | FP16, dynamic batch (1-16), layer fusion |
| GPU Memory | ~2.1 GB at batch=8 |
| Person class priority | Highest NMS score weighting for person class |
| Preprocessing | Letterbox resize to 640x640, normalize [0,1] |
Export pipeline:
# PyTorch -> ONNX -> TensorRT Engine
yolo export model=yolo11m.pt format=onnx imgsz=640 half=True opset=17 simplify=True
trtexec --onnx=yolo11m.onnx --saveEngine=yolo11m.engine --fp16 \
--minShapes=images:1x3x640x640 --optShapes=images:8x3x640x640 --maxShapes=images:16x3x640x640
8.1.2 Face Detection: SCRFD-500M-BNKPS
| Property | Value |
|---|---|
| Architecture | Single-stage detector with FPN, BN+KPS head |
| Parameters | 500 M (large variant for high accuracy) |
| Detects | Face bounding box + 5 facial landmarks |
| Minimum face size | 20 x 20 pixels (configurable) |
| NMS threshold | 0.45 (IoU) |
| Confidence threshold | 0.5 (minimum detection score) |
| GPU Memory | ~1.8 GB at batch=32 |
8.1.3 Face Recognition: ArcFace R100 (IR-SE100)
| Property | Value |
|---|---|
| Backbone | IR-SE100 (Improved ResNet-100 with SE blocks) |
| Training data | MS1MV3 (5.8M images, 85K identities) |
| Loss function | ArcFace additive angular margin (m=0.5) |
| Embedding dimension | 512 (float32, L2-normalized) |
| Distance metric | Cosine similarity (1 - cosine_distance) |
| Matching threshold (strict) | 0.60 |
| Matching threshold (balanced) | 0.45 |
| Matching threshold (relaxed) | 0.30 |
| GPU Memory | ~3.2 GB at batch=64 |
Published benchmarks on standard datasets:
| Dataset | Accuracy | Notes |
|---|---|---|
| LFW (Labeled Faces in the Wild) | 99.83% | Unconstrained face verification |
| CFP-FP (Frontal-Profile) | 99.17% | Cross-pose evaluation |
| AgeDB-30 | 98.28% | Age-invariant recognition |
| MegaFace (1M distractors) | 98.35% | Large-scale recognition |
| IJB-C | 96.18% (TAR@FAR=1e-4) | Template-based verification |
8.2 Inference Pipeline Architecture
+=============================================================================+
| REAL-TIME INFERENCE PIPELINE |
+=============================================================================+
| |
| INPUT: RTSP Frame (640x640, 1 fps per stream) |
| | |
| v |
| +-------------------+ +-------------------+ +-------------------+ |
| | Frame Preprocessor| -> | YOLO11m Detector | -> | Person Detection | |
| | - Resize | | (TensorRT FP16) | | Results: | |
| | - Normalize | | GPU: 12ms (P50) | | - bbox (x1,y1,x2, | |
| | - NCHW layout | | Batch: 1-16 | | y2) | |
| +-------------------+ +-------------------+ | - confidence | |
| | - class (person) | |
| +---------+---------+ |
| | |
| v |
| +-------------------+ +-------------------+ +-------------------+ |
| | Face Crop Extract | <- | SCRFD-500M | <- | Face Detection | |
| | (ROI from person | | (TensorRT FP16) | | Results: | |
| | bounding box) | | GPU: 8ms (P50) | | - face bbox | |
| | | | Batch: per-face | | - 5 landmarks | |
| +-------------------+ +-------------------+ | - confidence | |
| +---------+---------+ |
| | |
| v |
| +-------------------+ +-------------------+ +-------------------+ |
| | Face Alignment | <- | ArcFace R100 | <- | Embedding Vector | |
| | (5-point affine | | (TensorRT FP16) | | 512-D float32, | |
| | transform to | | GPU: 5ms (P50) | | L2-normalized | |
| | 112x112) | | Batch: 1-64 | | | |
| +-------------------+ +-------------------+ +---------+---------+ |
| | |
| v |
| +-------------------+ +-------------------+ +-------------------+ |
| | Face Matching | <- | Person Tracking | <- | Track-to-Person | |
| | (cosine similarity| | (ByteTrack) | | Association | |
| | vs. known DB) | | CPU: 2ms/frame | | - Match embedding | |
| +-------------------+ +-------------------+ | to known persons | |
| | | | | - Create/update | |
| | | | | track | |
| v v v +-------------------+ |
| +-------------------+ |
| | Confidence Scorer | |
| | (aggregate score | |
| | for all detect) | |
| +-------------------+ |
| | |
| v |
| OUTPUT: DetectionEvent (JSON) |
| { person_id, track_id, confidence, bbox, face_crop, |
| embedding, recognized_name?, quality_scores } |
| |
+=============================================================================+
End-to-end latency budget per frame:
| Stage | GPU | CPU Fallback |
|---|---|---|
| Frame preprocessing | 2-5 ms | 5-10 ms |
| YOLO11m detection | 12 ms (P50) | 35-56 ms (ONNX+OpenVINO) |
| SCRFD face detection | 8 ms (P50) | 15-25 ms |
| ArcFace embedding (per face) | 5 ms (P50) | 12-18 ms |
| ByteTrack tracking | 2 ms | 2-5 ms |
| Post-processing | 5-10 ms | 10-20 ms |
| Total (no face) | ~29 ms | ~67-116 ms |
| Total (1 face) | ~34 ms | ~79-134 ms |
| Total (5 faces) | ~54 ms | ~127-214 ms |
8.3 Face Recognition Matching Strategy
8.3.1 Known Person Matching
+-----------------------------------------------------------------------------+
| FACE RECOGNITION MATCHING FLOW |
+-----------------------------------------------------------------------------+
| |
| New Face Embedding (512-D) |
| | |
| v |
| +-------------------+ |
| | L2 Normalize | embedding = embedding / ||embedding||_2 |
| +-------------------+ |
| | |
| v |
| +-------------------+ +-------------------+ |
| | pgvector HNSW | -> | Top-5 Candidates | |
| | Similarity Search | | (cosine distance) | |
| | ef_search=128 | +-------------------+ |
| +-------------------+ | |
| v |
| +-------------------+ +-------------------+ |
| | Threshold Check | <- | Best Match Score | |
| | (per AI Vibe) | +-------------------+ |
| +-------------------+ | |
| | | |
| +------------+-------------+ |
| | |
| +----------+----------+ |
| | | |
| v v |
| Above threshold Below threshold |
| (Recognized) (Unknown) |
| | | |
| v v |
| +------------+ +------------------+ |
| | Assign to | | Check against | |
| | known | | recent unknown | |
| | person_id | | embeddings | |
| | (with | | (5-min window) | |
| | confidence)| +--------+---------+ |
| +------------+ | |
| | |
| +--------+--------+ |
| | | |
| v v |
| Similar unknown No similar unknown |
| (same person) (new unknown) |
| | | |
| v v |
| Reuse person_id Create new |
| Update centroid unknown person |
| record |
| |
+-----------------------------------------------------------------------------+
8.3.2 AI Vibe Threshold Mapping
The AI Vibe system maps three intuitive presets to internal confidence thresholds:
| Vibe | Face Match Threshold | Detection Confidence | Use Case |
|---|---|---|---|
| Relaxed | 0.30 cosine similarity | 0.40 minimum | Known persons re-identified more easily; more false positives acceptable |
| Balanced | 0.45 cosine similarity | 0.55 minimum | Default; good precision-recall tradeoff |
| Strict | 0.60 cosine similarity | 0.70 minimum | High-security scenarios; minimize false positives |
Per-stream Vibe Selection:
- Vibe can be set per camera via dashboard
- Night mode automatically applies Strict vibe
- Alert-triggered cameras automatically upgrade to Strict for 5 minutes
8.4 Unknown Person Clustering Approach
Unknown persons (faces that don't match any known person above threshold) are automatically clustered to help operators identify recurring visitors.
8.4.1 Clustering Pipeline
+-----------------------------------------------------------------------------+
| UNKNOWN PERSON CLUSTERING PIPELINE |
+-----------------------------------------------------------------------------+
| |
| Unknown Face Embeddings (streaming) |
| | |
| v |
| +-------------------+ |
| | Sliding Window | Keep last N embeddings in memory (configurable) |
| | Buffer (500) | + persistent storage for long-term clustering |
| +-------------------+ |
| | |
| v |
| +-------------------+ +-------------------+ |
| | HDBSCAN Clustering| -> | Primary clusters | min_cluster_size=5 |
| | (density-based) | | formed | min_samples=2 |
| | metric=cosine | +-------------------+ eps=auto |
| +-------------------+ | |
| | (fallback) | |
| v v |
| +-------------------+ +-------------------+ |
| | DBSCAN Fallback | | Merge with | Check: temporal gap |
| | (if HDBSCAN fails | | existing clusters | < 30 days, cosine sim |
| | to find structure| | - centroid | > 0.85 |
| +-------------------+ | distance | |
| +-------------------+ |
| | |
| v |
| +-------------------+ |
| | Operator Review | Dashboard shows clusters |
| | Queue | pending identification |
| +-------------------+ |
| |
+-----------------------------------------------------------------------------+
8.4.2 Clustering Parameters
| Parameter | Value | Description |
|---|---|---|
| Algorithm | HDBSCAN (primary), DBSCAN (fallback) | Density-based for irregular cluster shapes |
| Distance metric | Cosine similarity | Optimal for face embeddings |
| Minimum cluster size | 5 embeddings | Minimum to form a cluster |
| Minimum samples | 2 | Core point density threshold |
| Merge threshold | 0.85 cosine similarity | Merge clusters if centroids are close |
| Temporal window | 30 days | Maximum gap between cluster appearances |
| Review trigger | 10+ embeddings | Send to operator review queue |
8.4.3 Clustering Quality Targets
| Metric | Target | Measurement |
|---|---|---|
| Cluster Purity | > 89% | % of embeddings in a cluster belonging to the same person |
| BCubed F-Measure | > 0.85 | Harmonic mean of precision and recall for clustering |
| Silhouette Score | > 0.3 | Separation quality between clusters |
| False Merge Rate | < 5% | Different persons incorrectly merged |
| Split Rate | < 15% | Same person split into multiple clusters |
8.5 Confidence Handling
8.5.1 Confidence Score Computation
Each detection event carries an aggregate confidence score computed from multiple signals:
confidence_aggregate = weighted_average(
detection_confidence: 0.35 * yolo_confidence,
face_detection_quality: 0.25 * scrfd_confidence,
face_recognition_score: 0.25 * (1 - cosine_distance_to_match),
face_quality_score: 0.15 * quality_composite
)
Where quality_composite = average(
1.0 - blur_score, # Sharpness (higher is better)
1.0 - abs(pose_yaw)/90, # Frontal preference
illumination_score, # Well-lit face
resolution_adequacy # Sufficient pixels for face
)
8.5.2 Confidence Levels
| Level | Score Range | Color | Action |
|---|---|---|---|
| High Confidence | 0.80 - 1.00 | Green | Auto-accept, no review needed |
| Medium Confidence | 0.60 - 0.79 | Yellow | Accepted, flagged for periodic review |
| Low Confidence | 0.40 - 0.59 | Orange | Requires operator review within 24h |
| Very Low Confidence | 0.00 - 0.39 | Red | Rejected, not used for training |
8.6 Training Workflow Overview
The safe self-learning system captures operator feedback and converts it into model improvements through a carefully controlled pipeline.
8.6.1 Three Learning Modes
| Mode | Description | Use Case | Risk Level |
|---|---|---|---|
| Manual Only | Operator explicitly triggers training runs | Highly regulated environments | Lowest |
| Suggested Learning (Recommended) | System suggests training candidates; operator approves | Standard production deployment | Low |
| Approved Auto-Update | Auto-training triggers after admin approval threshold | Mature deployment with trusted operators | Medium |
8.6.2 Training Pipeline Architecture
+=============================================================================+
| SAFE SELF-LEARNING PIPELINE |
+=============================================================================+
| |
| STEP 1: COLLECTION |
| +-------------------+ |
| | Operator Review | confirm, correct_name, merge, reject |
| | Actions | + automatic high-confidence acceptances |
| +-------------------+ |
| | |
| v |
| STEP 2: CONFLICT DETECTION (Synchronous, blocks immediately) |
| +-------------------+ +-------------------+ +-------------------+ |
| | Label Conflict | -> | If conflict found | -> | Block from training | |
| | Detector | | (5 types) | | dataset, alert admin | |
| | - Same face, diff | +-------------------+ +-------------------+ |
| | names | |
| | - Diff faces, same| |
| | name | |
| | - Merge circular | |
| | reference | |
| | - Name to already-| |
| | deleted person | |
| | - Quality below | |
| | threshold | |
| +-------------------+ |
| | |
| v |
| STEP 3: DATASET CURATION |
| +-------------------+ |
| | Training Dataset | - Collect approved examples |
| | Builder | - Balance classes (min 5 per person) |
| | | - Augmentation (flip, rotate, brightness) |
| | | - Quality filter (blur, pose, illumination) |
| | | - Train/val split (80/20) |
| +-------------------+ |
| | |
| v |
| STEP 4: MODEL TRAINING |
| +-------------------+ |
| | Training Job | - ArcFace R100 backbone |
| | (Airflow DAG) | - Fine-tuning on curated dataset |
| | | - Cosine annealing LR schedule |
| | | - Early stopping (patience=10) |
| | | - Mixed precision (AMP) |
| | | - Typical duration: 2-8 hours on V100 |
| +-------------------+ |
| | |
| v |
| STEP 5: QUALITY GATES |
| +-------------------+ +-------------------+ +-------------------+ |
| | Gate 1: Hold-out | -> | Gate 2: Compare | -> | Gate 3: Identity | |
| | evaluation | | vs current | | accuracy | |
| | (precision, | | production | | (100% known) | |
| | recall, f1) | | (no >2% regress)| | | |
| +-------------------+ +-------------------+ +-------------------+ |
| | | | |
| +------------+-------------+--------------------------+ |
| | |
| +----------+----------+ |
| | | |
| v v |
| ALL PASSED ANY FAILED |
| | | |
| v v |
| +------------+ +------------------+ |
| | Proceed to | | REJECT | |
| | Deployment | | - Log failure | |
| +------------+ | - Alert admin | |
| | - Keep in staging| |
| +------------------+ |
| |
| STEP 6: DEPLOYMENT |
| +-------------------+ |
| | A/B Testing | - Shadow mode: 0% traffic (validation) |
| | (gradual rollout) | - Canary: 5% traffic for 24h |
| | | - Monitor: latency, error rate, FP rate |
| | | - Full rollout: 100% traffic |
| | | - Rollback: < 60 seconds to previous version |
| +-------------------+ |
| |
+=============================================================================+
8.7 Model Versioning and Rollback
8.7.1 Semantic Versioning
| Version Component | Increment When | Example |
|---|---|---|
| MAJOR (X.0.0) | Full retraining, architecture change, breaking embedding change | 1.0.0 -> 2.0.0 (new backbone) |
| MINOR (x.Y.0) | Fine-tuning, significant new data (>50 new identities) | 1.0.0 -> 1.1.0 (new employees) |
| PATCH (x.y.Z) | Incremental update, centroid update, hotfix | 1.0.0 -> 1.0.1 (new photos added) |
8.7.2 Version States
| State | Description | Transition |
|---|---|---|
TRAINING |
Model is being trained | Auto -> STAGING on completion |
STAGING |
Awaiting quality gate evaluation | Auto -> AWAITING_APPROVAL on pass |
AWAITING_APPROVAL |
Pending admin approval | Manual -> CANARY on approve |
CANARY |
5% traffic, monitoring | Auto -> PRODUCTION on success (24h) |
PRODUCTION |
100% traffic, active serving | Manual -> ARCHIVED on new version deploy |
ARCHIVED |
Kept for rollback, no traffic | Auto -> ROLLBACK_AVAILABLE after 30 days |
ROLLBACK_AVAILABLE |
Can be rolled back to | Manual -> PRODUCTION on rollback trigger |
DEPRECATED |
Cannot be rolled back to | Final state |
8.7.3 Rollback Procedure
+-----------------------------------------------------------------------------+
| EMERGENCY ROLLBACK PROCEDURE |
+-----------------------------------------------------------------------------+
| |
| Trigger: Admin initiates rollback or automatic rollback on failure |
| |
| Step 1: Validate target version exists and is in ROLLBACK_AVAILABLE state |
| Step 2: Load target model artifacts from S3/MinIO (pre-warm GPU) |
| Step 3: Atomic switch: update model reference in Triton config |
| Step 4: Triton SIGHUP reload (zero-downtime model swap) |
| Step 5: Validate: send test inference requests, check latency |
| Step 6: If validation fails -> auto-revert to previous production |
| Step 7: If validation passes -> update database model version records |
| Step 8: Log rollback event in audit_logs |
| |
| Maximum rollback time: < 60 seconds |
| Zero inference downtime during rollback |
| |
+-----------------------------------------------------------------------------+
8.8 Quality Gates
8.8.1 Gate Thresholds
| Gate | Metric | Minimum | Maximum | Critical |
|---|---|---|---|---|
| Hold-out Evaluation | Precision | 0.97 | — | Yes (cannot override) |
| Hold-out Evaluation | Recall | 0.95 | — | Yes |
| Hold-out Evaluation | F1 Score | 0.96 | — | Yes |
| No Regression | Metric regression vs production | — | 2% | No (admin can override) |
| Identity Accuracy | Known identity recall | 100% | — | Yes |
| Latency | P99 inference latency | — | 150 ms | Yes |
| Confusion Analysis | False positive rate | — | 5% | No |
8.8.2 Quality Gate Report Example
{
"gate_run_id": "550e8400-e29b-41d4-a716-446655440000",
"candidate_model_version": "1.2.0",
"baseline_model_version": "1.1.0",
"timestamp": "2024-01-25T10:30:00Z",
"overall_result": "PASSED",
"gates": [
{
"name": "holdout_performance",
"status": "PASSED",
"critical": true,
"metrics": {
"precision": 0.9842,
"recall": 0.9678,
"f1_score": 0.9759
}
},
{
"name": "no_regression",
"status": "PASSED",
"metrics": {
"max_regression_pct": 0.8,
"per_metric": {
"precision": 0.003,
"recall": -0.008,
"f1_score": -0.002
}
}
},
{
"name": "known_identity_accuracy",
"status": "PASSED",
"metrics": {
"known_identities_tested": 142,
"perfect_accuracy": 142,
"accuracy_below_threshold": 0
}
},
{
"name": "latency_requirement",
"status": "PASSED",
"metrics": {
"p50_latency_ms": 45,
"p99_latency_ms": 128,
"threshold_ms": 150
}
}
]
}
8.8.3 Embedding Update Strategies
After a model passes quality gates and is deployed, the face embedding database must be updated. Five strategies are available:
| Strategy | When to Use | Duration | Impact |
|---|---|---|---|
| Centroid Update | Few new examples (<10 per identity), same model | Seconds | Update running mean only |
| Incremental Add | Many new examples (10-100 per identity), same model | Minutes | Add new embeddings, keep existing |
| Full Reindex | Model version changed, or >10% of identities updated | Hours | Recompute all embeddings |
| Merge and Update | Identity merge operation | Seconds | Weighted centroid merge |
| Rollback Reindex | Model rollback | Minutes | Restore previous embeddings |
Decision Matrix:
+-----------------------------------------------------------------------------+
| EMBEDDING UPDATE STRATEGY SELECTION |
+-----------------------------------------------------------------------------+
| |
| Model changed? |
| | |
| +-- YES -> FULL_REINDEX (required, embeddings are model-dependent) |
| | |
| NO -> What changed? |
| | |
| +-- Identity merge -> MERGE_AND_UPDATE |
| | |
| +-- Rollback -> ROLLBACK_REINDEX |
| | |
| +-- New examples? |
| | |
| +-- < 10 per identity, < 10% total -> CENTROID_UPDATE |
| | |
| +-- Otherwise -> INCREMENTAL_ADD |
| |
+-----------------------------------------------------------------------------+
Reference: For complete model export commands, INT8 calibration scripts, performance benchmarks, and the full Python module structure, see
ai_vision.md— Sections 10-14. For the complete training pipeline code, Airflow DAG definitions, and quality gate implementations, seetraining_system.md— Sections 5-10.
Section 9: Suspicious Activity Night-Mode Design
9.1 Overview
The suspicious activity detection system provides comprehensive behavioral analysis during night hours (22:00-06:00 by default) through 10 specialized detection modules. Each module operates on the output of the AI inference pipeline (detected persons, tracked positions, and face identities) to identify anomalous behavior patterns.
The system features a composite scoring engine that combines signals from all modules with exponential time-decay, enabling unified threat assessment and intelligent escalation. Each camera can be independently configured with custom zones, thresholds, and schedules.
9.2 Ten Detection Modules Summary
| # | Module | Description | Severity | Key CV Model |
|---|---|---|---|---|
| 1 | Intrusion Detection | Detects persons entering restricted polygon zones | HIGH (default) | YOLO11m detections + zone polygon |
| 2 | Loitering Detection | Flags persons dwelling in an area longer than threshold | MEDIUM (default) | ByteTrack + timer per track |
| 3 | Running Detection | Identifies abnormally fast movement | MEDIUM (default) | YOLOv8n-pose + optical flow speed |
| 4 | Crowding Detection | Alerts when group density exceeds threshold | HIGH (default) | DBSCAN spatial clustering |
| 5 | Fall Detection | Detects persons falling or collapsing | CRITICAL | YOLOv8n-pose keypoint analysis |
| 6 | Abandoned Object | Identifies unattended objects left behind | HIGH (default) | YOLOv8s + MOG2 background subtraction |
| 7 | After-Hours Presence | Detects any person presence during night hours | MEDIUM (default) | YOLO11m person class only |
| 8 | Zone Breach | Triggers on crossing virtual boundary lines | MEDIUM (default) | ByteTrack + line crossing algorithm |
| 9 | Repeated Re-entry | Flags patterns of entering/exiting an area multiple times | MEDIUM (default) | ByteTrack + entry/exit state machine |
| 10 | Suspicious Dwell Time | Alerts on extended presence near sensitive areas | MEDIUM (configurable) | ByteTrack + per-zone timers |
9.3 Module Details
9.3.1 Module 1: Intrusion Detection
Detects when a person enters a user-defined restricted polygon zone.
| Parameter | Default | Range | Description |
|---|---|---|---|
confidence_threshold |
0.55 | 0.3-0.9 | Minimum person detection confidence |
overlap_threshold |
0.30 | 0.1-0.9 | Min IoU between person bbox and zone |
cooldown_seconds |
60 | 0-3600 | Cooldown before re-alerting same zone |
zone_severity |
HIGH | LOW/MEDIUM/HIGH | Per-zone configurable |
Algorithm:
For each detected person:
For each restricted zone polygon:
Compute IoU(person_bbox, zone_polygon)
If IoU > overlap_threshold AND confidence > confidence_threshold:
If zone not in cooldown:
Trigger INTRUSION alert
Start cooldown timer
9.3.2 Module 2: Loitering Detection
Flags persons who remain in an area longer than a threshold.
| Parameter | Default | Range | Description |
|---|---|---|---|
dwell_time_threshold_seconds |
300 | 30-1800 | Time before triggering loitering alert |
movement_tolerance_pixels |
50 | 10-200 | Max centroid movement to still count as "stationary" |
cooldown_seconds |
300 | 0-3600 | Cooldown after alert |
Algorithm:
For each active track:
If track centroid moved < tolerance in last N seconds:
Increment dwell timer
If dwell_timer > threshold:
Trigger LOITERING alert
Reset timer (or hold until movement detected)
Else:
Reset dwell timer
9.3.3 Module 3: Running Detection
Identifies abnormally fast movement using pose keypoints and optical flow.
| Parameter | Default | Range | Description |
|---|---|---|---|
speed_threshold_pixels_per_second |
150 | 50-500 | Pixel speed threshold |
speed_threshold_kmh |
15.0 | 5-40 | Real-world speed (requires calibration) |
confirmation_frames |
3 | 1-10 | Consecutive frames to confirm running |
Algorithm:
For each active track:
Compute torso keypoint displacement between frames
Convert pixel speed to km/h (if calibration available)
Apply Farneback optical flow for refinement
If speed > threshold for confirmation_frames:
Trigger RUNNING alert
9.3.4 Module 4: Crowding Detection
Alerts when person group density exceeds threshold.
| Parameter | Default | Range | Description |
|---|---|---|---|
count_threshold |
5 | 2-50 | Minimum person count in cluster |
area_threshold |
0.15 | 0.05-0.5 | Fraction of frame covered by group |
density_threshold |
0.05 | 0.01-0.2 | Persons per square meter (calibrated) |
dbscan_eps |
0.08 | 0.01-0.3 | DBSCAN neighborhood radius (normalized) |
Algorithm:
Collect all person centroids in current frame
Run DBSCAN(eps=0.08, min_samples=2) on centroids
For each cluster:
If cluster_size >= count_threshold OR cluster_area >= area_threshold:
Trigger CROWDING alert
9.3.5 Module 5: Fall Detection
Detects persons falling or collapsing using pose keypoint analysis.
| Parameter | Default | Range | Description |
|---|---|---|---|
fall_score_threshold |
0.75 | 0.5-0.95 | Combined fall confidence score |
min_keypoint_confidence |
0.30 | 0.1-0.5 | Minimum keypoint detection confidence |
torso_angle_threshold_deg |
45 | 30-75 | Torso angle from vertical to trigger |
aspect_ratio_threshold |
1.2 | 0.8-2.0 | Width/height ratio of person bbox |
temporal_confirmation_ms |
1000 | 500-3000 | Duration to confirm fall (not just bend) |
Algorithm:
For each detected person with pose keypoints:
Compute torso angle from vertical (using shoulder-hip line)
Compute bbox aspect ratio
Check if person is on ground (feet keypoint confidence drops)
Calculate fall_score = weighted_combination(angle, aspect_ratio, ground_contact)
If fall_score > threshold AND duration > confirmation_ms:
Trigger FALL alert (CRITICAL severity)
9.3.6 Module 6: Abandoned Object Detection
Identifies unattended objects using background subtraction and object detection.
| Parameter | Default | Range | Description |
|---|---|---|---|
unattended_time_threshold_seconds |
60 | 10-600 | Time before object is considered abandoned |
proximity_threshold_pixels |
100 | 20-300 | Max distance from owner before "unattended" |
watchlist_classes |
["backpack", "suitcase", "box", "bag"] | — | Object classes to monitor |
bg_learning_rate |
0.005 | 0.001-0.01 | MOG2 background model learning rate |
Algorithm:
Run YOLOv8s to detect objects in watchlist_classes
Run MOG2 background subtraction to identify static foreground
For each detected object:
Track owner proximity (nearest person)
If owner distance > threshold AND object stationary > time_threshold:
Trigger ABANDONED_OBJECT alert
9.3.7 Module 7: After-Hours Presence
Simple but effective: any person detected during night hours triggers an alert.
| Parameter | Default | Range | Description |
|---|---|---|---|
detection_confidence_threshold |
0.50 | 0.3-0.9 | Minimum person detection confidence |
min_detection_frames |
5 | 1-30 | Frames to confirm (avoid false positives) |
check_authorized_personnel |
false | true/false | If true, check against known persons whitelist |
9.3.8 Module 8: Zone Breach
Detects crossing of virtual boundary lines (directional or bidirectional).
| Parameter | Default | Range | Description |
|---|---|---|---|
boundary_lines |
[] (user-defined) | — | Array of {start, end, direction, severity} |
allowed_direction |
"both" | both/a_to_b/b_to_a | Which direction is allowed |
crossing_threshold_pixels |
20 | 5-100 | Min distance past line to trigger |
cooldown_seconds |
30 | 0-3600 | Cooldown per (track, line) pair |
Algorithm:
For each active track:
For each boundary line:
Check if track centroid crosses line in forbidden direction
Using line equation: ax + by + c = 0, check sign change
If crossed AND distance_past_line > threshold:
Trigger ZONE_BREACH alert
9.3.9 Module 9: Repeated Re-entry Patterns
Detects suspicious patterns of entering and exiting an area multiple times.
| Parameter | Default | Range | Description |
|---|---|---|---|
reentry_zone |
Full frame | polygon | Area to monitor for entries/exits |
time_window_seconds |
600 | 60-3600 | Time window for counting cycles |
reentry_threshold |
3 | 2-10 | Min entry/exit cycles to trigger |
min_cycle_duration_seconds |
30 | 5-300 | Min duration of one cycle |
State Machine:
For each track:
Track state: OUTSIDE -> ENTERING -> INSIDE -> EXITING -> OUTSIDE
Each complete cycle (entry + exit) increments counter
If cycle_count >= threshold within time_window:
Trigger REENTRY_PATTERN alert
9.3.10 Module 10: Suspicious Dwell Time
Extended presence near sensitive areas (different from general loitering).
| Parameter | Default | Range | Description |
|---|---|---|---|
sensitive_zones |
[] (user-defined) | — | Zones with custom dwell thresholds |
default_dwell_threshold_seconds |
120 | 10-1800 | Default threshold |
max_gap_seconds |
5.0 | 1.0-30.0 | Max disappearance gap before timer reset |
Predefined zone types with default thresholds:
| Zone Type | Default Threshold | Default Severity |
|---|---|---|
main_entrance |
60s | MEDIUM |
emergency_exit |
30s | HIGH |
equipment_room |
45s | HIGH |
storage_area |
120s | MEDIUM |
elevator_bank |
90s | LOW |
parking_access |
60s | MEDIUM |
9.4 Activity Scoring Engine
9.4.1 Composite Score Formula
All 10 modules feed into a unified scoring engine that produces a single suspicious activity score per camera:
S_total(t) = SUM_i( weight_i * signal_i(t) * decay(t - t_i) ) + bonus_cross_module
Where:
weight_i: module-specific weight (see table below)
signal_i(t): normalized signal value from module i [0, 1]
decay(delta_t): exponential time-decay function
bonus_cross_module: extra score when multiple modules fire simultaneously
t_i: timestamp of most recent event from module i
9.4.2 Module Weights
| Module | Weight | Signal Source | Signal Range |
|---|---|---|---|
| Intrusion Detection | 0.25 | overlap_ratio * confidence | 0.0 - 1.0 |
| Loitering Detection | 0.15 | dwell_ratio (dwell_time / threshold) | 0.0 - 1.0+ |
| Running Detection | 0.10 | speed_ratio normalized | 0.0 - 1.0+ |
| Crowding Detection | 0.12 | crowd_density_score | 0.0 - 1.0 |
| Fall Detection | 0.20 | fall_confidence_score | 0.0 - 1.0 |
| Abandoned Object | 0.18 | unattended_ratio (duration / threshold) | 0.0 - 1.0+ |
| After-Hours Presence | 0.05 | binary (1 if detected) * zone_severity_multiplier | 0.0 - 1.0 |
| Zone Breach | 0.12 | severity_mapped (LOW=0.3, MED=0.6, HIGH=1.0) | 0.0 - 1.0 |
| Re-entry Patterns | 0.10 | cycle_ratio (count / threshold) | 0.0 - 1.0+ |
| Suspicious Dwell | 0.13 | dwell_ratio (duration / zone_threshold) | 0.0 - 1.0+ |
Note: Weights sum to 1.40 — this is intentional to allow cross-module amplification when multiple modules fire simultaneously.
9.4.3 Time-Decay Function
def time_decay(delta_t_seconds, half_life=300):
"""Exponential decay with 5-minute half-life by default."""
import math
return math.exp(-0.693 * delta_t_seconds / half_life)
# Decay reference:
# 0 min -> 1.000 (full contribution)
# 1 min -> 0.871
# 5 min -> 0.500
# 10 min -> 0.250
# 20 min -> 0.063
# 30 min -> 0.016 (effectively zero)
9.4.4 Cross-Module Amplification Bonus
When multiple modules detect simultaneously for the same track or in close proximity:
def compute_cross_module_bonus(active_signals, proximity_weight=0.15):
n_modules = len(active_signals)
if n_modules <= 1:
return 0.0
# Base bonus: +15% per additional module
base_bonus = proximity_weight * (n_modules - 1)
# Track overlap: same person triggering multiple rules -> higher threat
track_bonus = 0.10 * (n_same_track_signals - 1) if n_same_track_signals >= 2 else 0
# Zone overlap: multiple signals in same zone -> higher threat
zone_bonus = 0.08 * (n_same_zone_signals - 1) if n_same_zone_signals >= 2 else 0
return min(base_bonus + track_bonus + zone_bonus, 0.50) # Cap at +0.50
9.4.5 Escalation Thresholds
| Score Range | Threat Level | Color | Actions |
|---|---|---|---|
| 0.00 - 0.20 | NONE | Gray | Log only, no alert |
| 0.20 - 0.40 | LOW | Blue | Log + dashboard indicator |
| 0.40 - 0.60 | MEDIUM | Yellow | Log + non-urgent alert dispatch |
| 0.60 - 0.80 | HIGH | Orange | Log + immediate alert + highlight |
| 0.80 - 1.00 | CRITICAL | Red | Log + all channels + security dispatch recommendation |
| > 1.00 | EMERGENCY | Purple/Flashing | All channels + automatic escalation to security lead |
9.5 Night Mode Scheduler
9.5.1 Automatic Schedule
| Parameter | Default | Configurable |
|---|---|---|
| Start time | 22:00 (10 PM) | Yes, per camera |
| End time | 06:00 (6 AM) | Yes, per camera |
| Gradual transition | 15 minutes | Yes (0-60 min) |
| Timezone | Local site timezone | Yes |
| Override | Manual toggle available | Admin only |
9.5.2 Gradual Transition
During the 15-minute transition window, sensitivity ramps linearly:
Transition Start (21:45) Night Full (22:00) Transition End (22:15)
| | |
v v v
Sensitivity: 0% ---- 25% ---- 50% ---- 75% ---- 100% ---- 100% ---- 100%
|__________|__________|__________|__________|__________|
Ramp up to full night sensitivity over 15 minutes
This prevents sudden spikes in alerts when night mode activates.
9.5.3 Night Mode Behavior Changes
| Aspect | Day Mode | Night Mode |
|---|---|---|
| Detection modules | Intrusion, Crowding, Fall, Abandoned Object | All 10 modules active |
| AI Vibe preset | Per-camera setting | Automatically Strict |
| Confidence threshold | Per-camera setting | +0.10 (stricter) |
| Scoring engine weights | Standard weights | +25% intrusion, +20% fall |
| Alert suppression | 5-minute cooldown | 2-minute cooldown (faster alerts) |
| After-hours detection | Disabled | Enabled (primary night function) |
9.6 Per-Camera Configuration
Each camera has independent configuration for all detection modules:
# Example: Camera 1 - Main Entrance
cam_01:
enabled: true
location: "Main Entrance Lobby"
night_mode:
enabled: true
custom_schedule: null # Use system default (22:00-06:00)
sensitivity_multiplier: 1.0 # Standard sensitivity
intrusion_detection:
enabled: true
confidence_threshold: 0.65
overlap_threshold: 0.30
cooldown_seconds: 30
restricted_zones:
- zone_id: "server_room_door"
polygon: [[0.65,0.20], [0.85,0.20], [0.85,0.60], [0.65,0.60]]
severity: "HIGH"
loitering_detection:
enabled: true
dwell_time_threshold_seconds: 300
movement_tolerance_pixels: 50
running_detection:
enabled: true
speed_threshold_pixels_per_second: 150
confirmation_frames: 3
fall_detection:
enabled: true
fall_score_threshold: 0.75
temporal_confirmation_ms: 1000
# ... (all 10 modules configured)
9.7 Alert Generation Logic
9.7.1 Alert Lifecycle
+------------+ +------------+ +------------+ +------------+
| DETECTED | -> | SUPPRESSED | -> | EVIDENCE | -> | DISPATCHED |
| (Rule fire)| | (Dedup) | | (Capture) | | (Send) |
+------------+ +------------+ +------------+ +------------+
|
v
+------------+
| ACKNOWLEDGE|
| or AUTO |
+------------+
9.7.2 Suppression Rules
| Condition | Action | Reason |
|---|---|---|
| Duplicate within suppression window | Log + increment counter | Prevent alert spam |
| Detection confidence < rule minimum | Log only | Insufficient evidence |
| Threat score < LOW threshold | Log only | Below alert threshold |
| Max alerts/hour for camera exceeded | Log + rate-limit flag | Prevent overflow |
| Composite score indicates low overall threat | Log + dashboard only | Reduce noise |
9.7.3 Suppression Configuration
| Parameter | Default | Range |
|---|---|---|
| Default suppression window | 5 minutes | 0-60 minutes |
| Max alerts per hour per camera | 20 | 5-100 |
| Max alerts per hour per rule | 10 | 5-50 |
| Evidence snapshot frames before | 5 frames | 1-30 |
| Evidence snapshot frames after | 10 frames | 1-30 |
| Evidence clip duration | 10 seconds | 5-60 |
9.7.4 Severity Assignment
Final alert severity considers both the triggering module and the composite score context:
def assign_alert_severity(detection_event, composite_score):
base_severity = detection_event['severity'] # From module config
severity_levels = {'LOW': 1, 'MEDIUM': 2, 'HIGH': 3, 'CRITICAL': 4}
base_level = severity_levels.get(base_severity, 2)
# Escalation: high composite score bumps severity up one level
if composite_score >= 0.80 and base_level < 3:
base_level = min(base_level + 1, 4)
# Escalation: multiple concurrent detections for same track
if detection_event.get('concurrent_detections_count', 0) >= 2:
base_level = min(base_level + 1, 4)
# Zone-specific escalation override
if detection_event.get('zone_severity_override'):
zone_level = severity_levels.get(detection_event['zone_severity_override'], base_level)
base_level = max(base_level, zone_level)
reverse_levels = {v: k for k, v in severity_levels.items()}
return reverse_levels.get(base_level, 'MEDIUM')
9.8 Integration with Main AI Pipeline
The suspicious activity service consumes detection events from the main AI pipeline:
+-----------------------------------------------------------------------------+
| SUSPICIOUS ACTIVITY INTEGRATION WITH MAIN PIPELINE |
+-----------------------------------------------------------------------------+
| |
| Main AI Pipeline Output: |
| { person_id, track_id, bbox, keypoints, face_embedding, timestamp, |
| camera_id, confidence, face_crop_path } |
| | |
| v |
| +-------------------+ +-------------------+ +-------------------+ |
| | Kafka Topic | -> | Suspicious Activity| -> | Scoring Engine | |
| | ai.detections | | Service | | (per camera) | |
| | (JSON events) | | - 10 modules | | - Composite score | |
| +-------------------+ | - Per-camera config| | - Time decay | |
| | - Zone polygons | | - Cross-module | |
| +-------------------+ | bonus | |
| +---------+---------+ |
| | |
| v |
| +-------------------+ +-------------------+ |
| | Alert Manager | <- | Scoring Output | |
| | - Deduplicate | | - Score [0, 1.5] | |
| | - Rate limit | | - Threat level | |
| | - Severity assign | | - Active signals | |
| +---------+---------+ +-------------------+ |
| | |
| v |
| +-------------------+ |
| | Alerts Table (DB) | |
| | Notification Svc | |
| +-------------------+ |
| |
+-----------------------------------------------------------------------------+
Key integration points:
- Suspicious Activity Service is a Kafka consumer on the
ai.detectionstopic - Processes events after face recognition (has access to person identity)
- Produces alert records to the
alerts.criticaltopic for notification dispatch - Updates the composite score in Redis (with TTL = 2 * half_life) for dashboard real-time display
- Stores all alert records in PostgreSQL for history and analytics
Reference: For complete detection algorithm pseudocode, zone configuration YAML schema, scoring engine implementation, and evidence capture logic, see
suspicious_activity.md— Sections 2-6.
Section 10: Live Video Streaming Design
10.1 RTSP Stream Configuration for CP PLUS DVR
10.1.1 URL Format
The CP PLUS ORANGE DVR uses a Dahua-compatible RTSP URL scheme:
rtsp://admin:{password}@{dvr_ip}:554/cam/realmonitor?channel={N}&subtype={M}
Where:
N = channel number (1-8)
M = stream type (0 = main stream, 1 = sub stream)
Example URLs for all 8 channels:
| Channel | Main Stream | Sub Stream |
|---|---|---|
| CH1 | rtsp://admin:password@192.168.29.200:554/cam/realmonitor?channel=1&subtype=0 |
rtsp://admin:password@192.168.29.200:554/cam/realmonitor?channel=1&subtype=1 |
| CH2 | ...channel=2&subtype=0 |
...channel=2&subtype=1 |
| CH3 | ...channel=3&subtype=0 |
...channel=3&subtype=1 |
| CH4 | ...channel=4&subtype=0 |
...channel=4&subtype=1 |
| CH5 | ...channel=5&subtype=0 |
...channel=5&subtype=1 |
| CH6 | ...channel=6&subtype=0 |
...channel=6&subtype=1 |
| CH7 | ...channel=7&subtype=0 |
...channel=7&subtype=1 |
| CH8 | ...channel=8&subtype=0 |
...channel=8&subtype=1 |
10.1.2 Stream Properties
| Property | Main Stream (subtype=0) | Sub Stream (subtype=1) |
|---|---|---|
| Resolution | 960 x 1080 | 352 x 288 to 704 x 576 |
| Frame rate | 25 FPS (PAL) | 25 FPS |
| Video codec | H.264 High Profile | H.264 Baseline/Main |
| Bitrate | ~4 Mbps per channel | ~1 Mbps per channel |
| Audio | G.711/AAC (optional) | None |
| Use case | Fullscreen viewing, evidence clips | AI inference, multi-camera grid |
10.1.3 Stream Discovery
The edge gateway can auto-discover streams via ONVIF:
from onvif import ONVIFCamera
camera = ONVIFCamera('192.168.29.200', 80, 'admin', 'password')
media_service = camera.create_media_service()
profiles = media_service.GetProfiles()
for profile in profiles:
stream_uri = media_service.GetStreamUri({
'StreamSetup': {'Stream': 'RTP_unicast', 'Transport': 'RTSP'},
'ProfileToken': profile.token
})
print(f"Channel: {profile.token}, URI: {stream_uri.Uri}")
10.2 Edge Gateway Stream Handling
10.2.1 FFmpeg Ingestion Pipeline
The edge gateway runs one FFmpeg process per camera stream:
# Main stream: HLS generation for live viewing
ffmpeg -hide_banner -loglevel warning \
-rtsp_transport tcp -stimeout 5000000 \
-fflags +genpts+discardcorrupt+igndts+ignidx \
-reorder_queue_size 64 -buffer_size 655360 \
-i "rtsp://admin:password@192.168.29.200:554/cam/realmonitor?channel=1&subtype=0" \
-c:v copy -c:a copy \
-f hls -hls_time 2 -hls_list_size 5 -hls_delete_threshold 2 \
-hls_flags delete_segments+omit_endlist+program_date_time \
-hls_segment_filename "/data/hls/ch1_%04d.ts" \
"/data/hls/ch1.m3u8" \
2>> /var/log/ffmpeg_ch1.log
10.2.2 Stream Health Monitoring
| Check | Frequency | Failure Action |
|---|---|---|
| FFmpeg process alive | Every 5s | Restart process |
| RTSP connection health | Every 10s | Reconnect with backoff |
| Frame rate validation | Every 30s | Alert if FPS < 20 |
| Bitrate validation | Every 30s | Alert if bitrate < 50% expected |
| Disk space check | Every 60s | Alert if < 10% free, emergency if < 5% |
10.2.3 Auto-Reconnect Logic
class StreamReconnectManager:
"""Handles RTSP stream reconnection with exponential backoff."""
INITIAL_BACKOFF = 1.0 # seconds
MAX_BACKOFF = 60.0 # seconds
BACKOFF_MULTIPLIER = 2.0
JITTER = 0.1 # 10% random jitter
def __init__(self):
self.current_backoff = self.INITIAL_BACKOFF
self.consecutive_failures = 0
def on_disconnect(self):
self.consecutive_failures += 1
wait_time = min(
self.current_backoff * (self.BACKOFF_MULTIPLIER ** self.consecutive_failures),
self.MAX_BACKOFF
)
# Add jitter to prevent thundering herd
wait_time *= (1 + random.uniform(-self.JITTER, self.JITTER))
return wait_time
def on_success(self):
self.current_backoff = self.INITIAL_BACKOFF
self.consecutive_failures = 0
def should_circuit_break(self):
return self.consecutive_failures >= 5 # Open circuit after 5 failures
10.3 HLS Generation for Dashboard
10.3.1 HLS Segment Configuration
| Parameter | Value | Rationale |
|---|---|---|
Segment duration (-hls_time) |
2 seconds | Balance between latency and segment count |
Playlist size (-hls_list_size) |
5 segments | 10-second sliding window for live playback |
| Delete threshold | 2 segments beyond playlist size | Disk cleanup |
| Flags | delete_segments+omit_endlist+program_date_time |
Live mode, no end list, accurate timing |
| Segment naming | ch{N}_%04d.ts |
Sequential numbering for cache busting |
| Segment path | /data/hls/ |
Fast NVMe storage |
10.3.2 Multi-Bitrate HLS (Optional)
For adaptive bitrate streaming, three variants are generated per channel:
# High quality (main stream, copy codec)
ffmpeg -i "rtsp://...channel=1&subtype=0" -c:v copy -f hls -hls_time 2 \
-hls_playlist_type vod -hls_segment_filename "ch1_high_%04d.ts" "ch1_high.m3u8"
# Medium quality (transcoded)
ffmpeg -i "rtsp://...channel=1&subtype=0" -c:v libx264 -preset fast -crf 23 \
-vf "scale=640:480" -f hls -hls_time 2 \
-hls_segment_filename "ch1_mid_%04d.ts" "ch1_mid.m3u8"
# Low quality (sub stream)
ffmpeg -i "rtsp://...channel=1&subtype=1" -c:v copy -f hls -hls_time 2 \
-hls_segment_filename "ch1_low_%04d.ts" "ch1_low.m3u8"
Master playlist:
#EXTM3U
#EXT-X-STREAM-INF:BANDWIDTH=4000000,RESOLUTION=960x1080
ch1_high.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=1500000,RESOLUTION=640x480
ch1_mid.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=500000,RESOLUTION=352x288
ch1_low.m3u8
10.3.3 HLS Latency Budget
| Stage | Latency |
|---|---|
| DVR encoding | 50-100 ms |
| RTSP to edge | 1-2 ms |
| FFmpeg demux/remux | 20-50 ms |
| HLS segment duration | 2000 ms (2-second segments) |
| Nginx/CDN delivery | 10-50 ms |
| HLS.js buffer | 2000-4000 ms (1-2 segments) |
| Browser decode + render | 20-50 ms |
| Total (camera to eye) | ~2.1 - 2.3 seconds |
10.4 WebRTC for Low-Latency Single Camera
For single-camera fullscreen viewing where low latency is critical, WebRTC provides sub-second delivery.
10.4.1 WebRTC Architecture
+------------+ +-------------------+ +-------------------+ +--------+
| Browser | | Edge Gateway | | FFmpeg | | DVR |
| (WebRTC |<-->| (WHIP/WHEP |<-->| (decode RTSP, |<-->| RTSP |
| client) | | bridge) | | encode VP8/H.264)| | Server |
+------------+ +-------------------+ +-------------------+ +--------+
10.4.2 WebRTC Configuration
| Parameter | Value |
|---|---|
| Signaling protocol | WHIP ( ingress) / WHEP (egress) |
| Video codec | H.264 (hardware) or VP8 (software) |
| Latency target | < 500 ms end-to-end |
| ICE servers | STUN only (both peers behind NAT) |
| Max bitrate | 3 Mbps |
| Resolution | 960x1080 (main stream) |
10.4.3 WebRTC Latency Budget
| Stage | Latency |
|---|---|
| DVR encoding | 50-100 ms |
| RTSP to edge | 1-2 ms |
| FFmpeg decode + WebRTC encode | 30-80 ms |
| Network (edge to browser via VPN) | 100-200 ms |
| Browser decode | 20-50 ms |
| Total | ~200-430 ms |
10.5 Multi-Camera Grid Layout
10.5.1 Layout Configurations
| Layout | Cameras | Stream Used | Per-Camera Resolution | Total Bandwidth |
|---|---|---|---|---|
| 1x1 (fullscreen) | 1 | Main (subtype=0) | 960x1080 | ~4 Mbps |
| 2x2 grid | 4 | Sub (subtype=1) | 352x288 | ~4 Mbps total |
| 3x3 grid | 8+1 empty | Sub (subtype=1) | 352x288 | ~8 Mbps total |
| 4x2 grid | 8 | Sub (subtype=1) | 352x288 | ~8 Mbps total |
| Custom | User-defined | Mixed | Mixed | Sum of selected |
Smart stream selection: The dashboard automatically switches streams based on layout:
- Fullscreen single camera -> Main stream (high quality)
- Grid layout -> Sub stream (bandwidth-efficient)
- Camera clicked for fullscreen -> Dynamically switch to main stream
10.5.2 Grid Rendering
+-----------------------------------------------------------------------------+
| DASHBOARD GRID LAYOUTS |
+-----------------------------------------------------------------------------+
| |
| 1x1 Layout: 2x2 Layout: |
| +------------------------+ +----------+----------+ |
| | | | CH1 | CH2 | |
| | Camera 1 | | (sub) | (sub) | |
| | Main stream | | | | |
| | 960x1080 | +----------+----------+ |
| | ~4 Mbps | | CH3 | CH4 | |
| +------------------------+ | (sub) | (sub) | |
| | | | |
| +----------+----------+ |
| |
| 3x3 Layout (8 cameras): |
| +----------+----------+----------+ |
| | CH1 | CH2 | CH3 | |
| | (sub) | (sub) | (sub) | |
| +----------+----------+----------+ |
| | CH4 | CH5 | CH6 | |
| | (sub) | (sub) | (sub) | |
| +----------+----------+----------+ |
| | CH7 | CH8 | [Empty] | |
| | (sub) | (sub) | | |
| +----------+----------+----------+ |
| |
| Bandwidth: ~8 Mbps total for 3x3 layout (8 x ~1 Mbps sub streams) |
| |
+-----------------------------------------------------------------------------+
10.6 Bandwidth Optimization
10.6.1 Total Bandwidth Budget
| Traffic Type | Direction | Bandwidth | Notes |
|---|---|---|---|
| 8x RTSP ingestion | Edge -> DVR (local) | ~32 Mbps receive | Local LAN only |
| 8x HLS upload to cloud | Edge -> Cloud (via VPN) | ~8-16 Mbps upload | Transcoded and compressed |
| AI frames to cloud | Edge -> Cloud (via VPN) | ~2-4 Mbps upload | 1 FPS, JPEG compressed |
| Dashboard HLS playback | Cloud -> Browser | ~8 Mbps per user | Cached at CDN |
| Control/management | Bidirectional | < 1 Mbps | WebSocket, API calls |
| Total edge upload | ~10-20 Mbps | Primary concern for site bandwidth |
10.6.2 Optimization Techniques
| Technique | Savings | Implementation |
|---|---|---|
| Sub-stream for grid view | 75% bandwidth reduction | Use subtype=1 (352x288) instead of subtype=0 (960x1080) |
| H.264 copy (no re-encode) for main stream | Zero CPU overhead | -c:v copy when no format change needed |
| JPEG quality tuning for AI frames | 50-70% size reduction | Quality 70-85 depending on scene complexity |
| Frame deduplication for AI | 10-30% frame reduction | Skip frames with < 2% pixel change |
| HLS segment caching at edge | Reduces cloud upload spikes | 5-segment buffer smooths burstiness |
| Gzip compression for API/WebSocket | 60-80% reduction | Content-Encoding: gzip |
10.7 Fallback Handling
10.7.1 Stream Failure Fallback Chain
Step 1: RTSP connection fails
+-> Retry with exponential backoff (3 attempts)
+-> Try UDP transport if TCP fails
+-> Circuit breaker opens after 5 consecutive failures
|
Step 2: Stream stall detected (no frames for 10s)
+-> Kill FFmpeg process
+-> Restart with fresh connection
|
Step 3: Camera marked OFFLINE
+-> Dashboard shows "Camera Offline" placeholder
+-> HLS playlist returns 404
+-> Last known frame displayed with timestamp overlay
+-> Alert sent to operations team
|
Step 4: Camera recovers
+-> Circuit breaker transitions to HALF_OPEN
+-> Test stream pulled for 10 seconds
+-> On success: circuit CLOSED, stream resumes
+-> Dashboard auto-refreshes
10.7.2 Offline Placeholder
When a camera is offline, the HLS endpoint returns a static playlist:
#EXTM3U
#EXT-X-VERSION:3
#EXT-X-TARGETDURATION:2
#EXT-X-MEDIA-SEQUENCE:0
#EXT-X-ERROR: "Camera OFFLINE - Channel 1"
#EXTINF:2.000,
offline_placeholder.ts
The dashboard detects the #EXT-X-ERROR tag and displays a camera offline indicator with the last known timestamp.
10.7.3 Edge Buffer Management
The 2TB NVMe edge storage is partitioned for circular buffer operation:
| Directory | Max Size | Retention | Cleanup |
|---|---|---|---|
/data/hls/ |
20 GB | Rolling (5 segments) | Automatic via FFmpeg |
/data/buffer/ch1-ch8/ |
1.5 TB | 7 days circular | Age-based FIFO |
/data/buffer/ai_frames/ |
100 GB | 24 hours | Age-based |
/data/buffer/evidence/ |
200 GB | 30 days | Event-linked retention |
/data/logs/ |
10 GB | 30 days | Logrotate |
/data/tmp/ |
50 GB | On process exit | Cleanup on restart |
| Total reserved | ~1.88 TB | — | Fits in 2TB NVMe |
Buffer exhaustion handling:
- At 80% capacity: Alert admin, begin aggressive cleanup of old non-evidence data
- At 90% capacity: Stop non-critical buffering (AI frames), preserve HLS + evidence only
- At 95% capacity: Emergency mode — evidence-only recording, all other buffers purged
- Never delete evidence clips linked to unresolved alerts
10.7.4 DVR Full Disk Mitigation
Since the DVR disk is full (0 bytes free), the system does not rely on DVR-side recording:
| Function | Traditional | Our Design |
|---|---|---|
| Continuous recording | DVR internal HDD | Edge gateway 2TB NVMe buffer |
| Event/alert clips | DVR playback export | Cloud MinIO + S3 archival |
| Long-term storage | DVR disk rotation | AWS S3 tiered lifecycle |
| Playback | DVR web UI | Cloud dashboard with timeline |
Reference: For complete FFmpeg commands including multi-output tee muxer, frame extraction for AI, WebRTC bridge code, and the ring buffer implementation, see
video_ingestion.md— Sections 4-7.
End of Part A (Sections 1-10)
This unified technical blueprint synthesizes outputs from 11 specialist agents across 6 domain-specific design documents. For detailed implementation code, DDL, algorithms, and configuration, refer to the individual specialist documents listed in the cross-reference guide at the top of this document.
| Document | Path | Content |
|---|---|---|
| Architecture | architecture.md |
Full deployment specs, scaling, cost, failover |
| Video Ingestion | video_ingestion.md |
RTSP config, FFmpeg, edge gateway, HLS, WebRTC |
| AI Vision | ai_vision.md |
Model configs, inference pipeline, benchmarks |
| Database Schema | database_schema.md |
Complete DDL, triggers, views, RLS |
| Suspicious Activity | suspicious_activity.md |
10 detection modules, scoring engine |
| Training System | training_system.md |
Learning pipeline, quality gates, versioning |