Sentinel AI Surveillance Platform — Unified Technical Blueprint (Part B)

Document Version: 1.0 Date: 2025-01-16 Classification: Confidential — Internal Use Only Author: Technical Architecture Team

Part B Table of Contents

Section 11: Alerting Design (Notification System)
Section 12: Security Design
Section 13: UX / Website Structure
Section 14: Deployment Plan
Section 15: Testing Plan
Section 16: Self-Test Framework
Section 17: Sample Self-Test Report
Section 18: Risks and Mitigations
Section 19: Final Implementation Roadmap
Section 20: Final Production-Readiness Summary

Section 11: Alerting Design

11.1 Architecture Overview

The notification system employs an event-driven architecture built on Redis Pub/Sub for real-time message distribution. All detection events, system alerts, and manual triggers flow through a unified pipeline that supports dual-channel delivery via Telegram Bot API and WhatsApp Business API (Meta Official). The system is designed to ensure that critical security alerts are never lost while maintaining high performance and reliability through sophisticated rate limiting, retry logic, and dead letter queue handling.

┌──────────────────────────────────────────────────────────────────────────────┐
│                         ALERTING ARCHITECTURE                                 │
│                                                                              │
│   ┌─────────────────────────────────────────────────────────────────┐       │
│   │                         EVENT SOURCES                            │       │
│   │                                                                  │       │
│   │  Detection Pipeline ──▶ New person detected                     │       │
│   │  Face Recognition ────▶ Known/Unknown/Watchlist match           │       │
│   │  System Monitors ─────▶ Camera offline, Storage full, VPN down   │       │
│   │  Manual Triggers ─────▶ Operator-initiated alerts                │       │
│   │  AI Anomaly Engine ───▶ Suspicious activity detected             │       │
│   └──────────────────────────┬──────────────────────────────────────┘       │
│                              │                                               │
│                              ▼                                               │
│   ┌─────────────────────────────────────────────────────────────────┐       │
│   │                    REDIS PUB/SUB                                 │       │
│   │                                                                  │       │
│   │  Channel: alerts.critical  ─── High priority, immediate process  │       │
│   │  Channel: alerts.high      ─── Standard priority                 │       │
│   │  Channel: alerts.medium    ─── Batched processing                │       │
│   │  Channel: system.health    ─── System health events              │       │
│   └──────────────┬───────────────────────────────────────────────────┘       │
│                  │                                                            │
│   ┌──────────────┴───────────────────────────────────────────────────┐       │
│   │                    NOTIFICATION ROUTER                           │       │
│   │                     (Python/FastAPI)                             │       │
│   │                                                                  │       │
│   │  ┌──────────────┐  ┌──────────────┐  ┌──────────────────────┐  │       │
│   │  │ Event Parser │──▶ Rules Engine │──▶ Channel Selector    │  │       │
│   │  └──────────────┘  └──────────────┘  └──────────────────────┘  │       │
│   └──────────────────────────┬───────────────────────────────────────┘       │
│                              │                                               │
│          ┌───────────────────┼───────────────────┐                           │
│          ▼                   ▼                   ▼                           │
│   ┌─────────────┐   ┌───────────────┐   ┌──────────────────┐               │
│   │   TEMPLATE  │   │    RATE       │   │   ESCALATION     │               │
│   │   RENDERER  │   │   LIMITER     │   │    ENGINE        │               │
│   │             │   │               │   │                  │               │
│   │  HTML/TXT   │   │ Token Bucket  │   │ 3-level timeout  │               │
│   │  per channel│   │ 4-tier limits │   │ Auto-escalation  │               │
│   └──────┬──────┘   └───────┬───────┘   └────────┬─────────┘               │
│          │                  │                     │                         │
│          └──────────────────┼─────────────────────┘                         │
│                             ▼                                               │
│   ┌─────────────────────────────────────────────────────────────────┐      │
│   │                    CHANNEL ADAPTERS                              │      │
│   │                                                                  │      │
│   │  ┌──────────────────────────┐  ┌──────────────────────────┐     │      │
│   │  │    TELEGRAM BOT API      │  │  WHATSAPP BUSINESS API   │     │      │
│   │  │                          │  │                          │     │      │
│   │  │  - HTML formatting       │  │  - Template messages     │     │      │
│   │  │  - Inline keyboards      │  │  - Session messages      │     │      │
│   │  │  - Media groups          │  │  - Interactive messages  │     │      │
│   │  │  - Edit/Delete messages  │  │  - Media attachments     │     │      │
│   │  │  - Webhook receipts      │  │  - Message status API    │     │      │
│   │  └──────────┬───────────────┘  └──────────┬───────────────┘     │      │
│   └─────────────┼─────────────────────────────┼─────────────────────┘      │
│                 │                             │                              │
│                 ▼                             ▼                              │
│          ┌──────────────┐            ┌──────────────┐                       │
│          │  Telegram    │            │  WhatsApp    │                       │
│          │  Servers     │            │  Cloud API   │                       │
│          └──────────────┘            └──────────────┘                       │
│                                                                              │
│   ┌─────────────────────────────────────────────────────────────────┐       │
│   │                    SUPPORTING SERVICES                           │       │
│   │                                                                  │       │
│   │  ┌──────────────┐  ┌──────────────┐  ┌──────────────────────┐  │       │
│   │  │ RETRY MGR    │  │    DLQ       │  │  DELIVERY TRACKER    │  │       │
│   │  │ Exponential  │  │ Redis-backed │  │  Webhook callbacks   │  │       │
│   │  │ 5 max        │  │ Admin review │  │  Status dashboard    │  │       │
│   │  └──────────────┘  └──────────────┘  └──────────────────────┘  │       │
│   └─────────────────────────────────────────────────────────────────┘       │
└──────────────────────────────────────────────────────────────────────────────┘

Key Design Principles:

Principle	Implementation
Guaranteed delivery	At-least-once delivery via retry with exponential backoff; dead letter queue for permanent failures
Ordered processing	Events within a single camera stream processed in sequence; no alert reordering
Non-blocking	Alert generation does not block the detection pipeline; async processing via queues
Channel isolation	Failure in one channel (e.g., Telegram down) does not affect the other (WhatsApp continues)
Deduplication	5-minute window for duplicate suppression; composite key based on camera + person + event type
Observability	Every notification tracked from creation through delivery with full audit trail

11.2 Telegram Integration

11.2.1 Bot API Configuration

Telegram integration uses the official Telegram Bot API for message delivery. The bot is configured with encrypted tokens stored in HashiCorp Vault, with HTML message formatting for rich alert presentation.

Parameter	Value	Notes
API Base URL	`https://api.telegram.org/bot<TOKEN>/`	Standard Bot API endpoint
API Version	Bot API 7.x	Latest stable as of Q1 2025
Token Storage	HashiCorp Vault (AES-256-GCM encrypted)	Rotated every 180 days
Communication	HTTPS POST + WebSocket fallback	TLS 1.3 required for all calls
Message Format	HTML subset	`<b>`, `<i>`, `<code>`, `<pre>`, `<a href>` tags supported
Max Message Size	4096 characters per message	Longer messages auto-split into parts
Media Size Limit (Image)	10 MB per image	Processed via Pillow for compression
Media Size Limit (Video)	50 MB per video	Processed via FFmpeg for re-encoding
Media Group Limit	Up to 10 items per media group	Album delivery for multi-image alerts
Global Rate Limit	30 messages per second	Across all chats
Per-Chat Rate Limit	1 message per second	Per conversation throttling
Webhook Endpoint	`/webhooks/telegram`	Receives delivery receipts and callback queries

11.2.2 Bot Features and Capabilities

Inline Keyboards: Every alert message includes contextual action buttons that allow operators to respond directly from Telegram without opening the web dashboard.

Keyboard Type	Buttons	Actions
Standard Alert	Acknowledge / View Live / Details	Confirm receipt, open stream, view full info
Watchlist Alert	Acknowledge / View Live / Escalate / Details	Includes escalation for watchlist matches
Blacklist Alert	ACKNOWLEDGE NOW / View Live / Dispatch Security / Escalate / Details	Highest priority actions for blacklist
Escalation Notice	Acknowledge / View Original Alert	Acknowledge escalated alert or view source
System Alert	Acknowledge / View Dashboard / Details	System-level alert actions

Media Groups: When an alert contains multiple evidence images (up to 10), they are sent as a Telegram media group (album). This presents all related images in a single scrollable gallery rather than individual messages, reducing chat clutter.

Webhook Receipts: Telegram delivers message status updates via webhooks:

Webhook Type	Trigger	Action
`message`	Bot receives a command	Process command (e.g., /status, /acknowledge)
`callback_query`	User clicks inline button	Execute action, update message status
`edited_message`	Message edited externally	Log for audit trail
`my_chat_member`	Bot added/removed from chat	Update recipient group membership

Chat Commands:

Command	Description	Response
`/status`	Get system health status	Camera count, offline count, last alert time
`/acknowledge <alert_id>`	Acknowledge an alert	Confirmation or error message
`/cameras`	List all cameras and their status	Camera name, status, last seen
`/health`	Get edge gateway health	CPU, memory, disk, VPN status
`/help`	Show available commands	Command reference

11.2.3 Security Considerations

Telegram bot tokens are among the most sensitive credentials in the system. The following security measures are implemented:

Measure	Implementation
Encryption at rest	AES-256-GCM in Vault
Token rotation	Every 180 days or immediately on compromise suspicion
Rotation procedure	1) Generate new token via BotFather, 2) Update Vault, 3) Notify services to hot-reload, 4) 5-minute grace period, 5) Revoke old token
IP allowlisting	Webhook endpoint accepts only Telegram IP ranges
Webhook secret	HMAC verification on incoming webhook payloads
No token logging	Tokens never appear in application logs
No token in code	Tokens injected via Vault at runtime

11.3 WhatsApp Business API Integration

11.3.1 Meta Cloud API Configuration

WhatsApp integration uses Meta's official Cloud API (Business Platform), which provides a reliable, enterprise-grade messaging channel. This requires a verified Meta Business account and pre-approved message templates for proactive messaging.

Parameter	Value	Notes
API Base URL	`https://graph.facebook.com/v18.0/`	Meta Graph API v18.0 minimum
Authentication	Permanent Access Token	Scoped to WhatsApp Business Management
Token Storage	HashiCorp Vault (AES-256-GCM encrypted)	Rotated every 180 days
Phone Number ID	Dedicated business phone number	Not shared with other WhatsApp uses
Business Account	Verified Meta Business Account	Required for template message approval
Message Types	Template messages + Session messages	Template for first contact; session for replies
Media Size Limit (All)	16 MB per file	Stricter than Telegram; aggressive compression needed
Supported Media	JPEG, PNG, MP4 (H.264), PDF, Audio	Format validation before upload
Global Rate Limit	80 messages per second	Across all recipients
Per-Recipient Rate Limit	20 messages per minute	Per WhatsApp ID throttling
Webhook Endpoint	`/webhooks/whatsapp`	Receives message status updates

11.3.2 Message Types

Template Messages: Pre-approved message templates are required for any proactive (business-initiated) message. Templates must be created and submitted for approval in Meta Business Manager. Each template contains named parameters that are dynamically populated at send time.

Template Name	Purpose	Parameters	Approval Status
`person_detected_known`	Known person detected	name, role, camera, date, time, confidence, alert_id	Approved
`person_detected_unknown`	Unknown person alert	camera, date, time, confidence	Approved
`watchlist_match`	Person on watchlist detected	name, watchlist_type, camera, date, time	Approved
`blacklist_alert`	Blacklisted person detected	name, camera, date, time	Approved
`suspicious_activity`	Suspicious behavior detected	activity_type, camera, date, time, confidence	Approved
`system_alert`	System health alert	message, timestamp, severity	Approved
`escalation_notice`	Alert escalation notification	alert_id, level, summary, elapsed_minutes	Approved
`daily_digest`	Daily summary of activity	date, total_detections, total_alerts, top_cameras	Approved
`test_message`	System test	timestamp	Approved

Session Messages: Within a 24-hour window after a user sends a message to the business, free-form session messages can be sent without template restrictions. This is used for:

Acknowledgment confirmations
Escalation follow-ups
Interactive conversations initiated by the recipient
Quick reply responses

11.3.3 Webhook Event Handling

Webhook Event	Trigger	System Action
`messages.delivered`	Message delivered to device	Update delivery status to `delivered`
`messages.read`	Recipient read the message	Update delivery status to `read`
`messages.failed`	Message delivery failed	Trigger retry or move to DLQ
`message_reaction`	Recipient reacted to message	Log for engagement metrics
`account_alerts`	Meta account issue	Alert admin, review account status
`template_category_update`	Template status change	Update template catalog

11.4 Alert Routing Rules Engine

11.4.1 Condition Types

The routing engine evaluates 9 distinct condition types to determine which recipients receive which alerts through which channels. Multiple conditions can be combined with AND/OR logic for precise targeting.

#	Condition Type	Description	Example Values	Operators
1	`camera`	Source camera identifier	"CAM-01", "CAM-02", "entrance-cam"	equals, in, not_in
2	`person`	Detected known person	"John Smith", "Jane Doe"	equals, in, not_in
3	`role`	Person role category	"employee", "visitor", "vendor", "contractor", "security"	equals, in
4	`event_type`	Type of detection event	"person_detected", "unknown_person", "suspicious_activity", "crowd_gathering", "camera_tamper"	equals, in
5	`zone`	Detection zone name	"entrance", "restricted_area", "parking", "lobby", "warehouse"	equals, in
6	`time`	Time of day range	"08:00-18:00", "22:00-06:00"	between, not_between
7	`day`	Day of week	"monday", "weekday", "weekend"	equals, in
8	`severity`	Alert severity level	"critical", "high", "medium", "low", "info"	equals, in, gte
9	`watchlist`	Watchlist membership	"vip", "blacklist", "authorized", "temporary_access"	equals, in

11.4.2 Rule Structure

Each routing rule consists of conditions, actions, and metadata:

rule:
  id: "rule-001"
  name: "Blacklist Immediate Alert"
  enabled: true
  priority: 100  # Higher number = evaluated first
  
  conditions:
    operator: "AND"
    conditions:
      - field: "watchlist"
        operator: "equals"
        value: "blacklist"
      - field: "severity"
        operator: "in"
        value: ["critical", "high"]
  
  actions:
    - channel: "telegram"
      recipients: ["security_team", "management"]
      template: "blacklist_alert"
      media: ["image", "video"]
      bypass_quiet_hours: true
      priority: "high"
    
    - channel: "whatsapp"
      recipients: ["security_manager"]
      template: "blacklist_alert"
      media: ["image"]
      bypass_quiet_hours: true
  
  metadata:
    created_by: "admin"
    created_at: "2025-01-01T00:00:00Z"
    last_modified: "2025-01-10T12:00:00Z"
    tags: ["critical", "blacklist"]

11.4.3 Default Routing Rules

The system ships with a comprehensive set of default routing rules that cover common surveillance scenarios:

#	Scenario	Conditions	Severity	Recipients	Channels	Media	Quiet Hours
1	Known employee normal hours	role=employee, time=08:00-18:00, weekday	Info	None (log only)	—	—	N/A
2	Known employee after hours	role=employee, time=18:00-08:00	Low	Security team	Telegram	Image	Respected
3	Known visitor during hours	role=visitor, time=08:00-18:00	Low	Reception desk	Telegram	Image	Respected
4	Unknown person detected	event_type=unknown_person	Medium	Security team	Telegram + WhatsApp	Image	Respected
5	Unknown person after hours	event_type=unknown_person, time=22:00-06:00	High	Security team + Manager	Both	Image + Video	Bypassed
6	Watchlist match	watchlist=watchlist	High	Security team	Both	Image + Video	Respected
7	Blacklist match	watchlist=blacklist	Critical	All groups	Both (bypass quiet)	Image + Video	Bypassed
8	VIP detected	watchlist=vip	Low	Reception desk	Telegram	Image	Respected
9	Camera offline	event_type=camera_offline	High	IT team + Security team	Telegram	None	Bypassed
10	Storage > 90%	event_type=storage_warning	High	IT team + Management	Both	None	Bypassed
11	Storage > 95%	event_type=storage_critical	Critical	All groups	Both (bypass quiet)	None	Bypassed
12	VPN tunnel down	event_type=vpn_down	Critical	IT team + Management	Both (bypass quiet)	None	Bypassed
13	Suspicious activity	event_type=suspicious_activity	High	Security team	Both	Image + Video	Respected
14	Crowd gathering	event_type=crowd_gathering	Medium	Security team	Telegram	Image	Respected

11.5 Recipient Groups and Quiet Hours

11.5.1 Recipient Group Management

Recipient groups are the primary mechanism for organizing alert destinations. Each group contains one or more contacts with specified channels.

Group Name	Members	Primary Channel	Backup Channel	Alert Preferences	Quiet Hours
Security Team	On-site security guards	Telegram	WhatsApp	All except info	Disabled
Security Manager	Shift supervisor	WhatsApp	Telegram	Medium and above	Disabled
IT Team	Infrastructure staff	Telegram	WhatsApp	System alerts only	Nights
Management	Facility managers	WhatsApp	Telegram	Critical only	Disabled
Reception	Front desk staff	Telegram	None	Visitor-related, VIP	Disabled
After-Hours	On-call personnel	WhatsApp	Telegram	High and Critical	Disabled

Group Configuration Interface:

Groups are managed through the web dashboard at /settings/notifications/groups. Each group can be configured with:

Setting	Description
Group name	Human-readable identifier
Description	Purpose of the group
Members	List of Telegram chat IDs and WhatsApp phone numbers
Default channel	Primary delivery channel
Alert severity filter	Minimum severity to deliver
Quiet hours override	Whether quiet hours apply to this group
Media preferences	Which media types to include
Max alerts per hour	Rate limit for this group

11.5.2 Quiet Hours Configuration

Quiet hours allow suppressing non-critical alerts during configured time windows. Critical alerts always bypass quiet hours — this is a non-configurable safety measure.

quiet_hours:
  enabled: false                    # DISABLED BY DEFAULT for security
  preset: "none"                    # none / nights / weekends / custom
  
  custom_schedule:
    - label: "Weekday Nights"
      days: ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday"]
      start_time: "22:00"
      end_time: "06:00"
      timezone: "Asia/Kolkata"
      
    - label: "Weekend All Day"
      days: ["Saturday", "Sunday"]
      start_time: "00:00"
      end_time: "23:59"
      timezone: "Asia/Kolkata"
  
  allowed_during_quiet:             # Which severities bypass quiet hours
    - "critical"                    # Always delivered (non-configurable)
    
  emergency_bypass:
    enabled: true
    triggers:
      - severity: "critical"
      - tag: "emergency"
      - rule_override: "bypass_quiet_hours"
    notification_method: "all_channels"
    
  suppression_behavior: "queue"     # queue / discard / digest
  # "queue": Hold until quiet hours end
  # "discard": Drop non-critical alerts entirely
  # "digest": Send summary when quiet hours end

Security Note: Quiet hours are disabled by default because the surveillance use case requires continuous awareness. Any decision to enable quiet hours must be documented with security team sign-off.

11.5.3 Per-Recipient Quiet Hours

Individual recipients can configure personal quiet hours that override group settings:

Recipient	Personal Quiet Hours	Group Override	Effect
Security Guard A	None	Security Team (Disabled)	Receives all alerts
IT Manager	23:00-07:00	IT Team (Nights)	Matches group — no IT alerts at night
Manager B	22:00-08:00	Management (Disabled)	Personal quiet hours applied

11.6 Message Templates

11.6.1 Telegram HTML Templates

All Telegram templates use a safe HTML subset for rich formatting with inline action keyboards.

Template: Person Detected (Known)

🔍 <b>Person Detected</b>

<b>{name}</b> ({role})
📍 Camera: {camera_name}
🕐 {date} at {time}
🎯 Confidence: {confidence}%

<a href="{dashboard_url}">View in Dashboard</a>

Template: Unknown Person Detected

❓ <b>Unknown Person Detected</b>

📍 Camera: {camera_name}
🕐 {date} at {time}
🎯 Confidence: {confidence}%

<i>This person is not in the database.</i>

<a href="{naming_url}">Name This Person</a>

Template: Watchlist Match

⚠️ <b>WATCHLIST ALERT</b>

<b>{name}</b>
📋 Watchlist: {watchlist_type}
📍 Camera: {camera_name}
🕐 {date} at {time}
🎯 Confidence: {confidence}%

<i>This person is on a watchlist and requires attention.</i>

Template: Blacklist Alert

🚨 <b>BLACKLIST ALERT</b> 🚨

⚠️ <b>{name}</b> has been detected!
📍 Camera: {camera_name}
🕐 {date} at {time}
🎯 Confidence: {confidence}%

<b>This person is BLACKLISTED. Immediate attention required.</b>

<a href="{dispatch_url}">🚨 Dispatch Security</a>

Template: Escalation Notice

⬆️ <b>Alert Escalated — Level {escalation_level}</b>

Alert #{alert_id} has been escalated.

Original: {alert_summary}
⏱️ Unacknowledged for {elapsed_minutes} minutes
Threshold: {threshold_minutes} minutes

<i>Please review immediately.</i>

Template: System Alert

⚙️ <b>System Alert</b>

{message}

🕐 {timestamp}
Severity: {severity}

<a href="{health_dashboard_url}">View System Health</a>

Template: Daily Digest

📊 <b>Daily Activity Digest — {date}</b>

👥 Persons Detected: {total_detections}
🔔 Alerts Generated: {total_alerts}
📹 Cameras Online: {cameras_online}/{cameras_total}

Top Cameras:
{camera_list}

<a href="{full_report_url}">View Full Report</a>

11.6.2 WhatsApp Template Format

WhatsApp templates use a different format — they are pre-registered with Meta and use numbered parameter substitution:

Template: person_detected_known

🔍 Person Detected

{{1}} ({{2}})
📍 Camera: {{3}}
🕐 {{4}} at {{5}}
🎯 Confidence: {{6}}%
Alert ID: {{7}}

Parameters: {{1}}=name, {{2}}=role, {{3}}=camera_name, {{4}}=date, {{5}}=time, {{6}}=confidence, {{7}}=alert_id

11.6.3 Template Variable Reference

Variable	Description	Source	Example
`{name}`	Detected person's name	Person database	"John Smith"
`{role}`	Person's role	Person database	"Employee"
`{camera_name}`	Camera display name	Camera configuration	"Main Entrance"
`{date}`	Event date	Event timestamp	"2025-01-16"
`{time}`	Event time	Event timestamp	"14:32:15"
`{confidence}`	Detection confidence %	AI inference result	"97.3"
`{alert_id}`	Unique alert identifier	Alert database	"ALT-20250116-001"
`{watchlist_type}`	Watchlist category	Watchlist configuration	"Blacklist"
`{activity_type}`	Type of suspicious activity	AI classification	"Loitering"
`{severity}`	Alert severity	Rules engine	"Critical"
`{dashboard_url}`	Deep link to dashboard	System configuration	"https://..."
`{elapsed_minutes}`	Time since alert creation	System clock	"15"

11.7 Retry Logic and Rate Limiting

11.7.1 Retry Configuration

Failed notifications are retried using an exponential backoff strategy to avoid overwhelming downstream services.

Parameter	Value	Description
Maximum retries	5	After 5 failures, move to DLQ
Base delay	2 seconds	Initial retry wait time
Exponential base	2	Delay multiplier (2^n)
Maximum delay	300 seconds (5 minutes)	Cap on retry delay
Jitter	Up to 1 second random	Prevents thundering herd

Retry Schedule:

Attempt	Delay	Cumulative Time
1 (initial)	Immediate	0s
2	2s + jitter	~2s
3	4s + jitter	~6s
4	8s + jitter	~14s
5	16s + jitter	~30s
6 (final)	32s + jitter	~62s
DLQ	—	After 62s total

Retryable Errors:

Error Code	Description	Retry?
Timeout	Request timed out	Yes
429 Too Many Requests	Rate limited by provider	Yes (with longer delay)
500 Internal Server Error	Provider error	Yes
502 Bad Gateway	Provider gateway error	Yes
503 Service Unavailable	Provider temporarily down	Yes
409 Conflict	Request conflict	Yes
401 Unauthorized	Authentication failed	No (credential issue)
403 Forbidden	Permission denied	No (configuration issue)
400 Bad Request	Invalid request	No (template/parameter issue)
Chat not found	Recipient blocked bot	No

Non-Retryable Errors (Immediate DLQ):

Invalid bot token (401)
Bot blocked by user (403)
Chat not found
Malformed template (400)
Message too long (after split)
Unsupported media format

11.7.2 Circuit Breaker

Each channel adapter implements a circuit breaker to prevent cascading failures:

Parameter	Value
Failure threshold	10 consecutive failures
Open state duration	60 seconds
Half-open test calls	3 successful calls required
Monitoring window	5 minutes

Circuit States:

State	Behavior	Transition Trigger
`Closed`	Normal operation — all requests pass	Initial state, or after half-open success
`Open`	Fast fail — no requests sent to provider	10 consecutive failures
`Half-Open`	Limited test requests allowed	After 60-second open timeout

11.7.3 Rate Limiting Tiers

The notification system implements multi-tier rate limiting to prevent abuse and ensure fair resource distribution:

Tier	Limit	Scope	Burst
Global (all channels)	200 messages/minute	Across all channels combined	20
Telegram Global	30 messages/second	All Telegram traffic	5
Telegram Per-Chat	1 message/second	Per conversation	1
WhatsApp Global	80 messages/second	All WhatsApp traffic	10
WhatsApp Per-Recipient	20 messages/minute	Per phone number	3
Per Camera Source	30 alerts/minute	Prevents camera spam	5
Per Severity (Critical)	No limit	Critical alerts bypass rate limits	N/A

Token Bucket Algorithm: Each tier maintains a token bucket. A token is consumed per message. Tokens replenish at the configured rate. If no tokens are available, the message is queued or rejected based on priority.

11.7.4 Alert Deduplication

Alerts are deduplicated to prevent notification spam when the same event triggers repeatedly:

Deduplication Key	Components	Window	Action on Duplicate
Known person	`camera_id + person_id + event_type`	5 minutes	Suppress, append counter to original
Unknown person	`camera_id + event_type`	5 minutes	Suppress, append counter to original
System alert	`alert_type + source_id`	15 minutes	Suppress, update existing message
Watchlist match	`camera_id + person_id + watchlist_id`	10 minutes	Suppress, append counter

When a duplicate is detected, the original message is updated with a counter (e.g., "+3 more detections"), avoiding a flood of similar messages.

11.8 Escalation Rules

11.8.1 Escalation Thresholds

When an alert goes unacknowledged, it automatically escalates through up to 3 levels, each with increasing urgency and broader recipient distribution.

Severity	Level 1 (Primary)	Level 2 (Secondary)	Level 3 (Final)
Critical	5 minutes	10 minutes	20 minutes
High	15 minutes	30 minutes	60 minutes
Medium	30 minutes	60 minutes	120 minutes
Low	60 minutes	120 minutes	240 minutes
Info	Never	Never	Never

11.8.2 Escalation Actions per Level

Level	Name	Notification Action	Recipient Expansion	Severity Change
0	Original	Standard routing rules	Primary recipients only	Original severity
1	Primary	Re-notify with escalation prefix	Add management group	Increase by one level
2	Secondary	Force all channels, bypass quiet hours	Add all groups, increase severity	Increase by one level
3	Final	All-hands notification, include audit trail	All configured recipients	Set to Critical

Escalation Cancellation: Acknowledgment cancels ALL pending escalation timers for an alert. Acknowledgment can occur via:

Telegram inline "Acknowledge" button click
WhatsApp quick reply "Ack"
Web dashboard "Acknowledge" button
REST API POST /api/v1/alerts/{id}/acknowledge
Chat command /acknowledge {alert_id}

11.8.3 Escalation Notification Template

⬆️ <b>ESCALATION — Level {level}</b>

Original Alert: {alert_summary}
Alert ID: {alert_id}
First Detected: {first_detected_time}
Current Time: {current_time}
Unacknowledged: {elapsed_minutes} minutes
Escalation Threshold: {threshold_minutes} minutes

This alert has been escalated because it has not been acknowledged.
Please review immediately.

<a href="{acknowledge_url}">✅ Acknowledge Now</a>
<a href="{view_alert_url}">👁 View Details</a>

11.9 Media Attachment Handling

11.9.1 Media Processing Pipeline

When an alert includes media (snapshot images or video clips), a multi-stage processing pipeline ensures the media meets channel-specific requirements:

Original Media (from detection)
       │
       ▼
┌──────────────────┐
│  1. Store Original│  ──▶ MinIO/S3 (full resolution archival)
│     in Storage    │
└────────┬─────────┘
         │
         ▼
┌──────────────────┐
│  2. Process for  │
│     Telegram     │
└────────┬─────────┘
         │
         ├──▶ Image: Resize 1280x720, JPEG quality 85, max 10 MB
         ├──▶ Video: H.264, 1280x720, max 50 MB, max 60 seconds
         └──▶ Media Group: Each image < 10 MB, max 10 items
         │
         ▼
┌──────────────────┐
│  3. Process for  │
│     WhatsApp     │
└────────┬─────────┘
         │
         ├──▶ Image: Resize 1600x900, JPEG quality 80, max 16 MB
         └──▶ Video: H.264, 1280x720, max 16 MB, max 60 seconds

11.9.2 Image Processing Details

Step	Operation	Parameters
1. Load	Open source image	Pillow (PIL)
2. Convert	Convert to RGB	Drop alpha channel if present
3. Resize	Scale to target dimensions	Lanczos resampling
4. Compress	JPEG encoding	Quality: 85 (Telegram), 80 (WhatsApp)
5. Check size	Verify file size under limit	If over limit, reduce quality iteratively
6. Fallback	Aggressive compression	If quality < 50 and still over limit, reduce dimensions

Iterative Quality Reduction:

def compress_image_to_limit(image, size_limit_mb, channel):
    quality = 85 if channel == 'telegram' else 80
    min_quality = 40
    
    while quality >= min_quality:
        buffer = io.BytesIO()
        image.save(buffer, format='JPEG', quality=quality, optimize=True)
        size_mb = buffer.tell() / (1024 * 1024)
        
        if size_mb <= size_limit_mb:
            return buffer.getvalue()
        
        quality -= 5
    
    # If still over limit, reduce dimensions by 25% and retry
    new_size = (int(image.width * 0.75), int(image.height * 0.75))
    image = image.resize(new_size, Image.LANCZOS)
    return compress_image_to_limit(image, size_limit_mb, channel)

11.9.3 Video Processing Details

Videos are processed with FFmpeg using two-pass encoding to achieve the target bitrate calculated from the size limit:

# Calculate target bitrate: (size_limit_bytes * 8) / duration_seconds
# Example: 16 MB limit, 10 second clip = (16*1024*1024*8) / 10 = ~13.4 Mbps

ffmpeg -i input.mp4 \
    -c:v libx264 \
    -b:v 10M \                          # Target video bitrate
    -maxrate 12M \                      # Maximum bitrate
    -bufsize 20M \                      # Buffer size
    -vf "scale=1280:720:force_original_aspect_ratio=decrease" \
    -c:a aac -b:a 128k \               # Audio encoding
    -movflags +faststart \              # Web-optimized
    -preset fast \                      # Encoding speed/quality tradeoff
    -y output.mp4

11.10 Delivery Tracking

11.10.1 Delivery Status Lifecycle

Every notification progresses through a well-defined status lifecycle, tracked in the database for audit and troubleshooting:

Status	Description	Terminal?
`pending`	Queued, waiting to be sent	No
`processing`	Currently being sent to provider	No
`sent`	API request to provider succeeded	No
`delivered`	Provider confirmed delivery to device	No
`read`	Recipient opened/read the message	No
`engaged`	User interacted (button click, reaction)	Yes
`failed`	Permanently failed (non-retryable error)	Yes
`retrying`	Scheduled for retry attempt	No
`dead_letter`	Moved to DLQ after all retries exhausted	Yes
`suppressed`	Blocked by quiet hours or deduplication	Yes
`cancelled`	Cancelled (e.g., acknowledged before send)	Yes
`expired`	Message TTL expired before delivery	Yes

Status Transitions:

pending → processing → sent → delivered → read → engaged
   │          │          │         │
   ▼          ▼          ▼         ▼
retrying  cancelled   failed   suppressed
   │
   ▼
dead_letter

11.10.2 Dead Letter Queue (DLQ)

Failed notifications that exhaust all retry attempts are moved to a Redis-backed Dead Letter Queue. Admin users can review and manage DLQ entries through the web dashboard.

DLQ Feature	Description
Storage	Redis sorted set, ordered by failure timestamp
Retention	30 days
View	Filterable by channel, error type, date range
Actions	Retry individual, Retry all (batch), Discard, Export
Alert	Daily digest of DLQ count; alert if > 10 entries
Auto-retry	Optional: automatically retry DLQ entries every 6 hours

11.11 API Endpoints Summary

11.11.1 REST Endpoints (13 endpoints)

#	Method	Endpoint	Purpose	Auth
1	GET	`/api/v1/notifications/rules`	List all routing rules	Admin
2	POST	`/api/v1/notifications/rules`	Create new routing rule	Admin
3	GET	`/api/v1/notifications/rules/{id}`	Get specific rule	Admin
4	PUT	`/api/v1/notifications/rules/{id}`	Update routing rule	Admin
5	DELETE	`/api/v1/notifications/rules/{id}`	Delete routing rule	Admin
6	GET	`/api/v1/notifications/templates`	List message templates	Admin
7	POST	`/api/v1/notifications/templates`	Create/update template	Admin
8	GET	`/api/v1/notifications/delivery-status/{alert_id}`	Get delivery status for alert	Operator+
9	GET	`/api/v1/notifications/{id}/status`	Single notification status	Operator+
10	POST	`/api/v1/notifications/{id}/retry`	Manual retry of failed notification	Admin
11	GET	`/api/v1/notifications/dlq`	List dead letter queue	Admin
12	POST	`/api/v1/notifications/dlq/retry-all`	Retry all DLQ entries	Admin
13	POST	`/api/v1/notifications/dlq/clear`	Clear all DLQ entries	Admin

11.11.2 Alert Management Endpoints

#	Method	Endpoint	Purpose	Auth
1	GET	`/api/v1/alerts`	List alerts with filters	Operator+
2	GET	`/api/v1/alerts/{id}`	Get single alert details	Operator+
3	POST	`/api/v1/alerts/{id}/acknowledge`	Acknowledge alert	Operator+
4	POST	`/api/v1/alerts/{id}/resolve`	Resolve alert	Operator+
5	POST	`/api/v1/alerts/{id}/ignore`	Ignore alert	Operator+
6	POST	`/api/v1/alerts/{id}/false-positive`	Mark as false positive	Operator+
7	POST	`/api/v1/alerts/bulk/acknowledge`	Bulk acknowledge	Operator+
8	POST	`/api/v1/alerts/bulk/ignore`	Bulk ignore	Operator+

11.11.3 WebSocket Endpoints (2 endpoints)

Endpoint	Purpose	Authentication
`WS /api/v1/notifications/live`	Real-time notification stream for connected clients	JWT token in query parameter
`WS /api/v1/alerts/stream`	Live alert feed for operator dashboards	JWT token in query parameter

11.11.4 Webhook Endpoints (2 endpoints)

Endpoint	Source	Purpose
`POST /webhooks/telegram`	Telegram servers	Receive delivery receipts, callback queries, chat events
`POST /webhooks/whatsapp`	Meta servers	Receive message status updates, incoming messages

Webhook Security:

Measure	Implementation
Telegram	HMAC-SHA256 signature verification using bot token
WhatsApp	SHA-256 signature verification using app secret
IP allowlisting	Only accept requests from Telegram/Meta IP ranges
Replay protection	Reject messages with timestamps older than 5 minutes
Rate limiting	100 requests per minute per source IP

Section 12: Security Design

12.1 Security Architecture Overview

The Sentinel AI Surveillance Platform implements defense-in-depth security across seven distinct layers. Every component — from network perimeter to data storage — has been designed with security as a primary consideration, reflecting the sensitive nature of surveillance data, biometric information, and the critical safety function the system performs.

┌──────────────────────────────────────────────────────────────────────────────┐
│                      DEFENSE IN DEPTH ARCHITECTURE                            │
│                                                                              │
│   LAYER 1: PERIMETER                                                         │
│   ─────────────────                                                          │
│   AWS WAF v2 │ Geo-restriction │ DDoS protection │ Rate limiting             │
│                                                                              │
│   LAYER 2: TRANSPORT                                                         │
│   ─────────────────                                                          │
│   TLS 1.3 │ mTLS internal │ WireGuard ChaCha20-Poly1305 │ Certificate mgmt  │
│                                                                              │
│   LAYER 3: AUTHENTICATION & AUTHORIZATION                                    │
│   ─────────────────────────────────────────                                  │
│   Argon2id │ JWT ES256 │ TOTP MFA │ RBAC 4 roles │ API keys                 │
│                                                                              │
│   LAYER 4: APPLICATION SECURITY                                              │
│   ────────────────────────────                                               │
│   Input validation │ Parameterized queries │ CSP │ CSRF │ CORS │ File upload │
│                                                                              │
│   LAYER 5: DATA SECURITY                                                     │
│   ────────────────────                                                       │
│   AES-256-GCM at rest │ Field-level encryption │ Signed URLs │ Key rotation  │
│                                                                              │
│   LAYER 6: NETWORK SEGMENTATION                                              │
│   ───────────────────────────                                                │
│   VPC private subnets │ Security groups │ Network Policies │ Firewall rules  │
│                                                                              │
│   LAYER 7: AUDIT & MONITORING                                                │
│   ─────────────────────────                                                  │
│   Hash-chain audit log │ Real-time alerts │ CloudTrail │ Flow Logs           │
└──────────────────────────────────────────────────────────────────────────────┘

12.2 SSL/TLS Configuration

12.2.1 Protocol and Cipher Suite Requirements

All external-facing services enforce strong TLS configuration with modern cipher suites:

Setting	Value	Rationale
Minimum TLS Version	TLS 1.2	Fallback for older clients; TLS 1.3 preferred
Preferred TLS Version	TLS 1.3	Fastest, most secure handshake
Cipher Suites (TLS 1.2)	`ECDHE-ECDSA-AES256-GCM-SHA384`	Forward secrecy, AES-GCM authenticated encryption
Cipher Suites (TLS 1.2)	`ECDHE-RSA-AES256-GCM-SHA384`	Same with RSA certificates
Cipher Suites (TLS 1.2)	`ECDHE-ECDSA-CHACHA20-POLY1305`	Mobile-optimized cipher
Cipher Suites (TLS 1.2)	`ECDHE-RSA-CHACHA20-POLY1305`	Mobile-optimized with RSA
Cipher Suites (TLS 1.3)	`TLS_AES_256_GCM_SHA384`	Mandatory TLS 1.3 cipher
Cipher Suites (TLS 1.3)	`TLS_CHACHA20_POLY1305_SHA256`	Alternative TLS 1.3 cipher
Disabled Ciphers	CBC mode, RC4, 3DES, DES, MD5, SHA1, RSA key exchange (no forward secrecy)	Known weaknesses
HSTS	`max-age=63072000; includeSubDomains; preload`	2-year HSTS with preload eligibility
OCSP Stapling	Enabled	Reduces certificate validation latency
Certificate Provider	Let's Encrypt (ACME v2)	Free, automated, trusted
Auto-renewal	60 days before expiry	Ensures 30+ day buffer
Certificate Transparency	Required	All certificates publicly logged

12.2.2 mTLS for Internal Service Communication

All inter-service communication uses mutual TLS (mTLS) with client certificate verification. This means both the client and server must present valid certificates signed by the internal Certificate Authority.

Parameter	Value
Internal CA	Self-managed ECDSA P-256 CA
Certificate lifetime	90 days (auto-rotated)
Verification mode	Required (reject if no client cert)
Revocation	CRL + OCSP
Service identity	SPIFFE URI in certificate Subject Alternative Name

Benefits of mTLS:

Even if network boundaries are breached, unauthorized services cannot access internal APIs
Every service-to-service call is authenticated and encrypted
Certificates provide strong service identity (not just IP-based)
No shared secrets between services (except Vault tokens)

12.2.3 TLS Configuration Code Example

# FastAPI TLS configuration
from fastapi import FastAPI
from uvicorn.config import Config

app = FastAPI()

# TLS settings for uvicorn
ssl_config = {
    "ssl_keyfile": "/certs/server.key",
    "ssl_certfile": "/certs/server.crt",
    "ssl_ca_certs": "/certs/ca.crt",          # For mTLS
    "ssl_cert_reqs": ssl.CERT_REQUIRED,        # Require client cert
    "ssl_min_version": ssl.TLSVersion.TLSv1_2,
    "ssl_ciphers": "ECDHE-ECDSA-AES256-GCM-SHA384:"
                    "ECDHE-RSA-AES256-GCM-SHA384:"
                    "ECDHE-ECDSA-CHACHA20-POLY1305:"
                    "ECDHE-RSA-CHACHA20-POLY1305",
}

12.3 Authentication

12.3.1 Password Policy

Requirement	Value	Enforcement
Minimum length	12 characters	Hard validation
Complexity	At least one uppercase, one lowercase, one digit, one special character	Regex validation
Password history	Last 12 passwords cannot be reused	Database check
Hashing algorithm	Argon2id (memory-hard, resistant to GPU cracking)	Passwords never stored in plaintext
Argon2id parameters	Time cost: 3, Memory: 64MB, Parallelism: 4	Tuned for 500ms hash time
HaveIBeenPwned check	Enabled for all new passwords	k-anonymity API (no full password sent)
Maximum age	90 days	Configurable; reminder at 75 days
Lockout after failures	5 failed attempts	30-minute lockout
Password change	Users cannot reuse current password	Immediate validation

12.3.2 JWT Token Configuration

Parameter	Value	Notes
Signing algorithm	ES256 (ECDSA with P-256 curve)	Smaller signatures than RS256; same security
Access token lifetime	15 minutes	Short-lived for security
Refresh token lifetime	7 days	Long-lived but revocable
Key rotation	Every 180 days	Dual-key support for zero-downtime rotation
Key storage	HashiCorp Vault	Private key never exposed to application filesystem
Token binding	Session ID + browser fingerprint	Detects token theft/reuse
Claims	`sub`, `iss`, `aud`, `exp`, `iat`, `jti`, `role`, `permissions`, `mfa_verified`	Standard + custom claims
Issuer	`sentinel-ai`	Verified by all services
Audience	`sentinel-api`	Scope-limited

JWT Token Structure:

{
  "header": {
    "alg": "ES256",
    "typ": "JWT",
    "kid": "key-2025-01"
  },
  "payload": {
    "sub": "user-uuid-here",
    "iss": "sentinel-ai",
    "aud": "sentinel-api",
    "exp": 1705500000,
    "iat": 1705499100,
    "jti": "unique-token-id",
    "role": "operator",
    "permissions": ["alerts:view", "alerts:acknowledge", "cameras:view"],
    "mfa_verified": true,
    "session_id": "sess-uuid-here"
  }
}

12.3.3 Multi-Factor Authentication (MFA)

Parameter	Value
Method	TOTP (Time-based One-Time Password) per RFC 6238
Issuer label	"Sentinel AI Surveillance"
Algorithm	SHA-1 (for compatibility)
Digit length	6 digits
Time step	30 seconds
Valid window	1 step before and after current (3-step tolerance)
Recovery codes	10 single-use codes generated at setup
Enforced for	Super Admin, Admin roles (mandatory)
Optional for	Operator, Viewer roles (recommended)
QR code format	otpauth://totp/Sentinel%20AI:{username}?secret={secret}&issuer=Sentinel%20AI

MFA Enforcement Matrix:

Role	MFA Required	Can Disable
Super Admin	Yes	No
Admin	Yes	No
Operator	No (Recommended)	Yes
Viewer	No	Yes

12.4 Role-Based Access Control (RBAC)

12.4.1 Role Definitions

Role	Level	Description	Typical Users	Count
Super Admin	L1	Full system access; can manage other admins	CISO, CTO, Platform Lead	1-2
Admin	L2	Administrative functions; day-to-day management	Security Manager, IT Manager	2-4
Operator	L3	Day-to-day surveillance operations	Security guards, SOC analysts	5-20
Viewer	L4	Read-only access for review and audit	Auditors, Management	2-10

12.4.2 Permission Matrix (30+ Permissions)

Permission	Super Admin	Admin	Operator	Viewer
`users:full_access`	Y	N	N	N
`users:manage` (create/edit/deactivate)	Y	Y	N	N
`users:view` (list, details)	Y	Y	Y	Y
`users:reset_password`	Y	Y	N	N
`users:reset_mfa`	Y	Y	N	N
`cameras:full_access`	Y	N	N	N
`cameras:manage` (add/edit/remove)	Y	Y	N	N
`cameras:view` (list, status)	Y	Y	Y	Y
`cameras:control` (PTZ, restart stream)	Y	Y	Y	N
`cameras:configure_zones`	Y	Y	N	N
`alerts:manage` (edit rules, bulk actions)	Y	Y	N	N
`alerts:view` (list, filter, search)	Y	Y	Y	Y
`alerts:acknowledge`	Y	Y	Y	N
`alerts:resolve`	Y	Y	Y	N
`alerts:mark_false_positive`	Y	Y	Y	N
`persons:full_access`	Y	N	N	N
`persons:manage` (create/edit/delete)	Y	Y	N	N
`persons:view` (gallery, profiles)	Y	Y	Y	Y
`persons:name_unknown`	Y	Y	Y	N
`persons:merge`	Y	Y	Y	N
`watchlists:manage` (create/edit/delete)	Y	Y	N	N
`watchlists:view` (list, members)	Y	Y	Y	Y
`watchlists:add_remove_members`	Y	Y	Y	N
`ai_settings:manage` (change defaults)	Y	Y	N	N
`ai_settings:view` (see current settings)	Y	Y	Y	Y
`ai_settings:adjust` (operator adjustments)	Y	Y	Y	N
`reports:full_access`	Y	N	N	N
`reports:view` (all reports)	Y	Y	Y	Y
`reports:export`	Y	Y	Y	N
`system:full_access`	Y	N	N	N
`system:manage` (config changes)	Y	Y	N	N
`system:view` (health, status)	Y	Y	Y	Y
`audit:view` (audit logs)	Y	Y	N	N
`notifications:manage` (routing rules)	Y	Y	N	N
`storage:manage` (retention policies)	Y	Y	N	N
`storage:view` (usage, reports)	Y	Y	Y	Y
`privacy:manage` (GDPR actions)	Y	Y	N	N
`privacy:view` (consent status)	Y	Y	Y	Y

12.4.3 Resource-Level Permissions

Beyond global permissions, the system supports resource-level access control:

Resource Type	Granularity	Example
Cameras	Per-camera access	Operator A can only view CAM-01, CAM-02
Zones	Per-zone access	Operator B can only view "entrance" zone
Alerts	Per-camera origin	Viewer can only see alerts from specific cameras
Persons	Per-department	HR can only view employee records
Watchlists	Per-watchlist	Security can only view "blacklist", not "vip"

12.5 VPN and Network Security

12.5.1 WireGuard VPN Configuration

WireGuard provides the encrypted tunnel between cloud infrastructure and the edge site:

Parameter	Value	Notes
Protocol	WireGuard	Modern, simple, fast VPN
Port	UDP 51820	Single port, firewall-friendly
Authentication	Ed25519 key pairs + Preshared Key (PSK)	Defense in depth
Encryption	ChaCha20-Poly1305	Fast on hardware without AES-NI
Key exchange	Curve25519 elliptic curve	128-bit security
Tunnel network	10.200.0.0/24	Dedicated VPN subnet
Cloud endpoint	10.200.0.1/32	Single IP for cloud side
Edge endpoint	10.200.0.2/32	Single IP for edge side
AllowedIPs (cloud)	10.200.0.2/32, 192.168.29.0/24	Edge + camera network only
AllowedIPs (edge)	10.100.0.0/16, 10.200.0.0/24	Full cloud VPC + VPN
Keepalive	25 seconds	Prevents NAT timeout
Key rotation	365 days	Annual rotation via maintenance window

12.5.2 Network Segmentation Architecture

┌──────────────────────────────────────────────────────────────────────────────┐
│                         NETWORK ARCHITECTURE                                  │
│                                                                              │
│   INTERNET                                                                   │
│      │                                                                       │
│      ▼                                                                       │
│   ┌──────────────┐                                                           │
│   │  AWS WAF     │                                                           │
│   │  + ALB       │                                                           │
│   └──────┬───────┘                                                           │
│          │                                                                   │
│   ═══════╪════════════════ AWS CLOUD VPC: 10.100.0.0/16 ═══════════════════  │
│          │                                                                   │
│          │    ┌──────────────────────────────────────────────────────┐       │
│          │    │  PUBLIC SUBNET: 10.100.1.0/24                      │       │
│          │    │  - ALB (Application Load Balancer)                  │       │
│          │    │  - NAT Gateway                                       │       │
│          │    │  - WireGuard VPN Gateway (10.200.0.1)               │       │
│          │    │  - Bastion Host (emergency SSH, admin IPs only)     │       │
│          │    └──────────────────────────────────────────────────────┘       │
│          │                                                                   │
│          └────▶┌──────────────────────────────────────────────────────┐      │
│               │  PRIVATE SUBNET: 10.100.2.0/24 (App Tier)            │      │
│               │  - EKS Worker Nodes (API, AI, Web pods)              │      │
│               │  - Stream Ingestion Service                          │      │
│               │  - Alert Engine                                      │      │
│               │  - Notification Service                              │      │
│               └──────────────────────────────────────────────────────┘      │
│               ▲                                                              │
│               │    ┌──────────────────────────────────────────────────────┐ │
│               │    │  DATA SUBNET: 10.100.3.0/24 (No Internet)          │ │
│               │    │  - RDS PostgreSQL (Multi-AZ)                        │ │
│               │    │  - ElastiCache Redis Cluster                        │ │
│               │    │  - Amazon MSK Kafka                                 │ │
│               │    │  - NO INTERNET ACCESS (VPC endpoints only)          │ │
│               │    └──────────────────────────────────────────────────────┘ │
│               │                                                              │
│               │    ┌──────────────────────────────────────────────────────┐ │
│               │    │  MONITORING SUBNET: 10.100.4.0/24                  │ │
│               │    │  - Prometheus, Grafana, Alertmanager                │ │
│               │    │  - Loki (log aggregation)                           │ │
│               │    │  - Jaeger (distributed tracing)                     │ │
│               │    └──────────────────────────────────────────────────────┘ │
│               │                                                              │
│   ════════════╪══════════════════════════════════════════════════════════    │
│               │                                                              │
│               │         WireGuard VPN Tunnel (UDP 51820)                   │
│               │                                                              │
│   ════════════╪══════════════════════════════════════════════════════════    │
│               │                                                              │
│               │    ┌──────────────────────────────────────────────────────┐ │
│               │    │  EDGE GATEWAY: 192.168.29.5/24 (Intel NUC)           │ │
│               │    │  OS: Ubuntu Server 22.04 LTS (minimal)               │ │
│               │    │  - Docker Compose stack                              │ │
│               │    │  - WireGuard Client (10.200.0.2)                     │ │
│               │    │  - Local MinIO (hot storage)                         │ │
│               │    │  - Redis (local cache)                               │ │
│               │    │  - Video Capture Service                             │ │
│               │    │  - AI Inference (edge models)                        │ │
│               │    └──────────────────────────────────────────────────────┘ │
│               │                              │                               │
│               │    ┌─────────────────────────┴──────────────────────┐       │
│               │    │  CAMERA LAN: 192.168.29.0/24                    │       │
│               │    │  - CP PLUS DVR: 192.168.29.200 (8 channels)     │       │
│               │    │  - RTSP streams on port 554                     │       │
│               │    │  - NO INTERNET ACCESS                           │       │
│               │    │  - NO ROUTE TO CLOUD (only via edge gateway)    │       │
│               │    └────────────────────────────────────────────────┘       │
│                                                                              │
└──────────────────────────────────────────────────────────────────────────────┘

12.5.3 Firewall Rules

Edge Gateway Firewall (iptables):

Direction	Protocol	Port	Source	Destination	Action	Purpose
IN	TCP	22	Admin IP range only	Edge gateway	ACCEPT	SSH management
IN	UDP	51820	Cloud VPN IP	Edge gateway	ACCEPT	WireGuard tunnel
IN	TCP	8080	Local LAN only	Edge gateway	ACCEPT	Admin UI
IN	—	—	Any	Edge gateway	DROP	Default deny
OUT	TCP	443	Edge gateway	AWS S3 endpoint	ACCEPT	Cloud storage sync
OUT	UDP	51820	Edge gateway	Cloud VPN IP	ACCEPT	WireGuard tunnel
OUT	TCP	8080	Edge gateway	Local LAN	ACCEPT	Internal services
OUT	—	—	Edge gateway	Internet	DROP	No direct internet

Cloud Firewall (AWS Security Groups):

Direction	Protocol	Port	Source	Action	Purpose
IN	TCP	443	0.0.0.0/0	ACCEPT	Public HTTPS
IN	UDP	51820	Edge gateway IP	ACCEPT	WireGuard
IN	TCP	5432	App security group	ACCEPT	PostgreSQL
IN	TCP	6379	App security group	ACCEPT	Redis
IN	TCP	9092	App security group	ACCEPT	Kafka
IN	TCP	22	Admin IPs only	ACCEPT	Bastion SSH
IN	—	—	Any	DROP	Default deny

12.6 Secret Management

12.6.1 Vault Integration

All secrets are stored in HashiCorp Vault with automatic rotation policies:

Secret Type	Encryption	Rotation Frequency	Rotation Method	Access Pattern
Database passwords	AES-256-GCM	90 days	Terraform + Vault dynamic credentials	Short-lived (1-hour TTL)
JWT signing keys	AES-256-GCM	180 days	Dual-key grace period	Zero-downtime rotation
Internal API keys	AES-256-GCM	90 days	Zero-downtime rotation	Automated
Telegram bot tokens	AES-256-GCM	180 days	Regenerate via BotFather	Semi-automated
WhatsApp API tokens	AES-256-GCM	180 days	Regenerate via Meta Business Manager	Semi-automated
DVR credentials	AES-256-GCM	180 days	Manual via DVR web UI	Manual
TLS certificates	ACME auto	60 days	cert-manager + Let's Encrypt	Fully automated
WireGuard keys	AES-256-GCM	365 days	Maintenance window rotation	Scripted
Backup encryption keys	AES-256-GCM	365 days	Re-encrypt all backups	Automated
Session secrets	AES-256-GCM	On security incident	Immediate revocation	Admin trigger

12.6.2 Dynamic Database Credentials

Instead of static database passwords, the system uses Vault's dynamic credential engine:

Application → Vault (request db credentials)
                  │
                  ▼
           Vault creates temporary DB user
           (TTL: 1 hour, auto-revoke)
                  │
                  ▼
           Application receives credentials
           Uses them for DB connections
                  │
                  ▼
           After TTL expires → Vault revokes DB user
           Application requests new credentials

Benefits:

No long-lived database passwords in application configuration
Each application instance gets unique credentials
Automatic credential rotation without application restart
Full audit trail of credential issuance and revocation
Instant credential revocation on compromise

12.6.3 Field-Level Encryption

PII and biometric data in the database uses AES-256-GCM field-level encryption:

Field Category	Example Fields	Encryption
Personal identification	`name_encrypted`, `email_encrypted`, `phone_encrypted`	AES-256-GCM per-field
Employment data	`employee_id_encrypted`, `department_encrypted`	AES-256-GCM per-field
Biometric data	`face_encoding_encrypted` (512-D vector)	AES-256-GCM per-field
Media metadata	`location_encrypted` (GPS coordinates)	AES-256-GCM per-field

Encryption Architecture:

Application receives plaintext data
       │
       ▼
[Encrypt field-by-field using Vault KMS]
       │
       ▼
Store ciphertext in PostgreSQL
       │
       ▼
[Decrypt only in application layer when needed]
       │
       ▼
Decrypted data never logged, never cached

12.7 Audit Logging

12.7.1 Tamper-Resistant Hash-Chain

The audit log implements a cryptographically linked chain to ensure integrity:

Field	Purpose	Example
`event_id`	Unique UUID for each audit event	`550e8400-e29b-41d4-a716-446655440000`
`timestamp`	ISO 8601 timestamp	`2025-01-16T14:32:15Z`
`event_type`	Category of event	`user_login`, `person_viewed`, `alert_acknowledged`
`actor_id`	User who performed the action	`user-uuid-here`
`actor_role`	Role of the actor at the time	`operator`
`resource_type`	Type of resource accessed	`person`, `camera`, `alert`
`resource_id`	Specific resource identifier	`person-123`, `cam-01`
`action`	Action performed	`view`, `edit`, `delete`, `create`
`result`	Success or failure	`success`, `failure`, `denied`
`ip_address`	Source IP address	`10.100.2.15`
`session_id`	Session identifier	`sess-uuid-here`
`previous_hash`	SHA-256 hash of the previous entry	`a3f5c2...`
`entry_hash`	SHA-256 hash of current entry content	`b7e1d9...`
`signature`	ECDSA signature of the entry hash	`30450221...`

Chain Verification: Any modification to historical entries invalidates all subsequent hashes and signatures, making tampering detectable.

12.7.2 Log Retention Policy

Log Type	Online Retention	Archive Retention	Storage Type
Authentication events	1 year	6 years	WORM (Write-Once-Read-Many)
Authorization decisions	1 year	6 years	WORM
Person data modifications	1 year	6 years	WORM
Alert actions (ack, resolve)	1 year	3 years	Standard
Configuration changes	2 years	5 years	Standard
Security events	1 year	6 years	WORM
System health events	90 days	1 year	Standard
API access logs	90 days	1 year	Standard

12.7.3 Real-Time Security Alerting

Automated detection rules trigger alerts on suspicious patterns:

Rule ID	Rule Name	Condition	Auto-Response
SEC-001	Brute force login	> 5 failed logins from same IP in 5 minutes	Block IP for 1 hour; alert security team
SEC-002	Credential stuffing	> 10 unique usernames from same IP in 5 minutes	Block IP for 24 hours; alert security team
SEC-003	Impossible travel	Logins > 500 km apart within 1 hour	Force MFA re-verification; alert security team
SEC-004	Privilege escalation	> 20 admin actions in 10 minutes from new user	Alert security team; log for review
SEC-005	Data exfiltration	> 1 GB downloaded by single user in 1 hour	Suspend account; alert security team
SEC-006	Off-hours admin	Admin action between 22:00-06:00	Log + notify security manager
SEC-007	MFA bypass attempt	> 3 MFA failures then success without MFA	Block account; alert security team
SEC-008	Suspicious media access	> 50 media downloads by non-security role	Alert security team
SEC-009	Unknown device login	Login from unrecognized device fingerprint	Require MFA; notify user
SEC-010	Concurrent sessions	> 3 concurrent sessions for same user	Force logout of oldest session

12.8 Media Access Security

12.8.1 Signed URL Architecture

Media files are never served directly from object storage. All access is mediated through signed URLs:

Parameter	Value	Notes
Default expiration	5 minutes	Short-lived to prevent sharing
Maximum expiration	1 hour	For bulk exports only
URL binding	Tied to user session	Invalidated on logout
Single-use option	Available for sensitive media	Blacklist incident footage
Access logging	Every media request logged	User ID, media ID, timestamp, IP
IP binding	Optional	URL valid only from requesting IP
Watermarking	Optional	Username/timestamp overlay on images

Signed URL Flow:

1. User requests to view media
2. System checks: authentication + authorization + consent
3. If allowed: generate signed URL with HMAC-SHA256 signature
4. URL format: https://cdn.example.com/media/{id}?token={jwt}&sig={hmac}
5. Redirect user to signed URL
6. CDN/Object storage validates signature and expiry
7. Media served if valid; 403 if expired or invalid
8. Access logged with full context

12.8.2 Media Access Controls

Control	Implementation
No direct S3/MinIO URLs	All access via signed URL proxy
Authentication required	Valid JWT session required for all media requests
Authorization enforced	RBAC checks per media item; camera-level permissions respected
Access logging	Every media request logged with user ID, media ID, timestamp, IP, session
DPO notification	Automatic notification for access to sensitive media (blacklist incidents)
Secure deletion	Overwrite with random data + verification before removal
Download tracking	Number of downloads per media item tracked and reported

12.9 API Security

12.9.1 Defense Layers

Layer	Implementation	Details
Rate limiting	Per-endpoint, per-user tiers	Token bucket algorithm; 100 req/min default; 10 req/min for auth endpoints
Input validation	Pydantic models on all endpoints	Strict type checking; reject unknown fields; max length limits
SQL injection prevention	Parameterized queries only	No dynamic SQL construction; ORM for all database access
XSS prevention	Output encoding + CSP headers	User input never rendered as HTML; Content-Security-Policy enforced
CSRF protection	SameSite=Strict cookies + tokens	State-changing operations require CSRF token validation
CORS	Restricted to known origins	No wildcard origins; explicit allowlist per environment
Request size limits	10 MB default; 50 MB for media upload	Prevents DoS via large payloads
Request timeout	30 seconds default	Prevents resource exhaustion

12.9.2 Security Headers

Header	Value	Purpose
`Strict-Transport-Security`	`max-age=63072000; includeSubDomains; preload`	Enforce HTTPS for 2 years
`X-Content-Type-Options`	`nosniff`	Prevent MIME-type sniffing
`X-Frame-Options`	`DENY`	Prevent clickjacking
`X-XSS-Protection`	`0`	Disabled — CSP is preferred defense
`Referrer-Policy`	`strict-origin-when-cross-origin`	Minimal referrer information
`Permissions-Policy`	`camera=(), microphone=(), geolocation=()`	Disable browser APIs not needed
`Content-Security-Policy`	`default-src 'self'; script-src 'self' 'nonce-{random}'; style-src 'self' 'unsafe-inline'; img-src 'self' blob: data: https://.amazonaws.com; media-src 'self' blob: https://.amazonaws.com; connect-src 'self' wss://*.example.com; frame-ancestors 'none'; base-uri 'self'; form-action 'self';`	Comprehensive CSP
`Cache-Control` (API)	`no-store, no-cache, must-revalidate, proxy-revalidate`	Prevent caching of API responses
`Pragma` (API)	`no-cache`	Legacy cache directive

12.10 Session Security

Parameter	Value	Notes
Cookie flags	`HttpOnly; Secure; SameSite=Strict`	Full protection against XSS and CSRF
Access token storage	Memory only (JavaScript variable)	Never stored in localStorage
Access token max-age	15 minutes	Short-lived
Refresh token storage	`HttpOnly` secure cookie	Cannot be accessed by JavaScript
Refresh token max-age	7 days	Long-lived but revocable
Session absolute timeout	8 hours	Force re-login after 8 hours
Idle timeout	30 minutes	Expire if no activity
Max concurrent sessions	3 per user	Prevents session abuse
Session fixation protection	Regenerate session ID on login	Prevent fixation attacks
Session binding	Browser fingerprint + IP validation	Detect session theft
Force logout capability	Admin can revoke all sessions for any user	Immediate effect via Redis
Session storage	Redis with AUTH enabled	Encrypted at rest

12.11 Data Privacy (GDPR Compliance)

12.11.1 GDPR Compliance Matrix

GDPR Principle	Implementation Detail	Evidence
Lawful Basis	Legitimate interest assessment documented per processing purpose	LIA document filed with DPO
Data Minimization	Only facial feature embeddings (512-D vector) stored; raw images discarded after encoding	Architecture documentation
Purpose Limitation	Facial data used ONLY for security/safety purposes; no marketing or secondary use	Privacy policy
Storage Limitation	Automated retention enforcement; cryptographic deletion after expiry	Retention policy configuration
Accuracy	Regular review and correction procedures; user can request correction	Data correction workflow
Integrity & Confidentiality	AES-256-GCM encryption, RBAC access controls, audit logging	Security architecture
Accountability	DPO appointed; Privacy Impact Assessment completed; Records of Processing maintained	Compliance documentation
Transparency	Privacy notice displayed at camera entry points; privacy policy on website	Physical signage + web policy

12.11.2 Consent Management

Consent is managed through a comprehensive lifecycle:

Stage	Description	Transition Trigger
`pending`	Consent requested but not yet obtained	Initial system setup
`granted`	Explicit consent obtained	User signs consent form
`withdrawn`	Consent actively withdrawn	User requests deletion/stop processing
`deleted`	All data removed; audit trail only	Deletion workflow complete

Consent Metadata:

Field	Description
Consent method	`written` / `digital` / `verbal`
Consent document reference	ID of signed consent form
Consent date	When consent was obtained
Consent recorder	Who recorded the consent
Consent expiry	Annual expiry date
Consent scope	What processing is consented to

Withdrawal Processing:

User submits withdrawal request (any channel)
System flags person record for deletion
Delete face embeddings (biometric data) within 72 hours
Delete all personal images from storage
Anonymize detection events (keep event, replace name with [REDACTED], remove person link)
Delete related event clips
Log all deletion actions in audit trail
Confirm completion to user within 30 days

12.11.3 Privacy Mode Controls

Four privacy modes are available per camera:

Mode	Recording	Face Recognition	Alerts	Live View	Use Case
Full Operation	Yes	Yes	All	Yes	Standard surveillance
Recording Only	Yes	No	Motion only (no face)	Yes	Areas where facial recognition is not needed
Live View Only	No	No	No	Yes	Privacy-sensitive areas; viewing only
Privacy Mode	No	No	No	Privacy overlay	Break rooms, restrooms — privacy completely protected

12.12 Edge Gateway Security

12.12.1 Hardening Checklist

#	Hardening Measure	Implementation
1	Minimal OS	Ubuntu Server 22.04 LTS — no desktop packages
2	Disabled Bluetooth	`systemctl stop bluetooth; systemctl disable bluetooth`
3	Disabled WiFi	`nmcli radio wifi off; modprobe -r iwlwifi`
4	Disabled CUPS	`systemctl stop cups; systemctl disable cups`
5	Disabled avahi/mDNS	`systemctl stop avahi-daemon; systemctl disable avahi-daemon`
6	Disabled snapd	`systemctl stop snapd; systemctl disable snapd`
7	Disabled modemmanager	`systemctl stop ModemManager; systemctl disable ModemManager`
8	SSH key-only	`PasswordAuthentication no; PubkeyAuthentication yes`
9	SSH LAN-only	`ListenAddress 192.168.29.5`
10	SSH root disabled	`PermitRootLogin no`
11	SSH rate limit	`MaxAuthTries 3; ClientAliveInterval 300`
12	SSH protocol 2	`Protocol 2` (only)
13	SSH modern ciphers	`Ciphers chacha20-poly1305@openssh.com`
14	Auto-updates	`unattended-upgrades` — security updates only
15	Update schedule	Daily at 03:00; auto-reboot at 04:00 if required
16	Disk encryption	LUKS + TPM2 auto-unseal
17	Tamper detection	File integrity monitoring (AIDE) for critical config
18	Container security	Non-root users, read-only root FS, no new privileges
19	Firewall	iptables default deny; explicit allow only
20	No internet access	All outbound traffic via VPN tunnel only

12.12.2 LUKS Disk Encryption with TPM2

The edge gateway uses LUKS full-disk encryption with TPM2 auto-unseal for headless operation:

# During setup — encrypt the data partition
cryptsetup luksFormat /dev/nvme0n1p2 \
    --type luks2 \
    --cipher aes-xts-plain64 \
    --key-size 512 \
    --pbkdf argon2id \
    --tpm2-device=auto

# Bind the LUKS key to TPM2 PCR measurements
cryptsetup luksAddKey /dev/nvme0n1p2 \
    --key-slot 1 \
    --tpm2-device=auto \
    --tpm2-pcrs=0,2,7

# During boot — TPM2 auto-unseals if PCRs match
cryptsetup open --tpm2-device=auto /dev/nvme0n1p2 data

PCR Measurements Bound:

PCR	Purpose
PCR 0	Core system firmware executable code
PCR 2	Extended or pluggable executable code
PCR 7	Secure Boot state

12.13 Cloud Infrastructure Security

Control	Implementation	Verification
Private subnets	All internal services in private subnets; no public IPs	VPC flow logs
Security groups	Least privilege; explicit allow only; no default allow-all	Quarterly review
Database access	No public access; app servers only via security group reference	AWS Config rule
Bastion host	Emergency access only; non-standard SSH port (2222); admin IP allowlist only	Access log audit
IMDSv2	Enforced on all EC2 instances; no IMDSv1 fallback	Instance metadata check
Container security	Non-root users, read-only root FS, no new privileges, drop ALL capabilities	Pod Security admission
Image scanning	Trivy + Snyk on every build; HIGH/CRITICAL vulnerabilities block deployment	CI/CD pipeline gate
Image signing	Cosign signature verification required before deployment	Admission controller
Resource quotas	Kubernetes LimitRange on all namespaces	Resource quota monitoring
Network policies	Default deny all ingress/egress; explicit rules per service	Policy audit
Pod Security	Restricted standard enforced cluster-wide	Pod Security admission
Secrets management	Vault + External Secrets Operator; no secrets in Git	Secret scanning
Logging	All AWS API calls logged via CloudTrail; VPC Flow Logs enabled	Log analysis

12.14 Secrets Rotation Policy

Secret Type	Frequency	Method	Automation	Rollback
Database passwords	90 days	Terraform + Vault dynamic credentials	Full	N/A (short-lived)
JWT signing keys	180 days	Dual-key grace period; new key signs, old key verifies for 7 days	Full	Keep old key for 7 days
Internal API keys	90 days	Zero-downtime: add new key, deploy, remove old key	Full	Immediate via config revert
Telegram/WhatsApp tokens	180 days or on suspicion	Generate new via provider, update Vault, 5-min grace, revoke old	Semi	Old token valid for 5-minute grace
TLS certificates	60 days	cert-manager + Let's Encrypt auto-renewal	Full	Previous certificate cached
WireGuard keys	365 days	Maintenance window: generate new keys, update both endpoints simultaneously	Scripted	Manual key restore
DVR credentials	180 days	Manual via DVR web UI	Manual	Previous password documented
Backup encryption keys	365 days	Generate new key, re-encrypt all backups in background	Full	Previous key kept for 30 days
Session secrets	On security incident	Immediate: generate new secret, force all re-authentication	Admin trigger	Not applicable

12.15 Incident Response

12.15.1 Security Event Detection and Response

Phase	Timeline	Actions	Responsible
Detection	Automated (real-time)	Automated rules + behavioral analysis detect anomaly; alert generated	System
Assessment	0-15 minutes	On-call engineer evaluates severity; determines if genuine security event	On-call Engineer
Containment	15-60 minutes	Isolate affected systems; revoke compromised credentials; block malicious IPs	Security Team
Eradication	1-4 hours	Remove root cause; patch vulnerabilities; rotate all exposed secrets	Engineering
Recovery	4-24 hours	Restore from clean backups; verify system integrity; re-enable services	Platform Team
Lessons Learned	24-48 hours	Post-mortem; update procedures; implement preventive measures	Security Team

12.15.2 Breach Notification Procedure

Phase	Timeline	GDPR Requirement	Actions
Detection & Assessment	0-24 hours	—	Confirm breach; contain; assemble response team
Investigation	24-72 hours	Article 33(1)	Forensic analysis; determine scope of affected data
Supervisory Authority	Within 72 hours	Article 33	Notify Data Protection Authority
Data Subjects	Without undue delay	Article 34	Notify affected individuals if high risk
Recovery	Post-notification	—	Restore from clean backups; apply patches
Post-Incident	Within 48 hours	Article 5(2)	Root cause analysis; update plans; document

12.15.3 Breach Severity Classification

Level	Criteria	Notification Required	Example
Low	No personal data accessed	Internal only	Failed attack attempt; no data exposure
Medium	Limited personal data; no sensitive data	DPA notification	Username/email list exposed
High	Sensitive personal data or biometric data accessed	DPA + Data subjects	Facial embeddings database accessed
Critical	Large-scale biometric exfiltration; ongoing threat	DPA + Data subjects + Public	Ransomware attack with biometric data theft

12.16 Security Checklist Summary

The complete security checklist contains 100+ items across 15 categories. The following table summarizes the key items per category:

Category	Items	Key Requirements
SSL/TLS	8	TLS 1.3, strong cipher suites only, HSTS, OCSP stapling, auto-renewal
Authentication	13	Argon2id, JWT ES256, MFA enforcement, password policy, HaveIBeenPwned
RBAC	7	4 roles, 30+ permissions, resource-level access, default deny
VPN & Network	10	WireGuard + PSK, 5 security zones, firewall deny-all, network policies
Secret Management	10	Vault storage, dynamic credentials, field encryption, rotation schedule
Audit Logging	11	Hash-chain integrity, 20+ fields per entry, WORM storage, real-time alerts
Media Access	8	Signed URLs, session-bound, 5-min expiry, single-use option, watermarking
API Security	11	Rate limiting, Pydantic validation, parameterized queries, CSP, CSRF, CORS
Session Security	8	HttpOnly/Secure/Strict cookies, 8h absolute timeout, 30m idle timeout
Data Privacy (GDPR)	13	Consent tracking, right to deletion, anonymization, DPO, PIA
Edge Gateway	12	20-point hardening, LUKS + TPM2, tamper detection, auto-updates
Cloud Infrastructure	11	Private subnets, image scanning, Pod Security, IMDSv2, CloudTrail
Secrets Rotation	7	All types scheduled, 60-day TLS, 90-day DB, dual-key JWT
Incident Response	9	Detection rules, breach notification, severity classification, post-mortem
Total	130+	—

Section 13: UX / Website Structure

13.1 Design System

13.1.1 Design Philosophy

The UX design follows a "dark cockpit" philosophy optimized for 24/7 surveillance operations. The interface minimizes eye strain during long monitoring shifts while ensuring critical information is immediately visible. All design decisions prioritize operator efficiency and rapid threat identification.

Principle	Implementation
Dark mode default	Near-black background with blue-tinted grays to reduce eye strain in low-light environments
Information density	High-density layouts that maximize data visible without scrolling
At-a-glance status	Color-coded status indicators for immediate situational awareness
Progressive disclosure	Advanced controls hidden behind "Expand" toggles; essential info always visible
Consistent patterns	Same interaction patterns reused across all 18 pages
Responsive feedback	Every action produces visible feedback within 100ms

13.1.2 Color Palette

Token	Hex	RGBA	Usage	Contrast Ratio
`--bg-primary`	`#0B0E14`	rgb(11, 14, 20)	Main application background	—
`--bg-secondary`	`#151922`	rgb(21, 25, 34)	Card and panel backgrounds	—
`--bg-tertiary`	`#1E2330`	rgb(30, 35, 48)	Elevated surfaces, modals, dropdowns	—
`--bg-sidebar`	`#0D1117`	rgb(13, 17, 23)	Sidebar navigation background	—
`--bg-hover`	`#1A2030`	rgb(26, 32, 48)	Row/card hover state	—
`--bg-selected`	`#1E3A5F`	rgb(30, 58, 95)	Selected item background	—
`--text-primary`	`#E2E8F0`	rgb(226, 232, 240)	Headings, important content	15.8:1
`--text-secondary`	`#94A3B8`	rgb(148, 163, 184)	Labels, descriptions, metadata	9.2:1
`--text-muted`	`#64748B`	rgb(100, 115, 139)	Placeholder text, disabled states	6.1:1
`--accent-blue`	`#3B82F6`	rgb(59, 130, 246)	Primary accent — buttons, links, active states	4.5:1
`--accent-blue-hover`	`#2563EB`	rgb(37, 99, 235)	Button/link hover state	5.1:1
`--accent-green`	`#10B981`	rgb(16, 185, 129)	Success, online status, positive trends	5.3:1
`--accent-red`	`#EF4444`	rgb(239, 68, 68)	Critical alerts, errors, offline status	5.0:1
`--accent-orange`	`#F59E0B`	rgb(245, 158, 11)	Warnings, medium severity	5.4:1
`--accent-yellow`	`#FBBF24`	rgb(251, 191, 36)	Watchlist indicators, highlights	6.1:1
`--accent-purple`	`#8B5CF6`	rgb(139, 92, 246)	AI features, special highlights	4.8:1
`--border-color`	`#1E293B`	rgb(30, 41, 59)	Card borders, dividers, separators	—
`--border-focus`	`#3B82F6`	rgb(59, 130, 246)	Focus ring color	—
`--shadow-sm`	`0 1px 2px rgba(0,0,0,0.3)`	—	Subtle elevation	—
`--shadow-md`	`0 4px 6px rgba(0,0,0,0.4)`	—	Card elevation	—
`--shadow-lg`	`0 10px 25px rgba(0,0,0,0.5)`	—	Modal/dialog elevation	—

13.1.3 Typography

Token	Font Family	Size	Weight	Line Height	Letter Spacing	Usage
Display	Inter	28px	700 (Bold)	1.2	-0.02em	Page titles
H1	Inter	22px	600 (Semi-bold)	1.3	-0.01em	Section headings
H2	Inter	18px	600 (Semi-bold)	1.4	0	Card titles, modal headers
H3	Inter	15px	500 (Medium)	1.4	0	Sub-sections, form labels
Body	Inter	14px	400 (Regular)	1.5	0	General text, descriptions
Body Small	Inter	13px	400 (Regular)	1.5	0	Secondary body text
Caption	Inter	12px	400 (Regular)	1.4	0.01em	Captions, metadata, footnotes
Timestamp	JetBrains Mono	12px	400 (Regular)	1.4	0	All timestamps, durations
Code	JetBrains Mono	13px	400 (Regular)	1.5	0	Code snippets, IDs, technical data
Badge	Inter	11px	500 (Medium)	1	0.02em	Status badges, tags

13.1.4 Spacing and Layout

Token	Value	Usage
Sidebar expanded	260px	Full navigation with labels and icons
Sidebar collapsed	72px	Icons only; hover for tooltip
Top bar height	56px	Clock, alerts, user menu
Content padding	24px	Page content horizontal padding
Content max-width	1400px	Maximum content width; centered above
Card padding	16px	Internal card padding
Card border radius	12px	Card and panel corners
Card gap	16px	Gap between cards in grid
Button border radius	8px	Button corners
Input border radius	6px	Form input corners
Modal border radius	16px	Modal/dialog corners
Toast border radius	8px	Toast notification corners
Avatar size (small)	24px	Inline avatars
Avatar size (medium)	40px	Card headers, lists
Avatar size (large)	64px	Profile pages
Icon size (default)	20px	Navigation and actions
Icon size (small)	16px	Inline icons
Scrollbar width	8px	Custom styled scrollbar

13.2 Global Navigation Structure

13.2.1 Layout Architecture

┌──────────────────────────────────────────────────────────────────────────────┐
│ [Logo]  Sentinel AI Surveillance              [Clock] [Alerts] [👤 User] │  ▲ 56px
├────────┬───────────────────────────────────────────────────────────────────┤
│        │                                                                    │
│  [📊]  │                    MAIN CONTENT AREA                              │
│  Dash  │                                                                    │
│  board │    ┌──────────────┐  ┌──────────────┐  ┌──────────────┐         │
│        │    │   Card 1     │  │   Card 2     │  │   Card 3     │         │
│  [📹]  │    │              │  │              │  │              │         │
│  Live  │    └──────────────┘  └──────────────┘  └──────────────┘         │
│        │                                                                    │
│  [🔔]  │    ┌──────────────────────────────────────────────────┐         │
│ Alerts │    │              Wide Card / Table                   │         │
│        │    └──────────────────────────────────────────────────┘         │
│  [🔍]  │                                                                    │
│ Detec  │                                                                    │
│ tions  │                                                                    │
│        │                                                                    │
│ [remaining navigation items...]                                            │
│        │                                                                    │
├────────┤                                                                    │
│◁ / ▷  │                                                                    │
└────────┴───────────────────────────────────────────────────────────────────┘
  ◄── 260px (expanded) / 72px (collapsed) ──►

13.2.2 Navigation Menu Items

#	Icon	Label	Route	Badge Type	Required Permission
1	`LayoutDashboard`	Dashboard	`/dashboard`	None	Any
2	`Video`	Live View	`/live`	Online camera count	`cameras:view`
3	`Bell`	Alert Center	`/alerts`	Pending alert count	`alerts:view`
4	`ScanEye`	Detections	`/detections`	None	`cameras:view`
5	`Users`	Person Gallery	`/persons`	Total person count	`persons:view`
6	`UserQuestion`	Unknown Review	`/unknowns`	Queue count	`persons:view`
7	`ClockAlert`	Suspicious Activity	`/timeline`	None	`alerts:view`
8	`Search`	Search	`/search`	None	Any
9	`ShieldAlert`	Watchlists	`/watchlists`	None	`watchlists:view`
10	`Sparkles`	AI Vibe Settings	`/settings/ai`	None	`ai_settings:view`
11	`Brain`	Training Review	`/training`	Pending suggestions	`ai_settings:view`
12	`Activity`	System Health	`/health`	Status dot (green/yellow/red)	`system:view`
13	`Settings`	Settings	`/settings`	None	Admin functions

Settings Submenu:

#	Icon	Label	Route	Required Permission
13a	`Camera`	Camera Management	`/settings/cameras`	`cameras:manage`
13b	`HardDrive`	Retention & Storage	`/settings/storage`	`storage:manage`
13c	`UserCog`	Admin Users	`/settings/users`	`users:manage`
13d	`BellRing`	Notification Settings	`/settings/notifications`	`notifications:manage`

13.2.3 Top Bar

Element	Position	Content	Update Frequency
Logo + Brand	Left	Sentinel AI logo + text	Static
Current Time	Center-Right	`HH:MM:SS` live clock	Every second
Alert Badge	Right	Bell icon with red count badge	On alert change
User Menu	Far right	Avatar + dropdown menu	Static

User Menu Dropdown:

Item	Action
Profile	Navigate to user profile
Preferences	Theme, timezone, notification preferences
Keyboard Shortcuts	Show shortcut reference modal
Help & Documentation	Open help center
Logout	End session (clears all tokens)

13.3 Page Descriptions

13.3.1 Page 1: Login (`/login`)

The login page is the entry point to the system. It is designed for quick, secure access with minimal friction.

Feature	Specification
Layout	Centered card on dark background
Logo	Sentinel AI logo (large) centered above form
Fields	Username/email (text input), Password (password input with show/hide toggle)
Remember me	Checkbox — "Keep me signed in for 7 days"
Submit	"Sign In" button — full width, accent blue
MFA step	Appears after successful password; 6-digit TOTP input with auto-focus
Error states	Inline validation; shake animation on error
Footer	"v2.3.1" version number, copyright, privacy policy link
Security	Rate limiting (5 attempts / 15 min), CAPTCHA after 3 failures
Redirect	After login, redirect to originally requested URL (or Dashboard)
Session	JWT access token (15 min) + refresh token cookie (7 days)

13.3.2 Page 2: Dashboard (`/dashboard`)

The Dashboard is the primary landing page providing at-a-glance situational awareness.

┌──────────────────────────────────────────────────────────────────────────────┐
│  Dashboard                                          [Refresh] [Date Range] │
├──────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  ┌────────────────┐ ┌────────────────┐ ┌────────────────┐ ┌──────────────┐ │
│  │  📹 8/8       │ │  🔔 12        │ │  👥 47        │ │  ✓ Healthy  │ │
│  │  Cameras      │ │  Alerts Today │ │  Persons      │ │  System     │ │
│  │  Online       │ │  3 Critical   │ │  Detected     │ │  All Good   │ │
│  └────────────────┘ └────────────────┘ └────────────────┘ └──────────────┘ │
│                                                                              │
│  ┌────────────────────────────────────────┐ ┌──────────────────────────┐   │
│  │  Alert Distribution (Last 24 Hours)   │ │  Recent Alerts           │   │
│  │                                        │ │                          │   │
│  │  8 ┤          ██                       │ │  🔴 CAM-01  Unknown     │   │
│  │  6 ┤    ██    ██  ██                   │ │     14:32 — Entrance    │   │
│  │  4 ┤    ██ ██ ██  ██ ██                │ │  🟡 CAM-03  Watchlist   │   │
│  │  2 ┤ ██ ██ ██ ██  ██ ██ ██             │ │     13:15 — Parking     │   │
│  │  0 ┼────┬────┬────┬────┬────┬────┬──  │ │  🟠 CAM-05  System      │   │
│  │     00  04  08  12  16  20           │ │     12:08 — Storage 90% │   │
│  │                                        │ │                          │   │
│  └────────────────────────────────────────┘ └──────────────────────────┘   │
│                                                                              │
│  ┌──────────────────────────────────────────────────────────────────┐     │
│  │  Camera Status Grid (2x4)                                        │     │
│  │                                                                    │     │
│  │  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐           │     │
│  │  │ CAM-01  │ │ CAM-02  │ │ CAM-03  │ │ CAM-04  │           │     │
│  │  │ [LIVE]  │ │ [LIVE]  │ │ [LIVE]  │ │ [LIVE]  │           │     │
│  │  │ ● Online│ │ ● Online│ │ ● Online│ │ ● Online│           │     │
│  │  └──────────┘ └──────────┘ └──────────┘ └──────────┘           │     │
│  │  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐           │     │
│  │  │ CAM-05  │ │ CAM-06  │ │ CAM-07  │ │ CAM-08  │           │     │
│  │  │ [LIVE]  │ │ [LIVE]  │ │ [LIVE]  │ │ [LIVE]  │           │     │
│  │  │ ● Online│ │ ● Online│ │ ● Online│ │ ● Online│           │     │
│  │  └──────────┘ └──────────┘ └──────────┘ └──────────┘           │     │
│  └──────────────────────────────────────────────────────────────────┘     │
│                                                                              │
│  ┌──────────────────────────────────────────────────────────────────┐     │
│  │  Activity Feed                                                    │     │
│  │  14:32 — Unknown person detected at CAM-01 (Entrance)            │     │
│  │  14:15 — Watchlist match: John Smith at CAM-03 (Parking)         │     │
│  │  13:58 — Operator Alice acknowledged alert #ALT-2847             │     │
│  │  13:42 — Camera CAM-05 stream reconnected                        │     │
│  │  13:30 — Daily training completed: 3 new face clusters          │     │
│  └──────────────────────────────────────────────────────────────────┘     │
└──────────────────────────────────────────────────────────────────────────────┘

Dashboard Components:

Component	Refresh Rate	Description
Stat cards	30 seconds	Active cameras, alerts today, persons detected, system health
Alert distribution chart	5 minutes	Bar chart showing alerts by hour for last 24 hours
Recent alerts card	30 seconds	Last 5 alerts with severity badge, camera, timestamp
Camera status grid	30 seconds	2x4 grid of all 8 cameras with live thumbnail and status dot
Activity feed	Real-time (WebSocket)	Recent system events — detections, alerts, operator actions

13.3.3 Page 3: Live Camera View (`/live`)

The live view is the primary monitoring interface, showing real-time streams from all 8 cameras.

Feature	Specification
Default layout	2x4 grid (8 cameras)
Layout options	1x1 (single), 2x2 (4 cameras), 2x4 (8 cameras), 4x4 (16 cameras for future scaling)
Stream format	HLS (HTTP Live Streaming) with WebRTC fallback for lower latency
Per-camera overlay	Camera name, status dot, expand button, snapshot button
Grid controls	Play all / Pause all, Refresh all streams, Layout selector
Camera states	Loading (spinner), Playing, Paused, Error (retry button), Offline (gray placeholder)
Fullscreen	Click any camera to expand; press `F` to toggle fullscreen for focused camera
Camera switching	Press `1`-`8` to focus camera by number
Snapshot	Press `S` or click camera snapshot button to capture current frame
Recording indicator	Red pulsing dot on cameras actively recording
Alert overlay	Flashing border on camera that triggered recent alert

13.3.4 Page 4: Alert Center (`/alerts`)

The Alert Center provides comprehensive alert management with filtering, batch actions, and detailed investigation tools.

Feature	Specification
Filter bar	Date range picker, severity multi-select (Critical/High/Medium/Low/Info), camera multi-select, status filter (Pending/Acknowledged/Resolved/Ignored), type filter
Severity legend	Color-coded badges: Critical (red), High (orange), Medium (yellow), Low (blue), Info (gray)
Alert cards	Each card: thumbnail image, camera name, timestamp, severity badge, person name (if known), description, current status
Card actions	Acknowledge, Resolve, Ignore, View Details, Mark False Positive
Bulk actions	Checkbox selection; batch Acknowledge or Ignore
Sort options	Newest first (default), Oldest first, Severity (highest first), Camera name
Pagination	20 alerts per page; infinite scroll option
Empty state	"No alerts in the selected period" with illustration
Detail panel	Slide-out panel with full alert info: images, video clip, AI confidence, detection metadata, person profile link

13.3.5 Page 5: Recent Detections (`/detections`)

Shows all recent detection events with face thumbnails and recognition results.

Feature	Specification
Filter controls	Known/Unknown/All toggle, date range picker, camera selector, person name search
Detection cards	Face thumbnail + name (or "Unknown") + confidence percentage + camera name + timestamp + watchlist badge
Card click	Opens detail view with full-size image, sighting history for that person, camera info
Actions	"Name This Person" (unknowns), "View Profile" (known), "Add to Watchlist"
Confidence indicator	Visual bar showing confidence level; color-coded (green > 90%, yellow 70-90%, orange < 70%)
Grid layout	4 columns desktop, 3 tablet, 2 mobile
Auto-refresh	New detections appear at top without page reload (WebSocket)

13.3.6 Page 6: Person Gallery (`/persons`)

A browsable gallery of all known persons in the system.

Feature	Specification
Search bar	Full-text search across names, roles, departments, tags
Role filters	Employee / Visitor / Vendor / Contractor / Other — pill-style toggle buttons
Sort options	Name (A-Z), Last Seen (recent first), Sightings Count (highest first), Date Added (newest first)
Person cards	Face image, name, role badge, department, last seen timestamp, total sightings count
Grid layout	5 columns desktop (xl), 4 columns (lg), 3 columns (md), 2 columns (sm)
Pagination	50 persons per page
Actions	Click card → navigate to Person Profile; right-click context menu
Bulk actions	Select multiple for bulk add to watchlist
Empty state	"No persons found" with "Add your first person" CTA

13.3.7 Page 7: Unknown Persons Review (`/unknowns`)

The review queue for unidentifified persons — a critical workflow for building the person database.

Feature	Specification
Queue view	Cards of unknown person clusters (grouped by face similarity via DBSCAN)
Cluster card	Representative face image + cluster size (number of sightings) + first/last seen + cameras detected at + confidence range
Actions per cluster	Name This Person, Merge with Existing, Ignore Cluster, Mark as Reviewed
AI insight panel	Pattern suggestion: "Seen 5x at entrance between 08:00-09:00 — possibly employee"
Progress indicator	"23 unknown clusters remaining" with progress bar
Batch review	Keyboard navigation (arrow keys + Enter to select action) for rapid review
Empty state	"Great job! No unknown persons to review. All caught up!" with celebration animation
Reviewed history	Tab to view previously reviewed clusters

13.3.8 Page 8: Person Profile (`/persons/{id}`)

Detailed view of a single person's information, detection history, and management options.

Feature	Specification
Header	Name, role badge, status (Active/Inactive), action buttons (Edit, Delete, Add to Watchlist)
Photo gallery	Primary face photo (large) + additional reference photos in thumbnail grid below
Info panel	Department, employee ID, contact information, notes, tags, date added, added by
Sighting history	Timeline of all detections — timestamp, camera name, confidence, thumbnail image
Sighting stats	Total sightings, first seen, last seen, most common camera, most common time
Watchlist memberships	Which watchlists this person belongs to, with badge per watchlist
Activity log	Who created/edited the profile and when; full audit trail
Danger zone	Delete person (with confirmation dialog explaining consequences)

13.3.9 Page 9: Suspicious Activity Timeline (`/timeline`)

A timeline-based visualization of flagged events for pattern analysis.

Feature	Specification
Timeline view	Horizontal time axis with event markers positioned by timestamp
Event types	Unusual movement (orange), Loitering (yellow), Unauthorized access (red), Crowd gathering (purple)
Color coding	Each event type has a distinct color; severity affects marker size
Filters	Event type multi-select, camera selector, date range, severity threshold
Zoom levels	Hour view, Day view (default), Week view, Month view
Click marker	Opens detail panel with description, evidence images, AI reasoning, confidence
Density heatmap	Background shows detection density to identify high-activity periods

13.3.10 Page 10: Search (`/search`)

Global search across all data types in the system.

Feature	Specification
Search bar	Prominent centered search input with clear button
Category filters	Person, Camera, Event, Alert — toggle pills
Results grouping	Results grouped by category with section headers
Person search	Type name or upload a photo for face recognition similarity search
Camera search	By name, location, or status
Event search	By description, camera, person, or event type
Alert search	By ID, description, or camera
Keyboard shortcut	`/` (forward slash) focuses search from any page
Recent searches	Dropdown shows recent searches for quick access
Empty state	"No results found" with search tips

13.3.11 Page 11: Watchlists (`/watchlists`)

Management interface for watchlist categories and their members.

Feature	Specification
Watchlist cards	Name, icon (selected from preset), color, member count, alert settings summary
Create button	"+ New Watchlist" with modal: name, icon picker, color picker, alert configuration
Default watchlists	VIP (green), Blacklist (red), Authorized (blue), Temporary Access (yellow)
Card click	Opens watchlist detail with full member list
Member management	Add from gallery (search + select), remove member, bulk import via CSV
Alert settings	Per-watchlist: alert timing, severity override, notify groups, quiet hours override
Test button	"Test Alert" — sends test notification for this watchlist to verify configuration
Member table	Sortable by name, date added, added by, sightings count

13.3.12 Page 12: AI Vibe Settings (`/settings/ai`)

The AI Vibe Settings page presents AI configuration as friendly questions rather than technical parameters.

#	Setting	Question	Options	Description
1	Detection Sensitivity	"How carefully should the AI watch?"	Relaxed / Balanced / High / Maximum	Controls how aggressively the AI reports detections
2	Face Match Threshold	"How confident should the AI be before naming someone?"	Lenient / Normal / Strict / Very Strict	Lower = more matches but more false positives
3	Night Mode	"How should the AI behave at night?"	Off / Diminished / Active / Enhanced	Night-specific model and sensitivity adjustment
4	Evidence Capture	"What should be saved when someone is detected?"	Photo Only / Photo + 5s Clip / Photo + 10s Clip / Full Recording	Media stored per detection event
5	Alert Style	"When should alerts be sent?"	Silent / Digest / Normal / Urgent / Critical	Controls alert frequency and channels used
6	Learning Mode	"Should the AI learn from new sightings?"	Off / Review First / Auto-Learn Cautiously / Auto-Learn Aggressively	How unknown face clusters are handled
7	Privacy Mode	"How should privacy be handled?"	Full Recognition / Blur Unrecognized / Blur All Faces / Privacy Zones	Face processing and display privacy

Each setting control:

Segmented button group (pill-shaped options)
Selected option highlighted in accent blue
Brief description below updates on selection
Current value displayed as badge
Auto-save (no save button); toast confirms: "Detection Sensitivity updated to High"
Expand toggle reveals internal numerical values (Admin permission required)

Advanced Mode (Admin only): When expanded, each control shows the internal parameter values:

Setting	Option	Internal Value
Detection Sensitivity	Relaxed	Confidence threshold: 0.85, NMS: 0.5
Detection Sensitivity	Balanced	Confidence threshold: 0.70, NMS: 0.45
Detection Sensitivity	High	Confidence threshold: 0.55, NMS: 0.4
Detection Sensitivity	Maximum	Confidence threshold: 0.40, NMS: 0.35
Face Match Threshold	Lenient	Similarity threshold: 0.60
Face Match Threshold	Normal	Similarity threshold: 0.70
Face Match Threshold	Strict	Similarity threshold: 0.80
Face Match Threshold	Very Strict	Similarity threshold: 0.90

13.3.13 Page 13: Training Review (`/training`)

Interface for reviewing AI-suggested face clusters and approving them for model training.

Feature	Specification
Suggestion cards	Face cluster the AI is uncertain about — multiple face images + AI confidence + reason for suggestion
Card layout	Grid of face thumbnails + confidence bar + suggestion reason ("Seen 8x at different cameras, high confidence match")
Actions per suggestion	Approve (add to training data), Reject (not a valid cluster), Merge with Existing Person
Batch actions	Select multiple suggestions for bulk Approve/Reject
Queue status	"12 suggestions pending review" with progress bar
Filter	By confidence level, camera, date range
History	Tab showing previously reviewed suggestions with outcome
Training metrics	Model accuracy trend, training data count, last training time

13.3.14 Page 14: System Health (`/health`)

Real-time system health monitoring dashboard.

Feature	Specification
Status overview	Large status indicator: All Systems Operational (green) / Degraded (yellow) / Critical (red)
Service cards	Per-service status card: Video Capture, AI Inference, Database, Storage, Notifications, VPN
Per-service metrics	Status dot, uptime percentage, last restart, CPU, memory
Camera health table	All 8 cameras: stream status, FPS, bitrate, last seen, error count
System metrics	CPU usage (%), memory usage (%), disk usage (%), network I/O
Logs viewer	Recent system logs with severity filtering (DEBUG/INFO/WARNING/ERROR/CRITICAL); tail -f style auto-scroll
Refresh	Auto-refresh every 30 seconds; manual refresh button
Historical view	Toggle to show metrics history (last 1h, 6h, 24h, 7d)

13.3.15 Page 15: Notifications Settings (`/settings/notifications`)

Configuration interface for the notification system.

Feature	Specification
Recipient groups	Add/edit/delete groups; each group has name, Telegram chat IDs, WhatsApp numbers, alert preferences
Routing rules	Visual rule builder with drag-and-drop condition blocks (camera, person, role, event_type, zone, time, day, severity, watchlist)
Quiet hours	Schedule builder with day-of-week checkboxes, time range pickers, timezone selector
Template editor	Edit message templates per alert type; live preview with sample data; variable reference panel
Delivery status	Real-time view showing notification delivery states (pending/sent/delivered/failed)
Test buttons	"Send Test Alert" per channel to verify configuration
DLQ viewer	Dead letter queue entries with retry/discard actions

13.3.16 Page 16: Admin Users (`/settings/users`)

User management interface for administrators.

Feature	Specification
Users table	Username, email, role badge, status (Active/Inactive), last login, MFA status, actions menu
Add user	Modal: username, email, role selector, password (or send invite link), MFA toggle
Edit user	Role, status, force password change on next login, reset 2FA, session revocation
User activity log	Login history (timestamp, IP, device), actions taken, settings changed
Bulk actions	Deactivate multiple accounts simultaneously
Filter	By role, status, last login date range
Sort	By username, role, last login, created date
Pagination	25 users per page

13.3.17 Page 17: Camera Management (`/settings/cameras`)

Configuration interface for camera setup and zone management.

Feature	Specification
Camera cards	Name, status (Online/Offline/Disabled), IP/connection string, stream info (resolution, FPS), action buttons (Edit, Test, Disable)
Add camera	Modal: name, location, stream URL, credentials, channel number, description
Edit camera	All camera properties; test connection button
Zone configuration	Interactive polygon drawing on live camera feed; zone name, color, sensitivity, type (Entrance/Restricted/Detection/Ignore)
Stream settings	Resolution (720p/1080p), frame rate (5/10/15/25/30 FPS), codec (H.264/H.265), night mode toggle
Recording settings	Continuous/event-triggered, retention policy, storage location
Camera ordering	Drag to reorder cameras in grid layout

13.3.18 Page 18: Retention & Storage (`/settings/storage`)

Storage management and retention policy configuration.

Feature	Specification
Storage overview	Donut chart showing usage breakdown: Video recordings, Detection snapshots, Training data, System logs, Free space
Numerical values	Total capacity / Used / Free; warning at > 80% (yellow), critical at > 95% (red)
Retention policies	Dropdown per category: 7 days / 14 days / 30 days / 60 days / 90 days / 180 days / 365 days / Forever
Auto-cleanup	Enable toggle + schedule time picker (daily at 03:00 default)
Actions	"Save Settings", "Run Cleanup Now" (with confirmation), "Export Storage Report"
Growth projection	Estimated days until full based on current growth rate
Storage alerts	Configure alert thresholds (80% warning, 90% high, 95% critical)

13.4 Key User Flows

13.4.1 Flow 1: Daily Operator — Monitor & Respond

┌──────────────────────────────────────────────────────────────────────────────┐
│                  FLOW 1: DAILY OPERATOR (Monitor & Respond)                   │
├──────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  STEP 1: LOGIN                                                              │
│  ──────────────                                                              │
│  Enter username → Enter password → MFA code (if enabled)                    │
│  → Redirect to Dashboard                                                     │
│                                                                              │
│  STEP 2: DASHBOARD REVIEW (~30 seconds)                                     │
│  ─────────────────────────────────────                                       │
│  Glance at stat cards:                                                       │
│    ├─ All 8 cameras online? ✓                                               │
│    ├─ Any critical alerts pending? (red badge)                               │
│    ├─ Any unknown persons detected?                                          │
│    └─ System health OK?                                                      │
│                                                                              │
│  If critical alert visible:                                                  │
│    → Click alert card → Go to Alert Center                                  │
│  If no urgent alerts:                                                        │
│    → Click "Live View" in sidebar                                           │
│                                                                              │
│  STEP 3: LIVE CAMERA MONITORING (ongoing)                                   │
│  ─────────────────────────────────────                                       │
│  View 2x4 grid of all cameras                                               │
│  Observe feeds for anomalies                                                │
│                                                                              │
│  When alert toast appears (top-right):                                       │
│    → Toast slides in with sound notification                                 │
│    → Click toast to view alert details                                       │
│                                                                              │
│  STEP 4: ALERT RESPONSE                                                     │
│  ──────────────────                                                          │
│  Click alert toast OR navigate to Alert Center                               │
│  Review alert card:                                                          │
│    ├─ Thumbnail image                                                        │
│    ├─ Camera name, timestamp                                                 │
│    ├─ Alert type (unknown person, watchlist match, etc.)                     │
│    └─ Severity level                                                         │
│                                                                              │
│  Click "View Details" for full information:                                  │
│    ├─ Full-size image / video clip                                           │
│    ├─ AI confidence score                                                    │
│    ├─ Detection metadata (bounding box, zone)                                │
│    └─ Person profile link (if known)                                         │
│                                                                              │
│  DECISION:                                                                   │
│    ├─ False detection → Click "Mark as False Positive"                       │
│    ├─ Legitimate alert → Click "Acknowledge" or "Resolve"                    │
│    ├─ Unknown person → Click "Name This Person"                              │
│    ├─ Needs escalation → Click "Escalate"                                    │
│    └─ Need live view → Click "View Live" to jump to camera                   │
│                                                                              │
│  STEP 5: RETURN TO MONITORING                                               │
│  ────────────────────────────                                                │
│  After handling alert, return to Live View                                   │
│  Continue monitoring cycle                                                   │
│                                                                              │
│  STEP 6: END OF SHIFT                                                       │
│  ──────────────────                                                          │
│  Review unacknowledged alerts (if any)                                       │
│  Check System Health page                                                    │
│  Hand over to next operator (verbal + note any pending issues)               │
│  Click user menu → Logout                                                    │
└──────────────────────────────────────────────────────────────────────────────┘

13.4.2 Flow 2: New Person Onboarding

┌──────────────────────────────────────────────────────────────────────────────┐
│                    FLOW 2: NEW PERSON ONBOARDING                              │
├──────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  TRIGGER: System detects unknown person → Alert created → Operator notified │
│                                                                              │
│  STEP 1: REVIEW DETECTION                                                    │
│  ────────────────────                                                        │
│  Navigate to "Recent Detections" via sidebar                                 │
│  Filter: "Unknown" (toggle button)                                           │
│  Click on unknown detection card                                             │
│                                                                              │
│  Detail view shows:                                                          │
│    ├─ Full-size face image                                                   │
│    ├─ Camera: CAM-01 (Entrance)                                              │
│    ├─ Timestamp: 2025-01-16 14:32:15                                        │
│    ├─ Confidence: 87.3%                                                      │
│    └─ AI note: "No matching person found in database"                        │
│                                                                              │
│  STEP 2: NAME THE PERSON                                                     │
│  ────────────────────                                                        │
│  Click "Name This Person" button                                             │
│  Modal dialog appears:                                                       │
│                                                                              │
│    ┌────────────────────────────────────┐                                   │
│    │  Name This Person                   │                                   │
│    │                                     │                                   │
│    │  Face: [thumbnail]                  │                                   │
│    │                                     │                                   │
│    │  Full Name *     [____________]     │                                   │
│    │  Role *          [Employee ▼]       │                                   │
│    │  Department      [____________]     │                                   │
│    │  Employee ID     [____________]     │                                   │
│    │  Notes           [____________]     │                                   │
│    │  Tags            [____________]     │                                   │
│    │                                     │                                   │
│    │  Similar existing persons:          │                                   │
│    │  [No similar persons found]         │                                   │
│    │                                     │                                   │
│    │  [Cancel]  [Save & Create Profile]  │                                   │
│    └────────────────────────────────────┘                                   │
│                                                                              │
│  STEP 3: SIMILARITY CHECK                                                    │
│  ────────────────────                                                        │
│  System searches for similar existing persons                                │
│  If matches found: display side-by-side comparison                           │
│    → Option to merge with existing person instead of creating new            │
│  If no matches: proceed with creation                                        │
│                                                                              │
│  STEP 4: SAVE PROFILE                                                        │
│  ──────────────                                                              │
│  Click "Save & Create Profile"                                               │
│  Toast notification: "Profile created for [Name]"                            │
│  Detection card updates with person name                                     │
│  Person now appears in Person Gallery                                        │
│                                                                              │
│  STEP 5: ADD TRAINING IMAGES (Optional)                                      │
│  ────────────────────────────────────                                        │
│  Navigate to Person Profile                                                  │
│  Click "Upload Reference Photos"                                             │
│  Select additional clear face images                                         │
│  System queues for model retraining                                          │
│  Toast: "3 new training images added. Model will retrain automatically."     │
└──────────────────────────────────────────────────────────────────────────────┘

13.4.3 Flow 3: Unknown Person Review Queue

┌──────────────────────────────────────────────────────────────────────────────┐
│                  FLOW 3: UNKNOWN PERSON REVIEW QUEUE                          │
├──────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  STEP 1: OPEN REVIEW QUEUE                                                   │
│  ──────────────────────                                                      │
│  Sidebar → "Unknown Persons Review"                                          │
│  View: Grid of unknown person cluster cards                                  │
│  Header: "23 unknown clusters remaining"                                     │
│                                                                              │
│  STEP 2: SELECT CLUSTER                                                      │
│  ────────────────                                                            │
│  Click on a cluster card to expand                                           │
│  Shows:                                                                      │
│    ├─ Representative face (largest)                                          │
│    ├─ Gallery of all face instances in cluster                               │
│    ├─ Sighting history (camera, time, count)                                 │
│    ├─ AI pattern insight: "Seen 5x at entrance between 08:00-09:00"         │
│    └─ Confidence distribution graph                                          │
│                                                                              │
│  STEP 3: MAKE DECISION                                                       │
│  ────────────────                                                            │
│  Options:                                                                    │
│    ├─ [Name This Person] → Enter details → Create new profile               │
│    ├─ [Merge with Existing] → Search/select person → Confirm merge          │
│    ├─ [Ignore Cluster] → "False detection / not a person" → Remove         │
│    └─ [Mark Reviewed] → "Unsure, keep in queue for later"                   │
│                                                                              │
│  STEP 4: QUEUE UPDATES                                                       │
│  ────────────────                                                            │
│  Processed item removed from queue                                           │
│  Toast confirms action: "Cluster marked as [Name]. 22 remaining."            │
│  Auto-advance to next cluster (optional)                                     │
│  Keyboard shortcut: Right arrow → next cluster                               │
│                                                                              │
│  STEP 5: CONTINUE REVIEW                                                     │
│  ────────────────                                                            │
│  Process all clusters or stop and resume later                               │
│  Queue persists across sessions                                              │
│  New clusters automatically added as detected                                │
└──────────────────────────────────────────────────────────────────────────────┘

13.4.4 Flow 4: AI Settings Adjustment

┌──────────────────────────────────────────────────────────────────────────────┐
│                    FLOW 4: AI SETTINGS ADJUSTMENT                             │
├──────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  STEP 1: NAVIGATE TO AI VIBE SETTINGS                                        │
│  ────────────────────────────────────                                        │
│  Sidebar → "AI Vibe Settings" (Sparkles icon)                                │
│  View: Scrollable page with 7 setting sections                               │
│                                                                              │
│  STEP 2: ADJUST DETECTION SENSITIVITY                                        │
│  ────────────────────────────────                                            │
│  Section: "How carefully should the AI watch?"                               │
│  Current: [Relaxed] [Balanced] [High] [Maximum]                              │
│  Change: Click "High"                                                        │
│  Description updates:                                                        │
│    "High: The AI will catch almost everything.                               │
│     Expect more alerts, including some false positives."                     │
│  Toast: "Detection Sensitivity updated to High"                              │
│  Change takes effect immediately                                             │
│                                                                              │
│  STEP 3: ADJUST ALERT STYLE                                                  │
│  ────────────────────                                                        │
│  Section: "When should alerts be sent?"                                      │
│  Current: [Silent] [Digest] [Normal] [Urgent] [Critical]                     │
│  Change: Click "Critical"                                                    │
│  Description updates:                                                        │
│    "Critical: Only truly important events trigger alerts.                    │
│     All other activity is logged but not alerted."                           │
│  Toast: "Alert Style updated to Critical"                                    │
│                                                                              │
│  STEP 4: REVIEW ADVANCED (Admin only)                                        │
│  ────────────────────────────────────                                        │
│  Click "Expand" on Advanced Settings                                         │
│  Shows internal values:                                                      │
│    Detection Sensitivity: High                                               │
│    └─ Confidence Threshold: 0.55                                             │
│    └─ NMS Threshold: 0.40                                                    │
│    └─ Model: yolo11m.onnx                                                    │
│  Admin can directly edit numerical values                                    │
│                                                                              │
│  STEP 5: DONE                                                                │
│  ────────                                                                    │
│  All changes auto-saved                                                      │
│  Return to monitoring — changes effective immediately                        │
└──────────────────────────────────────────────────────────────────────────────┘

13.4.5 Flow 5: Watchlist Alert Configuration

┌──────────────────────────────────────────────────────────────────────────────┐
│                 FLOW 5: WATCHLIST ALERT CONFIGURATION                         │
├──────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  STEP 1: NAVIGATE TO WATCHLISTS                                              │
│  ────────────────────────────                                                │
│  Sidebar → "Watchlists"                                                      │
│  View: Grid of existing watchlist cards                                      │
│  Default: VIP, Blacklist, Authorized, Temporary Access                      │
│                                                                              │
│  STEP 2: CREATE NEW WATCHLIST (Optional)                                     │
│  ────────────────────────────────────                                        │
│  Click "+ New Watchlist"                                                     │
│  Modal:                                                                      │
│    Name: [Security Escort Required]                                          │
│    Icon: [🛡️] (icon picker)                                                 │
│    Color: [Orange] (color picker)                                            │
│    Description: [People who require security escort]                         │
│    Click "Create"                                                            │
│  New watchlist card appears in grid                                          │
│                                                                              │
│  STEP 3: ADD MEMBERS                                                         │
│  ────────────────                                                            │
│  Click on watchlist card                                                     │
│  Click "Add from Gallery"                                                    │
│  Search/select persons to add:                                               │
│    [☑] John Doe                                                             │
│    [☑] Jane Smith                                                           │
│    [☐] Bob Johnson (not selected)                                            │
│  Click "Add to Watchlist"                                                    │
│  Toast: "2 persons added to Security Escort Required"                        │
│                                                                              │
│  STEP 4: CONFIGURE ALERTS                                                    │
│  ────────────────────                                                        │
│  Click "Settings" tab on watchlist detail                                    │
│  Configure:                                                                  │
│    Alert Timing:    [☑] Immediate    [☐] Delayed (___ min)                  │
│    Severity:        [☐] Inherit    [☑] Force Critical                       │
│    Notify Groups:   [☑] Security Team    [☐] Management                     │
│    Media:           [☑] Image    [☑] Video                                  │
│    Quiet Hours:     [☐] Respect global    [☑] Always alert                  │
│    Escalation:      [☑] Enable escalation (5/10/20 min)                     │
│  Click "Save"                                                                │
│                                                                              │
│  STEP 5: TEST                                                                │
│  ────────                                                                    │
│  Click "Test Alert" button                                                   │
│  System sends test alert through configured channels                         │
│  Verify: Telegram message received ✓                                         │
│  Verify: WhatsApp message received ✓                                         │
│  Watchlist is now active and monitoring                                      │
└──────────────────────────────────────────────────────────────────────────────┘

13.5 Component Specifications

13.5.1 Camera Feed Component

State	Visual	Interaction
Loading	Centered spinner overlay, camera name visible	None — wait for stream
Playing	Live stream active, recording dot if applicable	Click to focus, hover for controls
Paused	Stream paused, large play button overlay	Click to resume
Error	Error icon + "Connection failed" + Retry button	Click Retry to reconnect
Offline	Gray placeholder with camera icon + "Offline"	Shows last online timestamp
Disabled	Grayed out with "Disabled" badge	No stream attempted

Prop	Type	Required	Default	Description
`cameraId`	`string`	Yes	—	Unique camera identifier (e.g., "cam-01")
`name`	`string`	Yes	—	Display name shown as overlay
`streamUrl`	`string`	Yes	—	HLS or WebRTC stream URL
`status`	`'online' \| 'offline' \| 'reconnecting' \| 'disabled'`	Yes	—	Current camera status
`layout`	`'grid' \| 'fullscreen'`	No	`'grid'`	Current layout mode
`quality`	`'auto' \| 'hd' \| 'sd'`	No	`'auto'`	Stream quality preference
`showControls`	`boolean`	No	`true`	Show overlay controls
`onFocus`	`(id: string) => void`	No	—	Callback when camera is focused
`onSnapshot`	`(id: string) => void`	No	—	Callback when snapshot is taken

13.5.2 Alert Card Component

Prop	Type	Required	Description
`id`	`string`	Yes	Alert unique identifier
`severity`	`'critical' \| 'high' \| 'medium' \| 'low' \| 'info'`	Yes	Alert severity level
`type`	`string`	Yes	Alert type classification
`cameraName`	`string`	Yes	Source camera display name
`timestamp`	`Date`	Yes	When the alert occurred
`thumbnail`	`string`	No	URL to thumbnail image
`personName`	`string`	No	Identified person name (if known)
`status`	`'pending' \| 'acknowledged' \| 'resolved' \| 'ignored'`	Yes	Current alert status
`onAcknowledge`	`() => void`	No	Acknowledge callback
`onResolve`	`() => void`	No	Resolve callback
`onIgnore`	`() => void`	No	Ignore callback
`onViewDetails`	`() => void`	No	View details callback

13.5.3 Stat Card Component

Prop	Type	Required	Description
`title`	`string`	Yes	Card label (e.g., "Cameras Online")
`value`	`string \| number`	Yes	Main displayed value (e.g., "8/8")
`icon`	`LucideIcon`	Yes	Icon component from Lucide React
`color`	`'green' \| 'blue' \| 'orange' \| 'red' \| 'purple'`	No	Color theme (default: blue)
`trend`	`number`	No	Percentage change from previous period
`subtitle`	`string`	No	Secondary text below value
`href`	`string`	No	Navigation link (e.g., to detail page)

13.6 Toast Notification System

Type	Icon	Color	Duration	Use Case
Success	Check circle	Green (`#10B981`)	3 seconds	Action completed successfully
Error	X circle	Red (`#EF4444`)	5 seconds (or persistent)	Action failed; may require user attention
Warning	Alert triangle	Orange (`#F59E0B`)	4 seconds	Non-critical issue; may need attention
Info	Info circle	Blue (`#3B82F6`)	3 seconds	Informational message
Alert	Bell	Red (`#EF4444`)	Persistent (until dismissed)	Critical alert notification

Toast behavior:

Appears in top-right corner
Stacks up to 5 toasts simultaneously
Older toasts pushed down when new ones arrive
Hovering pauses auto-dismiss timer
Click to dismiss immediately
Swipe right to dismiss (mobile)

13.7 Modal System

Size	Width	Use Case
Small	400px	Confirmations, simple forms
Medium (default)	560px	Standard forms, detail views
Large	800px	Complex forms, image viewers
Fullscreen	100%	Camera fullscreen, large data tables

Modal behavior:

Backdrop click to close (configurable)
Escape key to close (configurable)
Focus trap — Tab cycles within modal
Return focus to trigger element on close
Body scroll locked when modal open
Enter key submits primary action (forms)

13.8 Responsive Behavior

Breakpoint	Width	Layout Changes
`xs`	< 576px	Single column; stacked layouts; bottom tab bar; hamburger menu; camera grid 1x1 or 2x1
`sm`	576-767px	Two column layouts; sidebar as overlay drawer; camera grid 2x2
`md`	768-991px	Collapsed sidebar (72px); filters as drawer; camera grid 2x3; 3-column person gallery
`lg`	992-1199px	Sidebar expanded (260px); full desktop layout; 4-column person gallery
`xl`	1200-1399px	Full desktop layout; 5-column person gallery; 2x4 camera grid
`xxl`	1400px+	Max content width 1400px centered; all features visible

13.9 Keyboard Shortcuts

Shortcut	Context	Action
`?`	Global	Show keyboard shortcuts reference modal
`/`	Global	Focus global search bar
`Escape`	Global	Close modal / exit fullscreen / deselect
`F`	Live View	Toggle fullscreen on focused camera
`S`	Live View	Take snapshot of focused camera
`1-8`	Live View	Focus camera 1-8
`Space`	Live View	Pause/play focused camera stream
`A`	Alert Center	Acknowledge selected alert
`R`	Alert Center	Resolve selected alert
`N`	Detections / Unknowns	Name unknown person
`→`	Unknown Review	Next cluster
`←`	Unknown Review	Previous cluster
`Ctrl+K`	Global	Command palette (quick navigation)
`Ctrl+Shift+A`	Global	Acknowledge most recent alert
`M`	Live View	Toggle mute on camera audio
`+` / `-`	Timeline	Zoom in / zoom out

13.10 Animation Guidelines

Animation	Duration	Easing	Description
Page transition	200ms	`ease-out`	Fade in on route change
Modal open	250ms	`cubic-bezier(0.16, 1, 0.3, 1)`	Scale up + fade in
Modal close	150ms	`ease-in`	Scale down + fade out
Sidebar toggle	250ms	`ease-in-out`	Width transition 260px ↔ 72px
Toast slide-in	300ms	`ease-out`	Slide from right + fade in
Toast fade-out	200ms	`ease-in`	Fade out before removal
Card hover lift	150ms	`ease`	Subtle translateY(-2px) + shadow increase
Segmented slider	200ms	`ease`	Sliding background between options
Pulse (recording)	2s	`ease-in-out` infinite	Red dot opacity oscillation
Stats update	500ms	`ease`	Number count-up animation
Skeleton shimmer	1.5s	`linear infinite`	Shimmer gradient sweep
Alert flash	1s	`ease-out`	Border flash on camera with new alert
Camera focus	300ms	`ease-out`	Expand to fullscreen
Dropdown open	150ms	`ease-out`	Fade + slight translateY
Tooltip	100ms	`ease`	Fade in on hover

13.11 Technology Stack

Layer	Technology	Version	Purpose
Framework	React	18.x	UI library
Meta-framework	Next.js	14.x	SSR, routing, API routes
Language	TypeScript	5.x	Type safety
Styling	Tailwind CSS	3.x	Utility-first CSS
Theme	CSS Custom Properties	—	Dark mode via `dark` class
UI Components	shadcn/ui	latest	Base component library
Icons	Lucide React	latest	Consistent icon set
State Management	Zustand	4.x	Lightweight global state
Data Fetching	TanStack Query (React Query)	5.x	Server state management
Real-time	Socket.IO Client	4.x	WebSocket for live updates
Video	hls.js	latest	HLS stream playback
Video (WebRTC)	native	—	WebRTC stream fallback
Charts	Recharts	2.x	Data visualization
Date/Time	date-fns	2.x	Date formatting and manipulation
Forms	React Hook Form	7.x	Form state management
Validation	Zod	3.x	Schema validation
Zone Drawing	SVG + native events	—	Polygon drawing on camera feed
Testing	Vitest	1.x	Unit testing
E2E Testing	Playwright	1.x	Browser automation testing
Build	Next.js built-in	—	Production optimization

Section 14: Deployment Plan

14.1 Deployment Architecture Overview

The deployment architecture spans two physical environments: AWS cloud for centralized services and an Intel NUC edge gateway at the surveillance site. Both environments are connected via an encrypted WireGuard VPN tunnel. All deployments use containerization (Docker/Kubernetes) with GitOps-based continuous delivery.

┌──────────────────────────────────────────────────────────────────────────────┐
│                         DEPLOYMENT ARCHITECTURE                               │
│                                                                              │
│    ┌────────────────────────────────────────────────────────────────────┐   │
│    │                        AWS CLOUD                                   │   │
│    │                                                                    │   │
│    │  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────────────┐ │   │
│    │  │ Route 53 │──▶  ALB     │──▶  EKS     │──▶  App Pods       │ │   │
│    │  │   DNS    │  │ TLS 1.3  │  │ Cluster  │  │  (FastAPI/Next)  │ │   │
│    │  └──────────┘  └──────────┘  └────┬─────┘  └──────────────────┘ │   │
│    │                                    │                              │   │
│    │  ┌──────────┐  ┌──────────┐  ┌────┴─────┐  ┌──────────────────┐ │   │
│    │  │   S3     │  │   RDS    │  │ ElastiCache│  │  MSK Kafka       │ │   │
│    │  │  Media   │  │ Postgres │  │  Redis    │  │  (Event Bus)     │ │   │
│    │  └──────────┘  └──────────┘  └──────────┘  └──────────────────┘ │   │
│    │                                                                    │   │
│    │  ┌──────────────────────────────────────────────────────────────┐ │   │
│    │  │  WireGuard VPN Gateway (EC2)  ←────→  Edge Gateway          │ │   │
│    │  │  UDP 51820                    Tunnel   (Intel NUC, Site)     │ │   │
│    │  └──────────────────────────────────────────────────────────────┘ │   │
│    └────────────────────────────────────────────────────────────────────┘   │
│                                                                              │
│    ┌────────────────────────────────────────────────────────────────────┐   │
│    │                        EDGE SITE                                   │   │
│    │                                                                    │   │
│    │  ┌─────────────────────────────────────────────────────────────┐  │   │
│    │  │              Intel NUC (Ubuntu Server 22.04)                │  │   │
│    │  │                                                             │  │   │
│    │  │  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐    │  │   │
│    │  │  │ Video Capture│  │ AI Inference │  │   MinIO      │    │  │   │
│    │  │  │  (RTSP/FFmpeg)│  │ (YOLO/Face) │  │  (Storage)   │    │  │   │
│    │  │  └──────────────┘  └──────────────┘  └──────────────┘    │  │   │
│    │  │                                                             │  │   │
│    │  │  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐    │  │   │
│    │  │  │    Redis     │  │  WireGuard   │  │ Node Exporter│    │  │   │
│    │  │  │   (Cache)    │  │   (VPN)      │  │  (Metrics)   │    │  │   │
│    │  │  └──────────────┘  └──────────────┘  └──────────────┘    │  │   │
│    │  │                                                             │  │   │
│    │  └─────────────────────────────────────────────────────────────┘  │   │
│    │                              │                                     │   │
│    │                    ┌─────────┴──────────┐                         │   │
│    │                    │  Camera LAN         │                         │   │
│    │                    │  CP PLUS DVR        │                         │   │
│    │                    │  192.168.29.200:554 │                         │   │
│    │                    │  (8 channels)       │                         │   │
│    │                    └─────────────────────┘                         │   │
│    └────────────────────────────────────────────────────────────────────┘   │
└──────────────────────────────────────────────────────────────────────────────┘

14.2 Cloud Deployment (AWS EKS)

14.2.1 EKS Cluster Configuration

Parameter	Value	Notes
Kubernetes version	1.28+	Latest stable at deployment
Control plane	Managed by AWS	Multi-AZ availability
Node group type	Managed (EC2)	t3.large for general, g4dn.xlarge for GPU
CNI	Amazon VPC CNI	Native VPC networking for pods
Ingress controller	NGINX Ingress + cert-manager	TLS termination at ALB
GitOps	ArgoCD	Declarative continuous deployment
Pod identity	IRSA (IAM Roles for Service Accounts)	No long-term AWS credentials

14.2.2 Cloud Service Resources

Service	AWS Service	Instance/Tier	HA Mode	Monthly Est.
Orchestration	Amazon EKS	Managed control plane	Multi-AZ	$73
Application nodes	EC2 (t3.large)	3 nodes (on-demand)	Multi-AZ spread	$200
GPU nodes	EC2 (g4dn.xlarge)	1 node (spot preferred)	Single + auto-recovery	$350
Database	RDS PostgreSQL 15	db.r6g.xlarge Multi-AZ	Multi-AZ with failover	$520
Cache	ElastiCache Redis	cache.r6g.large (2 shards)	Cluster mode	$260
Message bus	Amazon MSK	kafka.m5.large (3 brokers)	Multi-AZ	$350
Object storage	S3	Standard + IA + Glacier	Cross-region replication	$200
Load balancer	ALB	Application Load Balancer	Multi-AZ	$25
DNS	Route 53	Hosted zone + health checks	Global	$15
VPN gateway	EC2 (t3.micro)	WireGuard endpoint	Single (monitor for HA)	$15
Secrets	AWS Secrets Manager	Vault integration	Multi-AZ	$10
Monitoring	CloudWatch	Logs + metrics + alarms	Multi-AZ	$50
Total				~$2,088/month

14.3 Edge Deployment (Intel NUC)

14.3.1 Edge Hardware Specification

Component	Specification	Notes
Device	Intel NUC 13 Pro (or equivalent)	Fanless preferred for reliability
CPU	Intel Core i7-1360P (12 cores, 16 threads)	Sufficient for 8 streams + AI inference
RAM	32 GB DDR4-3200 (2x16 GB)	Dual channel for memory bandwidth
Storage (OS)	500 GB NVMe SSD (Samsung 980 Pro or equivalent)	Fast boot and application loading
Storage (Data)	2 TB NVMe SSD (Samsung 990 Pro or equivalent)	7-day local recording buffer
Network	Intel i226-V 2.5 GbE (dual port)	Dual NIC for WAN + LAN separation
WiFi	Disabled in BIOS	Security — no wireless
Bluetooth	Disabled in BIOS	Security — no wireless
TPM	TPM 2.0 enabled	For LUKS auto-unseal
OS	Ubuntu Server 22.04 LTS (minimal install)	No desktop environment

14.3.2 Edge Docker Compose Configuration

version: "3.8"

services:
  # RTSP stream capture and frame extraction
  video-capture:
    image: sentinel/surveillance-video-capture:v2.3.1
    restart: unless-stopped
    network_mode: host
    environment:
      - DVR_IP=192.168.29.200
      - DVR_PORT=554
      - NUM_CHANNELS=8
      - FRAME_EXTRACT_FPS=1
      - RECORDING_SEGMENT_SEC=10
      - REDIS_HOST=localhost
      - REDIS_PORT=6379
      - MINIO_ENDPOINT=localhost:9000
    volumes:
      - /data/frames:/app/frames
      - /data/recordings:/app/recordings
      - ./secrets:/run/secrets:ro
    depends_on:
      - redis
      - minio
    deploy:
      resources:
        limits:
          cpus: '4.0'
          memory: 4G
    logging:
      driver: "json-file"
      options:
        max-size: "100m"
        max-file: "5"

  # AI inference service (lightweight edge models)
  ai-inference:
    image: sentinel/surveillance-ai-inference:edge-v2.3.1
    restart: unless-stopped
    runtime: nvidia  # If NVIDIA GPU available; fallback to CPU
    environment:
      - MODEL_PATH=/models
      - REDIS_HOST=localhost
      - REDIS_PORT=6379
      - MINIO_ENDPOINT=localhost:9000
      - INFERENCE_BATCH_SIZE=8
      - CONFIDENCE_THRESHOLD=0.7
      - NMS_THRESHOLD=0.45
    volumes:
      - ./models:/models:ro
      - /data/frames:/app/frames:ro
      - ./secrets:/run/secrets:ro
    depends_on:
      - redis
    deploy:
      resources:
        limits:
          cpus: '6.0'
          memory: 8G
    logging:
      driver: "json-file"
      options:
        max-size: "100m"
        max-file: "5"

  # Local object storage (S3-compatible)
  minio:
    image: minio/minio:RELEASE.2024-latest
    restart: unless-stopped
    command: server /data --console-address ":9001"
    ports:
      - "9000:9000"
      - "9001:9001"
    volumes:
      - /data/minio:/data
    environment:
      - MINIO_ROOT_USER_FILE=/run/secrets/minio_user
      - MINIO_ROOT_PASSWORD_FILE=/run/secrets/minio_password
    secrets:
      - minio_user
      - minio_password
    deploy:
      resources:
        limits:
          cpus: '1.0'
          memory: 1G

  # Local cache and Pub/Sub
  redis:
    image: redis:7-alpine
    restart: unless-stopped
    command: >
      redis-server
      --requirepass """
      --appendonly yes
      --maxmemory 512mb
      --maxmemory-policy allkeys-lru
    volumes:
      - redis_data:/data
    ports:
      - "127.0.0.1:6379:6379"
    deploy:
      resources:
        limits:
          cpus: '0.5'
          memory: 512M

  # WireGuard VPN client
  wireguard:
    image: linuxserver/wireguard:latest
    restart: unless-stopped
    cap_add:
      - NET_ADMIN
      - SYS_MODULE
    environment:
      - PUID=1000
      - PGID=1000
    volumes:
      - ./wireguard-config:/config
    sysctls:
      - net.ipv4.conf.all.src_valid_mark=1
    deploy:
      resources:
        limits:
          cpus: '0.25'
          memory: 64M

  # Metrics exporter for Prometheus
  node-exporter:
    image: prom/node-exporter:latest
    restart: unless-stopped
    volumes:
      - /proc:/host/proc:ro
      - /sys:/host/sys:ro
      - /:/rootfs:ro
    command:
      - '--path.procfs=/host/proc'
      - '--path.rootfs=/rootfs'
      - '--path.sysfs=/host/sys'
      - '--collector.filesystem.mount-points-exclude=^/(sys|proc|dev|host|etc)($$|/)'

volumes:
  redis_data:
    driver: local

secrets:
  minio_user:
    file: ./secrets/minio_user.txt
  minio_password:
    file: ./secrets/minio_password.txt

14.4 Configuration and Environment Variables

14.4.1 Environment Structure

Environment	URL Pattern	Data	Purpose
Development	`*.dev.internal`	Synthetic test data	Feature development, local testing
Staging	`*.staging.example.com`	Anonymized production-like data	Integration testing, UAT
Production	`*.example.com`	Real operational data	Live surveillance operations

14.4.2 Required Environment Variables

# ─── APPLICATION ───
APP_ENV=production                    # dev | staging | production
APP_NAME="Sentinel AI Surveillance"
APP_VERSION=2.3.1
APP_DEBUG=false
APP_SECRET_KEY=<random-256-bit-key>   # Used for session signing
LOG_LEVEL=INFO                         # DEBUG | INFO | WARNING | ERROR | CRITICAL

# ─── SERVER ───
API_HOST=0.0.0.0
API_PORT=8080
WORKERS=4                              # Uvicorn worker processes
TIMEZONE=Asia/Kolkata

# ─── DATABASE ───
DATABASE_URL=postgresql://user:pass@rds-endpoint:5432/surveillance
DB_POOL_SIZE=20
DB_MAX_OVERFLOW=10
DB_POOL_TIMEOUT=30
DB_ECHO=false                          # Set true for SQL logging (dev only)

# ─── REDIS ───
REDIS_URL=redis://:password@redis-endpoint:6379/0
REDIS_POOL_SIZE=50
REDIS_SOCKET_TIMEOUT=5

# ─── OBJECT STORAGE (S3 or MinIO) ───
STORAGE_TYPE=s3                        # s3 | minio
STORAGE_ENDPOINT=s3.amazonaws.com
STORAGE_BUCKET=sentinel-surveillance-media
STORAGE_REGION=ap-south-1
STORAGE_ACCESS_KEY=<access-key>
STORAGE_SECRET_KEY=<secret-key>
STORAGE_SECURE=true
STORAGE_URL_EXPIRY=300                 # Signed URL expiry in seconds

# ─── DVR / CAMERA CONNECTION ───
DVR_IP=192.168.29.200
DVR_PORT=554
DVR_USERNAME=admin
DVR_PASSWORD=<dvr-password>
DVR_CHANNELS=8
DVR_STREAM_QUALITY=0                   # 0=main (high), 1=sub (low)
DVR_RTSP_TEMPLATE="rtsp://{user}:{pass}@{ip}:{port}/user={user}&password={pass}&channel={ch}&stream={quality}.sdp?"

# ─── AI MODELS ───
MODEL_PATH=/models
HUMAN_DETECTION_MODEL=yolo11m.onnx
FACE_DETECTION_MODEL=scrfd_10g_bnkps.onnx
FACE_RECOGNITION_MODEL=arcface_r100.onnx
CONFIDENCE_THRESHOLD=0.7
NMS_THRESHOLD=0.45
FACE_MATCH_THRESHOLD=0.70
UNKNOWN_CLUSTER_EPS=0.35
UNKNOWN_CLUSTER_MIN_SAMPLES=3

# ─── TELEGRAM NOTIFICATIONS ───
TELEGRAM_ENABLED=true
TELEGRAM_BOT_TOKEN=<bot-token>
TELEGRAM_WEBHOOK_URL=https://api.example.com/webhooks/telegram
TELEGRAM_WEBHOOK_SECRET=<webhook-secret>
TELEGRAM_ADMIN_CHAT_ID=<admin-chat-id>

# ─── WHATSAPP NOTIFICATIONS ───
WHATSAPP_ENABLED=true
WHATSAPP_API_VERSION=v18.0
WHATSAPP_ACCESS_TOKEN=<access-token>
WHATSAPP_PHONE_NUMBER_ID=<phone-number-id>
WHATSAPP_WEBHOOK_VERIFY_TOKEN=<verify-token>
WHATSAPP_BUSINESS_ACCOUNT_ID=<business-account-id>

# ─── VPN ───
VPN_ENABLED=true
VPN_TYPE=wireguard
VPN_ENDPOINT=wg.example.com:51820
VPN_PUBLIC_KEY=<server-public-key>
VPN_PRIVATE_KEY=<client-private-key>
VPN_PRESHARED_KEY=<preshared-key>
VPN_ALLOWED_IPS=10.100.0.0/16
VPN_KEEPALIVE=25

# ─── AUTHENTICATION ───
JWT_SECRET_KEY=<ecdsa-private-key-pem>
JWT_PUBLIC_KEY=<ecdsa-public-key-pem>
JWT_ALGORITHM=ES256
JWT_ACCESS_TOKEN_EXPIRE_MINUTES=15
JWT_REFRESH_TOKEN_EXPIRE_DAYS=7
MFA_REQUIRED_ROLES=super_admin,admin
MFA_ISSUER="Sentinel AI Surveillance"

# ─── MONITORING ───
PROMETHEUS_ENABLED=true
METRICS_PORT=9090
GRAFANA_URL=https://grafana.example.com
SENTRY_DSN=<sentry-dsn>
HEALTH_CHECK_INTERVAL=30

# ─── RETENTION ───
RECORDING_RETENTION_DAYS=90
DETECTION_SNAPSHOT_RETENTION_DAYS=90
EVENT_LOG_RETENTION_DAYS=365
AUDIT_LOG_RETENTION_DAYS=365
TRAINING_DATA_RETENTION_DAYS=365
AUTO_CLEANUP_ENABLED=true
AUTO_CLEANUP_HOUR=3                    # 3:00 AM daily

# ─── SECURITY ───
CORS_ALLOWED_ORIGINS=https://app.example.com,https://staging.example.com
CSP_REPORT_ONLY=false
RATE_LIMIT_DEFAULT=100/minute
RATE_LIMIT_AUTH=10/minute
SESSION_MAX_AGE_HOURS=8
SESSION_IDLE_TIMEOUT_MINUTES=30

14.5 Rollout Stages

14.5.1 Stage 1: Foundation (Weeks 1-4)

Objective: Infrastructure, VPN connectivity, and core data layer operational.

Week	Tasks	Deliverables	Success Criteria
1	AWS account setup, VPC creation (3 AZs), EKS cluster deployment, IAM roles	Cloud network ready	VPC flow logs active; EKS nodes Ready
1	RDS PostgreSQL Multi-AZ, ElastiCache Redis cluster	Data layer ready	DB connections successful; replication lag < 1s
2	S3 buckets (media, backups, logs), lifecycle policies, CORS	Storage ready	Upload/download test successful
2	WireGuard VPN gateway (EC2), key generation, firewall rules	VPN endpoint ready	Tunnel handshake successful
3	Edge gateway: OS install, hardening, Docker, WireGuard client	Edge device ready	Edge connects to cloud over VPN
3	Edge services: MinIO, Redis, video capture container	Edge services running	RTSP streams reachable from edge
4	Database schema migration (29 tables), seed data (admin user, 8 cameras)	Database ready	Schema matches design; seed data present
4	Monitoring: Prometheus, Grafana, CloudWatch dashboards	Monitoring active	Dashboards accessible; metrics flowing
4	End-to-end connectivity test	Full pipeline verified	Video from DVR → Edge → Cloud (VPN) → S3

Milestone M1 — Infrastructure Ready (End of Week 4):

All cloud services deployed and healthy
VPN tunnel established and stable (< 100ms latency)
Edge gateway online, all Docker services running
Database schema deployed with migrations and seed data
All 8 camera streams reachable from edge
Basic monitoring and alerting in place

14.5.2 Stage 2: Core AI Pipeline (Weeks 5-8)

Objective: Video ingestion, AI detection, face recognition, and basic API operational.

Week	Tasks	Deliverables	Success Criteria
5	Video capture service: RTSP ingestion, frame extraction, segment recording	Stream ingestion working	All 8 streams connected; FPS > 5 per stream
5	Kafka topic setup, stream ingestion producer	Event streaming ready	Frames published to Kafka
6	AI Inference Service: YOLO (human detection), SCRFD (face detection)	Detection models running	mAP > 0.90 for human detection
6	Detection event storage in PostgreSQL	Detection database working	Events queryable via API
7	ArcFace (face recognition) model deployment, embedding generation, pgvector	Face recognition working	Rank-1 accuracy > 95% on test set
7	Person matching logic: known person lookup, unknown person handling	Person matching working	Correct identification in < 100ms
8	FastAPI core: health endpoints, camera endpoints, detection endpoints	API core functional	All endpoints return correct data
8	Basic authentication: login, JWT token issuance, password hashing	Auth working	Login → token → authenticated requests

Milestone M2 — AI Pipeline Operational (End of Week 8):

All 8 camera streams ingesting at target FPS
Human detection, face detection, and face recognition operational
Detection events stored and queryable
Person matching (known/unknown) working
Basic REST API serving authenticated requests
End-to-end: Camera → Detection → Database → API

14.5.3 Stage 3: Application Layer (Weeks 9-12)

Objective: Web dashboard, alerting, notifications, and person management operational.

Week	Tasks	Deliverables	Success Criteria
9	Next.js project setup, design system, Tailwind config, dark theme	Frontend foundation	Login page renders correctly
9	Authentication flow: login form, MFA input, token management, logout	Auth UI working	Full login → dashboard flow
10	Dashboard page: stat cards, alert chart, camera grid, activity feed	Dashboard live	All widgets populated with real data
10	Live camera view: HLS player, grid layout, fullscreen, camera controls	Live view working	All 8 streams visible, playable
10	Alert engine: rule evaluation, severity assignment, routing	Alert generation working	Alerts created within 5s of detection
11	Telegram integration: bot setup, message templates, inline keyboards	Telegram alerts working	Test alert received in Telegram
11	WhatsApp integration: template messages, session messages	WhatsApp alerts working	Test template message received
11	Person management: gallery, profile, CRUD, face matching display	Person management working	Person created, detected, viewed
12	Unknown review queue: cluster display, naming, merging, ignore	Review queue working	Unknown person processed through queue
12	Watchlists: CRUD, member management, alert routing	Watchlists working	Watchlist match triggers correct alert
12	WebSocket: real-time alert feed, dashboard updates	Real-time working	Alerts appear without page refresh

Milestone M3 — Application Live (End of Week 12):

Web dashboard accessible with live camera feeds
Alerts generated and delivered via Telegram and WhatsApp
Person management (add, view, match, review unknowns) working
Watchlist alerts functional with correct routing
Real-time updates via WebSocket
All RBAC permissions enforced in UI

14.5.4 Stage 4: Intelligence (Weeks 13-16)

Objective: Night mode, training pipeline, self-learning, and advanced features.

Week	Tasks	Deliverables	Success Criteria
13	Night mode: low-light model training, deployment, auto-scheduling	Night mode working	Detection mAP > 0.75 in < 5 lux conditions
13	AI Vibe Settings page: all 7 controls, auto-save, advanced mode	Settings page working	All controls functional, changes effective immediately
14	Training pipeline: data collection, model training job, evaluation	Training pipeline working	Model accuracy improves with new training data
14	Model versioning: A/B testing, shadow mode, promotion workflow	Model management working	Blue/green model deployment
15	Self-learning service: automatic unknown clustering, suggestions	Self-learning working	Suggestions generated for unknown clusters
15	Privacy mode: face blurring, privacy zones, per-camera settings	Privacy mode working	Faces blurred according to settings
15	Suspicious activity detection: pattern rules, anomaly scoring	Advanced alerts working	Anomaly alerts generated for unusual behavior
16	Search service: face similarity search, text search, filters	Search working	Results returned in < 500ms
16	System health dashboard: service cards, metrics, logs viewer	Health dashboard working	All systems visible with status

Milestone M4 — Intelligence Features Live (End of Week 16):

Night mode detection operational
Training pipeline runs and improves models
Self-learning suggestions appear in review queue
Privacy modes configurable and effective
Suspicious activity alerts functional
Search returns results in acceptable time
All AI Vibe Settings controls operational

14.5.5 Stage 5: Hardening (Weeks 17-20)

Objective: Security hardening, testing framework, operations readiness, production go-live.

Week	Tasks	Deliverables	Success Criteria
17	Security penetration test (external vendor)	Pen test report	All critical/high findings addressed
17	SAST/DAST scans, dependency vulnerability scan	Scan reports	Zero critical vulnerabilities
17	Self-test framework: 21 test suites, scheduling, reporting	Testing framework deployed	All test suites execute successfully
18	Backup configuration: pgBackRest, S3 sync, restore procedures	Backup system ready	Restore test successful
18	DR environment setup, failover procedures, quarterly drill schedule	DR ready	DR failover test: RTO < 1 hour
18	Incident response runbooks: 5 documented procedures	Runbooks complete	All scenarios documented
19	Load testing: 8/16/32/64 camera simulation	Load test report	System handles 64 cameras within SLA
19	Performance tuning: database queries, API response times, cache optimization	Tuning complete	p95 API response < 200ms
19	Operations team training: system overview, runbooks, escalation procedures	Team trained	Training sign-off complete
19	98-item go-live checklist review	Checklist complete	All items pass
20	Final readiness review, security sign-off, management approval	Go approval	All stakeholders sign off
20	Production DNS cutover, monitoring, 72-hour stability period	Production live	72-hour stability confirmed

Milestone M5 — Production Go-Live (End of Week 20):

Security audit complete with all findings addressed
Self-test framework passing (score >= 85)
DR tested and verified (RTO < 1 hour, RPO < 15 minutes)
Operations team trained and runbooks reviewed
Load test passed at 64-camera target
98-item go-live checklist: all items complete
System stable in production for 72+ hours

14.6 Kubernetes Manifests Overview

Resource Type	Name	Purpose	Namespace
Deployment	`api`	FastAPI application server (3 replicas)	sentinel
Deployment	`ai-inference`	AI model serving (GPU node)	sentinel
Deployment	`video-capture`	RTSP stream ingestion (edge)	sentinel
Deployment	`alert-engine`	Alert generation and routing	sentinel
Deployment	`notification-service`	Telegram/WhatsApp delivery	sentinel
Deployment	`frontend`	Next.js web application	sentinel
Deployment	`websocket`	WebSocket real-time server	sentinel
StatefulSet	`redis`	Session cache and Pub/Sub	sentinel-data
Service	`api-service`	Internal API access (ClusterIP)	sentinel
Service	`ai-service`	AI inference access (ClusterIP)	sentinel
Service	`frontend-service`	Web app access (ClusterIP)	sentinel
Ingress	`sentinel-ingress`	External HTTPS routing	sentinel
ConfigMap	`app-config`	Application configuration	sentinel
ConfigMap	`nginx-config`	Ingress/Nginx configuration	sentinel
Secret	`app-secrets`	Encrypted secrets (Vault agent injector)	sentinel
Secret	`tls-cert`	TLS certificate (cert-manager)	sentinel
HPA	`api-hpa`	Auto-scale API: 3-10 replicas	sentinel
HPA	`ai-hpa`	Auto-scale AI: 1-4 replicas	sentinel
NetworkPolicy	`default-deny`	Block all unauthorized traffic	sentinel
NetworkPolicy	`allow-api`	API ingress rules	sentinel
NetworkPolicy	`allow-ai`	AI service communication rules	sentinel
PodDisruptionBudget	`api-pdb`	Ensure 2 API pods minimum	sentinel
ServiceMonitor	`api-metrics`	Prometheus scraping config	sentinel-monitoring
PrometheusRule	`alert-rules`	Alerting rules for platform	sentinel-monitoring

14.7 VPN Setup Procedure

14.7.1 Cloud VPN Gateway Setup

#!/bin/bash
# cloud-vpn-setup.sh — Run on cloud VPN EC2 instance

# 1. System preparation
sudo apt update && sudo apt install -y wireguard wireguard-tools iptables-persistent

# 2. Generate WireGuard keys
wg genkey | sudo tee /etc/wireguard/privatekey | wg pubkey | sudo tee /etc/wireguard/publickey

# 3. Create WireGuard configuration
sudo tee /etc/wireguard/wg0.conf << 'EOF'
[Interface]
Address = 10.200.0.1/24
ListenPort = 51820
PrivateKey = <CLOUD_PRIVATE_KEY>
PostUp = iptables -A FORWARD -i wg0 -j ACCEPT; iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE
PostDown = iptables -D FORWARD -i wg0 -j ACCEPT; iptables -t nat -D POSTROUTING -o eth0 -j MASQUERADE

# Edge Gateway peer
[Peer]
PublicKey = <EDGE_PUBLIC_KEY>
PresharedKey = <PRESHARED_KEY>
AllowedIPs = 10.200.0.2/32, 192.168.29.0/24
PersistentKeepalive = 25
EOF

# 4. Enable IP forwarding
sudo sysctl -w net.ipv4.ip_forward=1
echo "net.ipv4.ip_forward=1" | sudo tee -a /etc/sysctl.conf

# 5. Start WireGuard
sudo systemctl enable wg-quick@wg0
sudo systemctl start wg-quick@wg0

# 6. Verify
sudo wg show
ping -c 3 10.200.0.2

14.7.2 Edge VPN Client Setup

#!/bin/bash
# edge-vpn-setup.sh — Run on Intel NUC edge gateway

# 1. Install WireGuard
sudo apt update && sudo apt install -y wireguard wireguard-tools

# 2. Generate keys
wg genkey | sudo tee /etc/wireguard/privatekey | wg pubkey | sudo tee /etc/wireguard/publickey

# 3. Configure
sudo tee /etc/wireguard/wg0.conf << 'EOF'
[Interface]
Address = 10.200.0.2/32
PrivateKey = <EDGE_PRIVATE_KEY>
DNS = 10.100.0.2

[Peer]
PublicKey = <CLOUD_PUBLIC_KEY>
PresharedKey = <PRESHARED_KEY>
Endpoint = <CLOUD_PUBLIC_IP>:51820
AllowedIPs = 10.100.0.0/16, 10.200.0.0/24
PersistentKeepalive = 25
EOF

# 4. Start and enable
sudo systemctl enable wg-quick@wg0
sudo systemctl start wg-quick@wg0

# 5. Verify connectivity
ping -c 3 10.200.0.1        # Cloud VPN gateway
ping -c 3 10.100.0.2        # Cloud DNS/internal service

14.8 Database Initialization

14.8.1 Migration Strategy

Database migrations are managed with Alembic (SQLAlchemy) and executed as Kubernetes init containers before application startup:

initContainers:
  - name: db-migrations
    image: sentinel/surveillance-api:v2.3.1
    command: ["alembic", "upgrade", "head"]
    env:
      - name: DATABASE_URL
        valueFrom:
          secretKeyRef:
            name: db-credentials
            key: url
    resources:
      limits:
        cpu: "500m"
        memory: "256Mi"
    securityContext:
      readOnlyRootFilesystem: true
      allowPrivilegeEscalation: false

Migration Rules:

Rule	Implementation
Backward compatibility	All migrations must be backward-compatible within a release
Destructive changes	2-phase deployment: add new column in release N, drop old in release N+1
Automatic execution	Migrations run automatically before application startup via init container
Health check	Migration status exposed via `/health/ready` endpoint
Rollback	`alembic downgrade` script available for emergency rollback
Version tracking	`alembic_version` table tracks current schema version

14.9 SSL Certificate Setup

14.9.1 cert-manager Configuration

# ClusterIssuer for Let's Encrypt production
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: admin@example.com
    privateKeySecretRef:
      name: letsencrypt-prod-key
    solvers:
      - http01:
          ingress:
            class: nginx
        selector: {}

---
# Certificate resource
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: sentinel-tls
  namespace: sentinel
spec:
  secretName: sentinel-tls-secret
  issuerRef:
    name: letsencrypt-prod
    kind: ClusterIssuer
  dnsNames:
    - app.example.com
    - api.example.com
    - ws.example.com
  usages:
    - digital signature
    - key encipherment
  privateKey:
    algorithm: ECDSA
    size: 256

Section 15: Testing Plan

15.1 Testing Strategy Overview

The testing strategy encompasses five levels of testing, from isolated unit tests to full system end-to-end validation. The goal is comprehensive coverage of all functional and non-functional requirements with automated execution in CI/CD.

┌──────────────────────────────────────────────────────────────────────────────┐
│                         TESTING PYRAMID                                       │
│                                                                              │
│                        ┌─────────┐                                          │
│                        │  E2E    │  ~20 tests                               │
│                        │ Tests   │  Full system scenarios                   │
│                        ├─────────┤                                          │
│                      ┌─────────────┐                                        │
│                      │ Integration │  ~100 tests                            │
│                      │   Tests     │  Service-to-service                    │
│                      ├─────────────┤                                        │
│                   ┌───────────────────┐                                    │
│                   │    Unit Tests      │  ~300 tests                        │
│                   │  (Components, AI)  │  Isolated functions                  │
│                   └───────────────────┘                                    │
└──────────────────────────────────────────────────────────────────────────────┘

15.2 Unit Testing Strategy

Component	Framework	Coverage Target	Mock Strategy	CI Execution
API backend (Python)	pytest + pytest-asyncio	85%+	pytest-mock, moto (AWS),responses (HTTP)	Every commit
Frontend (React/TS)	Vitest + React Testing Library	80%+	MSW (API mocking), jsdom	Every commit
AI models (Python)	pytest	70%+ (model logic)	Mock inference engine, fixture data	Every commit
Database models	pytest + asyncpg	80%+	testcontainers-postgres	Every commit
Notification adapters	pytest	80%+	responses library for HTTP mocking	Every commit

15.3 Integration Testing

Integration Pair	Scope	Framework	Strategy
API + Database	CRUD operations, transactions, query performance	pytest + testcontainers	PostgreSQL container per test run
API + Redis	Caching, Pub/Sub, session storage	pytest + Redis container	Redis container per test run
API + S3/MinIO	Media upload, download, presigned URLs	pytest + LocalStack	S3 mock via LocalStack
Alert Engine + Router	Rule evaluation, routing decisions	pytest	Mock channel adapters
Telegram Adapter	Message formatting, API calls, error handling	pytest + responses	HTTP request/response mocking
WhatsApp Adapter	Template rendering, API calls, error handling	pytest + responses	HTTP request/response mocking
Auth + Database	User CRUD, password hashing, session management	pytest + testcontainers	Full auth flow testing

15.4 System Testing (End-to-End)

#	Scenario	Steps	Expected Result
1	Full detection pipeline	Trigger motion → verify detection stored → verify alert created → verify notification sent	All components process correctly within SLA
2	Person recognition flow	Known person walks by → verify face detected → verify identity matched → verify no false alert	Correct person identified with > 95% confidence
3	Unknown person flow	Unknown person detected → verify "Unknown" classification → verify review queue updated	Unknown queued for operator review within 5 seconds
4	Watchlist alert (blacklist)	Blacklist person detected → verify immediate critical alert → verify notification to security team	Alert within 5 seconds, correct severity, all channels
5	Night mode detection	Low-light detection scenario → verify night model used → verify detection confidence acceptable	Detection mAP > 0.75 in < 5 lux conditions
6	Privacy mode	Enable privacy mode → verify face blurring in live view → verify no face recognition occurs	Faces blurred, no biometric processing
7	Alert escalation	Create critical alert → don't acknowledge → verify escalation levels trigger at correct times	Level 1 at 5min, Level 2 at 10min, Level 3 at 20min
8	VPN failure recovery	Disconnect VPN → verify local operation continues → reconnect VPN → verify sync resumes	No data loss; automatic recovery
9	Database failover	Trigger RDS failover → verify application continues → verify no data loss	< 60 second downtime; zero data loss
10	Complete user flow	Login → view dashboard → view live cameras → receive alert → acknowledge → logout	All pages load; all actions succeed

15.5 Load Testing Plan

Scenario	Camera Count	Duration	Users	Target Metrics
Baseline	8	1 hour	5 concurrent	Establish baseline metrics
Scale-up	16	2 hours	10 concurrent	Verify 2x capacity; p95 latency < 500ms
Scale-up	32	2 hours	20 concurrent	Verify 4x capacity; auto-scaling triggers
Stress test	64	1 hour	50 concurrent	Find breaking point; error rate < 1%
Sustained	8	24 hours	5 concurrent	Memory leak detection; stability verification
Spike test	8→64→8	30 minutes	Ramp up/down	Verify auto-scaling response time

15.6 Failover Testing

Test Case	Description	Pass Criteria
API pod failure	Kill 1 API pod	Traffic routed to healthy pods; zero failed requests
Database failover	Trigger RDS Multi-AZ failover	< 60s downtime; no data loss; connections re-established
Redis failure	Restart Redis cluster	Session recovery; cache warm within 5 minutes
VPN tunnel failure	Disconnect WireGuard	Auto-reconnect within 30s; streams resume
Edge gateway restart	Reboot edge device	Full recovery within 5 minutes; all streams reconnect
AI inference failure	Kill inference container	Queue buffers frames; recovery < 30s; no frame loss
Complete cloud failure	Simulate region outage	DR test: RTO < 1 hour; RPO < 15 minutes

15.7 Security Testing

Test Type	Tool	Scope	Frequency	Gate
Static Analysis (SAST)	Bandit, Semgrep	Source code	Every commit	Block on HIGH/CRITICAL
Dependency Scan	Snyk, pip-audit	All dependencies	Daily	Block on HIGH/CRITICAL
Container Image Scan	Trivy	Docker images	Every build	Block on HIGH/CRITICAL
Dynamic Analysis (DAST)	OWASP ZAP	Running application	Weekly	Review findings
Penetration Test	External vendor	Full stack	Quarterly	All findings addressed
TLS Configuration	testssl.sh	SSL/TLS endpoints	Monthly	Grade A+ required
API Security	OWASP ZAP API scan	All REST endpoints	Weekly	Review findings
Secrets Scan	TruffleHog, GitLeaks	Git repositories	Every commit	Block on findings

15.8 AI Pipeline Testing

Test	Description	Target Metric	Test Data
Human detection accuracy	Evaluate YOLO on held-out test set	mAP > 0.90	1000 labeled frames
Face detection accuracy	Evaluate SCRFD on test set	Detection rate > 0.85	500 labeled face images
Face recognition accuracy	Evaluate ArcFace on test set	Rank-1 accuracy > 0.95	200 person gallery
False positive rate	Measure incorrect person matches	< 2%	Simulated impostor set
False negative rate	Measure missed person matches	< 5%	Known person test set
Inference latency	Measure end-to-end processing	< 200ms per frame (p95)	Benchmark suite
Night mode accuracy	Test low-light detection	mAP > 0.75	200 low-light frames
Batch processing	Test throughput at batch size 8	> 40 FPS aggregate	Benchmark suite

15.9 Notification Testing

Test	Description	Verification
Telegram delivery	Send test alert via Telegram	Message received; formatting correct; buttons functional
WhatsApp delivery	Send test alert via WhatsApp	Template message received; parameters correct
Routing rules	Trigger alert matching specific rule	Delivered to correct recipients only
Quiet hours	Send alert during quiet hours	Non-critical suppressed; critical bypasses
Escalation	Leave critical alert unacknowledged	Escalation notifications at correct thresholds
Rate limiting	Trigger burst of 50 alerts	Rate limiting applied; no provider blocks
Media attachments	Send alert with image + video	Media processed to correct size; delivered
Delivery tracking	Verify webhook receipts	Status updated correctly in dashboard
DLQ handling	Force 5 failed deliveries	Messages moved to DLQ; admin notification sent

15.10 Test Environments

Environment	Data	Purpose	Pipeline Stage
Local dev	Synthetic (10 cameras, 100 persons)	Developer testing	Pre-commit
CI	Synthetic (generated per run)	Automated test execution	Every commit
Staging	Anonymized production-like (8 cameras, 500 persons)	Pre-production validation	Post-merge
Load test	Generated (64 cameras, 10,000 persons)	Performance testing	Weekly schedule
DR	Minimal (2 cameras, 10 persons)	Disaster recovery validation	Quarterly

15.11 CI/CD Pipeline for Testing

┌─────────┐   ┌──────────┐   ┌──────────┐   ┌──────────┐   ┌──────────┐
│  Push   │──▶│   Lint   │──▶│  Unit    │──▶│  SAST    │──▶│  Build   │
│         │   │ + Format │   │  Tests   │   │ + Scan   │   │  Images  │
└─────────┘   └──────────┘   └──────────┘   └──────────┘   └──────────┘
                                                                   │
                                                                   ▼
┌─────────┐   ┌──────────┐   ┌──────────┐   ┌──────────┐   ┌──────────┐
│ Deploy  │◀──│   E2E    │◀──│   DAST   │◀──│  Image   │◀──│  Push    │
│Staging  │   │  Tests   │   │  Scan    │   │  Scan    │   │ Registry │
└─────────┘   └──────────┘   └──────────┘   └──────────┘   └──────────┘
      │
      ▼
┌─────────┐
│ Deploy  │ (Manual approval required)
│   Prod  │
└─────────┘

Stage	Tools	Coverage Gate	Duration
Lint + Format	ruff, black, mypy, ESLint, Prettier	Zero lint errors	30s
Unit Tests	pytest, Vitest	80%+ coverage	3 min
SAST + Secrets	Bandit, Semgrep, TruffleHog	No HIGH/CRITICAL	2 min
Build	Docker buildx	Build succeeds	5 min
Image Scan	Trivy, Snyk	No HIGH/CRITICAL CVEs	2 min
DAST	OWASP ZAP	No HIGH/CRITICAL findings	10 min
E2E Tests	Playwright, pytest	All scenarios pass	8 min
Deploy Staging	ArgoCD	Health checks pass	3 min

Section 16: Self-Test Framework

16.1 Framework Architecture

The Self-Test Framework is a standalone FastAPI service that continuously validates platform health and readiness through automated test execution.

┌──────────────────────────────────────────────────────────────────────────────┐
│                     SELF-TEST FRAMEWORK ARCHITECTURE                          │
│                                                                              │
│   ┌─────────────────────────────────────────────────────────────────────┐   │
│   │                    TEST ORCHESTRATOR (FastAPI)                       │   │
│   │                                                                      │   │
│   │   Scheduler        Queue         Executor       Aggregator          │   │
│   │   (cron/APScheduler) │           (asyncio)        │                 │   │
│   │        │            │            │               │                  │   │
│   │   15m health ◄─────┼────────────┼───────────────┤                  │   │
│   │   Daily 3am  ◄─────┼────────────┼───────────────┤                  │   │
│   │   On-demand  ◄─────┼────────────┼───────────────┤                  │   │
│   │                      │            │               │                  │   │
│   │                      ▼            ▼               ▼                  │   │
│   │              ┌─────────────────────────────────────┐                 │   │
│   │              │         Reporter + Storage           │                 │   │
│   │              │  PostgreSQL + S3 (evidence)          │                 │   │
│   │              └─────────────────────────────────────┘                 │   │
│   └─────────────────────────────────────────────────────────────────────┘   │
│                                                                              │
│   ┌─────────────────────────────────────────────────────────────────────┐   │
│   │                     21 TEST SUITES (170+ CASES)                      │   │
│   │                                                                      │   │
│   │   Infrastructure (TC-01..04)    │   Core AI (TC-05..10)             │   │
│   │   Alerts (TC-11..13)            │   Search (TC-14)                  │   │
│   │   Training (TC-15)              │   Security (TC-16..17)            │   │
│   │   Resilience (TC-18..21)        │                                   │   │
│   └─────────────────────────────────────────────────────────────────────┘   │
└──────────────────────────────────────────────────────────────────────────────┘

16.2 Test Suite Catalog (21 Suites)

Suite ID	Name	Tests	Priority	Description
TC-INF-01	DVR Connectivity	8	P0	RTSP handshake, stream access, credential validation
TC-INF-02	VPN Health	6	P0	Tunnel status, latency, packet loss, throughput
TC-INF-03	Database Health	8	P0	Connection pool, query performance, replication lag
TC-INF-04	Storage Health	7	P0	Disk space, read/write performance, object storage
TC-STR-05	Camera Stream Access	10	P0	All 8 channels streaming, FPS, bitrate verification
TC-STR-06	Live Streaming	6	P1	HLS stream delivery to browsers, latency check
TC-AI-07	Human Detection	12	P0	YOLO accuracy, confidence thresholds, edge cases
TC-AI-08	Face Detection	10	P0	SCRFD accuracy, face bounding box quality
TC-AI-09	Face Recognition	12	P0	ArcFace embeddings, person matching accuracy
TC-AI-10	Unknown Clustering	8	P1	Face grouping quality, similarity thresholds
TC-ALT-11	Alert Generation	10	P0	Rule evaluation, severity assignment, routing
TC-ALT-12	Telegram Delivery	8	P1	Message delivery, formatting, media, error handling
TC-ALT-13	WhatsApp Delivery	8	P1	Template delivery, session messages, error handling
TC-CAP-14	Image Capture	6	P1	Frame extraction quality, storage, metadata
TC-CAP-15	Video Clip Capture	6	P1	Clip generation, compression, storage
TC-SEA-16	Search Retrieval	8	P1	Face search accuracy, text search, performance
TC-TRA-17	Training Workflow	8	P2	Model retraining, evaluation, deployment
TC-SEC-18	Admin Login Security	10	P0	Auth flow, MFA, session management, brute force
TC-SEC-19	RBAC Enforcement	12	P0	Permission checks, role-based access, resource-level
TC-RES-20	Restart Recovery	8	P1	Service restart, state recovery, data integrity
TC-RES-21	Load Handling	7	P1	8/16/32/64 camera simulation, throughput

Total: 21 suites, 170 test cases

16.3 Test Scheduling

Schedule	Suites	Trigger	Notification
Every 15 minutes	Infrastructure (TC-01..04)	APScheduler cron	Alert on failure
Daily at 03:00 UTC	All 21 suites	APScheduler cron	Full report via email + Slack
On-demand	Any subset	Admin API call	Immediate report
Post-deployment	Critical path (TC-01,05,07,11,18)	CI/CD webhook	Pipeline gate
Weekly (Sunday 04:00)	Full suite + extended load tests	APScheduler cron	Weekly report

16.4 Production Readiness Scoring

Base Score: 100.0

Deductions:
  P0 failure:  -20.0 points each
  P1 failure:  -10.0 points each
  P2 failure:  -5.0 points each
  P3 failure:  -2.0 points each

Minimum score: 0.0
Maximum score: 100.0

Verdict	Score Range	Meaning	Recommended Action
GO	95.0 - 100.0	All critical systems healthy	Proceed with confidence
GO WITH CAVEATS	85.0 - 94.9	Minor issues, non-critical	Proceed with monitoring plan
CONDITIONAL GO	70.0 - 84.9	Significant issues	Fix P1 issues before deployment
NO-GO	0.0 - 69.9	Critical failures	Do not deploy; address P0 issues first

16.5 Report Generation

Format	Use Case	Generation	Retention
JSON API	Programmatic consumption, CI/CD integration	Immediate	90 days
HTML Dashboard	Web-based viewing, trend analysis	~5 seconds	90 days
PDF Report	Email distribution, compliance archiving	~30 seconds	1 year

Section 17: Sample Self-Test Report

17.1 Report Header

================================================================================
           SENTINEL AI SURVEILLANCE PLATFORM — SELF-TEST REPORT
================================================================================

Report ID:          STR-20250116-030015
Generated:          2025-01-16 03:00:15 UTC
Environment:        production
Version:            v2.3.1
Triggered By:       Scheduled (Daily 3:00 AM)
Duration:           18 minutes 42 seconds
Overall Status:     GO WITH CAVEATS

17.2 Executive Summary

Metric	Value
Verdict	GO WITH CAVEATS
Production Readiness Score	94.8 / 100
Total Test Cases	170
Passed	168 (98.8%)
Failed	2 (1.2%)
Skipped	0 (0.0%)
Previous Run Score	97.2 / 100
Score Change	-2.4 (downward)

Priority Breakdown:

Priority	Total	Passed	Failed	Pass Rate
P0 (Critical)	42	42	0	100.0%
P1 (High)	70	68	2	97.1%
P2 (Medium)	38	38	0	100.0%
P3 (Low)	20	20	0	100.0%

17.3 System Metrics at Test Time

Metric	Value	Status
Active Cameras	8 / 8	Online
Stream FPS (avg)	28.5	Normal
AI Inference Latency (p95)	42ms	Normal
Detection Rate (last hour)	47 events	Normal
Database Connections	18 / 100	Healthy
Storage Usage	67%	Healthy
VPN Latency	12ms	Excellent
API Response Time (p95)	78ms	Normal
Telegram Delivery Rate (24h)	99.2%	Healthy
WhatsApp Delivery Rate (24h)	99.8%	Healthy

17.4 Failed Test Cases

Failure 1: TC-ALT-12-004 — Telegram Media Group Delivery

Field	Value
Test Case	TC-ALT-12-004
Suite	Telegram Delivery (TC-ALT-12)
Priority	P1
Status	FAILED
Duration	12,450 ms
Severity	Medium

Description: Verify that media group (multiple images) is delivered correctly via Telegram when an alert contains multiple evidence images.

Expected Result: All 3 images delivered as a media group album within 10 seconds.

Actual Result: Only 2 of 3 images delivered. Third image failed with error: telegram_api_error: Request Entity Too Large (413). Image size after processing: 10.8 MB (exceeds Telegram's 10 MB per-image limit for media groups).

Root Cause: The media processing pipeline resizes images to 1280x720 but does not enforce a hard 10 MB per-image cap for Telegram media groups. The iterative quality reduction loop stops at quality 50 but can still produce files > 10 MB.

Recommended Fix: Add a hard size cap check after image processing. If image exceeds 10 MB after quality reduction to 50%, apply additional compression (reduce dimensions or use WebP format).

Workaround: Single-image delivery mode works correctly. Multi-image alerts temporarily deliver images individually.

Failure 2: TC-RES-20-006 — AI Inference Recovery After Simulated Crash

Field	Value
Test Case	TC-RES-20-006
Suite	Restart Recovery (TC-RES-20)
Priority	P1
Status	FAILED
Duration	65,200 ms
Severity	Medium

Description: Verify that the AI inference service recovers and resumes processing within 60 seconds after a simulated process crash.

Expected Result: AI inference pod restarts and resumes processing frames within 60 seconds for all 8 cameras.

Actual Result: Pod restarted successfully (18 seconds), but detection did not resume for Camera 3 and Camera 7. Other 6 cameras resumed within 45 seconds. Root cause: model warm-up process failed due to a race condition in GPU memory allocation during concurrent channel initialization.

Root Cause: All 8 channel processors attempt to load the face recognition model simultaneously. On resource-constrained edge hardware, this causes OOM for channels that lose the initialization race.

Recommended Fix: Implement shared model loading — load each model once and share across all channel processors. Add initialization semaphore.

Workaround: Manual restart of affected channel processors via admin API.

17.5 Trending (Last 14 Days)

Date	Score	Verdict	Notes
2025-01-02	96.5	GO	—
2025-01-03	98.2	GO	—
2025-01-04	97.1	GO	—
2025-01-05	98.8	GO	—
2025-01-06	97.5	GO	—
2025-01-07	98.2	GO	—
2025-01-08	96.8	GO	TC-RES-21 had 1 P3 failure
2025-01-09	97.2	GO	—
2025-01-10	98.2	GO	—
2025-01-11	97.5	GO	—
2025-01-12	98.2	GO	—
2025-01-13	97.2	GO	—
2025-01-14	98.2	GO	—
2025-01-15	97.2	GO	—
2025-01-16	94.8	GO WITH CAVEATS	2 P1 failures (see above)

17.6 Conclusion and Recommendations

Verdict: GO WITH CAVEATS

The Sentinel AI Surveillance Platform is operational and safe to use. All 42 P0 (Critical) test cases passed, confirming that core surveillance functions are working correctly.

Two P1 (High) priority issues were identified with documented workarounds. Both fixes are scheduled for v2.3.2.

Recommended Actions:

Address TC-ALT-12-004: Add aggressive compression for Telegram media group images
Address TC-RES-20-006: Implement shared model loading in AI inference service
Monitor Telegram multi-image alert delivery metrics (workaround active)
Monitor AI inference recovery metrics (manual restart documented in runbook)
Validate both fixes in next daily test run after v2.3.2 deployment

Section 18: Risks and Mitigations

18.1 Risk Register Summary

#	Category	Risk	Likelihood	Impact	Score	Mitigation	Owner
T1	Technical	DVR disk full (0 bytes free)	High	Critical	20	Auto-rotation at 85%; emergency cleanup; secondary storage	Platform
T2	Technical	AI false positives in low light	Medium	High	12	Night models; adjustable thresholds; operator review	AI Team
T3	Technical	Face rec accuracy with masks/angles	Medium	Medium	9	Multi-angle training; pose normalization	AI Team
T4	Technical	VPN tunnel instability	Medium	High	12	Auto-reconnect; local buffering; redundant endpoints	Platform
T5	Technical	DB performance at scale	Medium	Medium	9	Partitioning; read replicas; archiving	Platform
O1	Operational	Edge hardware failure	Medium	Critical	15	Cold spare; config backup; documented replacement	Operations
O2	Operational	Internet loss at edge site	Medium	High	12	Local storage buffer; 4G failover; local AI continues	Operations
O3	Operational	Operator training gaps	Medium	Medium	9	Training program; inline help; escalation procedures	Operations
O4	Operational	Alert fatigue	Medium	High	12	Escalation rules; alert grouping; severity routing	Operations
S1	Security	Biometric data breach	Low	Critical	10	AES-256-GCM; signed URLs; GDPR deletion; audit	Security
S2	Security	Unauthorized feed access	Low	Critical	10	RBAC; JWT; MFA; session binding; rate limiting	Security
S3	Security	Bot token compromise	Low	High	8	Vault encryption; 180-day rotation; IP allowlist	Security
A1	AI/ML	Model drift over time	Medium	High	12	Monthly evaluation; auto-monitoring; retraining	AI Team
A2	AI/ML	Training data poisoning	Low	Critical	10	Validation; multi-person review; audit trail	AI Team
A3	AI/ML	Demographic bias	Medium	High	12	Diverse data; fairness audits; human-in-loop	AI Team
A4	AI/ML	Edge hardware insufficient	Medium	High	12	CPU models; cloud offloading; GPU upgrade path	AI Team
I1	Integration	DVR firmware incompatibility	Medium	High	12	RTSP compliance check; firmware validation	Engineering
C1	Compliance	GDPR non-compliance	Low	Critical	10	PIA; consent mgmt; right to deletion; DPO	DPO
R1	Resource	Budget overrun	Medium	Medium	9	Reserved instances; cost monitoring; quotas	Finance
R3	Resource	Timeline delay	Medium	High	12	Phased delivery; parallel work; weekly tracking	PMO

18.2 Critical Risks Requiring Immediate Action

T1 — DVR Disk Full (Score: 20)
- Action: Emergency disk cleanup within 24 hours
- Implement automatic rotation at 85% capacity
- Configure critical alerts at 90%, 95%, 98%
- Owner: Platform Team | Due: 2025-01-17
O1 — Edge Hardware Failure (Score: 15)
- Action: Procure cold spare device
- Document hardware replacement runbook
- Automate configuration restoration from GitOps
- Owner: Operations Team | Due: 2025-02-01

Section 19: Final Implementation Roadmap

19.1 Five-Phase Implementation (20 Weeks)

Phase	Weeks	Name	Theme	Key Milestone
1	1-4	Foundation	Infrastructure, VPN, edge, database	M1: Infrastructure Ready
2	5-8	Core AI Pipeline	Video ingestion, detection, recognition	M2: AI Pipeline Operational
3	9-12	Application Layer	Dashboard, alerts, notifications	M3: Application Live
4	13-16	Intelligence	Night mode, training, self-learning	M4: Intelligence Features
5	17-20	Hardening	Security, testing, operations, go-live	M5: Production Go-Live

19.2 Key Milestones and Deliverables

Milestone	Target Week	Deliverables	Entry Criteria	Exit Criteria
M1 Infrastructure	Week 4	Cloud services, VPN, edge gateway, database, monitoring	Project kickoff, hardware delivered	All services healthy, VPN stable, schema deployed
M2 AI Pipeline	Week 8	Video capture, YOLO, SCRFD, ArcFace, detection DB, API	M1 complete, models ready	All 8 streams ingesting, AI accuracy targets met, API functional
M3 Application	Week 12	Dashboard, alerts, Telegram, WhatsApp, person mgmt, WebSocket	M2 complete, frontend env ready	Dashboard live, alerts delivered, person management working
M4 Intelligence	Week 16	Night mode, training pipeline, self-learning, privacy, search	M3 complete, training data accumulated	All intelligence features operational
M5 Go-Live	Week 20	Security audit, test framework, DR, runbooks, load test, checklist	M4 complete, security audit scheduled	All audits passed, checklist complete, 72h stability

19.3 Phase Details

Phase 1 (Weeks 1-4): VPC, EKS, RDS, Redis, Kafka, S3, WireGuard VPN, edge gateway OS hardening, Docker setup, database schema with migrations, monitoring stack (Prometheus, Grafana).

Phase 2 (Weeks 5-8): RTSP capture service, YOLO human detection, SCRFD face detection, ArcFace face recognition, embedding storage with pgvector, person matching logic, FastAPI core, authentication.

Phase 3 (Weeks 9-12): Next.js frontend, design system, dashboard, live camera view (HLS), alert engine with rules, Telegram Bot API integration, WhatsApp Business API integration, person gallery and profile, unknown review queue, watchlists, WebSocket real-time updates.

Phase 4 (Weeks 13-16): Night mode AI model, AI Vibe Settings page, training pipeline with model versioning, self-learning service for unknown clusters, privacy mode with face blurring, suspicious activity detection, search service (face + text), system health dashboard.

Phase 5 (Weeks 17-20): Penetration testing, SAST/DAST, self-test framework (21 suites), backup/DR setup, incident response runbooks, load testing (8-64 cameras), performance tuning, operations training, go-live checklist (98 items), production cutover, 72-hour stability monitoring.

19.4 Resource Allocation

Phase	Engineering	AI/ML	DevOps	QA	Security
1: Foundation	2	—	2	—	1
2: Core AI	2	2	1	1	—
3: Application	3	1	1	2	—
4: Intelligence	2	2	1	1	—
5: Hardening	2	1	2	2	2

Section 20: Final Production-Readiness Summary

20.1 System at a Glance

Category	Specification
Architecture	Cloud (AWS EKS) + Edge (Intel NUC) + VPN (WireGuard)
Services	12 containerized microservices
Security Zones	5 (Public, App Private, Database, Edge LAN, Camera LAN)
AI Pipeline	YOLO11m (human detection) + SCRFD (face detection) + ArcFace (recognition)
Embeddings	512-Dimensional face vectors stored in pgvector
Database	PostgreSQL 15, 29 tables, partitioned, AES-256-GCM encrypted
Web Application	18 pages, dark mode, Next.js 14, real-time WebSocket
Notifications	Telegram Bot API + WhatsApp Business API (dual channel)
Security	TLS 1.3, Argon2id, JWT ES256, TOTP MFA, RBAC (4 roles, 30+ permissions)
Testing	21 test suites, 170+ test cases, automated readiness scoring
Reliability	99.9% uptime target, RTO 1 hour, RPO 15 minutes
Timeline	20 weeks (5 months) to production

20.2 Readiness Checklist Summary

Category	Items	Status
Infrastructure	14	Ready to implement
Security	18	Ready to implement
AI/ML Pipeline	15	Ready to implement
Application	16	Ready to implement
Operations	15	Ready to implement
Data & Privacy	10	Ready to implement
Documentation	10	Ready to implement
Total	98	Ready to implement

20.3 Estimated Timeline

Milestone	Target	Duration
M1: Infrastructure Ready	Week 4	4 weeks
M2: AI Pipeline Operational	Week 8	4 weeks
M3: Application Live	Week 12	4 weeks
M4: Intelligence Features	Week 16	4 weeks
M5: Production Go-Live	Week 20	4 weeks
Total to Production	20 weeks	~5 months

Appendices

Appendix A: Cross-Reference to Specialist Documents

Document	Path	Content
Notification System	`/mnt/agents/output/notification_system.md`	Telegram, WhatsApp, routing rules, templates, retry logic
Security Architecture	`/mnt/agents/output/security_architecture.md`	SSL/TLS, auth, RBAC, VPN, secrets, audit, GDPR, checklist
Web UX Design	`/mnt/agents/output/web_ux_design.md`	Design system, 18 pages, navigation, user flows, AI vibe settings
Self-Test Framework	`/mnt/agents/output/self_test_framework.md`	Framework architecture, 21 suites, scheduling, sample report
Operations Plan	`/mnt/agents/output/operations_plan.md`	Monitoring, logging, backup, DR, incident response, runbooks
Architecture	`/mnt/agents/output/architecture.md`	System architecture, data flow, scaling strategy, cost estimates

Appendix B: Acronyms

Acronym	Full Form
AI	Artificial Intelligence
ALB	Application Load Balancer
API	Application Programming Interface
ArcFace	Additive Angular Margin Loss for Deep Face Recognition
CSP	Content Security Policy
CSRF	Cross-Site Request Forgery
CORS	Cross-Origin Resource Sharing
DLQ	Dead Letter Queue
DVR	Digital Video Recorder
EKS	Elastic Kubernetes Service
ES256	ECDSA using P-256 and SHA-256
FFmpeg	Fast Forward MPEG (multimedia framework)
FPS	Frames Per Second
GDPR	General Data Protection Regulation
GPU	Graphics Processing Unit
HLS	HTTP Live Streaming
HPA	Horizontal Pod Autoscaler
HSTS	HTTP Strict Transport Security
JWT	JSON Web Token
LUKS	Linux Unified Key Setup
MFA	Multi-Factor Authentication
mTLS	Mutual TLS
mAP	mean Average Precision
NMS	Non-Maximum Suppression
NUC	Next Unit of Computing
OCSP	Online Certificate Status Protocol
PII	Personally Identifiable Information
PSK	Pre-Shared Key
RBAC	Role-Based Access Control
RDS	Relational Database Service
RPO	Recovery Point Objective
RTO	Recovery Time Objective
RTSP	Real Time Streaming Protocol
S3	Simple Storage Service
SAST	Static Application Security Testing
SCRFD	Single-Shot Multi-scale Face Detector
SLA	Service Level Agreement
SQL	Structured Query Language
SSL	Secure Sockets Layer
TLS	Transport Layer Security
TOTP	Time-based One-Time Password
TPM	Trusted Platform Module
UAT	User Acceptance Testing
VPC	Virtual Private Cloud
VPN	Virtual Private Network
WAF	Web Application Firewall
WORM	Write Once Read Many
XSS	Cross-Site Scripting
YOLO	You Only Look Once

End of Document

Document Version: 1.0 Classification: Confidential — Internal Use Only Next Review: 2025-04-16 Owner: Sentinel AI Architecture Team

Blueprint Part B