Alerts

Alerts

Alerts notify you when agent runs or content source pulls match specific conditions — failures, performance regressions, error rate spikes, or unusual activity. Configure them once, and Seclai monitors your resources 24/7 so you don't have to.

Overview

The Alerts system has two sides:

  1. Alert Configuration — Rules you define that specify when to trigger a notification (e.g., "alert me after 3 consecutive failures")
  2. Alert Instances — Individual triggered alerts created when a rule fires (e.g., "Agent 'Daily Report' failed 3 times in a row at 2:14 PM")

You manage both from the Alerts section in the left sidebar.

Alert Scopes

Alerts can be configured at three levels:

ScopeWhere to ConfigureApplies To
Account-wideAlerts → Agent Alerts / Content Source Alerts tabsAll agents or all sources in the account
Per-agentAgent → Alerts tabA specific agent only
Per-sourceContent Source → Alerts tabA specific content source only

Per-resource alerts override account-wide settings for that resource, giving you fine-grained control.

Example: You might set a conservative account-wide "consecutive failures" threshold of 5, then override it to 2 for a business-critical agent that needs faster notification.


Agent Alert Types

Five alert types are available for monitoring agent runs:

Run Failed

Triggers an alert every time an agent run fails, regardless of context. This is the simplest alert type — useful for agents where any failure is unacceptable.

SettingValue
ThresholdsNone — every failure triggers
Best forCritical, low-volume agents

Example use case: A financial reporting agent that runs once daily. Every failure must be investigated immediately.

Consecutive Failures

Triggers an alert when an agent fails a configurable number of times in a row. This filters out one-off transient errors and only alerts on persistent issues.

SettingRangeDefaultDescription
Count2 – 1003Number of consecutive failures before alerting

Example use case: A web scraping agent that occasionally encounters timeout errors. Setting the count to 3 means you're only alerted when the upstream site is genuinely down, not when a single request times out.

Example configuration:

Alert type: Consecutive Failures
Count: 3
Cooldown: 60 minutes

→ Alert fires after the 3rd failure in a row
→ Won't fire again for at least 60 minutes

Error Rate Spike

Triggers when the failure rate within a sliding window of runs exceeds a threshold. Ideal for high-volume agents where some failures are tolerable but a spike indicates a systemic problem.

SettingRangeDefaultDescription
Rate0.01 – 1.00.5Failure rate threshold (e.g., 0.5 = 50%)
Window runs5 – 1,00020Number of recent runs to evaluate

Example use case: A customer support agent that handles hundreds of conversations daily. A 5% failure rate is normal, but if it spikes to 50% across the last 20 runs, something is wrong.

Example configuration:

Alert type: Error Rate Spike
Rate: 0.25 (25%)
Window runs: 50
Cooldown: 120 minutes

→ Alert fires when 13+ of the last 50 runs failed (≥25%)

Run Burst

Triggers when too many runs start within a short time window. This detects unusual activity — accidental loops, abuse, or misconfigured triggers.

SettingRangeDefaultDescription
Max runs2 – 10,00050Maximum allowed runs in the time window
Window minutes1 – 1,44010Length of the evaluation window (in minutes)

Example use case: An agent triggered by content updates. If a bulk import accidentally adds 500 items at once, the run burst alert fires so you can investigate before consuming thousands of credits.

Example configuration:

Alert type: Run Burst
Max runs: 100
Window minutes: 15
Cooldown: 30 minutes

→ Alert fires when 100+ runs start within any 15-minute window

Slow Run

Triggers when a run takes significantly longer than the agent's historical p95 (95th percentile) duration. This catches performance regressions that might indicate upstream issues, model slowdowns, or inefficient step configurations.

SettingRangeDefaultDescription
P95 multiplier1 – 102How many times the p95 duration a run must exceed to trigger
Min duration (seconds)0 – 86,40030Minimum run duration to even consider (filters out fast runs)
Min historical runs1 – 1,00010Minimum number of past runs needed to calculate a reliable p95

Example use case: A retrieval agent that normally completes in 5 seconds. If a model provider experiences latency issues and runs start taking 30+ seconds, the slow run alert fires.

Example configuration:

Alert type: Slow Run
P95 multiplier: 2.5
Min duration: 10 seconds
Min historical runs: 20
Cooldown: 60 minutes

→ Alert fires when a run exceeds 2.5× the p95 duration
→ Only if the run took at least 10 seconds
→ Only once at least 20 historical runs exist for comparison

Source Alert Types

Three alert types are available for monitoring content source pulls. The terminology adapts based on the source type:

  • Websites and RSS feeds use the term "pull"
  • File upload sources use the term "upload"

Pull Failed

Triggers every time a source pull (or upload) fails.

SettingValue
ThresholdsNone — every failure triggers
Best forCritical sources where freshness matters

Consecutive Pull Failures

Triggers after a configurable number of consecutive pull failures.

SettingRangeDefaultDescription
Count2 – 1003Consecutive failures before alerting

Example use case: An RSS feed that occasionally returns 503 during maintenance windows. Set count to 3 so you're only alerted when the feed is persistently unreachable.

Pull Error Rate Spike

Triggers when the pull failure rate in a window exceeds a threshold.

SettingRangeDefaultDescription
Rate0.01 – 1.00.5Failure rate threshold
Window pulls3 – 10010Number of recent pulls to evaluate

Configuring Alerts

Creating an Alert Configuration

  1. Navigate to Alerts in the left sidebar (for account-wide), or open an Agent / Content Source and click the Alerts tab
  2. Find the alert type you want to enable
  3. Toggle the switch to activate it
  4. Configure the threshold settings for your use case
  5. Set the cooldown period
  6. Choose the notification recipients
  7. Click Save Settings

Threshold Settings

Every alert type has threshold parameters that control when it fires. These vary by type — see the detailed tables in each alert type section above. All thresholds have validated ranges to prevent misconfiguration.

Cooldown Period

The cooldown controls how long Seclai waits after firing an alert before it can fire the same alert again. This prevents alert fatigue during extended outages.

SettingRangeDefault
Cooldown1 – 1,440 minutes60 minutes

Example: With a 60-minute cooldown, if a "Run Failed" alert fires at 2:00 PM, the next alert of the same type won't fire until after 3:00 PM, even if additional failures occur in between.

Disabling an Alert

Toggle the switch off to disable an alert configuration without deleting it. Your settings (thresholds, cooldown, recipients) are preserved and can be re-enabled at any time.

Removing an Alert

Click Remove (red button) to permanently delete an alert configuration. This cannot be undone.


Alert Lifecycle

When an alert configuration fires, it creates an alert instance that progresses through a lifecycle:

Triggered → Acknowledged → Resolved
                ↘ Dismissed
StatusMeaningColor
TriggeredAlert has fired and needs attentionRed
AcknowledgedSomeone is investigating the issueYellow
ResolvedThe issue has been fixedGreen
DismissedThe alert was a false positive or not actionableGray

Status Transitions

FromToWhen to Use
TriggeredAcknowledgedYou've seen the alert and are investigating
TriggeredResolvedThe issue was already fixed or resolved itself
TriggeredDismissedFalse positive or not worth investigating
AcknowledgedResolvedInvestigation complete, issue fixed
AcknowledgedDismissedInvestigation revealed a false positive

Each status change can include an optional note explaining the transition.

Filtering Alerts

The alerts list supports filtering by:

  • Status — All, Triggered, Acknowledged, Resolved, or Dismissed
  • Time frame — Same time frame selector used across the Dashboard

The table displays:

ColumnDescription
StatusColor-coded badge
TypeAlert type (e.g., "Consecutive Failures")
TitleHuman-readable summary
TriggeredDate and time the alert fired
CommentsNumber of comments on the alert
SubscribersNumber of users subscribed to updates

Click any row to open the alert detail page.


Alert Detail Page

Each alert instance has a dedicated detail page with full context and collaboration features.

Metadata

  • Alert type — Which alert rule triggered
  • Triggered date — When the alert was created
  • Updated date — Last time the alert was modified
  • Current status — Color-coded badge

Structured Details

The detail view shows context specific to each alert type:

Alert TypeDetails Shown
Run FailedFailed step ID, step type, error message
Consecutive FailuresFailed step info, consecutive failure count
Error Rate SpikeCurrent error rate, failed/total runs, configured threshold
Run BurstRun count, window duration, configured threshold
Slow RunRun duration, p95 duration, multiplier, slow threshold, historical run count

Status History

A timeline tracks every status change, showing:

  • Who made the change (user name)
  • When it happened (timestamp)
  • The status transition
  • Any note attached to the change

Comments

Add comments to discuss the alert with your team:

  1. Type your message in the text area
  2. Click Comment

Comments show the author's name, timestamp, and message body. Use them to document investigation findings, root cause analysis, or remediation steps.

Subscriptions

Subscribe to an alert instance to receive updates when its status changes or new comments are added.

  • Subscribe — Click the Subscribe button to follow the alert
  • Unsubscribe — Click Unsubscribe to stop receiving updates
  • The subscriber count is visible in the alert list and detail page

Notifications

When configuring an alert, you choose who receives email notifications when it fires.

Personal Accounts

Notifications are sent to the email associated with your account. No additional configuration needed.

Organization Accounts

Three distribution options:

OptionWho Receives Notifications
Account owner onlyOnly the account owner
Owner & administratorsOwner plus all users with administrator role
Selected membersSpecific organization members you choose

When "Selected members" is chosen, a searchable multi-select dropdown appears where you pick individual team members.

Example: For a critical production agent, select "Owner & administrators" so the on-call team is always notified. For a lower-priority staging agent, select "Account owner only."


Permissions

RoleView AlertsConfigure AlertsChange Status / Comment
Owner
Admin
Editor
Viewer❌ Redirected❌ 403 Forbidden❌ 403 Forbidden

Examples

Example: Critical Agent Monitoring

For a daily financial reporting agent that must never fail silently:

Account-wide alerts (baseline):
  • Consecutive Failures: count=5, cooldown=120 min
  • Error Rate Spike: rate=0.3, window=50, cooldown=60 min

Per-agent overrides (Financial Reporter):
  • Run Failed: enabled, cooldown=10 min
  • Consecutive Failures: count=2, cooldown=15 min
  • Slow Run: multiplier=1.5, min_duration=60s, cooldown=30 min

Notification: Owner & administrators

Example: High-Volume Agent with Burst Detection

For a customer support bot handling thousands of runs per day:

Per-agent configuration:
  • Error Rate Spike: rate=0.10 (10%), window=100, cooldown=60 min
  • Run Burst: max_runs=500, window=5 min, cooldown=30 min
  • Slow Run: multiplier=3, min_duration=15s, min_runs=50, cooldown=120 min

Notification: Selected members → [ops-team@company.com, lead@company.com]

Example: Content Source Freshness Monitoring

For RSS feeds that must stay current:

Account-wide source alerts:
  • Consecutive Pull Failures: count=3, cooldown=60 min
  • Pull Error Rate Spike: rate=0.5, window=10, cooldown=120 min

Per-source overrides (Primary News Feed):
  • Pull Failed: enabled, cooldown=15 min
  • Consecutive Pull Failures: count=2, cooldown=30 min

Notification: Account owner only

Next Steps

  • Dashboard — Monitor aggregated metrics for agents, sources, and credits
  • Agents — Learn about creating and configuring agents
  • Content Sources — Set up sources for your knowledge bases
  • Organizations — Manage team members and notification recipients