Skip to content
  • New: asasii S2 handheld barcode scanner. 1D and 2D, IP52 rated.View S2
  • asasii POS is live and deploying to Malaysian retailers.See asasii POS
  • asasii BSC: supply chain software for multi-outlet operators.See asasii BSC
  • Browse the full asasii hardware line: terminals, printers, scanners, payment, drawers.View hardware
idataraya
idataraya

Observability & SRE.

Production systems you can see into, with operational practices to match.

We build observability stacks and site reliability practices that let your engineering team understand what production is doing and respond effectively when it misbehaves. Metrics, logs, traces, alerts, on-call rotations, incident response playbooks, and service-level objectives, deployed and operated.

  • Metrics, logs, and distributed tracing stacks
  • SLO-based alerting with meaningful thresholds
  • Incident response runbooks and on-call rotations
  • Post-incident review templates and process
payments-svc · prodError budget · 96% remaining
  • AvailabilitySLO 99.95 · 99.99
  • Apdex0.94 · target 0.90
  • Burn rate1h · 0.2x · nominal
  • On-callN. Chong · 0 pages 24h
Telemetry pathService to dashboard
Running
AppOTel SDKCollectorbatchingBackendmetrics · logs · tracesDashboardsteam viewsPagingrouted to on-call
End-to-end · 4s median ingestion
checkout-svc · prodHealthy
  • Latencyp95 · 182ms
  • Errors0.04% · budget ok
  • Tracessampled · 10%
  • SaturationCPU 48% · mem 61%
SLO burn rate · last 24h
checkout · availabilitySLO 99.95%99.97%
checkout · p95 latencySLO 200ms182ms
search · availabilitySLO 99.9%2% budget left
search · error rateSLO 0.5%0.31%
4 SLOs · 1 burning fast

See production.

A complete observability and SRE practice covers instrumentation, alerting, incident response, and continuous learning. We deploy the tooling and the process.

Every service, the same four signals.

OpenTelemetry-instrumented metrics, logs, and traces from every service, rolling up to dashboards built around latency, errors, saturation, and traffic, not generic host graphs.

checkout-svc · prodHealthy
  • Latencyp95 · 182ms
  • Errors0.04% · budget ok
  • Tracessampled · 10%
  • SaturationCPU 48% · mem 61%

Alerts tied to SLOs, not CPU graphs.

Every alert traces back to a user-visible objective. Error budgets drive paging thresholds, so on-call wakes up when real reliability slips, not when a host breathes heavily.

SLO burn rate · last 24h
checkout · availabilitySLO 99.95%99.97%
checkout · p95 latencySLO 200ms182ms
search · availabilitySLO 99.9%2% budget left
search · error rateSLO 0.5%0.31%
4 SLOs · 1 burning fast

On-call, with a runbook behind every page.

PagerDuty rotations with humane handover, escalation policies, and every alert wired to a runbook that covers the first fifteen minutes. No more five a.m. pages into a blank terminal.

INC-2847 · P200:04
  • 1×search · error rate spike
  • 1×Runbook linked · auto-paged A. Lee
INC-2846 · P301:12
  • 1×checkout · latency warn
  • 1×Auto-resolved · budget intact
Ready to serve

Post-incident, structured and blameless.

Every incident ends in a blameless review with action items tracked to closure. Quarterly reliability reports turn recurring pain into architectural decisions, not tribal knowledge.

POSTMORTEM · INC-2841Reliability review
Verified
  • Root cause identifiedDB connection pool
  • Action items4 · all assigned
  • Timeline accuracyVerified by 3 responders
  • Customer impact8 min · 214 users
Review shipped · 48h after incident

An observability stack and an SRE practice to run it.

  • Observability platform

    Deployed metrics, logging, and tracing stack with retention, access control, and cost controls configured for your scale.

  • Alerting and SLOs

    Service-level objectives, error budgets, and alerting rules defined per service, tied to meaningful user-facing signals.

  • On-call tooling

    Incident management tool configured with rotations, escalation policies, and runbook links for every alert.

  • Incident response kit

    Runbooks, post-incident review templates, status page automation, and communication playbooks for your engineering team.

Ready to talk about observability & sre?

Book a discovery call. We will walk through how this fits your business, scope, timeline, and what you will get at the end.