Job Title:
Grafana Engineer
Company: Solvex Solutions
Location: Vellore, Tamil nadu
Created: 2026-05-10
Job Type: Full Time
Job Description:
We are seeking a Grafana Engineer of 5+ Years of experience to design, build, and operate cloud-native observability solutions across our platforms. You will lead instrumentation, dashboards, alerting, and SLO/SLA reporting using Grafana and its ecosystem, integrating with cloud services and metrics/logs/tracing backends to deliver reliable, actionable insights for platform and product teamsKey ResponsibilitiesObservability Architecture: Define and implement end-to-end observability patterns (metrics, logs, traces, events, SLOs) using Grafana, Prometheus-compatible systems, and cloud-native services.Data Sources & Integrations: Configure and manage Grafana data sources (Prometheus, Elastic/OpenSearch, CloudWatch/Azure Monitor/Stackdriver, SQL), enabling cross-system correlation and unified views.Dashboarding & Visualization: Build reusable, templated, role-based dashboards (Grafana panels, variables, transformations) that provide meaningful KPIs, health checks, and executive reporting.Alerting & Incident Response: Implement alerting (Grafana Alerting, Alertmanager, ServiceNow/Teams) with noise reduction, deduplication, and escalation policies; contribute to on-call runbooks and post-incident reviews.SLO/SLI Engineering: Define SLIs and SLOs with error budgets, implement tracking and burn-rate alerts, and partner with service owners to align reliability goals with business outcomes.Required Qualifications:-Hands-on expertise with Grafana (v11+) including dashboard design, templating, transformations, and Grafana Alerting; experience with Grafana Enterprise features is a plus.Dashboard automation and development like EOSL, sustainability dashboards etc.Strong experience with multi cloud: AWS (CloudWatch, EKS), Azure (Azure Monitor, AKS), or GCP (Cloud Monitoring, GKE), and integrating those with Grafana.Proficiency with metrics/logs/traces backends: Prometheus and/or Mimir/Cortex; Loki for logs; Tempo/Jaeger or OpenTelemetry for tracing; familiarity with Elastic/OpenSearch or Splunk is beneficial.Kubernetes fundamentals and production operations experience: exporters, service discovery, Helm/Operators, and cluster monitoring patterns.Solid understanding of SRE/observability principles: SLIs/SLOs, error budgets, runbooks, and incident management workflows.Infrastructure-as-Code and CI/CD: Terraform (especially Grafana and cloud providers), GitHub/GitLab, and automated pipeline practices.Scripting/automation skills in Cloudformation, Terraform, GitHub, Python, Go, or Node; ability to build exporters or transform telemetry as needed.