Job Title:

Sr SRE

Company: Insight Global

Location: Belgaum, Karnataka

Created: 2025-08-23

Job Type: Full Time

Job Description:

Required Skills & Experience 10+ years of experience in SRE or DevOps roles. Deep expertise in Kubernetes (deployment, troubleshooting, performance tuning), Networking (firewalls, routing, connectivity issues), Relational Databases (patching, auditing, performance tuning) Strong scripting skills (e.g., Python, Bash) for tooling and automation. Experience with operations and development, and the ability to debug at the code layer. Proven ability to lead through influence and solve problems across teams. Comfortable navigating organizational blockers and driving issues to resolution. Experience with incident response and postmortem processes. Familiarity with monitoring and observability tools. (Splunk Observability, Dynatrace, Datadog, Grafana, Prometheus etc.) Ability to mentor and coach other engineers and development teams. Strong communication, and the ability to explain complex technical issues clearly to both technical and non-technical audiences. Ability to work cross functionally with DBAs, network engineers, developers, and leadership.Job Description An employer is looking for an SRE to join their enterprise level SRE team. They are building a specialized team of Senior Site Reliability Engineers to act as embedded technical experts across their IT organization. This team will be responsible for solving complex production issues, guiding development teams, and building tools that improve system resilience and observability. This is not a traditional SRE role. You will be a technical leader, coach, and hands-on problem solver who thrives in ambiguity and drives results across organizational boundaries. This role is not on the infrastructure side (not on the terraform / provisioning server side) but supporting applications in production, and requires development and operations skills.Responsibilities • Investigate and resolve high-impact production issues across infrastructure and applications. • Embed with dev teams to guide them through performance, reliability, and architectural challenges. • Participate in incident response bridges as a technical expert. • Build tools and scripts to detect vulnerabilities, automate checks, and improve system visibility. • Conduct post-incident audits and ensure follow-through on remediation. • Collaborate with DBAs, network engineers, and platform teams to unblock and resolve issues. • Proactively identify issues and drive them to resolution without waiting for direction.

Apply Now

➤