Job Title:
Incident Manager
Company: Talentoj
Location: Sangli, Maharashtra
Created: 2025-08-23
Job Type: Full Time
Job Description:
As Incident Manager IV, you will be the link between our Support, Engineering and Infrastructure teams. You will enable a better experience for our customers by organizing and driving the investigation of production issues in our application, which is a SaaS product consisting of Spring based microservices, ML models and data pipelines hosted within the AWS infrastructure, and report on these to Engineering, Support and other stakeholders. In doing so, you will also have a positive impact on the quality of the product. We are looking for somebody who is passionate about product quality, has extreme customer empathy, and is constantly looking to improve the quality of our services. This is an engineering position, not a management position.Role Value: Your work will directly contribute to greater customer satisfaction by providing information about product issues in a timely manner. You will also help our Sales teams by answering technical questions about our infrastructure in customer RFP’s. Key Responsibilities Investigate production issues raised by customers, Support and Engineering Work as a liaison between Support and Engineering to facilitate issue resolution, root cause analysis (RCA), and drive the implementation of learnings Create and track progress of problem tickets in Jira Create incident analysis reports with the support of Engineering teams Perform log file analysis with Datadog Debugging of basic REST API calls for investigations Execute SQL database queries to provide more information for investigations Create and update knowledge base articles in Confluence Participate in security audits (PCI DSS, ISO 27001, SOC2) and preparing supporting evidenceSkills & Qualifications Must-Have Skills: Working experience of at least 8 years in IT (SRE, sysadmin, developer, QA, technical support, or similar) University degree in a relevant field Strong analytical, problem-solving and collaboration skills Basic understanding of systems architecture of cloud hosted applications Data analysis skills - creating and interpreting dashboards to distinguish between real issues and false positives Project management and documentation skills such as Jira and Confluence Excellent written and verbal communication skills in English Knowledge of cloud, preferably AWS, infrastructure components Experience with REST APIs and tools e.g. Postman Experience with application logging/monitoring tools e.g. Kibana, Datadog; Experience with SQL, Linux & Network environments Willingness to learn new technical skillsNice-to-Have Skills: Understanding of basic ML concepts and LLM’s experience with Git or similar version control system experience with agile software development process Jenkins or similar CI pipeline Bash scripting for Linux basic skills in software development e.g. Java, Python, JavaScript, Go; experience with Docker & Microservices network and application security working within a PCI DSS environment