SW Engineer - SRE
Job Description
- Engage with product, architects, developers, Certification, Project management, Operations & Infrastructure teams from the start of the SDLC phase.
- Become subject matter expert for the assigned product verticals.
- Analyze complex systems from a reliability and resilience perspective.
- Run the production environment by monitoring availability and taking a holistic view of system health
- Understanding the end-to-end product topology from infrastructure and application perspective.
- Identify sources of instability in large-scale distributed systems and drive operational excellence. Dive deep and understand every issue occurred and own them completely for end-to-end closure.
- Performing functional analysis of products by gathering and analyzing metrics from both operating systems and applications to assist in performance tuning and fault finding – integration/operational challenges.
- Performing code bug fixes in production and recommending any architectural improvements during issue/incident analysis.
- Work closely with development and product teams on suggesting new features and enhancements based on live issues.
- Drive down the burden of toil with tooling and automation to achieve operational efficiency and smoother customer experience.
- Technical consultancy for monitoring, incidents and problem management. Lead technical bridges and interact with both technical staff and management during the incident and change management process.
- Provide Level 3 on-call support (within working hours only, over weekends once in a quarter),
- Engage with tech and non-tech partners on regular basis to analyze functional and technical in-depth solutions.
- Understanding new changes in production systems and assessing its risk from application perspective for driving reliability and availability
- Have some level of network engineering understanding to assist in incident/issue triaging
- Provide guidance and technical expertise to junior team members.
Job Requirement
- Bachelors/Degree in Computer Science or other technology field.
- Must have at least 3 years of professional working experience in Java environment
- Coding experience beyond simple scripting
- 3+ years of development experience with Java, SQL, Automation, bug fixing, handling Production & Application operations
- Experience working with any log analysis tools and observability applications like Grafana, Tableau, Splunk.
- Linux systems engineering capabilities and network analysis expertise
- Knowledge on Docker/Kubernetes, NginX, CDN, Caching technologies would be a great addition