Site Reliability Engineer
Identity, Access & User Platform
Description:
CORE PROFILE:
We are looking for a Senior SE Engineer to help us develop solutions and administer a highly available architecture in AWS. You will help us better our engineering practice, integrates new technologies to existing/future architecture and develop the technical expertise of the team.
NATURE OF WORK:
- Architect, build, maintain AWS cloud infrastructure.
- Work closely with development teams to integrate their projects into the production AWS environment and ensure their ongoing support once there.
- Troubleshoot and resolve issues related to application development, deployment and operations.
- Build from the ground up reliable infrastructure services in AWS to deliver highly scalable services.
- Develop tools and utilities to provide short term solutions to immediate needs as required.
- Support and develop automation frameworks (IaaC, CI/CD pipelines) and tooling that will be consumed by the entire Enterprise Engineering team.
- Create deployment strategies that allows the team to successfully deploy software and hardware in any environment
- Identify roadblocks and inefficiencies in the current workflow, and work with teams to plan and remediate improvements.
- Perform research, test and implement tools to monitor performance, cost, uptime, and availability metrics.
- Provide 2nd level support in troubleshooting issues
- Drive project progress and communicate status to partners
DISPLAYED SKILL MASTERY:
- Communicate Product Service Engineering status effectively to group
- Driving task completion with other partner group
- Strong troubleshooting and problem solving skills, including application and network-level troubleshooting ability
- Strong focus on maintaining a high Quality of Service
- Work in a fast-paced, multi-tasking environment
- Take Ownership of all assigned tasks
REQUIRED QUALIFICATIONS:
- Highly technical and analytical, with 2+ years of demonstrated IT experience
- Experience working and implementing AWS services
- Intermediate scripting experience with languages like Python, Shell, Perl etc
- Hands-on experience in building continuous integration and continuous delivery using Gitlab CI, Jenkins, CodePipeline etc
- Experience working with Docker, Kubernetes and other container management solutions
- Experience with infrastructure tools like Terraform and CloudFormation.
- Extensive system/network administration background in a large scale Linux environment
- Experience with monitoring and APM tools such as Splunk, Dynatrace, New Relic or any other monitoring tools/processes
- Ability to handle multiple competing priorities in a fast-paced environment
- Strong Automation and Problem solving skills
- Experience with system hardening and implementing security controls. PCIDSS experience is a plus.