Tech Ops Engineer - Incident Management, Central Technical Operations Services (Ctos)

Details of the offer

DESCRIPTION Amazon is seeking an exceptional Systems Engineer to join our world-class Central Technical Operations Services (C-TOS) team as an Incident Manager. As the first line of defense for maintaining high availability on the Amazon Retail Website, our C-TOS group provides critical incident response and management for the entire Amazon ecosystem. When issues arise that could impact our hundreds of millions of customers worldwide, our skilled Incident Managers spring into action to make event durations shorter, less frequent, and less severe.
This is immensely important, high-stakes work. The Amazon Retail Website is where we directly engage and delight our global customer base - any disruption can have a real impact on real people. That's why our C-TOS Incident Managers are so vital; leveraging deep operational expertise and the latest incident management tools, they work quickly to mitigate customer-impacting events.
This is an excellent opportunity to join one of Amazon's world-class engineering teams, working alongside some of the best and brightest minds in technology. Our engineers are encouraged to build solutions that enhance our incident management practice, including tooling and processes, as well as fix software problems - and then share those innovations across the organization. You'll have access to mentoring programs, regular tech talks with technical leaders, and well-defined career paths for motivated engineers who want to contribute to our culture of operational excellence and customer-focused innovation. The C-TOS team is globally distributed, with groups in Austin, Dublin, and Sydney providing 24/7 coverage, each working 10-hour shifts for 4 days per week.
Key job responsibilitiesServe as a technical evangelist, leveraging deep expertise to devise innovative solutions to complex business problems.Drive down mean time to resolution for incidents through proactive monitoring, rapid response, and continuous process improvement.Design, implement, and optimize world-class event detection, alerting, and incident management systems.Evolve operations management processes and technologies to accommodate Amazon's rapid growth.Create, review, and continuously improve documentation, procedures, and knowledge resources.Identify and resolve recurring platform issues by collaborating cross-functionally with service owners.Provide exceptional customer service by responding to and resolving requests within defined SLAs.Participate in a global "follow the sun" rotation, ensuring 24/7 coverage including weekends and holidays.Contribute to the interviewing and hiring process to build a world-class Incident Management team.BASIC QUALIFICATIONS Bachelor's degree in Computer Science, Engineering, or a related technical field; or at least 7 years of relevant experience in a large-scale online operations environment.Fluent written and verbal communication skills in English, with the ability to effectively collaborate cross-functionally.Proficient in scripting and automation using at least one interpreted language (e.g. Java, Python, Perl) as well as shell scripting.Strong working knowledge of Linux operating systems and networking fundamentals.Proven track record of driving complex, collaborative projects from conception through successful delivery.Experience with incident management, event detection, and operational excellence in a fast-paced, customer-centric environment.Ability to thrive in a geographically distributed, "follow the sun" coverage model, including off-hours and weekend work as needed.PREFERRED QUALIFICATIONS Experience with distributed systems at scaleExperienced with Agile software development practices, including Scrum ceremonies and continuous improvementBackground in architecting and supporting large-scale, distributed systemsTrack record of effectively leading and managing cross-functional incident response effortsDeep understanding of network technologies and troubleshooting to rapidly resolve complex issuesAbility to collaborate closely with customers during high-pressure problem resolution, while remaining calm and focusedExcellent prioritization, time management, and organizational skills in a fast-paced environmentAcknowledgement of country:
In the spirit of reconciliation Amazon acknowledges the Traditional Custodians of country throughout Australia and their connections to land, sea and community. We pay our respect to their elders past and present and extend that respect to all Aboriginal and Torres Strait Islander peoples today.
IDE statement:
Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer, and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, disability, age, or other legally protected attributes.
#J-18808-Ljbffr


Nominal Salary: To be agreed

Source: Talent2_Ppc

Requirements

Analyst Programmer

Responsibilities Work closely with cross-functional teams, understand their requirements to deliver high-quality software solutions and tools Manage, improve...


Tideri Jobbörse - New South Wales

Published 8 days ago

Qa Inspector/ Camm 2 Data Manager

About 1 month ago , from Lockheed Martin Australia Sikorsky Australia, a Lockheed Martin Company, is a dynamic, growing, and energetic organisation offering...


Lockheed Martin - New South Wales

Published 8 days ago

Senior Software Development Manager

Drive internal and external software development and testing practice, architecture and application support. 18th October, 2024 Who are we? Standards Austra...


Tideri Jobbörse - New South Wales

Published 8 days ago

Zoho Developer & Administrator

Connecting Teams, a fast-growing software consultancy organisation, is seeking a skilled Zoho Developer & Administrator to join our team. This is a contract-...


Tideri Jobbörse - New South Wales

Published 8 days ago

Built at: 2024-11-24T12:37:34.064Z