Senior Software Reliability Engineer - Availability & Detection (Remote Across Anz)

Senior Software Reliability Engineer - Availability & Detection (Remote Across Anz)
Company:

Canva


Details of the offer

Join the team redefining how the world experiences design. Thanks for stopping by.
We know job hunting can be a little time consuming and you're probably keen to find out what's on offer, so we'll get straight to the point.
Where and how you can work Our flagship campus is in Sydney.
We also have a campus in Melbourne and co-working spaces in Brisbane, Perth and Adelaide.
But you have choice in where and how you work, we trust our Canvanauts to choose the balance that empowers them and their team to achieve their goals.
What you'd be doing in this role As Canva scales change continues to be part of our DNA.
But we like to think that's all part of the fun.
So this will give you the flavour of the type of things you'll be working on when you start, but this will likely evolve.
At the moment, this role is focused on:
Designing and implementing processes, tools, automation, and libraries that service teams can use to improve the reliability of the services they own.
For instance, adding a new long-awaited feature in our circuit breaker library. Working with product engineering teams to ensure reliability best practices and tools are rolled out in every service across the whole organization. Fostering a culture within the Engineering org that puts reliability first and establishes processes and policies that drive reliability within product engineering teams. A deep investigation into production incidents followed up by applying the learning to code. Researching, developing, and justifying the best choices in the form of design docs for tools and processes that will shape the future of reliability at Canva. Proposing new approaches and solutions to ensure we future-proof Canva's distributed cloud infrastructure as we scale. Participating in design meetings, hiring interviews, and code reviews. You're probably a match if You have advanced coding proficiency in Python/Golang/Java and strong Computer Science and OOP fundamentals. You have at least 5+ years of commercial experience working with developing complex, distributed web applications. You have experience diagnosing and addressing issues across the "full stack", including front-end code, backend, network/infrastructure and data layer. You have solid understanding of observability principles, such as metrics, logs, tracing, synthetic testing, query construction, dashboarding and alerting. You have experience with guiding others in the principles of incident review, investigation and remedial activity. You have disciplined coding practices, experience with code reviews and pull requests, and a creative and conceptual problem-solving approach. You have strong communication and team collaboration skills, both written and verbal. Nice to have; Not required! Experience in Java is a nice to have.
Our platform and infrastructure tooling is primarily written in Python, Go and Terraform. Experience working with microservice architectures in large containerised, distributed cloud environments (ideally AWS). Experience working with data warehouse, analytics and reporting tools such as Snowflake, Mode Analytics and Looker. About the team This role is with the Availability and Detection team which sits within the Reliability Platform Group.
The Reliability Platform Group is responsible for providing the tools and processes to scale reliability across all Canva services.
What's in it for you? Achieving our crazy big goals motivates us to work hard - and we do - but you'll experience lots of moments of magic, connectivity and fun woven throughout life at Canva, too.
We also offer a range of benefits to set you up for every success in and outside of work.
Equity packages - we want our success to be yours too. Inclusive parental leave policy that supports all parents & carers. An annual Vibe & Thrive allowance to support your wellbeing, social connection, office setup & more. Flexible leave options that empower you to be a force for good, take time to recharge and supports you personally. Other stuff to know We make hiring decisions based on your experience, skills and passion, as well as how you can enhance Canva and our culture.
When you apply, please tell us the pronouns you use and any reasonable adjustments you may need during the interview process.
Please note that interviews are conducted virtually.
#J-18808-Ljbffr


Source: Talent_Dynamic-Ppc

Requirements

Senior Software Reliability Engineer - Availability & Detection (Remote Across Anz)
Company:

Canva


Advisor (Real-Time Network Management)

We are recruiting for two (2) Advisor (Real-time Network Management) roles in our regional Queensland Traffic Management Centres. One role is based in Cairns...


From Department Of Transport And Main Roads - Queensland

Published 9 days ago

Technical Services Deployment Technician - Desktop Support

Location: ToowoombaJob Type: TemporaryPosted: 8 days agoContact: Chantelle LeeDisciplineGeneral ITReference: 263561About The CompanyOur client is a world-cla...


From Peoplebank - Queensland

Published 9 days ago

Ongoing Support Consultant

Why join APM?APM is a global health and human services organisation transforming lives since 1994. Be part of a 15,000-strong team across 11 countries, empow...


From Apm - Queensland

Published 9 days ago

Cloud Services Manager

Management (Information & Communication Technology)At WorkCover Queensland our vision is to be the best worker's compensation insurer, to make a positive dif...


From Workcover Queensland - Queensland

Published 9 days ago

Built at: 2024-09-30T22:26:27.660Z