Have most likely used Algolia in the last week without even knowing about it.
What about joining the team and enabling more developers to build great search experiences with little worry about the reliability of their search engine?Site Reliability Engineers (SRE) at Algolia are both software and systems engineers that ensure we can reliably serve over 4 billion queries every day and over 1 trillion queries a year, for users all around the world, despite data centers being on fire and undersea cables being cut.
Since at Algolia we operate many services including our Search API, DocSearch and Analytics, you'll keep learning new things every day and share what you have learned.The platform we develop uses both cloud and bare-metal systems spanning over 80 data centers in 17 different regions serving hundreds of millions of users from every corner of the globe.
Because search is a critical component of many applications, the SRE team maintains a high level of expertise in system failures in order to prevent them and provide reliable service to our customers.As a Site Reliability Engineer, you'll actively work with software engineers in application teams to improve the reliability, predictability, and performance of our applications and services.
While part of the application team, you'll closely work with the SRE community of engineers at Algolia and share the knowledge and needs of your application team.No two problems are the same because all the systems evolve all the time.
We expect you to be a curious problem solver who isn't afraid to think outside of the box and use the knowledge of system interactions in your favor.
When you're ready, you'll also take ownership of complete projects and execute them.
The team is composed of engineers with different backgrounds and experience both in the industry and academia, both senior and junior.
The diversity works in our favor and you should increase it by bringing your experience, your knowledge, and your point of view.
Thinking differently is a plus, not a minus.
We're transparent with each other and to other teams both about our success and our failures.
This way we learn, we accept our weaknesses and continuously strive to improve both personally and professionally.YOUR ROLE WILL CONSIST OF:Being a team playerWorking with other teams to identify, troubleshoot, and resolve high impact issuesEvaluating performance of current and future systems, both software and hardwareParticipating in design of new systemsDeveloping and maintaining the automation tools used for all systemsParticipating in on-call rotation to ensure fast response to production issuesEnsuring that the Infrastructure best practices are followedYOU MIGHT BE A FIT IF YOU HAVE:Collaborative approach to problem solvingWillingness to make independent decisions and taking ownership for them4+ years of software engineering experienceKnowledge of Shell scripting and at least one scripting language (Python, Ruby, etc.
)Willingness to learn Go (golang)Understanding of Linux systems: I/O, process scheduling, filesystemsUnderstanding of computer networks: TCP/IP, DNS, load-balancingProficient spoken and written English skillsRigor in high code quality, automated testing, and other engineering best practicesNICE TO HAVE:Knowledge of low level principles of computers and network componentsPerformance profiling of applications both in development and productionKnowledge of Public Cloud platforms (AWS, GCP, Azure)Knowledge of Go (golang)Knowledge of automated integration testsKnowledge of Chaos engineeringAbility to use a configuration management tool like Ansible, Puppet or ChefWe're looking for someone who can live our values:GRIT - Problem-solving and perseverance capability in an ever-changing and growing environmentTRUST - Willingness to trust our co-workers and to take ownershipCANDOR - Ability to receive and give constructive feedback.CARE - Genuine care about other team members, our clients and the decisions we make in the company.HUMILITY - Aptitude for learning from others, putting ego aside.
#J-18808-Ljbffr