Epicareer Might not Working Properly
Learn More

Principal Software Engineer, SRE (Mailchimp)

Salary undisclosed

Apply on


Original
Simplified
Mailchimp is the leading marketing platform for small businesses. We empower millions of customers around the world to build their brands and grow their companies with a suite of marketing automation, multichannel campaigns, CRM, and analytics tools.

We are seeking a highly skilled Principal Software Engineer focused on Site Reliability to join our dynamic engineering team. In this role, you will be responsible for ensuring the reliability, scalability, and performance of our application used by both internal engineers and external customers. You will collaborate with cross-functional teams to design, implement, and maintain systems that are robust and resilient. You will be responsible for leading a cultural change of operational excellence across the organization.

We are looking for experienced technologists who have a background that includes deep technical experience grounded in previous years of hand-on development in high scale, highly available systems that achieved outstanding levels of operational excellence and who has taken those learnings and applied them at scale in their organization.

Intuit Mailchimp is a , giving employees the opportunity to collaborate in person with team members in our Atlanta or New York office two or more days per week.

Responsibilities
  • Design, develop, and maintain software and systems to ensure the reliability and scalability of our application.
  • Strategize for system wide resiliency and degraded experience of dependent.
  • Write clear and persuasive technical documents and RCAs that have an impact on the engineering community.
  • Determine and implement monitoring and alerting solutions to proactively identify and resolve issues.
  • Collaborate with development teams to ensure best practices in code quality, deployment, and automation.
  • Comfortable delegating to a team while still hands-on in areas and maintaining a deep understanding of the end-to-end system.
  • Drive incident response efforts, conduct root cause analysis, and surface operational issues with dependency teams and partners to implement corrective actions.
  • Optimize application performance and resource utilization for operational excellence.
  • Develop and maintain infrastructure as code (IaC) using tools such as Terraform or CloudFormation.
  • Take calculated risks and help teams navigate change.
  • Stay up-to-date with industry trends and emerging technologies to drive continuous improvement.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
Report this job