Link copied to clipboard!
Back to Jobs
Staff SRE, Performance & Reliability at Fastly
Fastly
San Francisco, CA
Administration
Posted 0 days ago
Job Description
Fastly helps people stay better connected with the things they love. Fastlys edge cloud platform enables customers to create great digital experiences quickly securely and reliably by processing serving and securing our customers applications as close to their end-users as possible at the edge of the Internet. The platform is designed to take advantage of the modern internet to be programmable and to support agile software development. Fastlys customers include many of the worlds most prominent companies including Vimeo Pinterest The New York Times and GitHub.Were building a more trustworthy Internet. Come join us.Posting Open Date: Dec. 8 2025Anticipated Posting Close Date*: Jan 6 2026*Job posting may close early due to the volume of applicants.Staff SRE Performance & ReliabilityWere seeking a versatile and experienced Site Reliability Engineer who thrives in a fast paced high scale environment and is passionate about reliability performance automation and tooling. Reporting to the VP of Performance Center Operations youll serve as a key individual contributor within the Performance Center Operations team. The Fastly Performance Center is the strategic and operational engine that ensures the highest level of performance for the most demanding workloads on the Internet. We proactively safeguard quality of service at the global scale drive technical and product strategies that shape our platforms evolution and directly influence revenue outcomes by ensuring our customers succeed.Partnering cross-functionally across Engineering Infrastructure Product Revenue and Account teams to build tooling and processes that drive scale availability and intelligent automation. Your work will help ensure Fastly remains the most performant trusted and customer aligned partner in the industry.The scope of this role will evolve with the needs of the business and the maturity of the program. Additional responsibilities may be assigned based on individual expertise and strategic priorities from the Office of the Founder & CTO.What Youll Do:This role is approximately 35% Site Reliability Engineering 35% Data Analysis / Traffic Insights and 30% Cross-functional Operations balancing technical expertise with collaboration and strategic impact.Drive the development of automation and observability tooling that improves operational efficiency and platform reliability including traffic monitoring alerting and surveillance tools.Partner with observability teams to implement and improve existing dashboards (Grafana Prometheus) and metrics pipelines that provide meaningful visibility into traffic patterns surges and seasonal trends.Help define SLIs/SLOs and improve monitoring frameworks ensuring alerts and dashboards reflect operational reality and proactively surface issues before customer impact.Collaborate with data/analytics teams to leverage data pipelines (e.g. SQL BigQuery or other large-scale data stores) for trend analysis capacity planning traffic pattern recognitionStep in to run daily operational standups or coordination meetings as needed. Ensuring priorities are clear follow ups are tracked and cross functional execution maintains momentum.Facilitate cross-team communication during high-impact initiatives or incident reviews surfacing blockers early and maintaining execution momentumPerform root-cause investigations of performance scalability or traffic anomalies translate learnings into improvements in tooling and architectureAct as a technical liaison helping contextualize traffic behavior system performance and support escalations with clear insightHelp define and evolve run-books incident response processes post-mortems knowledge base ensuring that repeated issues are proactively surfaced and addressed via automation or tooling rather than reactive firefightingProvide leadership in incident response mitigation and communication across teamsMonitor seasonal patterns major events and global traffic distribution helping ensure the platform remains resilient during shifts in demandWhat Were Looking For:8 years of experience in Site Reliability Engineering Systems Engineering Platform/Infrastructure Engineering or equivalent roles.Professional experience operating in CDN streaming media or other high-volume internet traffic environments.Deep understanding of network/distributed/cloud systems: TCP/IP DNS HTTP/S TLS caching/proxy/CDN technologies; direct experience in CDN Web Application and API Security a plusDemonstrated ability to build automation tooling and observability systems: e.g. dashboards alerts instrumentation data pipelines. Experience with Prometheus Grafana BigQuery/SQL etcHands-on experience with scripting or programming (e.g. Python Go Shell) and comfortable building tooling rather than just consuming.Experience working cross-functionally with engineering infrastructure operations analytics and customer/account teams. Strong communication skills ability to translate technical findings to non-technical stakeholders.Demonstrated ability to coordinate complex technical work across multiple teams facilitate daily standups or working sessions and maintain operational momentum in complex fast-moving environments.Proven track record of driving mission-critical reliability and performance improvements in production systems. Strong sense of ownership and accountabilityExperience with monitoring/alerting systems and incident response. Bonus for experience with live streaming high-variability traffic or global seasonality at scale.Well be super impressed if you have experience in any of these:Experience with large-scale data analytics systems (BigQuery Spark Presto) to derive operational insights from traffic telemetryFamiliarity with cloud platforms (AWS GCP Azure) infrastructure as code or container orchestration (Terraform Kubernetes)Experience evaluating buildvsbuy decisions and driving platform wide tooling improvementsBackground in media live events or streaming operations in a high throughput latency sensitive environment a plusWork Hours:This position will require you to be available during core business hours and occasional nights and weekends as required for on call and incident responseWork Location(s) & Travel Requirements:The preferred locations for this position are:San Francisco CANew York NYDenver COFastly currently embraces a largely hybrid model for most roles which allows employees flexibility to split their time between the office and home.There is a strong preference for Hybrid near a local office. However we may be willing to consider exceptionally qualified remote candidates within the US.This position will require travel as required by your role or requested by your manager.SF / LA Fair Chance Ordinance StatementPursuant to the San Francisco Fair Chance Ordinance and the Los Angeles Fair Chance Initiative for Hiring Ordinance we will consider for employment qualified applicants with arrest and conviction records.SalaryThe estimated salary range for this position is $181220 to $217464. Starting salary may vary based on permissible non-discriminatory factors such as experience skills qualifications and location.This role may be eligible to participate in Fastlys equity and discretionary bonus programs.BenefitsWe care about you. Fastly works hard to create a positive environment for our employees and we think your life outside of work is important too. We support our teams with great benefits that start on the first day of your employment with Fastly. Curious about our offeringsWe offer a comprehensive benefits package including medical dental and vision insurance. Family planning mental health support along with Employee Assistance Program Insurance (Life Disability and Accident) a Flexible Vacation policy and up to 18 days of accrued paid sick leave are there to help support our employees. We also offer 401(k) (including company match) and an Employee Stock Purchase Program. For 2025 we offer 11 paid local holidays 11 paid company wellness days.Why FastlyWe have a huge impact.Fastly is a small company with a big reach. Not only doour customershave a tremendous user base but we also support a growing number ofopen source projects and of code employees are encouraged to share causes close to their heart with others so we can help lend a supportive hand.We love distributed teams. Fastlys home-base is in San Francisco but we have multiple offices and employees sprinkled around the globe. As a new hire you will be able to attend our IN-PERSON new hire orientation in our San Francisco office! It is an exciting week-long experience that we offer to new employees to build connections with colleagues across Fastly participate in hands-on learning opportunities and immerse yourself in our culture firsthand.We value diversity.Growing and maintaining our inclusive and diverse team matters to us. We are committed to being a company where our employees feel comfortable bringing their authentic selves to work and have the ability to be successful -- every day.We are passionate.Fastly is chock full of passionate people and were not one size fits all. Fastly employs authors pilots skiers parents (of humans and animals) makeup geeks coffee connoisseurs and more. We love employees for who they are and what they are passionate about.Were always looking for humble sharp and creative folks to join the Fastly team. If you think you might be a fit please apply! A fully completed application and resume or CV are required when applying.All job applications must be submitted through our official careers site at We will never request sensitive information such as your Social Security number bank account or credit card information during the application process. All official communication will come from an @ or @ email address. Fastly is committed to ensuring equal employment opportunity and to providing employees with a safe and welcoming work environment free of discrimination and harassment. Our employment decisions are based on business needs job requirements and individual qualifications. All qualified applicants will receive consideration for employment without regard to age ancestry color family or medical care leave gender identity or expression genetic information marital status medical condition national origin family or parental status physical or mental disability political affiliation protected veteran status race religion sex (including pregnancy) sexual orientation or any other characteristic protected by applicable laws regulations and ordinances.Consistent with the Americans with Disabilities Act (ADA) and federal or state disability laws Fastly will provide reasonable accommodations for applicants and employees with disabilities. If reasonable accommodation is needed to participate in the job application or interview process to perform essential job functions and/or to receive other benefits and privileges of employment please contact your Recruiter or the Fastly Employee Relations team at or . Fastly collects and processes personal data submitted by job applicants in accordance with ourPrivacy Policy. Please see ourprivacy notice for job applicants.Required Experience:Staff IC Key Skills Arabic Speaking,Access Control System,B2C,Account Management,Legal Operations,Broadcast Employment Type : Full-Time Experience: years Vacancy: 1 Monthly Salary Salary: 181220 - 217464
Resume Suggestions
Highlight relevant experience and skills that match the job requirements to demonstrate your qualifications.
Quantify your achievements with specific metrics and results whenever possible to show impact.
Emphasize your proficiency in relevant technologies and tools mentioned in the job description.
Showcase your communication and collaboration skills through examples of successful projects and teamwork.