小红书(xiaohongshu)招聘 SRE工程师-国际化

招聘职位:

SRE工程师-国际化 搜索同类职位
发布日期:
2026-05-22
工作地点:
职位类型:
全职
职位类别:
运维开发
来源:
小红书官网
岗位职责:
1、国际化架构与容灾建设 — 参与公司国际化基础设施架构设计与落地,负责跨 Region 架构、容灾与高可用能力建设,推动关键业务具备多 Region 部署、容灾切换及故障隔离能力,提升海外业务整体稳定性水平。
2、海外基础技术平台建设与运维 — 负责公司基础技术管控平台(如发布系统、监控告警、配置中心、服务治理、流量调度等)在海外 Region 的部署、运维与持续优化,保障海外环境与国内平台体系的一致性与可用性。
3、稳定性治理与应急响应 — 负责海外业务稳定性体系建设,包括可观测能力建设、故障应急响应、根因分析与复盘机制建设;在重大故障场景下牵头协调跨团队资源,快速恢复服务并推动系统性改进。
4、国际化技术方案落地 — 深入理解海外业务需求与架构特点,推动基础设施能力在海外场景的落地,包括多 Region 架构设计、网络与数据架构优化、基础服务能力适配等。
5、跨团队协作与体系建设 — 与国内基础设施团队、业务研发团队及平台团队紧密协作,推动海外技术体系与国内架构标准保持一致;沉淀海外稳定性最佳实践并推动在组织内推广。

1、International Architecture & Disaster Recovery — Participate in the design and implementation of Rednote's international infrastructure architecture. Build and evolve cross-region architecture, disaster recovery, and high-availability capability development. Drive critical services toward multi-region deployment, failover, and fault isolation to improve overall stability of overseas operations.
2、Overseas Infrastructure Platform Development & Operations — Own the deployment, operations, and continuous optimization of core internal technical platforms (release systems, monitoring & alerting, configuration services,service management, traffic scheduling, etc.) in overseas regions. Ensure consistency and availability across overseas and domestic platform environments.
3、Reliability Engineering & Incident Response — Build and continuously improve the reliability framework for overseas business, including observability capabilities, incident response, root cause analysis, and post-mortem mechanisms. Lead cross-functional coordination during major incidents to restore services quickly and drive (long-term)systemic improvements.
4、International Technical Solution Delivery — Develop a deep understanding of overseas business requirements and architecture characteristics. Drive infrastructure capabilities to fit overseas scenarios, including multi-region architecture design, network and data architecture optimization, and adaptation of foundational services.
5、Cross-functional Collaboration & Best Practice Development — Work closely with domestic infrastructure, product engineering teams, and platform teams to align overseas technical standards with domestic architecture standards. Consolidate and promote overseas stability best practices across the organization.
任职要求:
1、稳定性与SRE经验 — 熟悉大规模互联网系统稳定性保障体系,具备高可用架构设计、故障治理、容量规划及应急响应经验;有 SRE / 平台工程 / 基础设施团队经验者优先。
2、国际化架构经验 — 熟悉跨 Region 架构设计与容灾体系,如多 Region 部署、流量调度、数据同步与容灾切换等;有海外业务架构或国际化基础设施建设经验者优先。
3、基础技术能力 — 熟悉 Linux 系统、网络与常见中间件原理(如 MySQL、Redis、Kafka 等),理解云原生基础设施(Kubernetes、Service Mesh 等)与可观测体系(监控、日志、Tracing)。
4、研发与自动化能力 — 熟练掌握 Python、Go、Java 等至少一种编程语言,具备自动化运维平台、稳定性工具或基础设施系统研发经验。
5、问题分析与协作能力 — 具备良好的问题分析与故障排查能力,能够在复杂系统环境中快速定位问题;具备良好的沟通能力与团队协作意识。
6、语言能力 — 中英文流利,能够在国际化团队环境中进行技术沟通与协作。

1、Reliability Engineering & SRE Experience — Familiar with large-scale internet system stability frameworks; experienced in high-availability architecture design, fault governance, capacity planning, and incident response. Experience in SRE, platform engineering, or infrastructure engineering is preferred.
2、International Architecture Experience — Familiar with cross-region architecture design and disaster recovery systems (multi-region deployment, traffic scheduling, data sync, failover, etc.). Experience with overseas business architecture or international infrastructure development preferred.
3、Core Technical Skills — Proficient in Linux systems, networking, and common middleware (MySQL, Redis, Kafka, etc.); good understanding of cloud-native infrastructure (Kubernetes, Service Mesh, etc.) and observability stacks (monitoring, logging, tracing).
4、Development & Automation Skills — Proficient in at least one of Python, Go, or Java; experience building automation ops platforms, stability tooling, or infrastructure systems.
5、Problem Solving& Collaboration — Strong troubleshooting and root cause analysis skills in complex system environments; excellent communication and teamwork.
6、Language — Fluent in both English and Chinese (spoken and written)

【加分项】
1、有跨云 / 多云 / 海外云厂商(AWS / GCP / Azure / 阿里云国际 / 火山国际)经验
2、有跨 Region 容灾、流量调度(DNS / GSLB / Anycast / Global LB)经验
3、有稳定性工程(Chaos Engineering / 演练 / 自动化恢复)经验

Bonus Points:
1、Experience with multi-cloud / cross-cloud providers (AWS / GCP / Azure / Alibaba Cloud International / Volcano International)
2、Experience with cross-region disaster recovery and traffic scheduling (DNS / GSLB / Anycast / Global LB)
3、Experience with stability engineering (Chaos Engineering / drills / automated recovery)
免责声明:

此信息由小红书官网 (查看来源)审核并发布,我们转载该信息,仅出于传递更多就业招聘资讯、促进大学生及广大求职者就业之目的。该招聘职位信息的真实性、准确性、时效性及合法性均由原始发布方“小红书官网”负责。我们作为信息转载平台,不构成求职建议,不涉及任何职业中介服务,不对其内容承担任何形式的保证责任。请用户在使用转载信息时保持审慎,自行判断并承担相应风险,求职请认准企业官方渠道!

FAQ 小红书(xiaohongshu)招聘常见问答

小红书(xiaohongshu)招聘工作地点:
新加坡,美国
小红书(xiaohongshu)招聘经验要求:
no_limit