We are the Tiktok Cloud Infrastructure team, primarily responsible for Tiktok infrastructure and technical architecture, it's also the industry's leading cloud computing solution. We are supporting many of our star products and core business lines, serving hundreds of millions of users every day. In our work, we manage hundreds of thousands of server-scale clusters, several exabytes of massive data storage, as well as tens of thousands of computing/storage hybrid deployment and scheduling. We are also building a series of infrastructure to ensure the best R&D practices and enable the overall development of the company.
If you have a working background related to large-scale cloud computing and are willing to continuously evolve the industry-leading cloud infrastructure and do challenging technical architecture work under the industry-leading business scale, welcome to join us.
Responsibilities
1. Storage direction: Building massive distributed storage solutions, including but not limited to NoSQL, Graph, Object Storage, and DFS systems
2. PaaS direction: Building ultra-large-scale fully managed application scheduling platform, opening up the whole Life Time of business development, online operation, governance, auto-scale, etc.
3. Developer Products direction: For large-scale microservices and complex business call chains, build intelligent monitoring, alarm, DevOps and governance platforms, and form deep linkage with the PaaS platform
4. Orchestration direction: Designing and developing new and innovative orchestration and scheduling systems, which balance resources and efficiency well under the scenes of ultra-large clusters and mixed complex scenes
5. As a mentor, give the team junior engineer systematic training in terms of professional technology and domain knowledge to build a technical culture
6. Formulate development goals according to the project schedule, write detailed design documents and be responsible for module implementation, performance tuning, and functional testing
7. Provide timely technical support for our online applications, extract potential needs and points of optimization from it, and continuously optimize the system.
Report job