Department Engineering and Technology
LevelExperienced (Individual Contributor)
LocationSingapore
The Engineering and Technology team is at the core of the Shopee platform development. The team is made up of a group of passionate engineers from all over the world, striving to build the best systems with the most suitable technologies. Our engineers do not merely solve problems at hand; We build foundations for a long-lasting future. We don't limit ourselves on what we can or can't do; we take matters into our own hands even if it means drilling down to the bottom layer of the computing platform. Shopee's hyper-growing business scale has transformed most "innocent" problems into huge technical challenges, and there is no better place to experience it first-hand if you love technologies as much as we do.
About the Team:
The EGO team is dedicated to building an industry-leading machine learning platform to effectively support the implementation of algorithms across various business domains such as recommendation, search, and advertising. It focuses on extreme optimization for CTR/CVR prediction in large-scale sparse parameter scenarios, ensuring maximum performance in e-commerce applications and delivering greater value to the company.
The EGO platform covers the entire deep machine learning workflow — from sample organization and training to model building and publishing, and further to online model loading and inference services. It comes with a user-friendly Web UI and Restful API, providing an end-to-end, one-stop machine learning platform.
Job Description:
- Develop distributed Parameter Server (PS) systems for large-scale sparse model training and inference platforms in the search, advertising, and recommendation domains. The system should support high-throughput parameter read/write and update operations, handle hundreds of billions of features and TB-level sparse models, enable online real-time learning, and meet algorithmic needs such as feature admission and expiration.
- Participate in the development of the one-stop machine learning platform, integrating the PS system into the platform to provide a user-friendly, stable, high-performance, and platform-level distributed parameter service system. Enhance the platform’s efficiency and usability, accelerating the model iteration process for algorithm teams.
Requirements:
- Bachelor’s degree or above in Computer Science, Electronics, Automation, Software Engineering, or related fields
- At least 3 years of relevant hands-on experience
- Proficient in C++ programming with strong low-level technical skills; adept at multi-threaded programming, lock optimisation, memory pool, thread pool, template programming, GDB debugging, performance tuning, and RPC frameworks.
- Familiarity with distributed PS systems, distributed system backend optimization, high-performance in-memory KV systems, KV storage systems based on NVMe-SSD, and high-performance client-server architecture systems is a plus.
- Highly passionate about computer technology, proactive in learning, with a strong spirit of in-depth research and hands-on practice. Maintains high standards and strict requirements for delivered code; works with rigor and attention to detail.
- Strong team player with excellent continuous learning ability.