Data Cycling Center (DCC) is a Data Science team that develops AI-driven content (unstructured data) understanding capabilities, identifies business opportunities from the understanding, and builds products and solutions to capture those opportunities.

Our mission is to simplify the acquisition and utilization of unstructured/unlabeled data. The team act as the data modeling factory, using and analyzing mass data and finding useful insights for business growth.

About the Role:
We are looking for experienced data scientists to join our team and apply advanced analytics and machine learning techniques-including Prompt Engineering (PE), multi-modal large language models (LLMs), computer vision (CV), natural language processing (NLP), and audio signal processing-to optimize intelligent labeling workflows and data products within TikTok's ecosystem. Your work will help improve user experience, enhance content integrity, and support data-driven strategic decision-making. You will collaborate closely with cross-functional teams across product, operations, and algorithms to build scalable, end-to-end Prompt Engineering and LLM workflows for intelligent content moderation and labeling applications.

Key Responsibilities:
• Collaborate with cross-functional stakeholders to gather and refine requirements for data labeling projects and identify opportunities for optimization through data-driven solutions.
• Design and manage the full lifecycle of end-to-end data labeling and policy testing workflows — from aligning with business needs to deployment, iteration, and monitoring.
• Establish and maintain a centralized knowledge base for Retrieval-Augmented Generation (RAG) systems, incorporating both structured (e.g., SOPs, guidelines) and unstructured (e.g., annotations, case logs) data to support LLM-based policy QA and labeling efforts.
• Operationalize intelligent labeling pipelines leveraging Prompt Engineering, agent-based workflows, and labeling models to ensure availability of high-quality data for model training and policy evolution.
• Translate complex policy documents into machine- and human-readable formats, support agent and PE strategy development, and evolve nuanced policy edge cases in sync with fast-changing regulatory or platform dynamics.
• Apply multi-modal LLM techniques to extract latent signals from content that inform moderation strategies and highlight policy gaps.
• Lead applied ML and data science research and experimentation to solve business-critical use cases.
• Own the model lifecycle from data sourcing and preprocessing to training, deployment, and post-launch maintenance.

Save Apply

Report job

Applied Data Scientist - Applied AI

Product Data Scientist - Applied AI

Data Engineer - Applied AI - Data Cycling Centre

Data Scientist

Backend Engineer, Applied Machine Learning Platform

Backend Engineer - Applied Machine Learning Platform