Wonbeom Jang

Hi, I’m Wonbeom Jang, an LLM Research Engineer at SK Telecom.

I’m a data-centric LLM engineer: I build the benchmarks that reveal what a model gets wrong, then design and construct the training data that fixes it.

I’m currently contributing to SK Telecom’s A.X Foundation Model, where I design and build the training data that goes into the model — both Korean pre-training corpora and post-training datasets.

Previously, I built TelcoLLM, a telecom-domain LLM developed in collaboration with OpenAI and Anthropic, and led the development of TelBench (EMNLP 2024) and TelAgentBench (EMNLP 2025), benchmarks for evaluating this LLM. TelcoLLM is now deployed in production at SK Telecom’s AI Contact Center (AICC), powering in-call RAG and post-call analysis.

I also led the Smaller, Safer, Stronger project — a collaboration with AdaptiveML — where we fine-tuned Gemma 3 4B with SFT + PPO to match GPT-4.1 / Claude Sonnet 3.7 on multilingual content moderation. The project is featured as a reference case in Google DeepMind’s Gemmaverse showcase.

My research interests are LLM Evaluation, Synthetic Data, Model-based Data Curation, and LLM Safety & Red-Teaming — a data-centric approach where evaluation results inform exactly what training data needs to be built next.

Happy to chat about any of the above — reach me at jtiger958@gmail.com.

안녕하세요, SK텔레콤에서 LLM Research Engineer로 근무 중인 장원범입니다.

저는 데이터 중심(data-centric) LLM 엔지니어입니다. 모델의 약점을 드러내는 벤치마크를 만들고, 그 결과를 바탕으로 약점을 메우는 학습 데이터를 직접 설계·구축합니다.

현재 SK텔레콤의 A.X Foundation Model 개발에 참여하여, 모델 학습에 사용되는 한국어 Pre-training 데이터와 Post-training 데이터를 직접 설계·구축하고 있습니다.

이전에는 통신 도메인 특화 LLM(TelcoLLM)을 OpenAI, Anthropic와 협업하여 구축하였고, 이를 평가하기 위한 벤치마크인 TelBench(EMNLP 2024)와 TelAgentBench(EMNLP 2025)를 개발했습니다. TelcoLLM은 SK텔레콤 AICC(AI Contact Center)에 상용 배포되어 in-call RAG 및 post-call 분석에 활용되고 있습니다.

또한 AdaptiveML과의 협업 프로젝트인 Smaller, Safer, Stronger 를 프로젝트 리드로 이끌었으며, Gemma 3 4B를 SFT + PPO로 튜닝하여 GPT-4.1·Claude Sonnet 3.7급의 다국어 콘텐츠 모더레이션 성능을 달성했습니다. 이 작업은 Google DeepMind의 Gemmaverse 쇼케이스에 레퍼런스 사례로 등재되었습니다.

주요 관심 분야는 LLM Evaluation, Synthetic Data, Model-based Data Curation, LLM Safety & Red-Teaming 입니다. 평가 결과를 기반으로 모델의 약점을 분석하고, 이를 보완하는 고품질 학습 데이터를 설계하는 데이터 중심(data-centric) 접근에 관심이 많습니다.

관련 주제로의 협업·논의는 메일로 언제든 환영합니다.

news

Mar 03, 2026	A.X K1 publicly released — a 519B-parameter MoE language model by SK Telecom.
Feb 12, 2026	A.X K1 Technical Report released on arXiv. A 519B-parameter MoE language model trained from scratch.
Sep 29, 2025	TelAgentBench accepted at EMNLP 2025 Industry Track.
Aug 20, 2025	Smaller, Safer, Stronger — SK Telecom × AdaptiveML’s Gemma 3 4B multilingual moderation model has been featured in Google DeepMind’s Gemmaverse showcase. (Project Lead)
Jun 01, 2025	SK Telecom TelcoLLM deployed to production — powering AICC in-call RAG and post-call analysis for millions of customers.

latest posts

Jul 07, 2026	Kubernetes 확장과 생태계 — Operator와 CNCF Projects
Jul 06, 2026	Kubernetes 권한 관리 — ServiceAccount와 RBAC
Jul 06, 2026	Kubernetes 스토리지와 설정 — PV/PVC, ConfigMap, Secret

selected publications

A.X K1 Technical Report

Sung Jun Cheon, Jaekyung Cho, Seongho Choi, and 57 more authors

2026
TelAgentBench: A Multi-faceted Benchmark for Evaluating LLM-based Agents in Telecommunications

Sunwoo Lee, Daseong Jang, Dhammiko Arya, and 12 more authors

In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track, Nov 2025

DOI
TelBench: A Benchmark for Evaluating Telco-Specific Large Language Models

Sunwoo Lee, Dhammiko Arya, Seung-Mo Cho, and 9 more authors

In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track, Nov 2024

Abs

The telecommunications industry, characterized by its vast customer base and complex service offerings, necessitates a high level of domain expertise and proficiency in customer service center operations. Consequently, there is a growing demand for Large Language Models (LLMs) to augment the capabilities of customer service representatives. This paper introduces a methodology for developing a specialized Telecommunications LLM (Telco LLM) designed to enhance the efficiency of customer service agents and promote consistency in service quality across representatives. We present the construction process of TelBench, a novel dataset created for performance evaluation of customer service expertise in the telecommunications domain. We also evaluate various LLMs and demonstrate the ability to benchmark both proprietary and open-source LLMs on predefined telecommunications-related tasks, thereby establishing metrics that define telcommunications performance.
Deep-plane sweep generative adversarial network for consistent multi-view depth estimation

Dong Wook Shu, Wonbeom Jang, Heebin Yoo, and 2 more authors

Machine Vision and Applications, 2022

DOI