Research Area

We are interested in broad topics in Data Science (DS) and Artificial Intelligence (AI). We identify real-world challenges with significant practical impacts and address them through DS/AI methodologies by leveraging {Evolving, (Un)structured data} × {Foundation models} × {Human-curated knowledge}.

Evolving Data

  • Anomaly & drift detection KDD24, WWW24, KDD22, SIGMOD21, KDD20, VLDB19
  • Time-series analysis WWW26, KDD25, NeurIPS24, WWW24, ICML23, ICLR22, MiLeTS19
  • Streaming text & event processing KDD26, SIGIR23, WWW23, MiLeTS19, ICDE19

(Un)structured Data

  • Graph & network mining SIGMOD26, WSDM26, CIKM24
  • Spatio-temporal networks KBS25, KDD25, ICWSM25, TITS23, ICDM22
  • Tabular data understanding ICML26, EMNLP25, SIGIR25

Foundation Models

  • Pretraining & feature engineering WSDM26, EMNLP25, SIGIR25
  • Prompt tuning & continual learning KDD26, SIGIR26, ICML24
  • Retrieval & augmentation KDD26, ICML26, ACL26, SIGIR26

Human-curated Knowledge

  • Taxonomy & topic discovery NC26, ACL26, SIGIR26, ACL23, EMNLP22, WWW22
  • Societal & behavioral analysis TCSS25, ICDM22, AAAI22, CHI16
  • Weak-supervision & summarization EMNLP23, WWW23

Publications

  1. Can Structural Cues Save LLMs? Evaluating Language Models in Massive Document Streams
    Yukyung Lee, Yebin Lim*, Woojun Jung*, Wonjun Choi, Susik Yoon
    KDD26 | ACM SIGKDD Conference on Knowledge Discovery and Data Mining, August 2026
  2. CREAM: Continual Retrieval on Dynamic Streaming Corpora with Adaptive Soft Memory
    Huijeong Son, Hyeongu Kang, Sunho Kim, Subeen Ho, Seongku Kang, Dongha Lee, Susik Yoon
    KDD26 | ACM SIGKDD Conference on Knowledge Discovery and Data Mining, August 2026
  3. Segment-driven Structural Induction and Semantic Alignment for Heterogeneous Tabular Representation
    Woojun Jung, Susik Yoon
    ICML26 | International Conference on Machine Learning, July 2026
  4. Breaking the Reference Bottleneck via Learning to Rewrite Conversational Queries without Gold Reference Passages
    Doyoung Kim, Youngjun Lee, Joeun Kim, Jihwan Bang, Hwanjun Song, Susik Yoon, Jae-Gil Lee
    ICML26 | International Conference on Machine Learning, July 2026
  5. MUDY: Multi-Granular Dynamic Candidate Contextualization for Unsupervised Keyphrase Extraction
    Hyeongu Kang, Susik Yoon
    SIGIR26 | ACM SIGIR Conference on Research and Development in Information Retrieval, July 2026
  6. SPRINT: Scalable and Predictive Intent Refinement for LLM-Enhanced Session-based Recommendation
    Gyuseok Lee, Wonbin Kweon, Zhenrui Yue, Yaokun Liu, Yifan Liu, Susik Yoon, Dong Wang, SeongKu Kang
    SIGIR26 | ACM SIGIR Conference on Research and Development in Information Retrieval, July 2026
  7. Why These Documents? Explainable Generative Retrieval with Hierarchical Category Paths
    Sangam Lee, Ryang Heo, SeongKu Kang, Susik Yoon, Jinyoung Yeo, Dongha Lee
    ACL26 (Findings) | Annual Meeting of the Association for Computational Linguistics, July 2026
  8. Back to the Future: Look-ahead Augmentation and Parallel Self-Refinement for Time Series Forecasting
    Sunho Kim, Susik Yoon
    WWW26 (Short) | ACM The Web Conference, June 2026
  9. AI-Driven Text Mining of the Female Reproductive System: Enabling Multiscale Biomedical Modeling and Personalized Medicine
    Gaeun Lee*, Jeehyo Jeon*, Sharon Jeeho Ham*, Sieun Shin, Seo Yeon Kim, Hongsock Kim, Ju Yeon Lee, Heejin Woo, Jongwoo Ahn, Jungseub Lee, Seokyoung Bang, Susik Yoon+, Jungho Ahn+
    Nano Convergence, May 2026 (SCI(E), IF: 11)
  10. LMSC: Local Sketch Modularity Optimisation for Size-Constrained Community Search in Networks
    Dahee Kim, Taejoon Han, Kaiyu Feng, Junghoon Kim, Susik Yoon
    SIGMOD26 | ACM Conference on Management of Data, May 2026
  11. Metadata Meets LLMs: Constructing Knowledge-Rich Citation Networks with CoT-Enhanced Representations
    Soohwan Jeong, Mingyu Choi, Joon-Young Kim, Susik Yoon+, Sungsu Lim+
    WSDM26 (Short) | ACM Conference on Web Search and Data Mining, February 2026
  12. Sequence-aware Adaptive Graph Convolutional Recurrent Networks for Traffic Forecasting
    Seunghoon Han, Hyewon Lee, Yoonhwan Lee, Sung-Soo Kim, Susik Yoon+, Sungsu Lim+
    KBS25 | Knowledge-Based Systems, November 2025
  13. Multi-level Diagnosis and Evaluation for Robust Tabular Feature Engineering with Large Language Models
    Yebin Lim, Susik Yoon
    EMNLP25 (Findings) | Conference on Empirical Methods in Natural Language Processing, November 2025
  14. Bi-Modal Learning for Networked Time Series
    Youngeun Nam*, Jihye Na*, Susik Yoon, Hwanjun Song, Jae-Gil Lee, Byung Suk Lee
    KDD25 | ACM SIGKDD Conference on Knowledge Discovery and Data Mining, August 2025
  15. An Analysis of City Image by Exploiting Social Media: Toward a Deeper Understanding of Multiple Characteristics and Their Temporal Changes using Machine Learning
    Hyeonchoel Jeong, Bumsu Cho, Susik Yoon, Yun Wook Choo, Dookie Kim, Jungeun Kim
    TCSS25 | IEEE Transactions on Computational Social Systems, July 2025 (SCI(E), IF: 4.9)
  16. HAETAE: In-domain Table Pretraining with Header Anchoring
    Woojun Jung, Susik Yoon
    SIGIR25 (Short) | ACM SIGIR Conference on Research and Development in Information Retrieval, July 2025
  17. Mobility Networked Time-Series Forecasting Benchmark Datasets
    Jihye Na*, Youngeun Nam*, Susik Yoon, Hwanjun Song, Byung Suk Lee, Jae-Gil Lee
    ICWSM25 | AAAI International Conference on Web and Social Media, June 2025
  18. Exploiting Representation Curvature for Boundary Detection in Time Series
    Yooju Shin, Jaehyun Park, Susik Yoon, Hwanjun Song, Byung Suk Lee, Jae-Gil Lee
    NeurIPS24 | Conference on Neural Information Processing Systems, December 2024
  19. Flexi-clique: Exploring Flexible and Sub-linear Clique Structures
    Song Kim, Junghoon Kim, Susik Yoon, Jungeun Kim
    CIKM24 (Short) | ACM Conference on Information and Knowledge Management, October 2024
  20. Online Drift Detection with Maximum Concept Discrepancy
    Ke Wan, Yi Liang, Susik Yoon
    KDD24 | ACM SIGKDD Conference on Knowledge Discovery and Data Mining, August 2024
  21. One Size Fits All for Semantic Shifts: Adaptive Prompt Tuning for Continual Learning
    Doyoung Kim, Susik Yoon, Dongmin Park, Youngjun Lee, Hwanjun Song, Jihwan Bang, Jae-Gil Lee
    ICML24 | International Conference on Machine Learning, July 2024
  22. Breaking the Time-Frequency Granularity Discrepancy in Time-Series Anomaly Detection
    Youngeun Nam, Susik Yoon, Yooju Shin, Minyoung Bae, Hwanjun Song, Jae-Gil Lee, Byung Suk Lee
    WWW24 | ACM The Web Conference, May 2024
  23. MEGClass: Text Classification with Extremely Weak Supervision via Mutually-Enhancing Text Granularities
    Priyanka Kargupta, Tanay Komarlu, Susik Yoon, Xuan Wang, Jiawei Han
    EMNLP23 (Findings) | Conference on Empirical Methods in Natural Language Processing, December 2023
  24. DynaMiTE: Discovering Explosive Topic Evolutions with User Guidance
    Nishant Balepur, Shivam Agarwal, Karthik Venkat Ramanan, Susik Yoon, Diyi Yang, Jiawei Han
    ACL23 (Findings) | Annual Meeting of the Association for Computational Linguistics, July 2023
  25. Context Consistency Regularization for Label Sparsity in Time Series
    Yooju Shin, Susik Yoon, Hwanjun Song, Dongmin Park, Byunghyun Kim, Jae-Gil Lee, Byung Suk Lee
    ICML23 | International Conference on Machine Learning, July 2023
  26. Unsupervised Story Discovery from Continuous News Streams via Scalable Thematic Embedding
    Susik Yoon, Dongha Lee, Yunyi Zhang, Jiawei Han
    SIGIR23 | ACM SIGIR Conference on Research and Development in Information Retrieval, July 2023
  27. PDSum: Prototype-driven Continuous Summarization of Evolving Multi-document Sets Stream
    Susik Yoon, Hou Pong Chan, Jiawei Han
    WWW23 | ACM The Web Conference, April 2023
  28. SCStory: Self-supervised and Continual Online Story Discovery
    Susik Yoon, Yu Meng, Dongha Lee, Jiawei Han
    WWW23 | ACM The Web Conference, April 2023
  29. MG-TAR: Multi-view Graph Convolutional Networks for Traffic Accident Risk Prediction
    Patara Trirat, Susik Yoon, Jae-Gil Lee
    TITS23 | IEEE Transactions on Intelligent Transportation Systems, 2023 (SCI(E), IF: 8.4)
  30. Topic Taxonomy Expansion via Hierarchy-Aware Topic Phrase Generation
    Dongha Lee, Jiaming Shen, Seonghyeon Lee, Susik Yoon, Hwanjo Yu, Jiawei Han
    EMNLP22 (Findings) | Conference on Empirical Methods in Natural Language Processing, December 2022
  31. Multi-view POI-level Cellular Trajectory Reconstruction for Digital Contact Tracing of Infectious Diseases
    Dongmin Park, Junhyeok Kang, Hwanjun Song, Susik Yoon, Jae-Gil Lee
    ICDM22 (Short) | IEEE International Conference on Data Mining, November 2022
  32. Adaptive Model Pooling for Online Deep Anomaly Detection from a Complex Evolving Data Stream
    Susik Yoon, Youngjun Lee, Jae-Gil Lee, Byung Suk Lee
    KDD22 | ACM SIGKDD Conference on Knowledge Discovery and Data Mining, August 2022
  33. Coherence-based Label Propagation over Time Series for Accelerated Active Learning
    Yooju Shin, Susik Yoon, Sundong Kim, Hwanjun Song, Jae-Gil Lee, Byung Suk Lee
    ICLR22 | International Conference on Learning Representations, April 2022
  34. TaxoCom: Topic Taxonomy Completion with Hierarchical Discovery of Novel Topic Clusters
    Dongha Lee, Jiaming Shen, Seongku Kang, Susik Yoon, Jiawei Han, Hwanjo Yu
    WWW22 | ACM The Web Conference, April 2022
  35. COVID-EENet: Predicting Fine-Grained Impact of COVID-19 on Local Economies
    Doyoung Kim, Hyangsuk Min, Youngeun Nam, Hwanjun Song, Susik Yoon, Minseok Kim, Jae-Gil Lee
    AAAI22 | AAAI Conference on Artificial Intelligence, February 2022
  36. Multiple Dynamic Outlier-Detection from a Data Stream by Exploiting Duality of Data and Queries
    Susik Yoon, Yooju Shin, Jae-Gil Lee, Byung Suk Lee
    SIGMOD21 | ACM Conference on Management of Data, June 2021
  37. Ultrafast Local Outlier Detection from a Data Stream with Stationary Region Skipping
    Susik Yoon, Jae-Gil Lee, Byung Suk Lee
    KDD20 | ACM SIGKDD Conference on Knowledge Discovery and Data Mining, August 2020
  38. NETS: Extremely Fast Outlier Detection from a Data Stream via Set-Based Processing
    Susik Yoon, Jae-Gil Lee, Byung Suk Lee
    VLDB19 | International Conference on Very Large Data Bases, August 2019
  39. MLAT: Metric Learning for kNN in Streaming Time Series
    Dongmin Park, Susik Yoon, Hwanjun Song, Jae-Gil Lee
    MiLeTS19 (KDD Workshop) | Workshop on Mining and Learning from Time Series, August 2019
  40. CEP-Wizard: Automatic Deployment of Distributed Complex Event Processing
    Yooju Shin, Susik Yoon, Patara Trirat, Jae-Gil Lee
    ICDE19 (Demo) | IEEE International Conference on Data Engineering, April 2019
  41. Social or Financial Goals? Comparative Analysis of User Behaviors in Couchsurfing and Airbnb
    Jiwon Jung*, Susik Yoon*, SeungHyun Kim*, SangKeun Park, Kun-Pyo Lee, Uichin Lee
    CHI16 (Late-Breaking Work) | ACM Conference on Human Factors in Computing Systems, May 2016