K U N M T

Loading

  1. Model-Based Data-Centric AI: Bridging the Divide Between Academic Ideals and Industrial Pragmatism
    Chanjun Park, Minsoo Khang, Dahyun Kim
    ICLR 2024 – Data-centric Machine Learning Research (DMLR) Workshop, 2024

  2. Leveraging Pre-existing Resources for Data-Efficient Counter-Narrative Generation in Korean
    Seungyoon Lee, Chanjun Park (Corresponding Author), DaHyun Jung, Hyeonseok Moon, Jaehyung Seo, Sugyeong Eo, Heuiseok Lim (Corresponding Author)
    LREC-COLING 2024, 2024

  3. KNOTICED: A Dataset for Critical Error Detection in English-Korean Machine Translation
    Sugyeong Eo, Jungwoo Lim, Chanjun Park, Hyeonseok Moon, Jaehyung Seo, Heuiseok Lim
    LREC-COLING 2024, 2024

  4. Hyper-BTS Dataset: Scalability and Enhanced Analysis of Back TranScription (BTS) for ASR Post-Processing
    Chanjun Park, Jaehyung Seo, Seolhwa Lee, Junyoung Son, Hyeonseok Moon, Sugyeong Eo, Chanhee Lee, Heuiseok Lim
    EACL 2024 (Findings of EACL 2024), 2024

  5. Generative Interpretation: Toward Human-Like Evaluation for Educational Question-Answer Pair Generation
    Hyeonseok Moon, Jaewook Lee, Sugyeong Eo, Chanjun Park, Jaehyung Seo, Heuiseok Lim
    EACL 2024 (Findings of EACL 2024), 2024

  6. KEBAP: Korean Error Explainable Benchmark Dataset for ASR and Post-processing
    Seonmin Koo (*), Chanjun Park (*), Jinsung Kim, Jaehyung Seo, Sugyeong Eo, Hyeonseok Moon, Heuiseok Lim
    EMNLP 2023, 2023

  7. CHEF in the Language Kitchen: A Generative Data Augmentation Leveraging Korean Morpheme Ingredients
    Jaehyung Seo, Hyeonseok Moon, Jaewook Lee, Sugyeong Eo, Chanjun Park, Heuiseok Lim
    EMNLP 2023, 2023

  8. Proceedings of the Seventh Widening NLP Workshop (WiNLP 2023)
    Bonaventure F. P. Dossou, Isidora Tourni, Hatem Haddad, Shaily Bhatt, Fatemehsadat Mireshghallah, Sunipa Dev, Tanvi Anand, Weijia Xu, Atnafu Lambebo Tonja, Alfredo Gomez, Chanjun Park
    EMNLP 2023, 2023

  9. Alternative Speech: Complementary Method to Counter-Narrative for Better Discourse
    Seungyoon Lee (*), DaHyun Jung (*), Chanjun Park (*), Seolhwa Lee, Heuiseok Lim
    ICDM 2023 – The First Workshop on Data-Centric AI, 2023

  10. Informative Evidence-guided Prompt-based Fine-tuning for English-Korean Critical Error Detection
    DaHyun Jung, Sugyeong Eo, Chanjun Park, Hyeonseok Moon, Jaehyung Seo, Heuiseok Lim
    IJCNLP-AACL 2023, 2023

  11. Synthetic Alone: Exploring the Dark Side of Synthetic Data for Grammatical Error Correction
    Chanjun Park(*), Seonmin Koo(*), Seolhwa Lee(*), Jaehyung Seo, Sugyeong Eo, Hyeonseok Moon, Heuiseok Lim
    ICML 2023 – Data-centric Machine Learning Research (DMLR) Workshop, 2023

  12. DMOps: Data Management Operation and Recipes
    Eujeong Choi(*), Chanjun Park(*)
    ICML 2023 – Data-centric Machine Learning Research (DMLR) Workshop, 2023

  13. Inter-Annotator Agreement in the Wild: Uncovering Its Emerging Roles and Considerations in Real-World Scenarios
    NamHyeok Kim(*), Chanjun Park(*) 
    ICML 2023 – Data-centric Machine Learning Research (DMLR) Workshop, 2023

  14. Transcending Traditional Boundaries: Leveraging Inter-Annotator Agreement (IAA) for Enhancing Data Management Operations (DMOps)
    Damrin Kim, NamHyeok Kim, Chanjun Park, Harksoo Kim
    ICML 2023 – Data-centric Machine Learning Research (DMLR) Workshop, 2023

  15. Data-Driven Approach for Formality-Sensitive Machine Translation: Language-Specific Handling and Synthetic Data Generation
    Seugnjun Lee, Hyeonseok Moon, Chanjun Park, Heuiseok Lim
    ICML 2023 – Data-centric Machine Learning Research (DMLR) Workshop, 2023

  16. Toward Practical Automatic Speech Recognition and Post-Processing: a Call for Explainable Error Benchmark Guideline
    Seonmin Koo(*), Chanjun Park(*), Jinsung Kim, Jaehyung Seo, Sugyeong Eo, Hyeonseok Moon, Heuiseok Lim
    ICML 2023 – Data-centric Machine Learning Research (DMLR) Workshop, 2023

  17. Knowledge Graph-Augmented Korean Generative Commonsense Reasoning
    Dahyun Jung, Jaehyung Seo, Jaewook Lee, Chanjun Park, Heuiseok Lim
    ICML 2023 – Data-centric Machine Learning Research (DMLR) Workshop, 2023

  18. Improving Formality-Sensitive Machine Translation using Data-Centric Approaches and Prompt Engineering
    Seugnjun Lee, Hyeonseok Moon, Chanjun Park, Heuiseok Lim
    IWSLT 2023 – ACL 2023, 2023

  19. Towards Diverse and Effective Question-Answer Pair Generation from Children Storybooks
    Sugyeong Eo, Hyeonseok Moon, Jinsung Kim, Yuna Hur, Jeongwook Kim, SongEun Lee, Changwoo Chun, Sungsoo Park, Heuiseok Lim
    ACL 2023 -Findings, 2023
  20. PEEP-Talk: A Situational Dialogue-based Chatbot for English Education
    Seugnjun Lee, Yoonna Jang, Chanjun Park, Jungseob Lee, Jaehyung Seo, Hyeonseok Moon, Sugyeong Eo, Seounghoon Lee, Bernardo Nugroho Yahya, Heuiseok Lim
    ACL 2023 – Demo Track, 2023

  21. PicTalky: Augmentative and Alternative Communication for Language Developmental Disabilities
    Chanjun Park, Yoonna Jang, Seolhwa Lee, Jaehyung Seo, Kisu Yang, Heuiseok Lim
    AACL 2022 – Demo Track, 2022

  22. KU X Upstage’s submission for the WMT22 Quality Estimation: Critical Error Detection Shared Task
    Sugyeong Eo, Chanjun Park, Hyeonseok Moon, Jaehyung Seo, Heuiseok Lim
    WMT 2022 – EMNLP 2022, 2022

  23. QUAK: A Synthetic Quality Estimation Dataset for Korean-English Neural Machine Translation
    Sugyeong Eo, Chanjun Park, Hyeonseok Moon, Jaehyung Seo, Gyeongmin Kim, Jungseob Lee, Heuiseok Lim
    COLING 2022, 2022

  24. Focus on FoCus: Is FoCus focused on Context, Knowledge and Persona?
    SeungYoon Lee, Jungseob Lee, Chanjun Park, Sugyeong Eo, Hyeonseok Moon, Jaehyung Seo, Jeongbae Park, Heuiseok Lim
    COLING 2022 – The 1st Workshop on Customized Chat Grounding Persona and Knowledge , 2022

  25. A Self-Supervised Automatic Post-Editing Data Generation Tool
    Hyeonseok Moon, Chanjun Park, Sugyeong Eo, Jaehyung Seo, Seungjun Lee, Heuiseok Lim
    ICML 2022 – DataPerf workshop, 2022

  26. A Dog Is Passing Over The Jet? A Text-Generation Dataset for Korean Commonsense Reasoning and Evaluation
    Jaehyung Seo, Seounghoon Lee, Chanjun Park, Yoonna Jang, Hyeonseok Moon, Sugyeong Eo, Seonmin Koo, Heuiseok Lim
    NAACL 2022 – Findings, 2022
  27. Priming Ancient Korean Neural Machine Translation
    Chanjun Park, Seolhwa Lee, Hyeonseok Moon, Sugyeong Eo, Jaehyung Seo, Heuiseok Lim
    LREC 2022, 2022

  28. FreeTalky: Don’t Be Afraid! Conversations Made Easier by a Humanoid Robot using Persona-based Dialogue
    Chanjun Park, Yoonna Jang, Seolhwa Lee, Sungjin Park, Heuiseok Lim
    LREC 2022, 2022

  29. Empirical Analysis of Synthetic Data Generation Using Noising Strategies for Automatic Post-editing
    Hyeonseok Moon, Chanjun Park, Seolhwa Lee, Jaehyung Seo, Jeongsub Lee, Sugyeong Eo, Heuiseok Lim
    LREC 2022, 2022

  30. FreeTalky: Don’t Be Afraid! Conversations Made Easier by a Humanoid Robot using Persona-based Dialogue
    Chanjun Park(*), Yoonna Jang(*), Seolhwa Lee(*), Sungjin Park(*), Heuiseok Lim
    AAAI 2022 -Artificial Intelligence for Education(AI4EDU), 2022
  31. How should human translation coexist with NMT? Efficient tool for building high quality parallel corpus
    Chanjun Park, Seolhwa Lee, Hyeonseok Moon, Sugyeong Eo, Jaehyung Seo, Heuiseok Lim
    NeurIPS 2021 – Data-centric AI (DCAI) workshop, 2021

  32. A New Tool for Efficiently Generating Quality Estimation Datasets
    Sugyeong Eo, Chanjun Park, Jaehyung Seo, Hyeonseok Moon, Heuiseok Lim
    NeurIPS 2021 – Data-centric AI (DCAI) workshop, 2021

  33. Automatic Knowledge Augmentation for Generative Commonsense Reasoning
    Jaehyung Seo, Chanjun Park, Sugyeong Eo, Hyeonseok Moon, Heuiseok Lim
    NeurIPS 2021 – Data-centric AI (DCAI) workshop, 2021

  34. Syntax-enhanced Dialogue Summarization using Syntax-aware information
    Seolhwa Lee, Kisu Yang, Chanjun Park, João Sedoc, Heuiseok Lim
    NeurIPS 2021 – Women in Machine Learning (WiML 2021) workshop, 2021
  35. Towards Syntax-Aware DialogueSummarization using Multi-task Learning
    Seolhwa Lee, Kisu Yang, Chanjun Park, João Sedoc, Heuiseok Lim
    EMNLP 2021 -Widening NLP (WiNLP2021) workshop, 2021
  36. Two Heads are Better than One? Verification of Ensemble Effect in Neural Machine Translation
    Chanjun Park, Sungjin Park, Seolhwa Lee, Taesun Whang, Heuiseok Lim
    EMNLP 2021 -The Second Workshop on Insights from Negative Results in NLP, 2021 – (Oral presentation)
  37. BTS: Back TranScription for Speech-to-Text Post-Processor using Text-to-Speech-to-Text
    Chanjun Park, Jaehyung Seo, Seolhwa Lee, Chanhee Lee, Hyeonseok Moon, Sugyeong Eo, Heuiseok Lim
    ACL 2021 -WAT(Workshop on Asian Translation) 2021 Workshop, 2021 – (oral presentation)
  38. Dealing with the Paradox of Quality Estimation
    Sugyeong Eo (*), Chanjun Park (*), Jaehyung Seo, Hyeonseok Moon, Heuiseok Lim 
    MT Summit 2021 – LoResMT, 2021- (Oral presentation)
  39. Should we find another model?: Improving Neural Machine Translation Performance with ONE-Piece Tokenization Method without Model Modification
    Chanjun Park (*), Sugyeong Eo (*), Hyeonseok Moon (*), Heuiseok Lim
    NAACL-HLT 2021 Industry Track, 2021- (Poster/Oral presentation)

KU NMT Group.

School of Computer Science

College of Engineering, Korea University

© 2024 KU NMT GROUP.

Contact US

  • Group Leader Email
    bcj1210@naver.com
  • Address
    #311 Aegineung Student Center, College of Informatics, Korea University, 145 Anam-ro, Seongbuk-gu, Seoul, 02841, Korea