Submit your paper : editorIJETjournal@gmail.com Paper Title : Grammatical Error Correction in Customer Support Chat using Semi Supervision on Pretrained Language models ISSN : 2395-1303 Year of Publication : 2022 10.5281/zenodo.7047597 MLA Style: -Nikhilesh Cherukuri, Aditya Kiran Brahma Grammatical Error Correction in Customer Support Chat using Semi Supervision on Pretrained Language models , Volume 8 - Issue 5 September - October 2022 International Journal of Engineering and Techniques (IJET) ,ISSN:2395-1303 , www.ijetjournal.org APA Style: - Nikhilesh Cherukuri, Aditya Kiran Brahma Grammatical Error Correction in Customer Support Chat using Semi Supervision on Pretrained Language models , Volume 8 - Issue 5 September - October 2022 International Journal of Engineering and Techniques (IJET) ,ISSN:2395-1303 , www.ijetjournal.org Abstract Understanding the grammatical errors present in the chat conver- sations is crucial in developing chatbot with high quality data. In food delivery platforms, conversational AI development via chatbot platform is continuously built to understand the context of the cus- tomer conversations and suggest the next utterance by retrieving them from a similar scenario occurred in the past. These sugges- tions if used by agents are sometimes manually edited further based on the relevance in current scenario and suggestion quality. The grammatical quality of suggestions play significant role for the agents to utilize conversational AI assistance and provide better customer resolutions in quick and effective manner. In this paper, we analyse a use case of identifying the frequent grammatical er- rors present in the texts typed by the customer care agents and utilize them to build an automatic grammatical error correction model for the data specific to food delivery conversations. We show that using large pretrained encoder-decoder transformer models and systematic fine-tuning with a smaller downstream task specific data (Grammatical error correction) achieved an overall gain of 15.5 % GLEU score compared to the the baseline approach of using the pretrained model alone. Reference 1. Abhijeet Awasthi, Sunita Sarawagi, Rasna Goyal, Sabyasachi Ghosh, and Vihari Pi- ratla. 2019. “Parallel Iterative Edit Models for Local Sequence Transduction.” CoRR, abs/1910.02893. http://arxiv.org/abs/1910.02893 arXiv: 1910.02893. 2. Christopher Bryant, Mariano Felice, and Ted Briscoe. July 2017. “Automatic Annotation and Evaluation of Error Types for Grammatical Error Correction.” In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Vancouver, Canada, (July 2017), 793–805. doi: 10.18653/v1/P17-1074. 3. Shamil Chollampatt and Hwee Tou Ng. 2018. “A Multilayer Convolutional Encoder- Decoder Neural Network for Grammatical Error Correction.” CoRR, abs/1801.08831. http://arxiv.org/abs/1801.08831 arXiv: 1801.08831. 4. Daniel Grießhaber, Johannes Maucher, and Ngoc Thang Vu. Dec. 2020. “Fine-tuning BERT for Low-Resource Natural Language Understanding via Active Learning.” In: Proceedings of the 28th International Conference on Computational Linguistics. International Committee on Computational Linguistics, Barcelona, Spain (Online), (Dec. 2020), 1158–1171. doi: 10.18653/v1/2020.coling-main.100. 5. Roman Grundkiewicz and Marcin Junczys-Dowmunt. 2018. “Near Human-Level Perfor- mance in Grammatical Error Correction with Hybrid Machine Translation.” CoRR, abs/1804.05945. http://arxiv.org/abs/1804.05945 arXiv: 1804.05945. 6. Marcin Junczys-Dowmunt, Roman Grundkiewicz, Shubha Guha, and Kenneth Heafield. June 2018. “Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation Task.” In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Association for Computational Linguistics, New Orleans, Louisiana, (June 2018), 595–606. doi: 10.18653/v1/N18-1055. 7. Masahiro Kaneko, Masato Mita, Shun Kiyono, Jun Suzuki, and Kentaro Inui. 2020. “Encoder-Decoder Models Can Benefit from Pre-trained Masked Language Models in Grammatical Error Correction.” CoRR, abs/2005.00987. https://arxiv.org/abs/200 5.00987 arXiv: 2005.00987. 8. Shun Kiyono, Jun Suzuki, Masato Mita, Tomoya Mizumoto, and Kentaro Inui. 2019. “An Empirical Study of Incorporating Pseudo Data into Grammatical Error Correction.” CoRR, abs/1909.00502. http://arxiv.org/abs/1909.00502 arXiv: 1909.00502. 9. Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. July 2020. “BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension.” In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, (July 2020), 7871–7880. doi: 10.18653/v1/2020.acl-main.703. 10. Jared Lichtarge, Chris Alberti, Shankar Kumar, Noam Shazeer, Niki Parmar, and Simon Tong. June 2019. “Corpora Generation for Grammatical Error Correction.” In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, (June 2019), 3291–3301. doi: 10.18653/v1/N19-1333. 11. Zhenghao Liu, Xiaoyuan Yi, Maosong Sun, Liner Yang, and Tat-Seng Chua. 2021. “Neu- ral Quality Estimation with Multiple Hypotheses for Grammatical Error Correction.” CoRR, abs/2105.04443. https://arxiv.org/abs/2105.04443 arXiv: 2105.04443. 12. Courtney Napoles, Keisuke Sakaguchi, Matt Post, and Joel Tetreault. July 2015. “Ground Truth for Grammatical Error Correction Metrics.” In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). Asso- ciation for Computational Linguistics, Beijing, China, (July 2015), 588–593. doi: 10.3115/v1/P15- 2097. 13. Kostiantyn Omelianchuk, Vitaliy Atrasevych, Artem N. Chernodub, and Oleksandr Skurzhanskyi. 2020. “GECToR - Grammatical Error Correction: Tag, Not Rewrite.” CoRR, abs/2005.12592. https://arxiv.org/abs/2005.12592 arXiv: 2005.12592. 14. Kishore Papineni, Salim Roukos, Todd Ward, and Wei Jing Zhu. Oct. 2002. “BLEU: a Method for Automatic Evaluation of Machine Translation,” (Oct. 2002). doi: 10.3115/1073083.1073135. 15. Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2019. “Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer.” CoRR, abs/1910.10683. http://ar xiv.org/abs/1910.10683 arXiv: 1910.10683. 16. Sascha Rothe, Jonathan Mallinson, Eric Malmi, Sebastian Krause, and Aliaksei Severyn. 2021. “A Simple Recipe for Multilingual Grammatical Error Correction.” CoRR, abs/2106.03830. https://arxiv.org/abs/2106.03830 arXiv: 2106.03830. 17. Marina Sokolova, Nathalie Japkowicz, and Stan Szpakowicz. Jan. 2006. “Beyond Ac- curacy, F-Score and ROC: A Family of Discriminant Measures for Performance Evaluation.” In: vol. Vol. 4304. (Jan. 2006), 1015–1021. isbn: 978-3-540-49787-5. doi: 10.1007/11941439_114. 18. Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, and Colin Raffel. 2020. “mT5: A massively multilingual pre-trained text-to-text transformer.” CoRR, abs/2010.11934. https://arxiv.org/abs/2010.11934 arXiv: 2010.11934. 19. Wei Zhao, Liang Wang, Kewei Shen, Ruoyu Jia, and Jingming Liu. 2019. “Improving Grammatical Error Correction via Pre-Training a Copy-Augmented Architecture with Unlabeled Data.” CoRR, abs/1903.00138. http://arxiv.org/abs/1903.00138 arXiv: 1903.00138. Keywords — Grammatical error correction(GEC), Language models, Transfer learning, Natural Language Understanding(NLU) |