Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Khang Nhut Lam

Abstractive Text Summarization Using the BRIO Training Paradigm

May 23, 2023

Khang Nhut Lam, Thieu Gia Doan, Khang Thua Pham, Jugal Kalita

Abstract:Summary sentences produced by abstractive summarization models may be coherent and comprehensive, but they lack control and rely heavily on reference summaries. The BRIO training paradigm assumes a non-deterministic distribution to reduce the model's dependence on reference summaries, and improve model performance during inference. This paper presents a straightforward but effective technique to improve abstractive summaries by fine-tuning pre-trained language models, and training them with the BRIO paradigm. We build a text summarization dataset for Vietnamese, called VieSum. We perform experiments with abstractive summarization models trained with the BRIO paradigm on the CNNDM and the VieSum datasets. The results show that the models, trained on basic hardware, outperform all existing abstractive summarization models, especially for Vietnamese.

* Findings of the Association for Computational Linguistics: ACL 2023
* 6 pages, Findings of the Association for Computational Linguistics: ACL 2023

Via

Access Paper or Ask Questions

Facial Expression Recognition and Image Description Generation in Vietnamese

Aug 12, 2022

Khang Nhut Lam, Kim-Ngoc Thi Nguyen, Loc Huu Nguy, Jugal Kalita

Figure 1 for Facial Expression Recognition and Image Description Generation in Vietnamese

Figure 2 for Facial Expression Recognition and Image Description Generation in Vietnamese

Figure 3 for Facial Expression Recognition and Image Description Generation in Vietnamese

Figure 4 for Facial Expression Recognition and Image Description Generation in Vietnamese

Abstract:This paper discusses a facial expression recognition model and a description generation model to build descriptive sentences for images and facial expressions of people in images. Our study shows that YOLOv5 achieves better results than a traditional CNN for all emotions on the KDEF dataset. In particular, the accuracies of the CNN and YOLOv5 models for emotion recognition are 0.853 and 0.938, respectively. A model for generating descriptions for images based on a merged architecture is proposed using VGG16 with the descriptions encoded over an LSTM model. YOLOv5 is also used to recognize dominant colors of objects in the images and correct the color words in the descriptions generated if it is necessary. If the description contains words referring to a person, we recognize the emotion of the person in the image. Finally, we combine the results of all models to create sentences that describe the visual content and the human emotions in the images. Experimental results on the Flickr8k dataset in Vietnamese achieve BLEU-1, BLEU-2, BLEU-3, BLEU-4 scores of 0.628; 0.425; 0.280; and 0.174, respectively.

* Fuzzy Systems and Data Mining VII: Proceedings of FSDM 2021 340 (2021): 63
* 7 pages

Via

Access Paper or Ask Questions

Automatically Creating a Large Number of New Bilingual Dictionaries

Aug 12, 2022

Khang Nhut Lam, Feras Al Tarouti, Jugal Kalita

Figure 1 for Automatically Creating a Large Number of New Bilingual Dictionaries

Figure 2 for Automatically Creating a Large Number of New Bilingual Dictionaries

Figure 3 for Automatically Creating a Large Number of New Bilingual Dictionaries

Figure 4 for Automatically Creating a Large Number of New Bilingual Dictionaries

Abstract:This paper proposes approaches to automatically create a large number of new bilingual dictionaries for low-resource languages, especially resource-poor and endangered languages, from a single input bilingual dictionary. Our algorithms produce translations of words in a source language to plentiful target languages using available Wordnets and a machine translator (MT). Since our approaches rely on just one input dictionary, available Wordnets and an MT, they are applicable to any bilingual dictionary as long as one of the two languages is English or has a Wordnet linked to the Princeton Wordnet. Starting with 5 available bilingual dictionaries, we create 48 new bilingual dictionaries. Of these, 30 pairs of languages are not supported by the popular MTs: Google and Bing.

* Proceedings of the AAAI Conference on Artificial Intelligence, vol. 29, no. 1. 2015
* 7 pages

Via

Access Paper or Ask Questions

Using Artificial Intelligence and IoT for Constructing a Smart Trash Bin

Aug 12, 2022

Khang Nhut Lam, Nguyen Hoang Huynh, Nguyen Bao Ngoc, To Thi Huynh Nhu, Nguyen Thanh Thao, Pham Hoang Hao, Vo Van Kiet, Bui Xuan Huynh, Jugal Kalita

Abstract:The research reported in this paper transforms a normal trash bin into a smarter one by applying computer vision technology. With the support of sensors and actuator devices, the trash bin can automatically classify garbage. In particular, a camera on the trash bin takes pictures of trash, then the central processing unit analyzes and makes decisions regarding which bin to drop trash into. The accuracy of our trash bin system achieves 90%. Besides, our model is connected to the Internet to update the bin status for further management. A mobile application is developed for managing the bin.

* International Conference on Future Data and Security Engineering, pp. 427-435. Springer, Singapore, 2021
* 8 pages

Via

Access Paper or Ask Questions

Building a Chatbot on a Closed Domain using RASA

Aug 12, 2022

Khang Nhut Lam, Nam Nhat Le, Jugal Kalita

Figure 1 for Building a Chatbot on a Closed Domain using RASA

Figure 2 for Building a Chatbot on a Closed Domain using RASA

Figure 3 for Building a Chatbot on a Closed Domain using RASA

Figure 4 for Building a Chatbot on a Closed Domain using RASA

Abstract:In this study, we build a chatbot system in a closed domain with the RASA framework, using several models such as SVM for classifying intents, CRF for extracting entities and LSTM for predicting action. To improve responses from the bot, the kNN algorithm is used to transform false entities extracted into true entities. The knowledge domain of our chatbot is about the College of Information and Communication Technology of Can Tho University, Vietnam. We manually construct a chatbot corpus with 19 intents, 441 sentence patterns of intents, 253 entities and 133 stories. Experiment results show that the bot responds well to relevant questions.

* Proceedings of the 4th International Conference on Natural Language Processing and Information Retrieval, pp. 144-148. 2020
* 5 pages

Via

Access Paper or Ask Questions

Creating Lexical Resources for Endangered Languages

Aug 08, 2022

Khang Nhut Lam, Feras Al Tarouti, Jugal Kalita

Figure 1 for Creating Lexical Resources for Endangered Languages

Figure 2 for Creating Lexical Resources for Endangered Languages

Figure 3 for Creating Lexical Resources for Endangered Languages

Figure 4 for Creating Lexical Resources for Endangered Languages

Abstract:This paper examines approaches to generate lexical resources for endangered languages. Our algorithms construct bilingual dictionaries and multilingual thesauruses using public Wordnets and a machine translator (MT). Since our work relies on only one bilingual dictionary between an endangered language and an "intermediate helper" language, it is applicable to languages that lack many existing resources.

* Proceedings of the 2014 Workshop on the Use of Computational Methods in the Study of Endangered Languages, pp. 54-62. 2014
* 9 pages

Via

Access Paper or Ask Questions

Automatically constructing Wordnet synsets

Aug 08, 2022

Khang Nhut Lam, Feras Al Tarouti, Jugal Kalita

Figure 1 for Automatically constructing Wordnet synsets

Figure 2 for Automatically constructing Wordnet synsets

Figure 3 for Automatically constructing Wordnet synsets

Figure 4 for Automatically constructing Wordnet synsets

Abstract:Manually constructing a Wordnet is a difficult task, needing years of experts' time. As a first step to automatically construct full Wordnets, we propose approaches to generate Wordnet synsets for languages both resource-rich and resource-poor, using publicly available Wordnets, a machine translator and/or a single bilingual dictionary. Our algorithms translate synsets of existing Wordnets to a target language T, then apply a ranking method on the translation candidates to find best translations in T. Our approaches are applicable to any language which has at least one existing bilingual dictionary translating from English to it.

* Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 106-111. 2014
* 6 pages

Via

Access Paper or Ask Questions

Creating Reverse Bilingual Dictionaries

Aug 08, 2022

Khang Nhut Lam, Jugal Kalita

Figure 1 for Creating Reverse Bilingual Dictionaries

Figure 2 for Creating Reverse Bilingual Dictionaries

Abstract:Bilingual dictionaries are expensive resources and not many are available when one of the languages is resource-poor. In this paper, we propose algorithms for creation of new reverse bilingual dictionaries from existing bilingual dictionaries in which English is one of the two languages. Our algorithms exploit the similarity between word-concept pairs using the English Wordnet to produce reverse dictionary entries. Since our algorithms rely on available bilingual dictionaries, they are applicable to any bilingual dictionary as long as one of the two languages has Wordnet type lexical ontology.

* Proceedings of the 2013 conference of the North American chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 524-528. 2013
* 5 pages

Via

Access Paper or Ask Questions

Phrase translation using a bilingual dictionary and n-gram data: A case study from Vietnamese to English

Aug 05, 2022

Khang Nhut Lam, Feras Al Tarouti, Jugal Kalita

Figure 1 for Phrase translation using a bilingual dictionary and n-gram data: A case study from Vietnamese to English

Abstract:Past approaches to translate a phrase in a language L1 to a language L2 using a dictionary-based approach require grammar rules to restructure initial translations. This paper introduces a novel method without using any grammar rules to translate a given phrase in L1, which does not exist in the dictionary, to L2. We require at least one L1-L2 bilingual dictionary and n-gram data in L2. The average manual evaluation score of our translations is 4.29/5.00, which implies very high quality.

* In Proceedings of the 11th Workshop on Multiword Expressions, pp. 65-69. 2015
* 5 pages

Via

Access Paper or Ask Questions