Agricultural question classification model based on BERT word vector and TextCNN

BAO Tong; LUO Rui; GUO Ting; GUI Shu-ting; REN Ni

doi:10.3969/j.issn.2095-1191.2022.07.031

BAO Tong, LUO Rui, GUO Ting, GUI Shu-ting, REN Ni. 2022: Agricultural question classification model based on BERT word vector and TextCNN. Journal of Southern Agriculture, 53(7): 2068-2076. DOI: 10.3969/j.issn.2095-1191.2022.07.031

Citation:

Agricultural question classification model based on BERT word vector and TextCNN

Abstract

Abstract

【Objective】To study the effects of different word vectors and deep learning models on the classification results of agricultural questions, so as to provide technical support for the construction of agricultural intelligent question answering system.【Method】The question-and-answer data from websites such as the Agricultural Planting Network was obtained through crawlers, and 20 thousand questions were selected for artificial annotation to construct the classification corpus of agricultural questions. Bidirectional encoder representation from transformers(BERT) was used to encode agricultural questions, and text convolutional neural network(TextCNN) was used to extract high-dimensional features of questions to classify agricultural questions.【Result】In the word vector comparison experiment, when BERT word vector was combined with TextCNN, the F1 value of agricultural question classification reached 93.32%, which was 2.1% higher than that of Word2vec. In the comparison of classification accuracy of deep learning models, when TextCNN was combined with Word2vec and BERT, F1 value reached 91.22% and 93.32%, respectively, which were better than that of other models. In the subdivision experiment of agricultural questions, F1 values of BERT-TextCNN in the classification of cultivation technology, field management, soil, fertilizer and water management achieved 86.06%, 90.56%, 95.04% and 85.55%, which were better than that in other deep learning models. In terms of hyperparameter settings, the BERTTextCNN agricultural question classification model had the best effect when the convolution kernel size is set as3, 4, 5, the learning rate was set to 5e-5, and the number of iterations was set to 5. In the case of unbalanced data samples, the average classification accuracy of agricultural questions could still reach more than 93.00%, which could meet the question classification requirements of the agricultural intelligent question answering system.【Suggestion】The quality of data annotation can be improved through open source platforms such as Ali NLP;model classification accuracy shall be improved through supplementing word frequency and document features in the classification process;Agricultural-related government departments need to strengthen cooperation to explore new models of popularization and service of agricultural technology digitalization.

FullText(HTML)

References (29)

Cited By

Agricultural question classification model based on BERT word vector and TextCNN

Abstract

Catalog

Export File

Citation

Format

Content