Document Type
Article
Publication Date
3-20-2025
Abstract
Business, political, and other social structures create strong motivation to understand the attitudes, motivations, feelings, and emotions of a population of interest. Social media is a rich source of self-disclosed information by individuals from all walks of life about virtually every domain of the human experience, but the vast quantity of data is impossible to effectively analyze without advanced natural language processing algorithms. This research creates a transfer learning based emotion classification model for Indonesian language Twitter data. Transfer learning consists of two steps: pre-training and fine tuning. Three variations of Indonesian Bidirectional Encoder Representations from Transformers (IndoBERT) are tested with hyperparameters tuned via designed experiment. The top IndoBERT model, tested on an open source corpus of 4,401 labeled Indonesian Tweets, outperforms all known prior studies with an F1 score of approximately 0.791. Additionally, this research explores the relationship between training set size and model validity for fine tuning of the transfer learning models; datasets ranging from 100 to 3900 observations are trained and then validated on five unique test sets. Results indicate that as few as 1000 observations can obtain results comparable to using the full training corpus.
DOI
10.1007/s13278-025-01439-6
Source Publication
Social Network Analysis and Mining (ISSN 1869-5469)
Recommended Citation
Shaw, C., LaCasse, P. & Champagne, L. Exploring emotion classification of indonesian tweets using large scale transfer learning via IndoBERT. Soc. Netw. Anal. Min. 15, 22 (2025). https://doi.org/10.1007/s13278-025-01439-6
Included in
Applied Behavior Analysis Commons, Artificial Intelligence and Robotics Commons, Social Media Commons
Comments
© 2025 The Authors.
This article is published by Springer, licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Sourced from the published version of record cited below.