%0 Journal Article %T Enhancing BERTopic with Pre-Clustered Knowledge: Reducing Feature Sparsity in Short Text Topic Modeling %A Qian Wang %A Biao Ma %J Journal of Data Analysis and Information Processing %P 597-611 %@ 2327-7203 %D 2024 %I Scientific Research Publishing %R 10.4236/jdaip.2024.124032 %X Modeling topics in short texts presents significant challenges due to feature sparsity, particularly when analyzing content generated by large-scale online users. This sparsity can substantially impair semantic capture accuracy. We propose a novel approach that incorporates pre-clustered knowledge into the BERTopic model while reducing the l2 norm for low-frequency words. Our method effectively mitigates feature sparsity during cluster mapping. Empirical evaluation on the StackOverflow dataset demonstrates that our approach outperforms baseline models, achieving superior Macro-F1 scores. These results validate the effectiveness of our proposed feature sparsity reduction technique for short-text topic modeling. %K Topic Model %K BERTopic %K Short Text %K Feature Sparsity %K Cluster %U http://www.scirp.org/journal/PaperInformation.aspx?PaperID=137513