Thread
Topic modeling is one of the most powerful techniques of unsupervised learning🤖

Let's implement a topic modeling pipeline without going into heavy statistics behind it⚡

A 🧵👇
This thread is derived from my newest article on Cyberbullying detection using Topic modeling and sentiment analysis

Check out the article for a more detailed explanation with code👇
www.analyticsvidhya.com/blog/2023/03/detect-cyberbullying-using-topic-modeling-and-sentiment-analysis...
1) Load the Twitter dataset

Pre-process the dataset using lemmatization and stop word removal

Check out the full article + code using the below link:
www.analyticsvidhya.com/blog/2023/03/detect-cyberbullying-using-topic-modeling-and-sentiment-analysis...
2) Select the corpora from the Twitter dataset

Write the preprocess function in a pythonic way🐍
3) Setting up a topic model pipeline

We have used the LDA model from the gensim library

See the code below👇
4) Visual Interpretation of Topic Modeling Output
The output of the above code👇
Check out the full article + code using the below link:

Don't forget to share your feedback on the same
www.analyticsvidhya.com/blog/2023/03/detect-cyberbullying-using-topic-modeling-and-sentiment-analysis...
Check out the github repository for more👇

Feel free to raise an issue if you find one
github.com/avikumart/Cyberbullying-Detection-NLP
Mentions
See All