-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
I am trying to work on a use-case which requires predicting the polarity but the result is not accurate. Our main focus is on the -ve inputs but it is unable to find it with confidence.
I tried to go through the github code base and understand how exactly the sentiment is predicted by the algo but was unable to get a clear picture.
So I have 3 questions:
-
Can we modify and retrain the the algorithm by passing more training data? If YES, then how can we do that?
-
Textblob sentiment analysis using Naive Bayes but what I want to understand is what steps are happening after passing the data to
tb = TextBlob(data)
and then callingtb.sentiment
on it.
I would really appreciate if I can have a detailed steps including preprocessing, etc. -
I am performing the following preprocessing steps before passing the data to TextBlob:
- removing numbers, dates, months, urls, hashtags, mentions, etc
- lowercasing,
- removing punctuation marks
- stop word removal and converting -ve words like
don't
to justnot
asdo
is a stop word, etc
Can you suggest if removing/ adding any of the above steps will lead to grater confidence & accuracy in polarity prediction?