sentiment analysis v

Scraping data from e-commerce site and performing analysis on them using Vader library.

Introduction

Sentiment analysis is a powerful technique in Natural Language Processing (NLP) that allows us to determine the sentiment or emotional tone of a given text. In this mini-project, we will explore how to perform sentiment analysis using the Vader library in Python. We will scrape customer reviews from a website, save them to an Excel file, and then apply sentiment analysis using the Vader library to categorize the reviews as positive, negative, or neutral.

Here is the Step Wise Explanation

Step 1:

Importing the necessary libraries and resources To begin, we import the required libraries: requests, BeautifulSoup, pandas, and nltk. We also download the ‘vader_lexicon‘ resource from the NLTK library, which is necessary for sentiment analysis using Vader.

import requests
from bs4 import BeautifulSoup
import pandas as pd
from nltk.sentiment.vader import SentimentIntensityAnalyzer
import nltk
nltk.download('vader_lexicon')

Step 2: Scraping the customer reviews

We start by defining the URL of the webpage containing the customer reviews. Using the requests library, we send a GET request to the URL and retrieve the HTML content. We then create a BeautifulSoup object to parse the HTML content.

url = 'https://www.flipkart.com/hamtex-polycotton-double-bed-cover/product-reviews/itma5c9f08efe504?pid=BCVG2ZGSDZ3WSGTF&lid=LSTBCVG2ZGSDZ3WSGTFDBZ9IO&marketplace=FLIPKART'

response = requests.get(url)
content = response.content

soup = BeautifulSoup(content, 'html.parser')

Step 3: Extracting the reviews Next

We identify the container element that holds the reviews using its class name. We find all the review divs within the container and iterate over them. For each review div, we extract the text from the third child div element and store it in a list.

reviews_container = soup.find('div', {'class': '_1YokD2 _3Mn1Gg col-9-12'})

review_divs = reviews_container.find_all('div', {'class': 't-ZTKy'})

reviews = []
for child in review_divs:
third_div = child.div.div
text = third_div.text.strip()
reviews.append(text)

Step 4: Saving the reviews to an Excel file

We create a pandas DataFrame using the collected reviews and save it to an Excel file using the to_excel() function. This step allows us to have a structured dataset for further analysis.

# Save the reviews to an Excel file in current directory
data = pd.DataFrame({'review': reviews})
data.to_excel('reviews.xlsx', index=False)

Step 5: Performing sentiment analysis

Using Vader We define a helper function, sentiment_Vader, which applies the Vader sentiment analysis from the SentimentIntensityAnalyzer class to a given text. It calculates the overall polarity score and categorizes the sentiment as positive, negative, or neutral based on the compound score.

def sentiment_Vader(text):
    over_all_polarity = sid.polarity_scores(text)
    if over_all_polarity['compound'] >= 0.05:
        return "positive"
    elif over_all_polarity['compound'] <= -0.05:
        return "negative"
    else:
        return "neutral"

Step 6: Applying sentiment analysis to the reviews

Using the sentiment_Vader function, we apply sentiment analysis to each review in the DataFrame by creating a new column called ‘polarity’. The polarity is determined based on the sentiment score returned by the Vader library.

# Apply sentiment analysis using VADER
sid = SentimentIntensityAnalyzer()
data['polarity'] = data['review'].apply(lambda review: sentiment_Vader(review))

Step 7: Saving the result to an Excel file Finally

We save the updated DataFrame with the sentiment analysis results to a new Excel file using the to_excel() function. This file will contain the original reviews along with their corresponding sentiment polarity.

result_data = data.to_excel('G:/..../sentiment_result.xlsx')

Full Code is Given on my Github account. Please feel free to visit.

Conclusion

In this mini-project, we have demonstrated how to perform sentiment analysis using the Vader library in Python. By scraping customer reviews, saving them to an Excel file, and applying sentiment analysis using the Vader library, we can gain valuable insights into the sentiment of the reviews. This approach can be applied to various domains, such as product reviews, social media sentiment analysis, and customer feedback analysis. The Vader library, coupled with web scraping techniques, provides a powerful toolset for sentiment analysis and allows us to quickly analyze large volumes of text data in an efficient manner.

By Akshay Tekam

software developer, Data science enthusiast, content creator.

Leave a Reply

Your email address will not be published. Required fields are marked *