Skip to content

Commit

Permalink
revised
Browse files Browse the repository at this point in the history
Signed-off-by: Finbarrs Oketunji <f@finbarrs.eu>
  • Loading branch information
0xnu committed Jul 27, 2023
1 parent f566e89 commit 0ec30c4
Show file tree
Hide file tree
Showing 4 changed files with 5 additions and 13 deletions.
2 changes: 1 addition & 1 deletion LONG_DESCRIPTION.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ Amazon Products Scraper
:target: https://badge.fury.io/py/amazon-scrape
:alt: amazon-scrape Python Package Version

Scrape Amazon product data such as Product Name, Product Images, Product URL, Number of Reviews, ASIN, Rating Count, and Price.
Scrape Amazon product data such as Product Name, Product Images, Product URL, Number of Reviews, ASIN, and Price.

Requirements
------------
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

[![PyPI version](https://badge.fury.io/py/amazon-scrape.svg)](https://badge.fury.io/py/amazon-scrape)

Scrape Amazon product data such as Product Name, Product Images, Product URL, Number of Reviews, ASIN, Rating Count, and Price.
Scrape Amazon product data such as Product Name, Product Images, Product URL, Number of Reviews, ASIN, and Price.

## Requirements

Expand Down
12 changes: 2 additions & 10 deletions amazon_scraper/scraper.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ def __init__(self, locale="co.uk", keyword=None, url=None, api_key=None, pages=2
self.locale = locale

def start_scraping(self):
self.writer.writerow(["product_name", "product_images", "rating_count", "price", "product_url", "number_of_reviews", "asin"])
self.writer.writerow(["product_name", "product_images", "price", "product_url", "number_of_reviews", "asin"])
for page in range(1, self.pages + 1):
url = self.url + "&page=" + str(page)
headers = {"User-Agent": random.choice(self.user_agents)}
Expand All @@ -67,13 +67,6 @@ def start_scraping(self):
else:
images = []

# Rating count
rating_count = product.find("span", {"class": "a-size-base"})
if rating_count is not None:
rating_count = rating_count.text
else:
rating_count = ''

# Price
price = product.find("span", {"class": "a-offscreen"})
if price is not None:
Expand All @@ -99,13 +92,12 @@ def start_scraping(self):
asin = product_url.split("/dp/")[1].split("/")[0] if "/dp/" in product_url else ''

# Write to CSV
self.writer.writerow([name, ", ".join(images), rating_count, price, product_url, number_of_reviews, asin])
self.writer.writerow([name, ", ".join(images), price, product_url, number_of_reviews, asin])

# Add to JSON data
self.json_data.append({
"product_name": name,
"product_images": images,
"rating_count": rating_count,
"price": price,
"product_url": product_url,
"number_of_reviews": number_of_reviews,
Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
setup(
name=NAME,
version=VERSION,
description="Scrape Amazon product data such as Product Name, Product Images, Product URL, Number of Reviews, ASIN, Rating Count, and Price.",
description="Scrape Amazon product data such as Product Name, Product Images, Product URL, Number of Reviews, ASIN, and Price.",
long_description=long_description,
long_description_content_type="text/x-rst",
author="Finbarrs Oketunji",
Expand Down

0 comments on commit 0ec30c4

Please sign in to comment.