How to Scrape Amazon Reviews: Complete Guide (2026) | Amazon Scraping

Amazon reviews are a goldmine of customer intelligence. For any product, thousands of customers have documented exactly what they love, hate, and wish was different. Businesses use scraped review data for sentiment analysis, product development, competitive intelligence, and marketing.

This guide covers how Amazon review scraping works, what data you can extract, and how to use it effectively.

What Data Can You Extract from Amazon Reviews?

Amazon review pages contain rich, structured data:

Field	Description
Review title	Short headline written by reviewer
Review body	Full review text
Star rating	1–5 stars
Reviewer name	Display name (public)
Review date	Date review was submitted
Verified purchase	Whether buyer actually purchased the product
Helpful votes	Number of users who found the review helpful
Total votes	Total votes on the review
Reviewer location	Country of reviewer (sometimes shown)
Vine review	Whether it's an Amazon Vine programme review
Review images	Customer-uploaded images with the review

Why Businesses Scrape Amazon Reviews

1. Competitive Product Intelligence

Read your competitors' 1-star and 2-star reviews. These are free customer research — they tell you exactly what the market wants that competitors aren't delivering.

2. Sentiment Analysis

With hundreds of reviews scraped, you can run NLP analysis to identify:

Most common complaints (negative sentiment clusters)
Most praised features (positive sentiment clusters)
Feature gaps mentioned repeatedly

3. Review Monitoring for Your Own Products

Get alerted to new negative reviews faster than checking manually. A sudden spike in 1-star reviews often signals a product defect or fulfilment issue.

4. Marketing Copy

The language customers use in positive reviews is your best marketing copy. It reflects how real buyers describe the benefits — use it in your own listing and ad copy.

5. Fake Review Detection

Analyse review patterns to spot review manipulation by competitors: sudden bursts of 5-star reviews, unverified purchases, similar language patterns.

How Amazon Review Scraping Works

Amazon review pages are structured with pagination. Each ASIN typically has:

A star rating summary page (ratings breakdown by 1–5 stars)
Paginated review pages (10 reviews per page)
Filter options (by star rating, verified only, with images, etc.)

A complete scraper needs to:

Identify the total review count
Calculate the number of pages
Iterate through all pages with delays
Parse each review's fields
Handle anti-bot detection (the review endpoint is heavily protected)

Python Example — Basic Review Scraper

import requests
from bs4 import BeautifulSoup
import json
import time
import random

HEADERS = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) '
                  'AppleWebKit/537.36 (KHTML, like Gecko) '
                  'Chrome/124.0.0.0 Safari/537.36',
    'Accept-Language': 'en-US,en;q=0.9',
}

def scrape_reviews_page(asin: str, page: int = 1) -> list[dict]:
    """
    Scrape a single page of reviews for a given ASIN.
    Returns a list of review dicts.
    """
    url = (
        f'https://www.amazon.com/product-reviews/{asin}'
        f'?reviewerType=all_reviews&pageNumber={page}'
    )
    
    response = requests.get(url, headers=HEADERS, timeout=15)
    if response.status_code != 200:
        return []
    
    soup = BeautifulSoup(response.content, 'lxml')
    reviews = []
    
    for review_el in soup.select('[data-hook="review"]'):
        # Extract fields
        title_el    = review_el.select_one('[data-hook="review-title"]')
        body_el     = review_el.select_one('[data-hook="review-body"]')
        rating_el   = review_el.select_one('[data-hook="review-star-rating"]')
        date_el     = review_el.select_one('[data-hook="review-date"]')
        verified_el = review_el.select_one('[data-hook="avp-badge"]')
        helpful_el  = review_el.select_one('[data-hook="helpful-vote-statement"]')
        
        reviews.append({
            'asin':      asin,
            'title':     title_el.text.strip() if title_el else None,
            'body':      body_el.text.strip() if body_el else None,
            'rating':    rating_el.text.strip() if rating_el else None,
            'date':      date_el.text.strip() if date_el else None,
            'verified':  verified_el is not None,
            'helpful':   helpful_el.text.strip() if helpful_el else '0',
        })
    
    return reviews


def scrape_all_reviews(asin: str, max_pages: int = 10) -> list[dict]:
    all_reviews = []
    
    for page in range(1, max_pages + 1):
        print(f'Scraping page {page} for ASIN {asin}...')
        page_reviews = scrape_reviews_page(asin, page)
        
        if not page_reviews:
            print(f'No reviews on page {page}, stopping.')
            break
        
        all_reviews.extend(page_reviews)
        time.sleep(random.uniform(3, 7))  # Be respectful
    
    return all_reviews


# Usage
asin = 'B09G3HRMVB'
reviews = scrape_all_reviews(asin, max_pages=5)

with open(f'{asin}_reviews.json', 'w', encoding='utf-8') as f:
    json.dump(reviews, f, indent=2, ensure_ascii=False)

print(f'Scraped {len(reviews)} reviews for {asin}')

Running Sentiment Analysis on Reviews

Once you have reviews scraped, you can run basic sentiment analysis:

from collections import Counter
import re

def find_common_complaints(reviews: list[dict], top_n: int = 20) -> list:
    """Find most-mentioned words in 1-2 star reviews."""
    negative = [r for r in reviews if r['rating'] 
                and r['rating'].startswith(('1', '2'))]
    
    # Combine all negative review text
    all_text = ' '.join([r.get('body', '') for r in negative]).lower()
    
    # Remove stopwords (simplified)
    stopwords = {'the','a','an','is','it','in','and','or','to','this','that',
                 'was','for','of','with','my','i','but','not','very','so','be'}
    words = re.findall(r'\b[a-z]{4,}\b', all_text)
    meaningful = [w for w in words if w not in stopwords]
    
    return Counter(meaningful).most_common(top_n)

complaints = find_common_complaints(reviews)
print('Most common words in negative reviews:')
for word, count in complaints:
    print(f'  {word}: {count}')

Scale Considerations

Volume	Recommended Approach
< 5,000 reviews	DIY Python scraper
5,000 – 100,000 reviews	Python + proxy rotation
100,000 – 1M reviews	Managed scraping service
1M+ reviews	Enterprise managed service

Important Notes on Review Data

Only scrape public reviews — reviews visible without logging in are fair game
Don't store personally identifiable data beyond reviewer display names (which are public)
If operating in EU, document your legitimate interest under GDPR for processing review data
Amazon heavily protects the review endpoint — expect higher block rates than product pages

Our Amazon Review Scraping Service

For large-scale review extraction, our Amazon review scraper delivers:

All review fields (title, body, rating, date, verified, helpful votes)
Bulk extraction across thousands of ASINs
All star rating filters
Vine review identification
Review image URLs
Clean JSON or CSV delivery
All 12+ Amazon marketplaces

Get a free quote with a sample review dataset for your target ASINs.