We want to write 70 scrapy spiders for those URLs:
[login to view URL]
The most important thing that ALL CONDITIONS should be handled not to stop scrapy cause of failure, Also the QUALITY of the spiders.
Those attributes should be scraped:
product_id => SKU / ISBN (if books) / Item ID at store / site
name
description
price => MUST BE IN AED (United Arab Emirates Dirham not USD or Euro)
author_brand => Author in case of Books or Brand in case of other products
breadcrumb
parent_category
url
rating => if available
weight => if available
dimensions => if available
availability = 1 or 0,
image_urls = ONLY 1 image URL as string
images = ONLY 1 image as Scrapy will resize it later and upload it to Amazon S3
P.S. Project won't be closed before testing all spiders twice and try some random data from sites.