Find Jobs
Hire Freelancers

write 2 python web crawlers using scrapy framwork to read wikipedia data

$30-250 SGD

Fullført
Lagt ut omtrent 8 år siden

$30-250 SGD

Betalt ved levering
Implement two web crawlers in python using the scrapy (1.0.5) framework 1) Get the full list of countries and territories from here [login to view URL] write a. Country / territory name b. Wikipedia URL c. Status (Membership) d. Dispute status e. Further information f. Polling date into a mariadb based database db schema: id(auto-incement), createdate(timestamp), all other fields are type text 2) Get the list of URLs in 1b) and crawl each one of the countries websites to extract information of each one of them: a. Abstract b. VCard data: i. Name ii. URL for flag iii. URL for emblem iv. Motto v. Anthem vi. URL to location on globe vii. URL to map viii. Capital(s) 1. Name 2. URL ix. Official language(a) 1. Name 2. URL x. Religion(s) 1. Name 2. URL xi. Demonym(s) 1. Name 2. URL xii. Government 1. Name 2. URL xiii. Establishment(s) 1. Name 2. Date xiv. Area 1. Total km2 2. Water km2 xv. Population 1. Total estimate 2. Date of counting / estimate xvi. GDP 1. Total 2. Per capita xvii. HDI index 1. Total 2. Rank xviii. Currency 1. Name 2. 3 letter code xix. TimeZone(s) 1. Name 2. Deviation from GMT 3. URL to timezone xx. Driving on left or right? xxi. Calling code(s) xxii. ISO code xxiii. Internet TLD(s) c. Date of polling Write the data above into a mariadb based database db schema: id(auto-incement), createdate(timestamp), all other fields are type text In case of multiple entries (e.g. languages) write a comma separated list in the db text field. Make sure the original text is comma free. Requirements: - Running on Ubuntu 14.04 lts (x64) standard installation (scrapy 1.0.5 installation [login to view URL]) - Mariadb 5.5 - Needs to run failsafe with correct results for all countries – especially for countries with several entries for capitol, timezones, languages etc Copyright: - All code belongs to employer Delivery: - 2 web crawlers with pipeline into MariaDB - Code with comments / documentation to be maintained - After uploading the full code, I will run it on my system and proof-read code before payment - No milestones, payment in full after successful test
Prosjekt-ID: 10207457

Om prosjektet

13 forslag
Eksternt prosjekt
Aktiv 8 år siden

Ønsker du å tjene penger?

Fordeler med budgivning på Freelancer

Angi budsjettet og tidsrammen
Få betalt for arbeidet ditt
Skisser forslaget ditt
Det er gratis å registrere seg og by på jobber
Tildelt til:
Brukeravatar
Hi, I can do this job for you. Message me if you want me to get started on it ill do it right away .
$83 SGD om 4 dager
4,9 (4 omtaler)
2,7
2,7
13 frilansere byr i gjennomsnitt $284 SGD for denne jobben
Brukeravatar
Hello Sir, We've done a number of web scraping projects for our clients. We have scraped many directory websites including yellowpages, yelp and e-commerce websites including amazon, walmart etc and many more. We can deliver the data very quickly. We use proxies with IP rotation to avoid being detected as bots. We use python with wget, scrapy, urllib and other tools to fetch webpages and parsers like HtmlXPathSelector, regular expressions etc to extract information from the html. We have the right skill set to do this job effectively and within time and would like to discuss more about this opportunity. Looking forward to hear from you. Thanks, Shiv Agrawal SuiGen Solutions
$421 SGD om 3 dager
4,8 (88 omtaler)
6,6
6,6
Brukeravatar
Hi there. I would be glad to help you out with this project. I am a professional data scraper, with experience creating easy to use scripts to extract data from the web. I can guarantee you an excellent job and deliver asap. However, I do not work without milestone creation, I only ask for them to be created but no payment in advance. Once the job is completed you can release them. Thanks, Daniel
$336 SGD om 3 dager
4,7 (94 omtaler)
6,9
6,9
Brukeravatar
Hi, I have read the description & would like to discuss.. I have good web scraping experience & reviews. & can develop web scraping scripts in Python & C# Hope we can discuss details..
$200 SGD om 2 dager
5,0 (137 omtaler)
6,2
6,2
Brukeravatar
Hello! I'm web scraping expert and i can done your project in 3 days. I use python language and scrapy framework. My scripts works on windows, mac or linux, but linux is preferably. I can schedule scripts on server if it is required. I have more 200 finished projects (google scraping, facebook scraping, yellow pages, linkedinIn, amazon, webshops and other sites with lists of any items). I can export data into json, xml, csv (excel), or any database (mysql, mongodb, mssql, etc). Message me, if you have any questions!
$299 SGD om 5 dager
4,8 (110 omtaler)
6,5
6,5
Brukeravatar
hi, I am an expert with python/scrapy, and have many scrapy project done here. Your project looks OK for me at first glance, please contact me to discuss more detailed requirement, Thanks
$222 SGD om 5 dager
5,0 (25 omtaler)
5,2
5,2
Brukeravatar
Hi, I'm a frequent user of Scrapy and I've already written solutions that integrate with MySQL. I have a few questions regarding your project: 1- Are you looking to run the script periodically? If so, it should be aware that the data may already exist to avoid duplication or unique contraint violations. 2- If you plan to run it periodically, will you do it manually or do you need an additional solution to schedule the spiders? 3- The installation of the required software is part of the project or you will be doing that yourself? Thanks.
$200 SGD om 10 dager
5,0 (19 omtaler)
4,6
4,6
Brukeravatar
Hey there ! We're 2 developers with vast and wide knowledge in Python and scripting specializing in Web scraping. We'll gladly do your project as it seems like something we can pull-off with a script. You can look in our profile for previous projects we've done regarding Web Scraping. Contact us for further details.
$277 SGD om 3 dager
5,0 (7 omtaler)
4,0
4,0
Brukeravatar
I got 7+years work experience in Data Collection,Bulk Email Campaign,Excel VBA and Internet Research in IT companies here.I can do create crawler and scrap datas from Directory and yellowpages using C++,Python and Perl coding as per your requirements in excel with multiple ip rotations.I have dealt with US,UK and Australia companies President,Directors and Managers for web design and development projects successfully and I have Good Communication with writing skills.I am well versed in Internet,MS Office Applications and Phone Etiquette manners with latest Technologies.I can accept your payment terms.
$188 SGD om 2 dager
3,9 (6 omtaler)
4,4
4,4
Brukeravatar
I am a seasoned python programmer with past experience using scrapy. I can write this software for you and even test it on Ubuntu 14.04 LTS before you receive the code. I frequently comment my code and try to write code that is simple and easy to understand. I require no milestone until you are satisfied with the work. Please let me know if you have any questions. Regards, Daniel
$277 SGD om 15 dager
0,0 (0 omtaler)
0,0
0,0

Om klienten

SINGAPOREs flagg
Singapore, Singapore
4,9
2
Betalingsmetode bekreftet
Medlem siden apr. 10, 2016

Klientbekreftelse

Takk! Vi har sendt deg en lenke for at du skal kunne kreve din gratis kreditt.
Noe gikk galt. Vær så snill, prøv på nytt.
Registrerte brukere Publiserte jobber
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Forhåndsvisning innlasting
Tillatelse gitt for geolokalisering.
Påloggingsøkten din er utløpt og du har blitt logget ut. Logg på igjen.