Python modules you should know: Scrapy
Next in our series of Python modules you should know is Scrapy. Do you want to be the next Google ? Well read on. Scrapy is a fast high-level screen scraping and web crawling framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing. You can use Scrapy to extract any kind of data from a web page, in HTML, XML, CSV and other formats. I recently used it to automate the extraction of domains and emails on the ISPA Spam Hall of Shame list, for use in a DNSBL.
|
|
Next in our series of Python modules you should know is Scrapy. Do you want to be the next Google ? Well read on.
Scrapy is a fast high-level screen scraping and web crawling framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.
You can use Scrapy to extract any kind of data from a web page, in HTML, XML, CSV and other formats. I recently used it to automate the extraction of domains and emails on the ISPA Spam Hall of Shame list, for use in a DNSBL. Full Story |
This topic does not have any threads posted yet!
You cannot post until you login.