Scrape web contents faster

2006 Oct 1

when scraping websites, i usually use the function file_get_contents. However, there are times when we only need a specific portion of the site to get; for instance: getting the title of the site or the description.

Instead of using file_get_contents function we instead use the builtin file fopen and fgets functions like this:

$buffer
"; ?>

But, using CURL functions will be a lot faster. We will use CURLOPT_RANGE to get the specific amount of data from a specified url. CURLOPT_RANGE defines as range(s) of data to retrieve in the format "X-Y" where X or Y are optional. HTTP transfers also support several intervals, separated with commas in the format "X-Y,N-M".

$content
"; ?>
Tweet this post

1 Comment

This range thing doesnt work when we are using POST :s

it downloads the whole page..

Leave a comment


About Me


Alfredo Sanchez is an internet professional focusing on the study search engines behavior in particular. Supports Free Open Source Software and currently develops applications with it using XAMPP.

Recent Entries

Close