What is the best PHP web content crawler class?: Extracting content by passing the URL of a web site

Recommend this page to a friend!

All requests

What is the best PHP web content craw...

Request new recommendation

Featured requests

No recommendations

What is the best PHP web content crawler class? #web content crawler

Edit

by Adeagbo Moruf Adedeji - 10 years ago (2015-01-09)

Extracting content by passing the URL of a web site

+3	The class will extract the specified content and save it in a database by passing the URL of that web site until all the related content is extracted.

Ask clarification

2 Recommendations

Very simple page details: Parse and extract Web page information details

This class can parse and extract Web page information details.

It can retrieve a Web page from a given URL and parse it to extract details like:

- Page title
- Page head and body
- Meta tags
- Character set
- Links expanded to full path
- Images
- Page headers from H1 through H6
- Internal and external links checking if they are broken
- Page elements by class or id value

0	by zinsou A.A.E.Mo�se package author 6835 - 7 years ago (2017-12-22) Comment one may also need this...

PHP Scraper: Extract structured data from remote HTML pages

This class is meant to fetch remote HTML pages and parse them to extract structured information into arrays.

It can take a model of the definition of the structure of a given page and process it to clip the relevant fields of information.

by Manuel Lemos 26695 - 10 years ago (2015-01-21) Comment

It seems you want to scrape information from Web pages but in general the actual scraping configuration depends on the format of the pages you want to scrape.

This class can solve your problem by passing a model of the data you want to extract from the pages you want to scrape.

Adding the scrapped content to a database needs to be done by yourself with additional code as it depends a lot on what you want to store in the database.

Recommend package

About us

Advertise on this site

For more information send a message to info at phpclasses dot org.