Tuesday, 19 December 2017

Scraping Data from Website in php

Chandan Sharma 12:39 CodeIgniter, HTML, PHP, Scraping Data, Scraping Data From Website

Scraping Data from Website in php::

There is PHP Simple HTML DOM Parser. It's fast, easy and super flexible.
It basically sticks an entire HTML page in an object then you can access any element from that object.

Document Link : http://simplehtmldom.sourceforge.net/

Like:: get all links on the main Google page:

// Create DOM from URL or file
$html = file_get_html('http://www.google.com/');

// Find all images
foreach($html->find('img') as $element)
echo $element->src . '<br>';

// Find all links
foreach($html->find('a') as $element)
echo $element->href . '<br>';

Alternatively,
we can use this library PHPPowertools/DOM-Query.
Document Link: https://github.com/PHPPowertools/DOM-Query

It uses customized version of Masterminds/html5-php under the hood parsing an HTML5 string into a DomDocument and symfony/DomCrawler for conversion of CSS selectors to XPath selectors.
It always uses the same DomDocument, even when passing one object to another, to ensure decent performance.

LIKE::
namespace PowerTools;

// Get file content
$pagecontent = file_get_contents( 'http://www.4wtech.com/csp/web/Employee/Login.csp' );

// Define your DOMCrawler based on file string
$H = new DOM_Query( $pagecontent );

// Define your DOMCrawler based on an existing DOM_Query instance
$H = new DOM_Query( $H->select('body') );

// Passing a string (CSS selector)
$s = $H->select( 'div.foo' );

// Passing an element object (DOM Element)
$s = $H->select( $documentBody );

// Passing a DOM Query object
$s = $H->select( $H->select('p + p') );

// Select the body tag
$body = $H->select('body');

// Combine different classes as one selector to get all site blocks
$siteblocks = $body->select('.site-header, .masthead, .site-body, .site-footer');

// Nest your methods just like you would with jQuery
$siteblocks->select('button')->add('span')->addClass('icon icon-printer');

// Use a lambda function to set the text of all site blocks
$siteblocks->text(function( $i, $val) {
return $i . " - " . $val->attr('class');
});

// Append the following HTML to all site blocks
$siteblocks->append('<div class="site-center"></div>');

// Use a descendant selector to select the site's footer
$sitefooter = $body->select('.site-footer > .site-center');

// Set some attributes for the site's footer
$sitefooter->attr(array('id' => 'aweeesome', 'data-val' => 'see'));

// Use a lambda function to set the attributes of all site blocks
$siteblocks->attr('data-val', function( $i, $val) {
return $i . " - " . $val->attr('class') . " - photo by Kelly Clark";
});

// Select the parent of the site's footer
$sitefooterparent = $sitefooter->parent();

// Remove the class of all i-tags within the site's footer's parent
$sitefooterparent->select('i')->removeAttr('class');

// Wrap the site's footer within two nex selectors
$sitefooter->wrap('<section><div class="footer-wrapper"></div></section>');

ali3 March 2023 at 06:12
slot siteleri
kralbet
betpark
tipobet
mobil ödeme bahis
betmatik
kibris bahis siteleri
poker siteleri
bonus veren siteler
0İKY1W
ReplyDelete
Replies

Add comment

Breaking

Stack Of Codes

Ads

Tuesday, 19 December 2017

Scraping Data from Website in php

1 comment:

Topics

Search

Learn Django

Learn Python

Translate

Popular

Sponsor

Blog Archive

Featured post

How to find the odd and even numbers using Python?

Social

Total Pageviews

Tags

Categories

Popular Posts

Breaking

Stack Of Codes

Ads

Tuesday, 19 December 2017

Scraping Data from Website in php

Subscribe via email

1 comment:

Topics

Search

Learn Django

Learn Python

Translate

Popular

Sponsor

Blog Archive

Featured post

How to find the odd and even numbers using Python?

Social

Total Pageviews

Tags

Categories

Popular Posts