Skip to content

amsify42/php-domfinder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PHP DOM Finder

PHP package for searching document object model efficiently and with more readable way.

Installation

$ composer require amsify42/php-domfinder

Table of Contents

  1. Loading Source
  2. Important Notes
  3. Meta Tags
  4. Elements
  5. Element Class
  6. Element Id
  7. Element Attribute
  8. Regex Extraction
  9. Element Methods
  10. Multi Level Finder

1. Loading Source


File

$domFinder 	= new Amsify42\DOMFinder\DOMFinder('path/to/file.html');
// or
$domFinder 	= new Amsify42\DOMFinder\DOMFinder();
$domFinder->load('path/to/file.html');

HTML

$domFinder 	= new Amsify42\DOMFinder\DOMFinder('path/to/file.html', 'html');
// or
$domFinder 	= new Amsify42\DOMFinder\DOMFinder();
$domFinder->loadHTML('path/to/file.html');

XML

$domFinder 	= new Amsify42\DOMFinder\DOMFinder('path/to/file.xml', 'xml');
// or
$domFinder 	= new Amsify42\DOMFinder\DOMFinder();
$domFinder->loadXML('path/to/file.xml');

URL

For HTML

$domFinder 	= new Amsify42\DOMFinder\DOMFinder('http://www.site.com/file.html', 'html', true);
// or
$domFinder 	= new Amsify42\DOMFinder\DOMFinder();
$domFinder->loadHTML('http://www.site.com/file.html', true);

For XML

$domFinder 	= new Amsify42\DOMFinder\DOMFinder('http://www.site.com/file.xml', 'xml', true);
// or
$domFinder 	= new Amsify42\DOMFinder\DOMFinder();
$domFinder->loadXML('http://www.site.com/file.xml', true);

Using helper method

$domFinder = get_dom_finder('http://www.site.com/file.html', 'html', true);

Note: Make sure you pass true as 3rd parameter to constructor/helper method or 2nd parameter to load method for loading content from URL.

2. Important Notes


Amsify42\DOMFinder\DOMFinder class uses Amsify42\DOMFinder\DOM\Document which extends PHP pre defined class DOMDocument. You can use all the methods of DOMDocument using this instance

$domFinder->dom();	

Example:

$domFinder->dom()->getElementsByTagName('p');	

Amsify42\DOMFinder\DOMFinder class uses PHP pre defined class DomXPath for querying document. If you want to use all the methods of DomXPath, you can use this instance

$domFinder->finder();

Example:

$domFinder->finder()->query("/div[@class='body-entry']");	

All the element results you get after querying document will be of type Amsify42\DOMFinder\DOM\Element which extends PHP pre defined class DOMElement.

$anchors = $domFinder->find('a')->byClass('action-link')->all();
if($anchors->length)
{
	foreach($anchors as $anchor)
	{
		var_dump($anchor); // Will be of type Amsify42\DOMFinder\DOM\Element which extends DOMElement
	}
}

You can use all the methods of DOMElement from all the element items. Example:

foreach($anchors as $anchor)
{
	$anchor->getAttribute('href');
}

Most importantly, whenever you try to get the first or particular key element by index, it will either return NULL or element of type Amsify42\DOMFinder\DOM\Element. Examples:

$para = $domFinder->getFirstElement('p');
// or
$para = $domFinder->getElement('p', 1);
// or
$para = $domFinder->findFirst('p');
// or
$para = $domFinder->find('p')->first();
// or
$para = $domFinder->find('p')->get(1);

3. Meta Tags


After source has been loaded, you can use these meta tags related methods.

$metaTags = $domFinder->metaTags();

To get specific meta tag value

<meta name="title" content="Amsify42">
$title = $domFinder->getMetaValue('name', 'title');

By default it takes content attribute value from meta element, to get value from other attribute, pass 3rd parameter

<meta name="title" myattr="Amsify42">
$title = $domFinder->getMetaValue('name', 'title', 'myattr');

4. Elements


To get specific elements from DOM

$paras = $domFinder->getElements('p');

To get first element

$para = $domFinder->getFirstElement('p');

To get the element by index position

$para = $domFinder->getElement('p', 1);

5. Element Class


Equals

Find all elements by class name

$elements = $domFinder->findByClass('section-items')->all();

Find first element by class

$element = $domFinder->findByClass('section-items')->first();
// or
$element = $domFinder->findFirstByClass('section-items');

Find all div tag element by class

$elements = $domFinder->find('div')->byClass('section-items')->all();

Find first div tag element by class

$element = $domFinder->find('div')->byClass('section-items')->first();

For getting element by its key position

$element = $domFinder->find('div')->byClass('section-items')->get(1); // This will return 2nd element

Like

Find all elements contains class

$elements = $domFinder->findClassLike('section-items')->all();

Find first element contains class

$element = $domFinder->findClassLike('section-items')->first();
// or
$element = $domFinder->findFirstClassLike('section-items');

Find all div tag element contains class

$divs = $domFinder->find('div')->classLike('section-items')->all();

Find first div tag element contains class

$div = $domFinder->find('div')->classLike('section-items')->first();

For getting element by its key position

$div = $domFinder->find('div')->classLike('section-items')->get(1); // This will return 2nd element

6. Element Id


Equals

Find all elements by id

$elements = $domFinder->findById('body-entry')->all();

Find first element by id

$element = $domFinder->findById('body-entry')->first();
// or
$element = $domFinder->findFirstById('body-entry');

Find all div tag element by id

$divs = $domFinder->find('div')->byId('body-entry')->all();

Find first div tag element by id

$div = $domFinder->find('div')->byId('body-entry')->first();

Like

Find all elements contains id

$elements = $domFinder->findIdLike('section-')->all();

Find first element contains id

$element = $domFinder->findIdLike('section-')->first();
// or
$element = $domFinder->findFirstIdLike('section-');

Find all div tag element contains id

$divs = $domFinder->find('div')->idLike('section-')->all();

Find first div tag element contains id

$div = $domFinder->find('div')->idLike('section-')->first();

For getting element by its key position

$div = $domFinder->find('div')->idLike('section-')->get(1); // This will return 2nd element

7. Element Attribute


Equals

Find all elements by attribute

$elements = $domFinder->findByAttr('data-section', 'paragraph')->all();

Find first element by attribute

$element = $domFinder->findByAttr('data-section', 'paragraph')->first();
// or
$element = $domFinder->findFirstByAttr('data-section', 'paragraph');

Find all div tag element by attribute

$divs = $domFinder->find('div')->byAttr('data-section', 'paragraph')->all();

Find first div tag element by attribute

$div = $domFinder->find('div')->byAttr('data-section', 'paragraph')->first();

For getting element by its key position

$div = $domFinder->find('div')->byAttr('data-section', 'paragraph')->get(1); // This will return 2nd element

Like

Find all elements contains attribute

$elements = $domFinder->findAttrLike('my-att', 'some-')->all();

Find first element contains attribute

$element = $domFinder->findAttrLike('my-att', 'some-')->first();
// or
$element = $domFinder->findFirstAttrLike('my-att', 'some-');

Find all div tag element contains attribute

$divs = $domFinder->find('div')->attrLike('my-att', 'some-')->all();

Find first div tag element contains attribute

$div = $domFinder->find('div')->attrLike'my-att', 'some-')->first();

For getting element by its key position

$div = $domFinder->find('div')->attrLike('my-att', 'some-')->get(1); // This will return 2nd element

8. Regex Extraction


To extract particular item from html, consider this sample html

$html = '<div class="section">
			<script>var data={"name": "my name", "id":12345};</script>
		</div>';
$domFinder = new Amsify42\DOMFinder\DOMFinder();
$domFinder->loadHTML($html);

$section = $domFinder->findFirstByClass('section');
if($section)
{
	$data = $section->extractByRegex("/data\=(.*?)\;</"); // Here you will get js dictionary data
}

For extracting multiple instances of data by regex, pass 2nd parameter as true

$html = '<div class="section">
			<some-element class="some-class">{"name": "name one", "id":1}</some-element>
			<some-element class="some-class">{"name": "name two", "id":2}</some-element>
			<some-element class="some-class">{"name": "name three", "id":3}</some-element>
		</div>';
$domFinder = new Amsify42\DOMFinder\DOMFinder();
$domFinder->loadHTML($html);

$section = $domFinder->findFirstByClass('section');
if($section)
{
	$data = $section->extractByRegex("/class=\"some-class\">(.*?)\<\//", true); // Here you will get multiple js dictionary data as array
}

You can also pass multiple regex as array for multi level check and extraction

$data = $section->extractByRegex(["/<some-element(.*?)some-element>/", "/class=\"some-class\">(.*?)\<\//"], true);

9. Element methods


These are the methods you can use at element level

<ul class="list-items">
	<li>Item one</li>
	<li>Item two</li>
	<li>Item three</li>
</ul>	
$ul = $domFinder->getElement('ul');
// or
$ul = $domFinder->findFirst('ul');

For getting outer and inner HTML of element, you can use these methods

echo $ul->outerHTML();

Outer html will print

<ul class="list-items">
	<li>Item one</li>
	<li>Item two</li>
	<li>Item three</li>
</ul>
echo $ul->innerHTML();

Inner html will print

<li>Item one</li>
<li>Item two</li>
<li>Item three</li>

10. Multi Level Finder


This section is to demonstrate how the dom finder works at multi level.

<div class="parent-class">
	<div class="child-class">
		<ul class="list">
			<li class="item">one</li>
			<li class="item">two</li>
			<li class="item">three</li>
		</ul>
	</div>
	<div class="child-class">
		<ul class="list">
			<li class="item">one</li>
			<li class="item">two</li>
			<li class="item">three</li>
		</ul>
	</div>
</div>

Simple

$uls = $domFinder->find('div')->byClass('child-class')->find('ul')->all();
// or
$uls = $domFinder->find('div')->byClass('child-class')->findAll('ul');

The above query is same as DomXPath

$uls = $domFinder->finder()->query("/div[@class='child-class']/ul");

You will get all the ul elements

if($uls->length)
{
	foreach($uls as $ul)
	{
		var_dump($ul);
	}
}

Element Level

This approach actually creates DOMFinder instance at each element level when you try to do query.

$div = $domFinder->find('div')->byClass('parent-class')->first();
if($div)
{
	$divs = $div->find('div')->byClass('child-class')->all(); // At this level DOMFinder instance will be created and assigned to this element
	if($divs->length)
	{
		echo $divs->length;
	}
}

About

PHP package for searching document object model efficiently and with more readable way.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors