Methods |
public
|
__construct(DOMNodeList|DOMNode|DOMNode[]|string|null $node = null, ?string $uri = null, ?string $baseHref = null)
Parameters
$node |
A Node to use as the base for the crawling
|
|
#
|
public
|
getUri(): string|null
Returns the current URI.
|
#
|
public
|
getBaseHref(): string|null
Returns base href.
|
#
|
public
|
clear()
Removes all the nodes.
|
#
|
public
|
add(DOMNodeList|DOMNode|DOMNode[]|string|null $node)
Adds a node to the current list of nodes.
Adds a node to the current list of nodes.
This method uses the appropriate specialized add*() method based
on the type of the argument.
Parameters
Throws
|
#
|
public
|
addContent(string $content, ?string $type = null)
Adds HTML/XML content.
Adds HTML/XML content.
If the charset is not set via the content type, it is assumed to be UTF-8,
or ISO-8859-1 as a fallback, which is the default charset defined by the
HTTP 1.1 specification.
|
#
|
public
|
addHtmlContent(string $content, string $charset = 'UTF-8')
Adds an HTML content to the list of nodes.
Adds an HTML content to the list of nodes.
The libxml errors are disabled when the content is parsed.
If you want to get parsing errors, be sure to enable
internal errors via libxml_use_internal_errors(true)
and then, get the errors via libxml_get_errors(). Be
sure to clear errors with libxml_clear_errors() afterward.
|
#
|
public
|
addXmlContent(string $content, string $charset = 'UTF-8', int $options = LIBXML_NONET)
Adds an XML content to the list of nodes.
Adds an XML content to the list of nodes.
The libxml errors are disabled when the content is parsed.
If you want to get parsing errors, be sure to enable
internal errors via libxml_use_internal_errors(true)
and then, get the errors via libxml_get_errors(). Be
sure to clear errors with libxml_clear_errors() afterward.
Parameters
|
#
|
public
|
addDocument(DOMDocument $dom)
Adds a \DOMDocument to the list of nodes.
Adds a \DOMDocument to the list of nodes.
Parameters
$dom |
A \DOMDocument instance
|
|
#
|
public
|
addNodeList(DOMNodeList $nodes)
Adds a \DOMNodeList to the list of nodes.
Adds a \DOMNodeList to the list of nodes.
Parameters
$nodes |
A \DOMNodeList instance
|
|
#
|
public
|
addNodes(DOMNode[] $nodes)
Adds an array of \DOMNode instances to the list of nodes.
Adds an array of \DOMNode instances to the list of nodes.
Parameters
$nodes |
An array of \DOMNode instances
|
|
#
|
public
|
addNode(DOMNode $node)
Adds a \DOMNode instance to the list of nodes.
Adds a \DOMNode instance to the list of nodes.
Parameters
$node |
A \DOMNode instance
|
|
#
|
public
|
eq(int $position): static
Returns a node given its position in the node list.
Returns a node given its position in the node list.
|
#
|
public
|
each(Closure $closure): array
Calls an anonymous function on each node of the list.
Calls an anonymous function on each node of the list.
The anonymous function receives the position and the node wrapped
in a Crawler instance as arguments.
Example:
$crawler->filter('h1')->each(function ($node, $i) {
return $node->text();
});
Parameters
$closure |
An anonymous function
|
Returns
An array of values returned by the anonymous function
|
#
|
public
|
slice(int $offset = 0, ?int $length = null): static
Slices the list of nodes by $offset and $length.
Slices the list of nodes by $offset and $length.
|
#
|
public
|
reduce(Closure $closure): static
Reduces the list of nodes by calling an anonymous function.
Reduces the list of nodes by calling an anonymous function.
To remove a node from the list, the anonymous function must return false.
Parameters
$closure |
An anonymous function
|
|
#
|
public
|
first(): static
Returns the first node of the current selection.
Returns the first node of the current selection.
|
#
|
public
|
last(): static
Returns the last node of the current selection.
Returns the last node of the current selection.
|
#
|
public
|
siblings(): static
Returns the siblings nodes of the current selection.
Returns the siblings nodes of the current selection.
Throws
|
#
|
public
|
matches(string $selector): bool
|
#
|
public
|
closest(string $selector): ?self
Return first parents (heading toward the document root) of the Element that matches the provided selector.
Return first parents (heading toward the document root) of the Element that matches the provided selector.
Throws
|
#
|
public
|
nextAll(): static
Returns the next siblings nodes of the current selection.
Returns the next siblings nodes of the current selection.
Throws
|
#
|
public
|
previousAll(): static
Returns the previous sibling nodes of the current selection.
Returns the previous sibling nodes of the current selection.
Throws
|
#
|
public
|
parents(): static
Returns the parent nodes of the current selection.
Returns the parent nodes of the current selection.
Throws
|
#
|
public
|
ancestors(): static
Returns the ancestors of the current selection.
Returns the ancestors of the current selection.
Throws
|
#
|
public
|
children(?string $selector = null): static
Returns the children nodes of the current selection.
Returns the children nodes of the current selection.
Throws
|
#
|
public
|
attr(string $attribute): string|null
Returns the attribute value of the first node of the list.
Returns the attribute value of the first node of the list.
Throws
|
#
|
public
|
nodeName(): string
Returns the node name of the first node of the list.
Returns the node name of the first node of the list.
Throws
|
#
|
public
|
text(string|null $default = null, bool $normalizeWhitespace = true): string
Returns the text of the first node of the list.
Returns the text of the first node of the list.
Pass true as the second argument to normalize whitespaces.
Parameters
$default |
When not null: the value to return when the current node is empty
|
$normalizeWhitespace |
Whether whitespaces should be trimmed and normalized to single spaces
|
Throws
|
#
|
public
|
innerText(): string
Returns only the inner text that is the direct descendent of the current node, excluding any child nodes.
Returns only the inner text that is the direct descendent of the current node, excluding any child nodes.
|
#
|
public
|
html(string|null $default = null): string
Returns the first node of the list as HTML.
Returns the first node of the list as HTML.
Parameters
$default |
When not null: the value to return when the current node is empty
|
Throws
|
#
|
public
|
outerHtml(): string
|
#
|
public
|
evaluate(string $xpath): array|Crawler
Evaluates an XPath expression.
Evaluates an XPath expression.
Since an XPath expression might evaluate to either a simple type or a \DOMNodeList,
this method will return either an array of simple types or a new Crawler instance.
|
#
|
public
|
extract(array $attributes): array
Extracts information from the list of nodes.
Extracts information from the list of nodes.
You can extract attributes or/and the node value (_text).
Example:
$crawler->filter('h1 a')->extract(['_text', 'href']);
|
#
|
public
|
filterXPath(string $xpath): static
Filters the list of nodes with an XPath expression.
Filters the list of nodes with an XPath expression.
The XPath expression is evaluated in the context of the crawler, which
is considered as a fake parent of the elements inside it.
This means that a child selector "div" or "./div" will match only
the div elements of the current crawler, not their children.
|
#
|
public
|
filter(string $selector): static
Filters the list of nodes with a CSS selector.
Filters the list of nodes with a CSS selector.
This method only works if you have installed the CssSelector Symfony Component.
Throws
|
#
|
public
|
selectLink(string $value): static
Selects links by name or alt value for clickable images.
Selects links by name or alt value for clickable images.
|
#
|
public
|
selectImage(string $value): static
Selects images by alt value.
Selects images by alt value.
|
#
|
public
|
selectButton(string $value): static
Selects a button by name or alt value for images.
Selects a button by name or alt value for images.
|
#
|
public
|
link(string $method = 'get'): Link
Returns a Link object for the first node in the list.
Returns a Link object for the first node in the list.
Throws
|
#
|
public
|
links(): Link[]
Returns an array of Link objects for the nodes in the list.
Returns an array of Link objects for the nodes in the list.
Throws
|
#
|
public
|
image(): Image
Returns an Image object for the first node in the list.
Returns an Image object for the first node in the list.
Throws
|
#
|
public
|
images(): Image[]
Returns an array of Image objects for the nodes in the list.
Returns an array of Image objects for the nodes in the list.
|
#
|
public
|
form(?array $values = null, ?string $method = null): Form
Returns a Form object for the first node in the list.
Returns a Form object for the first node in the list.
Throws
|
#
|
public
|
setDefaultNamespacePrefix(string $prefix)
Overloads a default namespace prefix to be used with XPath and CSS expressions.
Overloads a default namespace prefix to be used with XPath and CSS expressions.
|
#
|
public
|
registerNamespace(string $prefix, string $namespace)
|
#
|
public
static
|
xpathLiteral(string $s): string
Converts string for XPath expressions.
Converts string for XPath expressions.
Escaped characters are: quotes (") and apostrophe (').
Examples:
echo Crawler::xpathLiteral('foo " bar');
//prints 'foo " bar'
echo Crawler::xpathLiteral("foo ' bar");
//prints "foo ' bar"
echo Crawler::xpathLiteral('a'b"c');
//prints concat('a', "'", 'b"c')
|
#
|
public
|
getNode(int $position): DOMNode|null
|
#
|
public
|
count(): int
|
#
|
public
|
getIterator(): ArrayIterator<int, DOMNode>
|
#
|
protected
|
sibling(DOMNode $node, string $siblingDir = 'nextSibling'): array
|
#
|