How to keep <p><img … /></p> with XPATH?
-
11-02-2021 - |
質問
I use XPATH to remove untidy HTML tags,
$nodeList = $xpath->query("//*[normalize-space(.)='' and not(self::br)]");
foreach($nodeList as $node)
{
$node->parentNode->removeChild($node);
}
will remove the horrible input like these,
<p><em><br /></em></p>
<p><span style="text-decoration: underline;"><em><br /></em></span></p>
but it also removes the img tag
like blow that I want to keep,
<p><img title="picture summit" src="images/32913430_127001_e.jpg" alt="picture summit" width="590" height="366" /></p>
How can I keep the img tag
input with XPATH?
解決
Use:
//p[not(descendant::*[self::img or self::br]) and normalize-space()='']
他のヒント
Maybe you could use an XPath 1.0 expression like the one below to remove unwanted paragraphs:
//p[count(text())=0 and count(img)=0]
所属していません StackOverflow