Gaao ka chhora

March 30, 2017

Some useful Xpath expressions to extract correct element from Html/XML.
---------------------------------------------------------------------------------

I faced many issues while trying to learn to write Xpath expressions when i was trying to parse correct elements from Html and XML documents.

Here are some of the long and complex XPaths with their explanations.

1. xpathApply(doc,"//p[not(ancestor::footer[@class='footer footer-intl'])][not(@class='email-signup-description')]",xmlValue)

Explanation: Get the value of those "p" nodes, which do not have "footer" ancestors with class aatributes value as "footer footer-intl" and which do not have "footer" ancestors with value of class attribute as "email-signup-description".

2. xpathSApply(doc,"//p[not(contains(@class,'promo__summary'))]", xmlValue)

Explanation: Get the value of those "p" nodes, where value of "class" attribute does not contain the text "promo__summary".

3.xpathApply(doc,"//p[not(ancestor::div[@class='story'])][not(ancestor::div[@class='disclaimer'])][not(ancestor::div[@class='footer-policy-links'])]",xmlValue)

Explanation: Get the value of those "p" nodes where following conditions are met

"div" element which has "class" attribute as 'story' should not be an ancestor.
"div" element which has "class" attribute as 'disclaimer' should not be an ancestor.
"div" element which has "class" attribute as 'footer-policy-links' should not be an ancestor.

These XPaths will give you an indication on how to parse appropriate node values from Html/Xml.

Search This Blog

Gaao ka chhora

Some useful Xpath expressions to extract correct element from Html/XML.
---------------------------------------------------------------------------------

Comments

Post a Comment

Popular posts from this blog

Ye Jo Khili Dhoop hai na jaane..!!

Reading CSVs through pandas containing Japanese character in some columns.

Some useful Xpath expressions to extract correct element from Html/XML.---------------------------------------------------------------------------------

Comments

Post a Comment

Popular posts from this blog

Ye Jo Khili Dhoop hai na jaane..!!

Reading CSVs through pandas containing Japanese character in some columns.

Some useful Xpath expressions to extract correct element from Html/XML.
---------------------------------------------------------------------------------