Some useful Xpath expressions to extract correct element from Html/XML.
---------------------------------------------------------------------------------

I faced many issues while trying to learn to write Xpath expressions when i was trying to parse correct elements from Html and XML documents.

Here are some of the long and complex XPaths with their explanations.

1. xpathApply(doc,"//p[not(ancestor::footer[@class='footer footer-intl'])][not(@class='email-signup-description')]",xmlValue)

Explanation: Get the value of those "p" nodes, which do not have "footer" ancestors with class aatributes value as "footer footer-intl" and which do not have "footer" ancestors with value of class attribute as "email-signup-description".

2. xpathSApply(doc,"//p[not(contains(@class,'promo__summary'))]", xmlValue)

Explanation: Get the value of those "p" nodes, where value of "class" attribute does not contain the text "promo__summary".

3.xpathApply(doc,"//p[not(ancestor::div[@class='story'])][not(ancestor::div[@class='disclaimer'])][not(ancestor::div[@class='footer-policy-links'])]",xmlValue)

Explanation: Get the value of those "p" nodes where following conditions are met

  •   "div" element which has "class" attribute as 'story' should not be an ancestor.
  •   "div" element which has "class" attribute as 'disclaimer' should not be an ancestor.
  •   "div" element which has "class" attribute as 'footer-policy-links' should not be an ancestor.

These XPaths will give you an indication on how to parse appropriate node values from Html/Xml.

Comments

Popular posts from this blog