Posts

Showing posts from March, 2017
Some useful Xpath expressions to extract correct element from Html/XML. --------------------------------------------------------------------------------- I faced many issues while trying to learn to write Xpath expressions when i was trying to parse correct elements from Html and XML documents. Here are some of the long and complex XPaths with their explanations. 1. xpathApply(doc,"//p[not(ancestor::footer[@class='footer footer-intl'])][not(@class='email-signup-description')]",xmlValue) Explanation: Get the value of those "p" nodes, which do not have "footer" ancestors with class aatributes value as "footer footer-intl" and which do not have "footer" ancestors with value of class attribute as "email-signup-description". 2. xpathSApply(doc,"//p[not(contains(@class,'promo__summary'))]", xmlValue) Explanation: Get the value of those "p" nodes, where value of "class&quo
Facebook extraction using R. ( For beginners of Facebook App development).  ---------------------------------------------------------------------------------------------- It took me lot of time to authenticate with Facebook when i was trying to extract data from facebook using Rfacebook package in R language. An important step which is not documented properly while authenticating our APP using R code. After creating the Application, Go to Settings and click on "add platform". Their you will find a textbox with name "Site URL"  fbOAuth(app_id = "XXXXXXXXXXXXXX",app_secret = "XXXXXXXXXXXXXXXXX") When we run above code, on console with get a small link. "http://localhost:1410/ ", This link has to be pasted in  Site URL text box in App settings and saved. After we do this, we should press "Enter" on the R console, which lets this API authenticate the APP on Facebook. Once this is done the APP is authenticate and a browse
Very important thing to remember while making RMarkdown file. --------------------------------------------------------------------------------- When coding the different r code chunks of a markdown file it is very intuitive and tempting to copy paste code from other chunks in the same file for similar analysis.  Please do not just copy paste codes from other chunks. When you use same variable names in more than on code chunks, evaluation of new chunks will effect the working of earlier chunks from where we have copy pasted. I faced this issue and it took me lots and lots of time to realize that, my variable values were being manipulated in many chunks so i was getting wrong values. Even if you copy paste the logic from other code chunks, Dont forget to rename the variables , so that each chunk has its own variables and chunks don't modify variables from other code chunks.