Posts

Showing posts from 2019
Selecting rows of a dataframe based on value/string pattern in a column Comparing a string pattern with values in a column of a pandas DataFrame. Many times i have seen that, x_df.loc[x_df.column_A == "xyz",:] fails to return the rows which contain "xyz" in the column_A of Data Frame x_df Its always better to use below code for this task of selecting rows where column_A has "xyz" as its value, x_df[x_df['column_A'].str.contains("xyz")]
'Sum' with Groupby() on a column in pandas. ====================================================== When we use aggregation function 'sum' on the result of groupby() on a column as below. x_df.groupby(['Column1']).sum() If there are multiple columns in data frame x_df. Only those columns will come in the result which are of type "float". If there are columns, where type of elements are of type "int", it will not come in the result if above code. Check the type of elements of columns if any column is not coming in the result.
when we use "apply" with a custom function with "groupby" on a pandas data frame. the customer function inside the apply() gets called twice. this is by design of pandas. Issue in the pandas. https://github.com/pandas-dev/pandas/issues/2656 My question related to this issue on stackoverflow, https://stackoverflow.com/questions/53698605/all-columns-are-not-passed-when-we-use-apply-on-result-of-groupby-with-a-custom