Saturday, December 9, 2017

Dictionary: data.table/pandas

This post contains translations from Python pandas to R data.table. All examples are based on Titanic dataset (can be easily found on the internet).

Get sizes of groups


pandas

df.groupby('Pclass').size()

data.table

df[, .N, by=Pclass]

Filter groups by mean


pandas

df.groupby('Pclass').filter(lambda df: df['Age'].mean() > 30)

data.table

df[, .SD[mean(Age, na.rm=TRUE) > 30], by=Pclass]

Filter groups by function


pandas

df.groupby('Pclass').filter(func)

data.table

df[, .SD[func(.SD)], by=Pclass]

Apply functions to column in each group


pandas

df.groupby('Sex').agg({'Age': ['mean', 'std']})

data.table

df[, list(mean=mean(.SD$Age, na.rm=TRUE),
          std=sd(.SD$Age, na.rm=TRUE)), by=Pclass]

No comments:

Post a Comment