3/17/2016

Django Admin and millions of objects

Django admin is not really good when it comes to handling large amounts of data. Things got even worse if you use PostgreSQL as your database backend. Admin panel issuing a lot of select count(*) from table and postgres is quite slow performing this statement. Actually there is a page in postgres wiki dedicated to this issue - https://wiki.postgresql.org/wiki/Slow_Counting So when you deal with a lot of data in django you need to tweak admin a bit. Here is how.

12/11/2015

Multicolumn aggregation in pandas using function over multiple columns (weighted average example)

Here is how to perform multicolumn aggregation over a dataframe with user defined functions that depends on data in other dataframe columns. Code below is an example of weighted average implementation.

12/09/2015

Covert datetime to unixtime and back

Very often I need to convert datetime into unix timestamp to pass datetime values into javascript json response. One way to do this using standard library is:

12/07/2015

Running Apache Spark step from Python on AWS EMR

Here is how to send Apache Spark step with from Python script with Boto3 on Amazon Elastic Map Reduce cluster. I've recently needed to do this, and I spent some time figuring it out.