3/17/2016

Django Admin and millions of objects

Django admin is not really good when it comes to handling large amounts of data. Things got even worse if you use PostgreSQL as your database backend. Admin panel issuing a lot of select count(*) from table and postgres is quite slow performing this statement. Actually there is a page in postgres wiki dedicated to this issue - https://wiki.postgresql.org/wiki/Slow_Counting So when you deal with a lot of data in django you need to tweak admin a bit. Here is how.
First of all you will want to disable result count in entry admin model with the how_full_result_count = False option. But this disables result count only for filtered results. There is no way disabling this for plain admin section. To overcome slow counting problem with postgres we can use another postges feature. It is possible to obtain approximate object count from postgres statistics tables. So we are going to define our own version of QuerySet that uses this feature to count objects in result set (actually I'm not an author of the following snippet, found it somewhere some time ago but is works good for me):

class ApproxCountPgQuerySet(QuerySet):
  """approximate unconstrained count(*) with reltuples from pg_class"""

  def count(self):
      if self._result_cache is not None and not self._iter:
          return len(self._result_cache)

      if hasattr(connections[self.db].client.connection, 'pg_version'):
          query = self.query
          if (not query.where and query.high_mark is None
              and query.low_mark == 0 and not query.select
              and not query.group_by and not query.having and
              not query.distinct):
              # If query has no constraints, we would be simply doing
              # "SELECT COUNT(*) FROM foo". Monkey patch so the we get an
              # approximation instead.
              parts = [
                  p.strip('"') for p in self.model._meta.db_table.split('.')]
              cursor = connections[self.db].cursor()
              if len(parts) == 1:
                  cursor.execute(
                      "select reltuples::bigint FROM pg_class WHERE "
                      "relname = %s", parts)
              else:
                  cursor.execute(
                      "select reltuples::bigint FROM pg_class c JOIN "
                      "pg_namespace n on (c.relnamespace = n.oid) WHERE "
                      "n.nspname = %s AND c.relname = %s", parts)
          return cursor.fetchall()[0][0]
      return self.query.get_count(using=self.db)
and then we are going to use it in model admin as follows:

@admin.register(Entry)
class EntryAdmin(admin.ModelAdmin):

    def get_queryset(self, request):
        qs = super(EntryAdmin, self).get_queryset(request)
        return qs._clone(klass=ApproxCountPgQuerySet)
...
Now you can work with millions of records in django admin. Enjoy.

1 comment:

  1. Hello.. thanks for response, but.. it Doesn't for me

    ReplyDelete