SqlAlchemy: count of distinct over multiple columns SqlAlchemy: count of distinct over multiple columns postgresql postgresql

SqlAlchemy: count of distinct over multiple columns


distinct() accepts more than one argument when appended to the query object:

session.query(Hit).distinct(Hit.ip_address, Hit.user_agent).count()

It should generate something like:

SELECT count(*) AS count_1FROM (SELECT DISTINCT ON (hit.ip_address, hit.user_agent)hit.ip_address AS hit_ip_address, hit.user_agent AS hit_user_agentFROM hit) AS anon_1

which is even a bit closer to what you wanted.


The exact query can be produced using the tuple_() construct:

session.query(    func.count(distinct(tuple_(Hit.ip_address, Hit.user_agent)))).scalar()


Looks like sqlalchemy distinct() accepts only one column or expression.

Another way around is to use group_by and count. This should be more efficient than using concat of two columns - with group by database would be able to use indexes if they do exist:

session.query(Hit.ip_address, Hit.user_agent).\    group_by(Hit.ip_address, Hit.user_agent).count()

Generated query would still look different from what you asked about:

SELECT count(*) AS count_1 FROM (SELECT hittable.user_agent AS hittableuser_agent, hittable.ip_address AS sometable_column2 FROM hittable GROUP BY hittable.user_agent, hittable.ip_address) AS anon_1