Postgres String Aggregations
Contents
Aggregations in SQL are used to get useful insights from large set of data by doing some operations on it. Different scenarios of aggregations are,
- Aggregate the entire result set into 1 single value. eg: max value of a column
- Aggregate the entire result set into 1 single row. eg: max and min values of a column. Note that ALL the columns in this row have to be aggregates
- Aggregate the entire result set into a subset of rows, with
- one row for each unique value for column specified by group by clause. eg: max value of bill amount for each date
- one row for each unique combination of values for columns specified in group by clause eg: max value of bill amount for each date for each store
When we talk about aggregations, usually mathematical or statistical operations like sum, avg,count etc comes to mind. Postgres also provides pretty good text aggregation functions also. Let’s explore one use case
Scenario
Suppose you are a smoke alarm sensor rental company. You have installed your equipments in different locations
Assume for a moment that your database is not yet perfectly normalised or just that you are not storing Foreign keys in your time series
table after reading some guides which says that FK’s
affect insert performance. So your schema for alarms table is like this,
| device_id | location | alarm_time | other fields.. |
|---|
Then assume that you have a requirement to get the list of devices which generated the highest number of alarms in past 30 days. your query would be,
|
|
Now you know the devices which generated the highest number of alarms. Good. But the device id in itself doesn’t tell you much. You give this to your safety officer, and he asks where these devices are located. After all, the location info is readily available in the table itself. But there is a catch.
Once you do a group by, all the selected columns in that query has to be aggregated
You can’t just specify a single column, you have to aggregate it. So, how would you do that ? Well, you could use string_agg aggregation function. To eliminate duplicates, we can use the distinct keyword inside it. The final query is
|
|
Coming from the land of Django ORM, I have only started my pure SQL journey only about a year back. So, please let me know if there are better ways to do things that I mention in these posts.
Good Resources
Author Jeffin Mathew
LastMod 2021-09-22