userid | last_login |
---|---|
651651 | 2013-01-05 21:39:45 |
894156 | 2013-01-04 11:31:00 |
132546 | 2013-01-04 15:01:49 |
654713 | 2013-01-04 01:05:16 |
654231 | 2013-01-02 16:40:57 |
687413 | 2012-12-31 23:40:57 |
We would like to know on each day, how many users logged in last, so wee need an SQL to list all days and select the count of users last logged in on that day.
The first task is to find an SQL with which consecutive days can be listed.
The query we are looking for is the following:
select i::date from generate_series('2012-12-30', '2013-01-05', '1 day'::interval) i;This produces the output:
day |
---|
2012-12-30 |
2012-12-31 |
2013-01-01 |
2013-01-02 |
2013-01-03 |
2013-01-04 |
2013-01-05 |
Great, now we need a criteria for the last_login for each day. The previous query generates timestamps with 00:00:00.000, so we need an interval with the current day as the beginning, and the next day as the end of the interval we are looking for.
Withe the help of the query
select (i::date + '1 day'::interval) from generate_series('2012-12-30', '2013-01-05', '1 day'::interval) i;we will be able to accomplish this task. Its output is quite similar with the previous one, but in this case the result set is shifted into the future with one day (and the precision changed, but this is not important for our point of view, because we could achieved the same precision with our first query selecting i:timestamp instead of i:date).
day |
---|
2012-12-31 00:00:00.000 |
2013-01-01 00:00:00.000 |
2013-01-02 00:00:00.000 |
2013-01-03 00:00:00.000 |
2013-01-04 00:00:00.000 |
2013-01-05 00:00:00.000 |
2013-01-06 00:00:00.000 |
Superb, we now only need to combine the queries to build our desired selection:
select to_char(i::date, 'YYYY.MM.DD') as last_login_day, count(u.userid) as users from generate_series('2012-12-30', '2013-01-05', '1 day'::interval) i, users u where (i::date + '1 day'::interval) > u.last_login and i::date < u.last_login group by last_login_day order by last_login_day:
The result shows the correct values, but unfortunately, the days which none of the users chose to be their last login date are not show up in the result set:
last_login_day | users |
---|---|
2012-12-31 | 1 |
2013-01-02 | 1 |
2013-01-04 | 3 |
2013-01-05 | 1 |
This is actually not a big problem, and can be solved by using a union. We create an other selection for displaying all days with 0 user counts, and take the union of the two results by grouping by the days and summing up the users (of course the additional order by is needed as well):
select tmp.last_login_day, sum(tmp.users) from ( ( select to_char(i::date, 'YYYY.MM.DD') as last_login_day, count(u.userid) as users from generate_series('2012-12-30', '2013-01-05', '1 day'::interval) i, users u where (i::date + '1 day'::interval) > u.last_login and i::date < u.last_login group by last_login_day order by last_login_day ) union ( select to_char(i::date, 'YYYY.MM.DD') as last_login_day, 0 as users from generate_series('2012-12-30', '2013-01-05', '1 day'::interval) i ) ) as tmp group by 1 order by 1
The results are what we wanted to achieve in the first place:
last_login_day | users |
---|---|
2012-12-30 | 0 |
2012-12-31 | 1 |
2013-01-01 | 0 |
2013-01-02 | 1 |
2013-01-03 | 0 |
2013-01-04 | 3 |
2013-01-05 | 1 |
No comments:
Post a Comment