{"id":195,"date":"2018-11-15T10:13:51","date_gmt":"2018-11-15T04:43:51","guid":{"rendered":"https:\/\/www.virtueinfo.com\/blog\/?p=195"},"modified":"2018-11-26T12:37:01","modified_gmt":"2018-11-26T07:07:01","slug":"get-last-record-in-each-mysql-group-the-efficient-way-and-all-possible-alternatives","status":"publish","type":"post","link":"https:\/\/www.virtueinfo.com\/blog\/get-last-record-in-each-mysql-group-the-efficient-way-and-all-possible-alternatives\/","title":{"rendered":"Get Last Record in Each MySQL Group, the efficient way and all possible alternatives"},"content":{"rendered":"<p>This article discusses all possible alternatives for getting the first or last records from each mysql group.<br \/>\nConsider this sample table ( user_log ) for example:<\/p>\n<table class=\"table-database\" cellspacing=\"0\" cellpadding=\"0\" border=\"0\" width=\"100%\">\n<thead>\n<tr>\n<td colspan=\"4\">Table ( user_log )<\/td>\n<\/tr>\n<tr>\n<th>id<\/th>\n<th>user_id<\/th>\n<th>first_name<\/th>\n<th>last_name<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>1<\/td>\n<td>1<\/td>\n<td>fname1.1<\/td>\n<td>lname 1.1<\/td>\n<\/tr>\n<tr>\n<td>2<\/td>\n<td>1<\/td>\n<td>fname1.2<\/td>\n<td>lname 1.2<\/td>\n<\/tr>\n<tr>\n<td>3<\/td>\n<td>1<\/td>\n<td>fname1.3<\/td>\n<td>lname 1.3<\/td>\n<\/tr>\n<tr>\n<td>4<\/td>\n<td>1<\/td>\n<td>fname1.4<\/td>\n<td>lname 1.4<\/td>\n<\/tr>\n<tr>\n<td>5<\/td>\n<td>2<\/td>\n<td>fname2.1<\/td>\n<td>lname 2.1<\/td>\n<\/tr>\n<tr>\n<td>6<\/td>\n<td>2<\/td>\n<td>fname2.2<\/td>\n<td>lname 2.2<\/td>\n<\/tr>\n<tr>\n<td>7<\/td>\n<td>2<\/td>\n<td>fname2.3<\/td>\n<td>lname 2.3<\/td>\n<\/tr>\n<tr>\n<td>8<\/td>\n<td>2<\/td>\n<td>fname2.4<\/td>\n<td>lname 2.4<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<pre class=\"lang:mysql decode:true \" title=\"SQL\" >CREATE TABLE IF NOT EXISTS `user_log` (\r\n`id` int(11) NOT NULL AUTO_INCREMENT,\r\n\r\n`user_id` int(11) NOT NULL,\r\n\r\n`first_name` varchar(255) NOT NULL,\r\n\r\n`last_name` varchar(255) NOT NULL, PRIMARY KEY (`id`)\r\n\r\n) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=1;<\/pre>\n<pre class=\"lang:mysql decode:true \" title=\"Insert\" >INSERT INTO `user_log` \r\n(`id`, `user_id`, `first_name`, `last_name`) VALUES\r\n\r\n(1, 1, 'fname1.1', 'lname1.1'), (2, 1, 'fname1.2', 'lname1.2'),\r\n\r\n(3, 1, 'fname1.3', 'lname1.3'), (4, 1, 'fname1.4', 'lname1.4'),\r\n\r\n(5, 2, 'fname2.1', 'lname2.1'), (6, 2, 'fname2.2', 'lname2.2'),\r\n\r\n(7, 2, 'fname2.3', 'lname2.3'), (8, 2, 'fname2.4', 'lname2.4');<\/pre>\n<p>In this table, there are 8 records of 2 users [user_id = 1, 2].<br \/>\nWe want a query to get either first or last record in each user_id group. i.e. ( first or last records with user_id = 1, 2 ).<\/p>\n<p>Here are possible solutions, their &#8220;Explain&#8221; statements and performance matrix (every query executed 3 times to check timing) to check which query would be best according to your requirements.<\/p>\n<h5>Query 1<\/h5>\n<p>Query 1 shows the conceptually simplest approach which is querying the table with id where id is in sub query which returns the maximum id group by user_id. Explain statement shows the query is executed in two parts as PRIMARY and DEPENDENT SUBQUERY iterating whole table 2 times.<br \/>\nAs we can see in the performance matrix below the Query 1 is time consuming and hence not recommend.<\/p>\n<pre class=\"lang:mysql decode:true \" title=\"Query\" >SELECT * \r\nFROM user_log \r\nwhere id IN \r\n( \r\nSELECT max(id) \r\nFROM `user_log` \r\ngroup by user_id \r\n);<\/pre>\n<table class=\"table-database\" cellspacing=\"0\" cellpadding=\"0\" border=\"0\" width=\"100%\">\n<thead>\n<tr>\n<td colspan=\"4\">SQL Result<\/td>\n<\/tr>\n<tr>\n<th>id<\/th>\n<th>user_id<\/th>\n<th>first_name<\/th>\n<th>last_name<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>4<\/td>\n<td>1<\/td>\n<td>fname1.4<\/td>\n<td>lname 1.4<\/td>\n<\/tr>\n<tr>\n<td>8<\/td>\n<td>2<\/td>\n<td>fname2.4<\/td>\n<td>lname 2.4<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<div class=\"table-responsive\">\n<table class=\"table-database\" cellspacing=\"0\" cellpadding=\"0\" border=\"0\" width=\"100%\">\n<thead>\n<tr>\n<td colspan=\"10\">Explain Result<\/td>\n<\/tr>\n<tr>\n<th>id<\/th>\n<th>select_type<\/th>\n<th>table<\/th>\n<th>type<\/th>\n<th>possible_keys<\/th>\n<th>key<\/th>\n<th>key_len<\/th>\n<th>ref<\/th>\n<th>rows<\/th>\n<th>extra<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>1<\/td>\n<td>PRIMARY<\/td>\n<td>user_log<\/td>\n<td>ALL<\/td>\n<td><i>NULL<\/i><\/td>\n<td><i>NULL<\/i><\/td>\n<td><i>NULL<\/i><\/td>\n<td><i>NULL<\/i><\/td>\n<td>8<\/td>\n<td>Using where<\/td>\n<\/tr>\n<tr>\n<td>2<\/td>\n<td>DEPENDENT SUBQUERY<\/td>\n<td>user_log<\/td>\n<td>ALL<\/td>\n<td><i>NULL<\/i><\/td>\n<td><i>NULL<\/i><\/td>\n<td><i>NULL<\/i><\/td>\n<td><i>NULL<\/i><\/td>\n<td>8<\/td>\n<td>Using temporary; Using filesort<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<table class=\"table-matrix\" cellspacing=\"0\" cellpadding=\"0\" border=\"0\" width=\"100%\">\n<thead>\n<tr>\n<td colspan=\"4\">Performance Matrix*<\/td>\n<\/tr>\n<tr>\n<th>Rows<\/th>\n<th>2GB<\/th>\n<th>4GB<\/th>\n<th>8GB<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<th>10000<\/th>\n<td class=\"red-bg\">99.24<\/td>\n<td class=\"red-bg\">152.7567<\/td>\n<td class=\"red-bg\">46.21667<\/td>\n<\/tr>\n<tr>\n<th>25000<\/th>\n<td class=\"red-bg\">617.7533<\/td>\n<td class=\"red-bg\">929.76<\/td>\n<td class=\"red-bg\">295.9333<\/td>\n<\/tr>\n<tr>\n<th>50000<\/th>\n<td class=\"red-bg\">2646.973<\/td>\n<td class=\"red-bg\">3742.683<\/td>\n<td class=\"red-bg\">1212.503<\/td>\n<\/tr>\n<tr>\n<th>100000<\/th>\n<td class=\"red-bg\">11447.51<\/td>\n<td class=\"red-bg\">15600.39<\/td>\n<td class=\"red-bg\">4719.04<\/td>\n<\/tr>\n<tr>\n<th>200000<\/th>\n<td class=\"red-bg\">32400<\/td>\n<td class=\"red-bg\">28800<\/td>\n<td class=\"red-bg\">19544.14<\/td>\n<\/tr>\n<tr>\n<td colspan=\"4\">*Time is in seconds<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<table class=\"table-matrix\" cellspacing=\"0\" cellpadding=\"0\" border=\"0\" width=\"100%\">\n<thead>\n<tr>\n<td>Performance Matrix Chart<\/td>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td class=\"p0\"><img loading=\"lazy\" decoding=\"async\" width=\"748\" height=\"457\" src=\"https:\/\/www.virtueinfo.com\/blog\/wp-content\/uploads\/2018\/11\/chart-query-1.png\" alt=\"\" class=\"alignnone size-full wp-image-197\" srcset=\"https:\/\/www.virtueinfo.com\/blog\/wp-content\/uploads\/2018\/11\/chart-query-1.png 748w, https:\/\/www.virtueinfo.com\/blog\/wp-content\/uploads\/2018\/11\/chart-query-1-300x183.png 300w\" sizes=\"auto, (max-width: 706px) 89vw, (max-width: 767px) 82vw, 740px\" \/><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<table class=\"table-matrix\" cellspacing=\"0\" cellpadding=\"0\" border=\"0\" width=\"100%\">\n<thead>\n<tr>\n<td width=\"50%\">Pros<\/td>\n<td width=\"50%\">Cons<\/td>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>\n<ul class=\"list-bullets\">\n<li>Simple to understand<\/li>\n<li>Runs fine with small data set<\/li>\n<\/ul>\n<\/td>\n<td>\n<ul class=\"list-bullets\">\n<li>Not ideal for big data sets<\/li>\n<li>It refers to the whole table twice<\/li>\n<li>Slow for handling table with many rows<\/li>\n<\/ul>\n<\/td>\n<\/tr>\n<tr>\n<td colspan=\"2\" class=\"red-bg\">Conclusion : NOT RECOMMENDED<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<div class=\"block-separator\"><\/div>\n<h5>Query 2<\/h5>\n<p>Query 2 shows the approach where we query the table with the help of derived table t_max. Explain statement shows the query is executed in two parts as PRIMARY and DERIVED iterating whole table 2 times.<br \/>\nAs we can see in the performance matrix below the Query 2 is better than Query 1 but still time consuming and hence not recommended.<\/p>\n<pre class=\"lang:mysql decode:true \" title=\"Query\" >SELECT * \r\nFROM ( \r\nSELECT max(id) as high, user_id, \r\nfirst_name, last_name \r\nFROM user_log \r\nGROUP BY user_id DESC, id DESC \r\n) as t_max \r\nGROUP BY t_max.user_id;<\/pre>\n<table class=\"table-database\" cellspacing=\"0\" cellpadding=\"0\" border=\"0\" width=\"100%\">\n<thead>\n<tr>\n<td colspan=\"4\">SQL Result<\/td>\n<\/tr>\n<tr>\n<th>id<\/th>\n<th>user_id<\/th>\n<th>first_name<\/th>\n<th>last_name<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>4<\/td>\n<td>1<\/td>\n<td>fname1.4<\/td>\n<td>lname 1.4<\/td>\n<\/tr>\n<tr>\n<td>8<\/td>\n<td>2<\/td>\n<td>fname2.4<\/td>\n<td>lname 2.4<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<div class=\"table-responsive\">\n<table class=\"table-database\" cellspacing=\"0\" cellpadding=\"0\" border=\"0\" width=\"100%\">\n<thead>\n<tr>\n<td colspan=\"10\">Explain Result<\/td>\n<\/tr>\n<tr>\n<th>id<\/th>\n<th>select_type<\/th>\n<th>table<\/th>\n<th>type<\/th>\n<th>possible_keys<\/th>\n<th>key<\/th>\n<th>key_len<\/th>\n<th>ref<\/th>\n<th>rows<\/th>\n<th>extra<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>1<\/td>\n<td>PRIMARY<\/td>\n<td><derived2><\/td>\n<td>ALL<\/td>\n<td><i>NULL<\/i><\/td>\n<td><i>NULL<\/i><\/td>\n<td><i>NULL<\/i><\/td>\n<td><i>NULL<\/i><\/td>\n<td>8<\/td>\n<td>Using temporary; Using filesort<\/td>\n<\/tr>\n<tr>\n<td>2<\/td>\n<td>DERIVED<\/td>\n<td>user_log<\/td>\n<td>ALL<\/td>\n<td><i>NULL<\/i><\/td>\n<td><i>NULL<\/i><\/td>\n<td><i>NULL<\/i><\/td>\n<td><i>NULL<\/i><\/td>\n<td>8<\/td>\n<td>Using temporary; Using filesort<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<table class=\"table-matrix\" cellspacing=\"0\" cellpadding=\"0\" border=\"0\" width=\"100%\">\n<thead>\n<tr>\n<td colspan=\"4\">Performance Matrix*<\/td>\n<\/tr>\n<tr>\n<th>Rows<\/th>\n<th>2GB<\/th>\n<th>4GB<\/th>\n<th>8GB<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<th>10000<\/th>\n<td class=\"green-bg\">0.106667<\/td>\n<td class=\"green-bg\">0.06<\/td>\n<td class=\"green-bg\">0.03<\/td>\n<\/tr>\n<tr>\n<th>25000<\/th>\n<td class=\"green-bg\">0.203333<\/td>\n<td class=\"green-bg\">0.173333<\/td>\n<td class=\"green-bg\">0.08<\/td>\n<\/tr>\n<tr>\n<th>50000<\/th>\n<td class=\"red-bg\">1.516667<\/td>\n<td class=\"red-bg\">1.773333<\/td>\n<td class=\"red-bg\">1.01<\/td>\n<\/tr>\n<tr>\n<th>100000<\/th>\n<td class=\"red-bg\">4.013333<\/td>\n<td class=\"red-bg\">4.16<\/td>\n<td class=\"red-bg\">2.896667<\/td>\n<\/tr>\n<tr>\n<th>200000<\/th>\n<td class=\"red-bg\">9.356667<\/td>\n<td class=\"red-bg\">10.03333<\/td>\n<td class=\"red-bg\">6.173333<\/td>\n<\/tr>\n<tr>\n<td colspan=\"4\">*Time is in seconds<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<table class=\"table-matrix\" cellspacing=\"0\" cellpadding=\"0\" border=\"0\" width=\"100%\">\n<thead>\n<tr>\n<td>Performance Matrix Chart<\/td>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td class=\"p0\"><img loading=\"lazy\" decoding=\"async\" width=\"477\" height=\"291\" src=\"https:\/\/www.virtueinfo.com\/blog\/wp-content\/uploads\/2018\/11\/chart-query-2.png\" alt=\"\" class=\"alignnone size-full wp-image-198\" srcset=\"https:\/\/www.virtueinfo.com\/blog\/wp-content\/uploads\/2018\/11\/chart-query-2.png 477w, https:\/\/www.virtueinfo.com\/blog\/wp-content\/uploads\/2018\/11\/chart-query-2-300x183.png 300w\" sizes=\"auto, (max-width: 477px) 100vw, 477px\" \/><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<table class=\"table-matrix\" cellspacing=\"0\" cellpadding=\"0\" border=\"0\" width=\"100%\">\n<thead>\n<tr>\n<td width=\"50%\">Pros<\/td>\n<td width=\"50%\">Cons<\/td>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>\n<ul class=\"list-bullets\">\n<li>Simple to understand<\/li>\n<li>Runs fine with small data set<\/li>\n<\/ul>\n<\/td>\n<td>\n<ul class=\"list-bullets\">\n<li>Not ideal for big data sets<\/li>\n<li>It refers to the whole table twice<\/li>\n<li>Slow for handling table with many rows<\/li>\n<li>Joins on derived tables are more expensive since you lose control over indexing<\/li>\n<\/ul>\n<\/td>\n<\/tr>\n<tr>\n<td colspan=\"2\" class=\"red-bg\">Conclusion : NOT RECOMMENDED<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<div class=\"block-separator\"><\/div>\n<h5>Query 3<\/h5>\n<p>Query 3 shows the approach of Joins. Explain statement shows the query is executed in two parts as SIMPLE and SIMPLE iterating whole table 2 times.<br \/>\nAs we can see in the performance matrix below the Query 3 too time consuming so not recommended.<\/p>\n<pre class=\"lang:mysql decode:true \" title=\"Query\" >SELECT MAX(u1.id) as id, u1.user_id, \r\nu1.first_name, u1.last_name\r\nFROM user_log u1 \r\nLEFT JOIN user_log u2\r\nON u1.user_id = u2.user_id \r\nAND u1.id < u2.id\r\nWHERE u2.id IS NULL \r\nGROUP BY u1.user_id;<\/pre>\n<table class=\"table-database\" cellspacing=\"0\" cellpadding=\"0\" border=\"0\" width=\"100%\">\n<thead>\n<tr>\n<td colspan=\"4\">SQL Result<\/td>\n<\/tr>\n<tr>\n<th>id<\/th>\n<th>user_id<\/th>\n<th>first_name<\/th>\n<th>last_name<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>4<\/td>\n<td>1<\/td>\n<td>fname1.4<\/td>\n<td>lname 1.4<\/td>\n<\/tr>\n<tr>\n<td>8<\/td>\n<td>2<\/td>\n<td>fname2.4<\/td>\n<td>lname 2.4<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<div class=\"table-responsive\">\n<table class=\"table-database\" cellspacing=\"0\" cellpadding=\"0\" border=\"0\" width=\"100%\">\n<thead>\n<tr>\n<td colspan=\"10\">Explain Result<\/td>\n<\/tr>\n<tr>\n<th>id<\/th>\n<th>select_type<\/th>\n<th>table<\/th>\n<th>type<\/th>\n<th>possible_keys<\/th>\n<th>key<\/th>\n<th>key_len<\/th>\n<th>ref<\/th>\n<th>rows<\/th>\n<th>extra<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>1<\/td>\n<td>SIMPLE<\/td>\n<td>u1<\/td>\n<td>ALL<\/td>\n<td><i>NULL<\/i><\/td>\n<td><i>NULL<\/i><\/td>\n<td><i>NULL<\/i><\/td>\n<td><i>NULL<\/i><\/td>\n<td>8<\/td>\n<td>Using temporary; Using filesort<\/td>\n<\/tr>\n<tr>\n<td>1<\/td>\n<td>SIMPLE<\/td>\n<td>u2<\/td>\n<td>ALL<\/td>\n<td>PRIMARY<\/td>\n<td><i>NULL<\/i><\/td>\n<td><i>NULL<\/i><\/td>\n<td><i>NULL<\/i><\/td>\n<td>8<\/td>\n<td>Using where; Not exists<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<table class=\"table-matrix\" cellspacing=\"0\" cellpadding=\"0\" border=\"0\" width=\"100%\">\n<thead>\n<tr>\n<td colspan=\"4\">Performance Matrix*<\/td>\n<\/tr>\n<tr>\n<th>Rows<\/th>\n<th>2GB<\/th>\n<th>4GB<\/th>\n<th>8GB<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<th>10000<\/th>\n<td class=\"red-bg\">30.22667<\/td>\n<td class=\"red-bg\">39.01333<\/td>\n<td class=\"red-bg\">12.23667<\/td>\n<\/tr>\n<tr>\n<th>25000<\/th>\n<td class=\"red-bg\">188.9967<\/td>\n<td class=\"red-bg\">240.8033<\/td>\n<td class=\"red-bg\">80.59<\/td>\n<\/tr>\n<tr>\n<th>50000<\/th>\n<td class=\"red-bg\">756.0533<\/td>\n<td class=\"red-bg\">965.7833<\/td>\n<td class=\"red-bg\">308.7467<\/td>\n<\/tr>\n<tr>\n<th>100000<\/th>\n<td class=\"red-bg\">2946.127<\/td>\n<td class=\"red-bg\">3916.707<\/td>\n<td class=\"red-bg\">1195.153<\/td>\n<\/tr>\n<tr>\n<th>200000<\/th>\n<td class=\"red-bg\">11581.11<\/td>\n<td class=\"red-bg\">15778.77<\/td>\n<td class=\"red-bg\">4821.28<\/td>\n<\/tr>\n<tr>\n<td colspan=\"4\">*Time is in seconds<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<table class=\"table-matrix\" cellspacing=\"0\" cellpadding=\"0\" border=\"0\" width=\"100%\">\n<thead>\n<tr>\n<td>Performance Matrix Chart<\/td>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td class=\"p0\"><img loading=\"lazy\" decoding=\"async\" width=\"748\" height=\"458\" src=\"https:\/\/www.virtueinfo.com\/blog\/wp-content\/uploads\/2018\/11\/chart-query-3.png\" alt=\"\" class=\"alignnone size-full wp-image-199\" srcset=\"https:\/\/www.virtueinfo.com\/blog\/wp-content\/uploads\/2018\/11\/chart-query-3.png 748w, https:\/\/www.virtueinfo.com\/blog\/wp-content\/uploads\/2018\/11\/chart-query-3-300x184.png 300w\" sizes=\"auto, (max-width: 706px) 89vw, (max-width: 767px) 82vw, 740px\" \/><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<table class=\"table-matrix\" cellspacing=\"0\" cellpadding=\"0\" border=\"0\" width=\"100%\">\n<thead>\n<tr>\n<td width=\"50%\">Pros<\/td>\n<td width=\"50%\">Cons<\/td>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>\n<ul class=\"list-bullets\">\n<li>Simple to understand<\/li>\n<li>Runs fine with small data set<\/li>\n<\/ul>\n<\/td>\n<td>\n<ul class=\"list-bullets\">\n<li>Not ideal for big data sets<\/li>\n<li>It refers to the whole table twice<\/li>\n<li>Slow for handling table with many rows<\/li>\n<li>Joins on derived tables are more expensive since you lose control over indexing<\/li>\n<\/ul>\n<\/td>\n<\/tr>\n<tr>\n<td colspan=\"2\" class=\"red-bg\">Conclusion : NOT RECOMMENDED<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<div class=\"block-separator\"><\/div>\n<h5>Query 4<\/h5>\n<p>Query 4 shows the approach of Joins with derived table. Explain statement shows the query is executed in three parts as PRIMARY, PRIMARY and DERIVED but the table is not iterated 2 times here.<br \/>\nAs we can see in the performance matrix below the Query 4 is very time efficient so it is recommended.<\/p>\n<pre class=\"lang:mysql decode:true \" title=\"Query\" >SELECT u1.*\r\nFROM user_log u1\r\nINNER JOIN (\r\nSELECT user_id, MAX(id) AS mId\r\nFROM user_log\r\nGROUP BY user_id ASC\r\n) u2\r\nON u1.user_id = u2.user_id\r\nAND u1.id = u2.mId;<\/pre>\n<table class=\"table-database\" cellspacing=\"0\" cellpadding=\"0\" border=\"0\" width=\"100%\">\n<thead>\n<tr>\n<td colspan=\"4\">SQL Result<\/td>\n<\/tr>\n<tr>\n<th>id<\/th>\n<th>user_id<\/th>\n<th>first_name<\/th>\n<th>last_name<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>4<\/td>\n<td>1<\/td>\n<td>fname1.4<\/td>\n<td>lname 1.4<\/td>\n<\/tr>\n<tr>\n<td>8<\/td>\n<td>2<\/td>\n<td>fname2.4<\/td>\n<td>lname 2.4<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<div class=\"table-responsive\">\n<table class=\"table-database\" cellspacing=\"0\" cellpadding=\"0\" border=\"0\" width=\"100%\">\n<thead>\n<tr>\n<td colspan=\"10\">Explain Result<\/td>\n<\/tr>\n<tr>\n<th>id<\/th>\n<th>select_type<\/th>\n<th>table<\/th>\n<th>type<\/th>\n<th>possible_keys<\/th>\n<th>key<\/th>\n<th>key_len<\/th>\n<th>ref<\/th>\n<th>rows<\/th>\n<th>extra<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>1<\/td>\n<td>PRIMARY<\/td>\n<td><derived2><\/td>\n<td>ALL<\/td>\n<td><i>NULL<\/i><\/td>\n<td><i>NULL<\/i><\/td>\n<td><i>NULL<\/i><\/td>\n<td><i>NULL<\/i><\/td>\n<td>2<\/td>\n<td>&nbsp;<\/td>\n<\/tr>\n<tr>\n<td>1<\/td>\n<td>PRIMARY<\/td>\n<td>u1<\/td>\n<td>eq_ref<\/td>\n<td>PRIMARY<\/td>\n<td>PRIMARY<\/td>\n<td>4<\/td>\n<td>u2.mId<\/td>\n<td>1<\/td>\n<td>Using where<\/td>\n<\/tr>\n<tr>\n<td>2<\/td>\n<td>DERIVED<\/td>\n<td>user_log<\/td>\n<td>ALL<\/td>\n<td><i>NULL<\/i><\/td>\n<td><i>NULL<\/i><\/td>\n<td><i>NULL<\/i><\/td>\n<td><i>NULL<\/i><\/td>\n<td>8<\/td>\n<td>Using temporary; Using filesort<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<table class=\"table-matrix\" cellspacing=\"0\" cellpadding=\"0\" border=\"0\" width=\"100%\">\n<thead>\n<tr>\n<td colspan=\"4\">Performance Matrix*<\/td>\n<\/tr>\n<tr>\n<th>Rows<\/th>\n<th>2GB<\/th>\n<th>4GB<\/th>\n<th>8GB<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<th>10000<\/th>\n<td class=\"green-bg\">0.033333<\/td>\n<td class=\"green-bg\">0.026667<\/td>\n<td class=\"green-bg\">0.016667<\/td>\n<\/tr>\n<tr>\n<th>25000<\/th>\n<td class=\"green-bg\">0.106667<\/td>\n<td class=\"green-bg\">0.063333<\/td>\n<td class=\"green-bg\">0.036667<\/td>\n<\/tr>\n<tr>\n<th>50000<\/th>\n<td class=\"green-bg\">0.13<\/td>\n<td class=\"green-bg\">0.15<\/td>\n<td class=\"green-bg\">0.046667<\/td>\n<\/tr>\n<tr>\n<th>100000<\/th>\n<td class=\"green-bg\">0.313333<\/td>\n<td class=\"green-bg\">0.286667<\/td>\n<td class=\"green-bg\">0.093333<\/td>\n<\/tr>\n<tr>\n<th>200000<\/th>\n<td class=\"green-bg\">0.55<\/td>\n<td class=\"green-bg\">0.566667<\/td>\n<td class=\"green-bg\">0.2<\/td>\n<\/tr>\n<tr>\n<td colspan=\"4\">*Time is in seconds<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<table class=\"table-matrix\" cellspacing=\"0\" cellpadding=\"0\" border=\"0\" width=\"100%\">\n<thead>\n<tr>\n<td>Performance Matrix Chart<\/td>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td class=\"p0\"><img loading=\"lazy\" decoding=\"async\" width=\"477\" height=\"292\" src=\"https:\/\/www.virtueinfo.com\/blog\/wp-content\/uploads\/2018\/11\/chart-query-4.png\" alt=\"\" class=\"alignnone size-full wp-image-200\" srcset=\"https:\/\/www.virtueinfo.com\/blog\/wp-content\/uploads\/2018\/11\/chart-query-4.png 477w, https:\/\/www.virtueinfo.com\/blog\/wp-content\/uploads\/2018\/11\/chart-query-4-300x184.png 300w\" sizes=\"auto, (max-width: 477px) 100vw, 477px\" \/><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<table class=\"table-matrix\" cellspacing=\"0\" cellpadding=\"0\" border=\"0\" width=\"100%\">\n<thead>\n<tr>\n<td width=\"50%\">Pros<\/td>\n<td width=\"50%\">Cons<\/td>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>\n<ul class=\"list-bullets\">\n<li>Fewer itration than previous queries<\/li>\n<li>Runs fine with small \/ medium \/ big data set<\/li>\n<\/ul>\n<\/td>\n<td>\n<ul class=\"list-bullets\">\n<li>Not ideal for medium to big data sets<\/li>\n<li>Little complex to understand compaired to first two queries<\/li>\n<li>Joins on derived tables are more expensive since you lose control over indexing<\/li>\n<\/ul>\n<\/td>\n<\/tr>\n<tr>\n<td colspan=\"2\" class=\"green-bg\">Conclusion : RECOMMENDED<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<div class=\"block-separator\"><\/div>\n<h5>Query 5<\/h5>\n<p>Query 5 shows the approach of Simple query and then string processing on columns to get the result we want. Explain statement shows the query is executed in only one part as SIMPLE so the table is iterated only once.<br \/>\nAs we can see in the performance matrix below the Query 5 is also time efficient compared with Query 1, Query 2 and Query 3 so it is recommended.<\/p>\n<pre class=\"lang:mysql decode:true \" title=\"Query\" >SELECT \r\nSUBSTRING_INDEX(\r\nGROUP_CONCAT(DISTINCT id ORDER BY id DESC), ',', 1\r\n) as id, user_id, \r\nSUBSTRING_INDEX(\r\nGROUP_CONCAT(DISTINCT first_name ORDER BY id DESC), ',', 1\r\n) as first_name, \r\nSUBSTRING_INDEX(\r\nGROUP_CONCAT(DISTINCT last_name ORDER BY id DESC), ',', 1\r\n) as last_name\r\nFROM user_log \r\nGROUP BY user_id \r\nORDER BY id ASC;<\/pre>\n<table class=\"table-database\" cellspacing=\"0\" cellpadding=\"0\" border=\"0\" width=\"100%\">\n<thead>\n<tr>\n<td colspan=\"4\">SQL Result<\/td>\n<\/tr>\n<tr>\n<th>id<\/th>\n<th>user_id<\/th>\n<th>first_name<\/th>\n<th>last_name<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>4<\/td>\n<td>1<\/td>\n<td>fname1.4<\/td>\n<td>lname 1.4<\/td>\n<\/tr>\n<tr>\n<td>8<\/td>\n<td>2<\/td>\n<td>fname2.4<\/td>\n<td>lname 2.4<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<div class=\"table-responsive\">\n<table class=\"table-database\" cellspacing=\"0\" cellpadding=\"0\" border=\"0\" width=\"100%\">\n<thead>\n<tr>\n<td colspan=\"10\">Explain Result<\/td>\n<\/tr>\n<tr>\n<th>id<\/th>\n<th>select_type<\/th>\n<th>table<\/th>\n<th>type<\/th>\n<th>possible_keys<\/th>\n<th>key<\/th>\n<th>key_len<\/th>\n<th>ref<\/th>\n<th>rows<\/th>\n<th>extra<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>1<\/td>\n<td>SIMPLE<\/td>\n<td>user_log<\/td>\n<td>ALL<\/td>\n<td><i>NULL<\/i><\/td>\n<td><i>NULL<\/i><\/td>\n<td><i>NULL<\/i><\/td>\n<td><i>NULL<\/i><\/td>\n<td>8<\/td>\n<td>Using temporary; Using filesort<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<table class=\"table-matrix\" cellspacing=\"0\" cellpadding=\"0\" border=\"0\" width=\"100%\">\n<thead>\n<tr>\n<td colspan=\"4\">Performance Matrix*<\/td>\n<\/tr>\n<tr>\n<th>Rows<\/th>\n<th>2GB<\/th>\n<th>4GB<\/th>\n<th>8GB<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<th>10000<\/th>\n<td class=\"green-bg\">0.113333<\/td>\n<td class=\"green-bg\">0.113333<\/td>\n<td class=\"green-bg\">0.043333<\/td>\n<\/tr>\n<tr>\n<th>25000<\/th>\n<td class=\"green-bg\">0.55<\/td>\n<td class=\"green-bg\">0.293333<\/td>\n<td class=\"green-bg\">0.103333<\/td>\n<\/tr>\n<tr>\n<th>50000<\/th>\n<td class=\"green-bg\">0.963333<\/td>\n<td class=\"green-bg\">0.81<\/td>\n<td class=\"green-bg\">0.196667<\/td>\n<\/tr>\n<tr>\n<th>100000<\/th>\n<td class=\"red-bg\">3.64<\/td>\n<td class=\"red-bg\">2.366667<\/td>\n<td class=\"red-bg\">1.65<\/td>\n<\/tr>\n<tr>\n<th>200000<\/th>\n<td class=\"red-bg\">6.63<\/td>\n<td class=\"red-bg\">5.053333<\/td>\n<td class=\"red-bg\">4.406667<\/td>\n<\/tr>\n<tr>\n<td colspan=\"4\">*Time is in seconds<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<table class=\"table-matrix\" cellspacing=\"0\" cellpadding=\"0\" border=\"0\" width=\"100%\">\n<thead>\n<tr>\n<td>Performance Matrix Chart<\/td>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td class=\"p0\"><img loading=\"lazy\" decoding=\"async\" width=\"477\" height=\"292\" src=\"https:\/\/www.virtueinfo.com\/blog\/wp-content\/uploads\/2018\/11\/chart-query-5.png\" alt=\"\" class=\"alignnone size-full wp-image-201\" srcset=\"https:\/\/www.virtueinfo.com\/blog\/wp-content\/uploads\/2018\/11\/chart-query-5.png 477w, https:\/\/www.virtueinfo.com\/blog\/wp-content\/uploads\/2018\/11\/chart-query-5-300x184.png 300w\" sizes=\"auto, (max-width: 477px) 100vw, 477px\" \/><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<table class=\"table-matrix\" cellspacing=\"0\" cellpadding=\"0\" border=\"0\" width=\"100%\">\n<thead>\n<tr>\n<td width=\"50%\">Pros<\/td>\n<td width=\"50%\">Cons<\/td>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>\n<ul class=\"list-bullets\">\n<li>Best query for large data sets, because querying table once<\/li>\n<li>Runs fine with small \/ medium \/ big data set<\/li>\n<\/ul>\n<\/td>\n<td>\n<ul class=\"list-bullets\">\n<li>Complex to understand<\/li>\n<li>Uses String functions to get column value<\/li>\n<li>Takes more time as the data grows- takes more time as the data grows<\/li>\n<\/ul>\n<\/td>\n<\/tr>\n<tr>\n<td colspan=\"2\" class=\"green-bg\">Conclusion : RECOMMENDED<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>As we can see from the data matrix Query 4 and Query 5 are best suited to handle this problem.<\/p>\n<p>The performance of this queries can very from data sets to data sets, here the data set taken is symmetrical (each user wise 4 rows), if data is asymmetrical these result will slightly differ, but it is beyond doubt that Query 4 and Query 5 will outperform every other query.<\/p>\n<h6 class=\"steps\"><span>Common mistake in using MAX:<\/span><\/h6>\n<p>A simple mistake Developers make in this type of queries is using MAX function to get the last records from each group. Let's see by example.<\/p>\n<p>consider these 2 queries<br \/>\n<code>SELECT max(id),user_id,first_name,last_name FROM `user_log` group by user_id DESC<\/code><br \/>\n<code>SELECT max(id),id,user_id,first_name,last_name FROM `user_log` group by user_id DESC<\/code><\/p>\n<p>Most developers just use query 1 thinking, MAX(id) will give them the row with id = MAX(id) which is wrong.<\/p>\n<p>You can catch the problem if you run the 2nd query with id in column. You can clearly see the MAX(id) is 8 and id is returning 5, so the result is wrong. And unlike in our example if you do not have different data in columns other than id [first_name, last_name], it is difficult to catch this problem.<\/p>\n<p>So, let me clear one misconception that MAX(id), is a function that returns the maximum of the rows in that column, Using MAX(id) does not return maximum id of rows in the query result.<\/p>\n<h6 class=\"steps\"><span>Structural changes:<\/span><\/h6>\n<p>Let's discuss what structural changes we could have done to improve the performance.<\/p>\n<p>So basically we have an option of adding a column that returns the latest record, for each group last row(or first depend on your query) added a flag <code>`last`= 1<\/code>, and make all other rows' flag <code>`last`= 0<\/code>.<br \/>\n1. By Trigger<br \/>\n2. By Manual programming changes<\/p>\n<p>Which ever works for you, but if you can afford to add an column then it would largely simplify your query if not you can always use Query 4 or Query 5.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>This article discusses all possible alternatives for getting the first or last records from each mysql group. Consider this sample table ( user_log ) for example: Table ( user_log ) id user_id first_name last_name 1 1 fname1.1 lname 1.1 2 1 fname1.2 lname 1.2 3 1 fname1.3 lname 1.3 4 1 fname1.4 lname 1.4 5 &hellip; <\/p>\n<p class=\"link-more\"><a href=\"https:\/\/www.virtueinfo.com\/blog\/get-last-record-in-each-mysql-group-the-efficient-way-and-all-possible-alternatives\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Get Last Record in Each MySQL Group, the efficient way and all possible alternatives&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":196,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[],"class_list":["post-195","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-all"],"acf":{"short_description":"This article discusses all possible alternatives for getting the first or last records from each mysql group. We have used live examples in these article so that it is easy to understand and implement.","section_heading_description":"","author_photo":{"ID":111,"id":111,"title":"Virtueinfo","filename":"logo-icon-512x512.jpg","filesize":26093,"url":"https:\/\/www.virtueinfo.com\/blog\/wp-content\/uploads\/2018\/10\/logo-icon-512x512.jpg","link":"https:\/\/www.virtueinfo.com\/blog\/virtueinfo-included-in-leading-development-research\/logo-icon-512x512-2\/","alt":"Virtueinfo","author":"1","description":"","caption":"Virtueinfo","name":"logo-icon-512x512-2","status":"inherit","uploaded_to":107,"date":"2018-10-11 13:48:29","modified":"2018-10-11 13:48:53","menu_order":0,"mime_type":"image\/jpeg","type":"image","subtype":"jpeg","icon":"https:\/\/www.virtueinfo.com\/blog\/wp-includes\/images\/media\/default.png","width":512,"height":512,"sizes":{"thumbnail":"https:\/\/www.virtueinfo.com\/blog\/wp-content\/uploads\/2018\/10\/logo-icon-512x512-150x150.jpg","thumbnail-width":150,"thumbnail-height":150,"medium":"https:\/\/www.virtueinfo.com\/blog\/wp-content\/uploads\/2018\/10\/logo-icon-512x512-300x300.jpg","medium-width":300,"medium-height":300,"medium_large":"https:\/\/www.virtueinfo.com\/blog\/wp-content\/uploads\/2018\/10\/logo-icon-512x512.jpg","medium_large-width":512,"medium_large-height":512,"large":"https:\/\/www.virtueinfo.com\/blog\/wp-content\/uploads\/2018\/10\/logo-icon-512x512.jpg","large-width":512,"large-height":512,"1536x1536":"https:\/\/www.virtueinfo.com\/blog\/wp-content\/uploads\/2018\/10\/logo-icon-512x512.jpg","1536x1536-width":512,"1536x1536-height":512,"2048x2048":"https:\/\/www.virtueinfo.com\/blog\/wp-content\/uploads\/2018\/10\/logo-icon-512x512.jpg","2048x2048-width":512,"2048x2048-height":512,"twentyseventeen-featured-image":"https:\/\/www.virtueinfo.com\/blog\/wp-content\/uploads\/2018\/10\/logo-icon-512x512.jpg","twentyseventeen-featured-image-width":512,"twentyseventeen-featured-image-height":512,"twentyseventeen-thumbnail-avatar":"https:\/\/www.virtueinfo.com\/blog\/wp-content\/uploads\/2018\/10\/logo-icon-512x512-100x100.jpg","twentyseventeen-thumbnail-avatar-width":100,"twentyseventeen-thumbnail-avatar-height":100}}},"_links":{"self":[{"href":"https:\/\/www.virtueinfo.com\/blog\/wp-json\/wp\/v2\/posts\/195","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.virtueinfo.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.virtueinfo.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.virtueinfo.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.virtueinfo.com\/blog\/wp-json\/wp\/v2\/comments?post=195"}],"version-history":[{"count":2,"href":"https:\/\/www.virtueinfo.com\/blog\/wp-json\/wp\/v2\/posts\/195\/revisions"}],"predecessor-version":[{"id":204,"href":"https:\/\/www.virtueinfo.com\/blog\/wp-json\/wp\/v2\/posts\/195\/revisions\/204"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.virtueinfo.com\/blog\/wp-json\/wp\/v2\/media\/196"}],"wp:attachment":[{"href":"https:\/\/www.virtueinfo.com\/blog\/wp-json\/wp\/v2\/media?parent=195"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.virtueinfo.com\/blog\/wp-json\/wp\/v2\/categories?post=195"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.virtueinfo.com\/blog\/wp-json\/wp\/v2\/tags?post=195"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}