Pantek Library
Hosting Provided By
CybrHost
High Speed Hosting

Re: Assistance avoiding a full table scan

From: Erik Giberti <erik(at)af-design.com>
Date: Wed Sep 26 2007 - 18:41:14 EDT


Brent,

I tried this and it definitely boosted performance. On a test query that would take 2+ seconds to run with 20 id's - it ran in 0.002 seconds.

Thanks everyone for your help and comments.

Erik

On Sep 21, 2007, at 2:01 PM, Brent Baisley wrote:

> As others have mentioned, mysql doesn't handle IN queries efficiently.
> You can try changing it to using derived tables/subqueries. I did some
> quick tests and the explain shows a different analysis.
>
> select comment, gid, date_posted from tbl
> JOIN
> (select max(id) as mid
> from tbl where gid in ( 1234,2345,3456 .. 7890 ) group by gid) as
> maxids
> ON tpl.id=maxids.mid;
>
> You're really just changing one of the IN statements to a join. My
> quick tests showed that a full table is still being done, but it's now
> on the derived table (maxids), which would only be as large as how
> many gids you are searching on (50?).
> I tested this on two related tables, 170K in one and 90K in the other.
> Your tables are currently much smaller, so mysql may come up with a
> different execution path. For me, the join syntax was far faster.
>
>
> On 9/21/07, Erik Giberti <erik@af-design.com> wrote:
>> Hello everyone,
>>
>> The app server in this case is PHP, and the database is MySQL 5.0.22
>> on RedHat linux
>>
>> I've got a database with about 7.5K records in it that I expect to
>> start growing very quickly ~10-12K records per day. The storage
>> engine is InnoDB. This table is growing quickly and will continue to
>> grow for a long time. This table stores comments (as you can see from
>> the structure) and is being used to display a list of comments based
>> on a users affiliations.
>>
>> The structure is approximately this - I'm leaving out unrelated
>> columns:
>>
>> id int - primary key - auto increment
>> gid bigint - indexed
>> comment varchar
>> date_posted timestamp
>>
>> I run a query with the following form
>>
>> select comment, gid, date_posted from tbl where id in (select max(id)
>> from tbl where gid in ( 1234,2345,3456 .. 7890 ) group by gid);
>>
>> I have an index on gid and id is the primary key
>>
>> When I describe the query with about 50 gid values inserted (where
>> indicated above) I get the following:
>>
>> +----+--------------------+-------+-------+---------------+---------
>> +---------+------+------+--------------------------+
>> | id | select_type | table | type | possible_keys | key |
>> key_len | ref | rows | Extra |
>> +----+--------------------+-------+-------+---------------+---------
>> +---------+------+------+--------------------------+
>> | 1 | PRIMARY | tbl | ALL | NULL | NULL |
>> NULL | NULL | 7533 | Using where |
>> | 2 | DEPENDENT SUBQUERY | tbl | range | idx_gid | idx_gid |
>> 9 | NULL | 58 | Using where; Using index |
>> +----+--------------------+-------+-------+---------------+---------
>> +---------+------+------+--------------------------+
>>
>> Running the query on a production machine with sufficient memory and
>> horsepower (box is only 20% utilized) it still takes 3 seconds to run
>> - obviously not quick enough for web use.
>>
>> What I really need is the most recent comment from each group based
>> on a variable set of gid's that change from user to user.
>>
>> Any thoughts on how to tweak this to avoid the full table scan? Thank
>> you in advance for your assistance.
>>
>> Erik Giberti
>>
>>
>> --
>> MySQL General Mailing List
>> For list archives: http://lists.mysql.com/mysql
>> To unsubscribe: http://lists.mysql.com/mysql?
>> unsub=brenttech@gmail.com
>>
>>
>
> --
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe: http://lists.mysql.com/mysql?unsub=giberti@mac.com
>

-- 
MySQL General Mailing List
For list archives: 
http://lists.mysql.com/mysql
To unsubscribe:    
http://lists.mysql.com/mysql?unsub=lists@pantek.com
Received on Wed Sep 26 18:41:20 2007

This archive was generated by hypermail 2.1.8 : Sun Oct 07 2007 - 10:12:22 EDT


Contact Us  Legal Notices  Order Services Online 
Pantek Home  Privacy Policy  IT news  Site Map  Pantek Library