|
|||||||||||||||||||||||||||
|
Applications: MySQL > doc > refman > 6.0 > enmemory-storage-engine.html (Request Expert MySQL Support)
Skip navigation links
Additional languages MySQL 6.0 Reference Manual :: 13 Storage Engines :: 13.8 The MEMORY (HEAP) Storage Engine Section Navigation [Toggle]
The
Each
To specify explicitly that you want to create a
CREATE TABLE t (i INT) ENGINE = MEMORY;
As indicated by the name,
This example shows how you might create, use, and remove a
mysql>
As mentioned earlier, the mysql>
Both tables will revert to the server's global
You can also specify a Additional resources
© 1995-2008 MySQL AB. All rights reserved. |
||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||



User Comments
I think the slowdown documented above is entirely unnecessary and the slowdown is not directly correlated to cardinality:
"...The degree of slowdown is proportional to the degree of duplication...You can use a BTREE index to avoid this problem."
Only a very simple "MTF" optimization needs to be made to the HEAP storage engine:
http://bugs.mysql.com/bug.php?id=7817
BTREEs are much slower than hashing (about 5 to 6 times at least), and are necessary only when non-equality (range) indexing is required. See the research paper quoted at above link for benchmarks.
So consider the above advice to use BTREEs to solve performance issues as incorrect because they are 5 - 6 times slower. BTREEs are a way to get 5 - 6 times slower performance than a correctly optimized HASH indexing. BTREEs may be faster in some cases than an *UN*optimized HASH index.
As for the issue of slowdown correlation to cardinality, see comment "16 Jan 9:32pm" in above link.
Current HASH key implementation is unoptimized and much slower than it needs to be for the case where most queries result in non-match:
http://bugs.mysql.com/7936
In this case, it is possible that BTREE is faster until HASH is optimized.
I would like to explain something for all of us that can be confused about this. Above it's stated:
MEMORY tables use a fixed record length format.
That means, not that you can't create a varchar column, but that it will be treated as char and will waste the whole size you defined it with.
Insertion into HASH indexed columns is somewhat vulnerable to degenerate cases of "bad" data sets, which can cause insertion to be painfully slow (two orders of magnitude slower than a "normal" data set). See the examples (with suggestions for application-level fixes) below:
Create a table n:
mysql> create temporary table n (n int unsigned auto_increment primary key);
mysql> insert into n select NULL from SQ_SIMILAR2; -- a 1-million-row-table
Query OK, 1115156 rows affected (4.40 sec)
Records: 1115156 Duplicates: 0 Warnings: 0
Ok, now we have numbers 1-1e6 in table n.
mysql> create temporary table sq (sq int unsigned, key sq) engine memory;
Ok, now we're set. Look at the timings in the two insert statements:
mysql> insert into sq select floor(n/64*1024)*n from n;
Query OK, 1115156 rows affected, 65535 warnings (2.80 sec)
Records: 1115156 Duplicates: 0 Warnings: 1098773
mysql> truncate table sq;
Query OK, 0 rows affected (0.01 sec)
mysql> insert into sq select floor(n/(64*1024-1))*n from n;
Query OK, 1115156 rows affected (2 min 59.34 sec)
Records: 1115156 Duplicates: 0 Warnings: 0
In other words, a slow-down factor of 64! Obviously something weird is
going on that throws the adaptive cache algorithm to the ground!
Part of the problem can be solved by e.g. random reordering before
inserts (after truncating the table, of course):
mysql> insert into sq select floor(n/(64*1024-1))*n from n order by rand();
Query OK, 1115156 rows affected (52.64 sec)
Records: 1115156 Duplicates: 0 Warnings: 0
Now we're down to "only" a factor of about 20. But we can do even better:
mysql> insert into sq select floor(n/(64*1024-1))*n from n order by n desc;
Query OK, 1115156 rows affected (2.60 sec)
Records: 1115156 Duplicates: 0 Warnings: 0
Whee! Great.
Our actual data were a little different. The table SQ_SIMILAR2 contains
1.1 million non-unique numbers - about 180,000 distinct values between 1
and 1.1 million - in a, well, special [by accident] order. Here are some
timings (table sq is truncated before each insert):
mysql> insert into sq select SQ_SIMILAR2 from SQ_SIMILAR2;
Query OK, 1115156 rows affected (4 min 39.07 sec)
Records: 1115156 Duplicates: 0 Warnings: 0
I.e. a little worse than the test case above. Random ordering seems a tiny
bit worse. And ordering in ascending order is really, really bad:
mysql> insert into sq select SQ_SIMILAR2 from SQ_SIMILAR2 order by SQ_SIMILAR2;
Query OK, 1115156 rows affected (8 min 31.24 sec)
Records: 1115156 Duplicates: 0 Warnings: 0
Yikes, a slow-down factor of 182 compared to the floor(n/64*1024)*n
example above. Sorting in descending order gets back within the realm of
the reasonable again:
mysql> insert into sq select SQ_SIMILAR2 from SQ_SIMILAR2 order by SQ_SIMILAR2 $
Query OK, 1115156 rows affected (4.54 sec)
Records: 1115156 Duplicates: 0 Warnings: 0
But with non-unique data, can you do better? Try this:
mysql> insert into sq select distinct SQ_SIMILAR2 from SQ_SIMILAR2;
Query OK, 181272 rows affected (0.61 sec)
Records: 181272 Duplicates: 0 Warnings: 0
mysql> insert into sq select SQ_SIMILAR2 from SQ_SIMILAR2;
Query OK, 1115156 rows affected (1.50 sec)
Records: 1115156 Duplicates: 0 Warnings: 0
Alltogether only 2.11 sec, half the time of the descending sort order,
although further table manipulations are necessary to delete the spurious
duplicates that have been created.
When joining a column in a MEMORY table against one in an InnoDB table, the kind of indexes on the columns is important.
In my case, when a column on a MEMORY table was of type HASH and the corresponding column in the InnoDB table of type BTREE, the query optimizer was not able to make use of the indexes and queries were taking a long time. A fix in this instance was to convert the default HASH index on the MEMORY table column to BTREE.
Add your own comment.