The MEMORY
storage engine creates tables with
contents that are stored in memory. Before MySQL 4.1,
MEMORY
tables are called
HEAP
tables. As of 4.1,
MEMORY
is the preferred term, although
HEAP
remains supported for backward
compatibility.
Each MEMORY
table is associated with one disk
file. The file name begins with the table name and has an
extension of .frm
to indicate that it stores
the table definition.
To specify explicitly that you want to create a
MEMORY
table, indicate that with an
ENGINE
table option:
CREATE TABLE t (i INT) ENGINE = MEMORY;
The older term TYPE
is supported as a synonym
for ENGINE
for backward compatibility, but
ENGINE
is the preferred term from MySQL 4.0.18
on and TYPE
is deprecated.
As indicated by the name, MEMORY
tables are
stored in memory. They use hash indexes by default, which makes
them very fast, and very useful for creating temporary tables.
However, when the server shuts down, all rows stored in
MEMORY
tables are lost. The tables themselves
continue to exist because their definitions are stored in
.frm
files on disk, but they are empty when
the server restarts.
This example shows how you might create, use, and remove a
MEMORY
table:
mysql>CREATE TABLE test TYPE=MEMORY
->SELECT ip,SUM(downloads) AS down
->FROM log_table GROUP BY ip;
mysql>SELECT COUNT(ip),AVG(down) FROM test;
mysql>DROP TABLE test;
MEMORY
tables have the following
characteristics:
Space for MEMORY
tables is allocated in
small blocks. Tables use 100% dynamic hashing for inserts. No
overflow area or extra key space is needed. No extra space is
needed for free lists. Deleted rows are put in a linked list
and are reused when you insert new data into the table.
MEMORY
tables also have none of the
problems commonly associated with deletes plus inserts in
hashed tables.
MEMORY
tables allow up to 32 indexes per
table and 16 columns per index. Previously, the maximum key
length supported by this storage engine was 255 bytes; as of
MySQL 4.1.13, MEMORY
tables support a
maximum key length of 500 bytes. (See
Section B.1.13, “Changes in MySQL 4.1.13 (15 July 2005)”.)
Before MySQL 4.1, the MEMORY
storage engine
implements only hash indexes. From MySQL 4.1 on, hash indexes
are still the default, but you can specify explicitly that a
MEMORY
table index should be a
HASH
or BTREE
by adding
a USING
clause as shown here:
CREATE TABLE lookup (id INT, INDEX USING HASH (id)) ENGINE = MEMORY; CREATE TABLE lookup (id INT, INDEX USING BTREE (id)) ENGINE = MEMORY;
General characteristics of B-tree and hash indexes are described in Section 7.4.4, “How MySQL Uses Indexes”.
You can have nonunique keys in a MEMORY
table. (This is an uncommon feature for implementations of
hash indexes.)
If you have a hash index on a MEMORY
table
that has a high degree of key duplication (many index entries
containing the same value), updates to the table that affect
key values and all deletes are significantly slower. The
degree of this slowdown is proportional to the degree of
duplication (or, inversely proportional to the index
cardinality). You can use a BTREE
index to
avoid this problem.
As of MySQL 4.0.2, columns that are indexed can contain
NULL
values.
MEMORY
tables use a fixed-length row
storage format.
MEMORY
supports
AUTO_INCREMENT
columns as of MySQL 4.1.0.
As of MySQL 4.1, you can use INSERT
DELAYED
with MEMORY
tables. See
Section 12.2.4.2, “INSERT DELAYED
Syntax”.
MEMORY
tables are shared among all clients
(just like any other non-TEMPORARY
table).
MEMORY
table contents are stored in memory,
which is a property that MEMORY
tables
share with internal temporary tables that the server creates
on the fly while processing queries. However, the two types of
tables differ in that MEMORY
tables are not
subject to storage conversion, whereas internal temporary
tables are:
If an internal temporary table becomes too large, the server automatically converts it to an on-disk table, as described in Section 7.5.8, “How MySQL Uses Internal Temporary Tables”.
MEMORY
tables are never converted to
disk tables.
The maximum size of MEMORY
tables is
limited by the
max_heap_table_size
system variable, which has a default value of 16MB. To
have larger (or smaller) MEMORY
tables,
you must change the value of this variable. The value in
effect at the time a MEMORY
table is
created is the value used for the life of the table. (If
you use ALTER TABLE
or
TRUNCATE TABLE
, the value
in effect at that time becomes the new maximum size for
the table. A server restart also sets the maximum size of
existing MEMORY
tables to the global
max_heap_table_size
value.) You can set the size for individual tables as
described later in this section.
The server needs sufficient memory to maintain all
MEMORY
tables that are in use at the same
time.
To free memory used by a MEMORY
table when
you no longer require its contents, you should execute
DELETE
or
TRUNCATE TABLE
, or remove the
table altogether using DROP
TABLE
.
If you want to populate a MEMORY
table when
the MySQL server starts, you can use the
--init-file
option. For
example, you can put statements such as
INSERT INTO ...
SELECT
or
LOAD DATA
INFILE
into this file to load the table from a
persistent data source. See Section 5.1.2, “Server Command Options”,
and Section 12.2.5, “LOAD DATA INFILE
Syntax”.
If you are using replication, the master server's
MEMORY
tables become empty when it is shut
down and restarted. However, a slave is not aware that these
tables have become empty, so it returns out-of-date content if
you select data from them. Beginning with MySQL 4.0.18, when a
MEMORY
table is used on the master for the
first time since the master was started, a DELETE
FROM
statement is written to the master's binary log
automatically, thus synchronizing the slave to the master
again. Note that even with this strategy, the slave still has
outdated data in the table during the interval between the
master's restart and its first use of the table. However, if
you use the --init-file
option
to populate the MEMORY
table on the master
at startup, it ensures that this time interval is zero.
The memory needed for one row in a MEMORY
table is calculated using the following expression:
SUM_OVER_ALL_BTREE_KEYS(max_length_of_key
+ sizeof(char*) × 4) + SUM_OVER_ALL_HASH_KEYS(sizeof(char*) × 2) + ALIGN(length_of_row
+1, sizeof(char*))
ALIGN()
represents a round-up factor to
cause the row length to be an exact multiple of the
char
pointer size.
sizeof(char*)
is 4 on 32-bit machines and 8
on 64-bit machines.
As mentioned earlier, the
max_heap_table_size
system
variable sets the limit on the maximum size of
MEMORY
tables. To control the maximum size for
individual tables, set the session value of this variable before
creating each table. (Do not change the global
max_heap_table_size
value unless
you intend the value to be used for MEMORY
tables created by all clients.) The following example creates two
MEMORY
tables, with a maximum size of 1MB and
2MB, respectively:
mysql>SET max_heap_table_size = 1024*1024;
Query OK, 0 rows affected (0.00 sec) mysql>CREATE TABLE t1 (id INT, UNIQUE(id)) ENGINE = MEMORY;
Query OK, 0 rows affected (0.01 sec) mysql>SET max_heap_table_size = 1024*1024*2;
Query OK, 0 rows affected (0.00 sec) mysql>CREATE TABLE t2 (id INT, UNIQUE(id)) ENGINE = MEMORY;
Query OK, 0 rows affected (0.00 sec)
Both tables will revert to the server's global
max_heap_table_size
value if the
server restarts.
You can also specify a MAX_ROWS
table option in
CREATE TABLE
statements for
MEMORY
tables to provide a hint about the
number of rows you plan to store in them. This does not allow the
table to grow beyond the
max_heap_table_size
value, which
still acts as a constraint on maximum table size. For maximum
flexibility in being able to use MAX_ROWS
, set
max_heap_table_size
at least as
high as the value to which you want each MEMORY
table to be able to grow.
Additional Resources
A forum dedicated to the MEMORY
storage
engine is available at
http://forums.mysql.com/list.php?92.
User Comments
I think the slowdown documented above is entirely unnecessary and the slowdown is not directly correlated to cardinality:
"...The degree of slowdown is proportional to the degree of duplication...You can use a BTREE index to avoid this problem."
Only a very simple "MTF" optimization needs to be made to the HEAP storage engine:
http://bugs.mysql.com/bug.php?id=7817
BTREEs are much slower than hashing (about 5 to 6 times at least), and are necessary only when non-equality (range) indexing is required. See the research paper quoted at above link for benchmarks.
So consider the above advice to use BTREEs to solve performance issues as incorrect because they are 5 - 6 times slower. BTREEs are a way to get 5 - 6 times slower performance than a correctly optimized HASH indexing. BTREEs may be faster in some cases than an *UN*optimized HASH index.
As for the issue of slowdown correlation to cardinality, see comment "16 Jan 9:32pm" in above link.
Current HASH key implementation is unoptimized and much slower than it needs to be for the case where most queries result in non-match:
http://bugs.mysql.com/7936
In this case, it is possible that BTREE is faster until HASH is optimized.
I would like to explain something for all of us that can be confused about this. Above it's stated:
MEMORY tables use a fixed record length format.
That means, not that you can't create a varchar column, but that it will be treated as char and will waste the whole size you defined it with.
Insertion into HASH indexed columns is somewhat vulnerable to degenerate cases of "bad" data sets, which can cause insertion to be painfully slow (two orders of magnitude slower than a "normal" data set). See the examples (with suggestions for application-level fixes) below:
Create a table n:
mysql> create temporary table n (n int unsigned auto_increment primary key);
mysql> insert into n select NULL from SQ_SIMILAR2; -- a 1-million-row-table
Query OK, 1115156 rows affected (4.40 sec)
Records: 1115156 Duplicates: 0 Warnings: 0
Ok, now we have numbers 1-1e6 in table n.
mysql> create temporary table sq (sq int unsigned, key sq) engine memory;
Ok, now we're set. Look at the timings in the two insert statements:
mysql> insert into sq select floor(n/64*1024)*n from n;
Query OK, 1115156 rows affected, 65535 warnings (2.80 sec)
Records: 1115156 Duplicates: 0 Warnings: 1098773
mysql> truncate table sq;
Query OK, 0 rows affected (0.01 sec)
mysql> insert into sq select floor(n/(64*1024-1))*n from n;
Query OK, 1115156 rows affected (2 min 59.34 sec)
Records: 1115156 Duplicates: 0 Warnings: 0
In other words, a slow-down factor of 64! Obviously something weird is
going on that throws the adaptive cache algorithm to the ground!
Part of the problem can be solved by e.g. random reordering before
inserts (after truncating the table, of course):
mysql> insert into sq select floor(n/(64*1024-1))*n from n order by rand();
Query OK, 1115156 rows affected (52.64 sec)
Records: 1115156 Duplicates: 0 Warnings: 0
Now we're down to "only" a factor of about 20. But we can do even better:
mysql> insert into sq select floor(n/(64*1024-1))*n from n order by n desc;
Query OK, 1115156 rows affected (2.60 sec)
Records: 1115156 Duplicates: 0 Warnings: 0
Whee! Great.
Our actual data were a little different. The table SQ_SIMILAR2 contains
1.1 million non-unique numbers - about 180,000 distinct values between 1
and 1.1 million - in a, well, special [by accident] order. Here are some
timings (table sq is truncated before each insert):
mysql> insert into sq select SQ_SIMILAR2 from SQ_SIMILAR2;
Query OK, 1115156 rows affected (4 min 39.07 sec)
Records: 1115156 Duplicates: 0 Warnings: 0
I.e. a little worse than the test case above. Random ordering seems a tiny
bit worse. And ordering in ascending order is really, really bad:
mysql> insert into sq select SQ_SIMILAR2 from SQ_SIMILAR2 order by SQ_SIMILAR2;
Query OK, 1115156 rows affected (8 min 31.24 sec)
Records: 1115156 Duplicates: 0 Warnings: 0
Yikes, a slow-down factor of 182 compared to the floor(n/64*1024)*n
example above. Sorting in descending order gets back within the realm of
the reasonable again:
mysql> insert into sq select SQ_SIMILAR2 from SQ_SIMILAR2 order by SQ_SIMILAR2 $
Query OK, 1115156 rows affected (4.54 sec)
Records: 1115156 Duplicates: 0 Warnings: 0
But with non-unique data, can you do better? Try this:
mysql> insert into sq select distinct SQ_SIMILAR2 from SQ_SIMILAR2;
Query OK, 181272 rows affected (0.61 sec)
Records: 181272 Duplicates: 0 Warnings: 0
mysql> insert into sq select SQ_SIMILAR2 from SQ_SIMILAR2;
Query OK, 1115156 rows affected (1.50 sec)
Records: 1115156 Duplicates: 0 Warnings: 0
Alltogether only 2.11 sec, half the time of the descending sort order,
although further table manipulations are necessary to delete the spurious
duplicates that have been created.
When joining a column in a MEMORY table against one in an InnoDB table, the kind of indexes on the columns is important.
In my case, when a column on a MEMORY table was of type HASH and the corresponding column in the InnoDB table of type BTREE, the query optimizer was not able to make use of the indexes and queries were taking a long time. A fix in this instance was to convert the default HASH index on the MEMORY table column to BTREE.
Add your own comment.