In the following section, we answer questions that are frequently
    asked about MySQL Cluster and the
    NDBCLUSTER storage engine.
  
Questions
15.6.1: Which versions of the MySQL software support Cluster? Do I have to compile from source?
15.6.2: What does “NDB” mean?
15.6.3: What is the difference between using MySQL Cluster vs using MySQL replication?
15.6.4: Do I need to do any special networking to run MySQL Cluster? How do computers in a cluster communicate?
15.6.5: How many computers do I need to run a MySQL Cluster, and why?
15.6.6: What do the different computers do in a MySQL Cluster?
15.6.7: 
          When I run the SHOW command in the MySQL
          Cluster management client, I see a line of output that looks
          like this:
        
id=2 @10.100.10.32 (Version: 4.1.26, Nodegroup: 0, Master)
What is a “master node”, and what does it do? How do I configure a node so that it is the master?
15.6.8: With which operating systems can I use Cluster?
15.6.9: What are the hardware requirements for running MySQL Cluster?
15.6.10: How much RAM do I need to use MySQL Cluster? Is it possible to use disk memory at all?
15.6.11: What file systems can I use with MySQL Cluster? What about network file systems or network shares?
15.6.12: Can I run MySQL Cluster nodes inside virtual machines (such as those created by VMWare, Parallels, or Xen)?
15.6.13: 
          I am trying to populate a MySQL Cluster database. The loading
          process terminates prematurely and I get an error message like
          this one:
        
          ERROR 1114: The table 'my_cluster_table'
          is full
        
          Why is this happening?
        
15.6.14: MySQL Cluster uses TCP/IP. Does this mean that I can run it over the Internet, with one or more nodes in remote locations?
15.6.15: Do I have to learn a new programming or query language to use MySQL Cluster?
15.6.16: How do I find out what an error or warning message means when using MySQL Cluster?
15.6.17: Is MySQL Cluster transaction-safe? What isolation levels are supported?
15.6.18: What storage engines are supported by MySQL Cluster?
15.6.19: In the event of a catastrophic failure — say, for instance, the whole city loses power and my UPS fails — would I lose all my data?
15.6.20: 
          Is it possible to use FULLTEXT indexes with
          MySQL Cluster?
        
15.6.21: Can I run multiple nodes on a single computer?
15.6.22: Can I add data nodes to a MySQL Cluster without restarting it?
15.6.23: Are there any limitations that I should be aware of when using MySQL Cluster?
15.6.24: How do I import an existing MySQL database into a MySQL Cluster?
15.6.25: How do cluster nodes communicate with one another?
15.6.26: What is an arbitrator?
15.6.27: What data types are supported by MySQL Cluster?
15.6.28: How do I start and stop MySQL Cluster?
15.6.29: What happens to MySQL Cluster data when the cluster is shut down?
15.6.30: Is it a good idea to have more than one management node for a MySQL Cluster?
15.6.31: Can I mix different kinds of hardware and operating systems in one MySQL Cluster?
15.6.32: Can I run two data nodes on a single host? Two SQL nodes?
15.6.33: Can I use host names with MySQL Cluster?
15.6.34: How do I handle MySQL users in a MySQL Cluster having multiple MySQL servers?
15.6.35: How do I continue to send queries in the event that one of the SQL nodes fails?
15.6.36: How do I back up and restore a MySQL Cluster?
15.6.37: What is an “angel process”?
Questions and Answers
15.6.1: Which versions of the MySQL software support Cluster? Do I have to compile from source?
          Beginning with MySQL 4.1.3, MySQL Cluster is supported in all
          MySQL-Max server binaries in the
          4.1 release series for operating systems on which
          MySQL Cluster is available. See Section 4.3.1, “mysqld — The MySQL Server”. You
          can determine whether your server has
          NDB support using either of the
          statements SHOW VARIABLES LIKE 'have_%' or
          SHOW ENGINES.
        
          You can also obtain NDB support
          by compiling MySQL from source, but it is not necessary to do
          so simply to use MySQL Cluster. To download the latest binary,
          RPM, or source distribution in the MySQL 4.1
          series, visit
          http://dev.mysql.com/downloads/mysql/4.1.html.
        
However, you should use MySQL NDB Cluster NDB 6.3 or 7.0 for new deployments, and if you are already using MySQL 4.1 with clustering support, to upgrade to one of these MySQL Cluster release series. For an overview of improvements made in MySQL Cluster NDB 6.2, 6.3, and 7.0, see MySQL CLuster Development in MySQL Cluster NDB 6.2, MySQL Cluster Development in MySQL Cluster NDB 6.3, and MySQL Cluster Development in MySQL Cluster NDB 7.0, respectively.
          “NDB” stands for
          “Network
          Database”.
          NDB and
          NDBCLUSTER are both names for the
          storage engine that enables clustering support in MySQL.
          Either name is equally correct; both names appear in our
          documentation, and either name can be used in the
          ENGINE option of a
          CREATE TABLE statement for
          creating a MySQL Cluster table.
        
15.6.3: What is the difference between using MySQL Cluster vs using MySQL replication?
          In traditional MySQL replication, a master MySQL server
          updates one or more slaves. Transactions are committed
          sequentially, and a slow transaction can cause the slave to
          lag behind the master. This means that if the master fails, it
          is possible that the slave might not have recorded the last
          few transactions. If a transaction-safe engine such as
          InnoDB is being used, a
          transaction will either be complete on the slave or not
          applied at all, but replication does not guarantee that all
          data on the master and the slave will be consistent at all
          times. In MySQL Cluster, all data nodes are kept in synchrony,
          and a transaction committed by any one data node is committed
          for all data nodes. In the event of a data node failure, all
          remaining data nodes remain in a consistent state.
        
In short, whereas standard MySQL replication is asynchronous, MySQL Cluster is synchronous.
We have implemented (asynchronous) replication for Cluster in MySQL 5.1 and later. MySQL Cluster Replication (also sometimes known as “geo-replication”) includes the capability to replicate both between two MySQL Clusters, and from a MySQL Cluster to a non-Cluster MySQL server. However, we do not plan to backport this functionality to MySQL 4.1. See MySQL Cluster Replication.
15.6.4: Do I need to do any special networking to run MySQL Cluster? How do computers in a cluster communicate?
MySQL Cluster is intended to be used in a high-bandwidth environment, with computers connecting via TCP/IP. Its performance depends directly upon the connection speed between the cluster's computers. The minimum connectivity requirements for MySQL Cluster include a typical 100-megabit Ethernet network or the equivalent. Use gigabit Ethernet whenever available.
The faster SCI protocol is also supported, but requires special hardware. See Section 15.3.5, “Using High-Speed Interconnects with MySQL Cluster”, for more information about SCI.
15.6.5: How many computers do I need to run a MySQL Cluster, and why?
A minimum of three computers is required to run a viable cluster. However, the minimum recommended number of computers in a MySQL Cluster is four: one each to run the management and SQL nodes, and two computers to serve as data nodes. The purpose of the two data nodes is to provide redundancy; the management node must run on a separate machine to guarantee continued arbitration services in the event that one of the data nodes fails.
To provide increased throughput and high availability, you should use multiple SQL nodes (MySQL Servers connected to the cluster). It is also possible (although not strictly necessary) to run multiple management servers.
15.6.6: What do the different computers do in a MySQL Cluster?
A MySQL Cluster has both a physical and logical organization, with computers being the physical elements. The logical or functional elements of a cluster are referred to as nodes, and a computer housing a cluster node is sometimes referred to as a cluster host. There are three types of nodes, each corresponding to a specific role within the cluster. These are:
Management node. This node provides management services for the cluster as a whole, including startup, shutdown, backups, and configuration data for the other nodes. The management node server is implemented as the application ndb_mgmd; the management client used to control MySQL Cluster is ndb_mgm. See Section 15.4.3, “ndb_mgmd — The MySQL Cluster Management Server Daemon”, and Section 15.4.4, “ndb_mgm — The MySQL Cluster Management Client”, for information about these programs.
Data node. 
                This type of node stores and replicates data. Data node
                functionality is handled by instances of the
                NDB data node process
                ndbd. For more information, see
                Section 15.4.2, “ndbd — The MySQL Cluster Data Node Daemon”.
              
SQL node. 
                This is simply an instance of MySQL Server
                (mysqld) that is built with support
                for the NDBCLUSTER storage
                engine and started with the
                --ndb-cluster option to enable the
                engine and the --ndb-connectstring
                option to enable it to connect to a MySQL Cluster
                management server. For more about these options, see
                Section 15.3.4.2, “mysqld Command Options for MySQL Cluster”.
              
An API node is any application that makes direct use of Cluster data nodes for data storage and retrieval. An SQL node can thus be considered a type of API node that uses a MySQL Server to provide an SQL interface to the Cluster. You can write such applications (that do not depend on a MySQL Server) using the NDB API, which supplies a direct, object-oriented transaction and scanning interface to MySQL Cluster data; see The NDB API, for more information.
15.6.7: 
          When I run the SHOW command in the MySQL
          Cluster management client, I see a line of output that looks
          like this:
        
id=2 @10.100.10.32 (Version: 4.1.26, Nodegroup: 0, Master)
What is a “master node”, and what does it do? How do I configure a node so that it is the master?
The simplest answer is, “It's not something you can control, and it's nothing that you need to worry about in any case, unless you're a software engineer writing or analyzing the MySQL Cluster source code”.
If you don't find that answer satisfactory, here's a longer and more technical version:
A number of mechanisms in MySQL Cluster require distributed coordination among the data nodes. These distributed algorithms and protocols include global checkpointing, DDL (schema) changes, and node restart handling. To make this coordination simpler, the data nodes “elect” one of their number to be a “master”. There is no user-facing mechanism for influencing this selection, which is is completely automatic; the fact that it is automatic is a key part of MySQL Cluster's internal architecture.
When a node acts as a master for any of these mechanisms, it is usually the point of coordination for the activity, and the other nodes act as “servants”, carrying out their parts of the activity as directed by the master. If the node acting as master fails, then the remaining nodes elect a new master. Tasks in progress that were being coordinated by the old master may either fail or be continued by the new master, depending on the actual mechanism involved.
          It is possible for some of these different mechanisms and
          protocols to have different master nodes, but in general the
          same master is chosen for all of them. The node indicated as
          the master in the output of SHOW in the
          management client is actually the DICT
          master (see
          The DBDICT Block, in the
          MySQL Cluster API Developer Guide, for
          more information), responsible for coordinating DDL and
          metadata activity.
        
MySQL Cluster is designed in such a way that the choice of master has no discernable effect outside the cluster itself. For example, the current master does not have significantly higher CPU or resource usage than the other data nodes, and failure of the master should not have a significantly different impact on the cluster than the failure of any other data node.
15.6.8: With which operating systems can I use Cluster?
MySQL Cluster is supported on most Unix-like operating systems, including Linux, Mac OS X, and Solaris. Beginning with MySQL Cluster NDB 6.4, it is also possible to run MySQL Cluster on Windows platforms on an experimental basis; we hope to release a GA version for Windows in MySQL Cluster NDB 7.1.
We are continuing to work on providing MySQL Cluster support for additional platforms; eventually we intend to offer MySQL Cluster on all platforms for which MySQL itself is supported.
For more detailed information concerning the level of support which is offered for MySQL Cluster on various operating system versions, OS distributions, and hardware platforms, please refer to http://www.mysql.com/support/supportedplatforms/cluster.html.
15.6.9: What are the hardware requirements for running MySQL Cluster?
          MySQL Cluster should run on any platform for which
          NDB-enabled binaries are
          available. For data nodes and API nodes, faster CPUs and more
          memory are likely to improve performance, and 64-bit CPUs are
          likely to be more effective than 32-bit processors. There must
          be sufficient memory on machines used for data nodes to hold
          each node's share of the database (see How much RAM
          do I Need? for more information). For a computer
          which is used only for running the MySQL Cluster management
          server, the requirements are minimal; a common desktop PC (or
          the equivalent) is generally sufficient for this task. Nodes
          can communicate via the standard TCP/IP network and hardware.
          They can also use the high-speed SCI protocol; however,
          special networking hardware and software are required to use
          SCI (see Section 15.3.5, “Using High-Speed Interconnects with MySQL Cluster”).
        
15.6.10: How much RAM do I need to use MySQL Cluster? Is it possible to use disk memory at all?
In MySQL 4.1, Cluster is in-memory only. This means that all table data (including indexes) is stored in RAM. Therefore, if your data takes up 1 GB of space and you want to replicate it once in the cluster, you need 2 GB of memory to do so (1 GB per replica). This is in addition to the memory required by the operating system and any applications running on the cluster computers.
          If a data node's memory usage exceeds what is available
          in RAM, then the system will attempt to use swap space up to
          the limit set for DataMemory. However, this
          will at best result in severely degraded performance, and may
          cause the node to be dropped due to slow response time (missed
          heartbeats). We do not recommend on relying on disk swapping
          in a production environment for this reason. In any case, once
          the DataMemory limit is reached, any
          operations requiring additional memory (such as inserts) will
          fail.
        
We have implemented disk data storage for MySQL Cluster in MySQL 5.1 and later but we have no plans to add this capability in MySQL 4.1. See MySQL Cluster Disk Data Tables, for more information.
You can use the following formula for obtaining a rough estimate of how much RAM is needed for each data node in the cluster:
(SizeofDatabase × NumberOfReplicas × 1.1 ) / NumberOfDataNodes
To calculate the memory requirements more exactly requires determining, for each table in the cluster database, the storage space required per row (see Section 10.5, “Data Type Storage Requirements”, for details), and multiplying this by the number of rows. You must also remember to account for any column indexes as follows:
              Each primary key or hash index created for an
              NDBCLUSTER table requires
              21–25 bytes per record. These indexes use
              IndexMemory.
            
              Each ordered index requires 10 bytes storage per record,
              using DataMemory.
            
              Creating a primary key or unique index also creates an
              ordered index, unless this index is created with
              USING HASH. In other words:
            
A primary key or unique index on a Cluster table normally takes up 31 to 35 bytes per record.
                  However, if the primary key or unique index is created
                  with USING HASH, then it requires
                  only 21 to 25 bytes per record.
                
          Note that creating MySQL Cluster tables with USING
          HASH for all primary keys and unique indexes will
          generally cause table updates to run more quickly — in
          some cases by a much as 20 to 30 percent faster than updates
          on tables where USING HASH was not used in
          creating primary and unique keys. This is due to the fact that
          less memory is required (because no ordered indexes are
          created), and that less CPU must be utilized (because fewer
          indexes must be read and possibly updated). However, it also
          means that queries that could otherwise use range scans must
          be satisfied by other means, which can result in slower
          selects.
        
          When calculating Cluster memory requirements, you may find
          useful the ndb_size.pl utility which is
          available in recent MySQL 4.1 releases. This Perl
          script connects to a current (non-Cluster) MySQL database and
          creates a report on how much space that database would require
          if it used the NDBCLUSTER storage
          engine. For more information, see
          Section 15.4.19, “ndb_size.pl — NDBCLUSTER Size Requirement Estimator”.
        
          It is especially important to keep in mind that
          every MySQL Cluster table must have a primary
          key. The NDB storage
          engine creates a primary key automatically if none is defined;
          this primary key is created without USING
          HASH.
        
          There is no easy way to determine exactly how much memory is
          being used for storage of Cluster indexes at any given time;
          however, warnings are written to the Cluster log when 80% of
          available DataMemory or
          IndexMemory is in use, and again when use
          reaches 85%, 90%, and so on.
        
15.6.11: What file systems can I use with MySQL Cluster? What about network file systems or network shares?
Generally, any file system that is native to the host operating system should work well with MySQL Cluster. If you find that a given file system works particularly well (or not so especially well) with MySQL Cluster, we invite you to discuss your findings in the MySQL Cluster Forums.
          We do not test MySQL Cluster with FAT or
          VFAT file systems on Linux. Because of
          this, and due to the fact that these are not very useful for
          any purpose other than sharing disk partitions between Linux
          and Windows operating systems on multi-boot computers, we do
          not recommend their use with MySQL Cluster.
        
MySQL Cluster is implemented as a shared-nothing solution; the idea behind this is that the failure of a single piece of hardware should not cause the failure of multiple cluster nodes, or possibly even the failure of the cluster as a whole. For this reason, the use of network shares or network file systems is not supported for MySQL Cluster. This also applies to shared storage devices such as SANs.
15.6.12: Can I run MySQL Cluster nodes inside virtual machines (such as those created by VMWare, Parallels, or Xen)?
This is possible but not recommended for a production environment.
We have found that running MySQL Cluster processes inside a virtual machine can give rise to issues with timing and disk subsystems that have a strong negative impact on the operation of the cluster. The behavior of the cluster is often unpredictable in these cases.
If an issue can be reproduced outside the virtual environment, then we may be able to provide assistance. Otherwise, we cannot support it at this time.
15.6.13: 
          I am trying to populate a MySQL Cluster database. The loading
          process terminates prematurely and I get an error message like
          this one:
        
          ERROR 1114: The table 'my_cluster_table'
          is full
        
          Why is this happening?
        
          The cause is very likely to be that your setup does not
          provide sufficient RAM for all table data and all indexes,
          including the primary key required by the
          NDB storage engine and
          automatically created in the event that the table definition
          does not include the definition of a primary key.
        
It is also worth noting that all data nodes should have the same amount of RAM, since no data node in a cluster can use more memory than the least amount available to any individual data node. For example, if there are four computers hosting Cluster data nodes, and three of these have 3GB of RAM available to store Cluster data while the remaining data node has only 1GB RAM, then each data node can devote at most 1GB to MySQL Cluster data and indexes.
15.6.14: MySQL Cluster uses TCP/IP. Does this mean that I can run it over the Internet, with one or more nodes in remote locations?
It is very unlikely that a cluster would perform reliably under such conditions, as MySQL Cluster was designed and implemented with the assumption that it would be run under conditions guaranteeing dedicated high-speed connectivity such as that found in a LAN setting using 100 Mbps or gigabit Ethernet — preferably the latter. We neither test nor warrant its performance using anything slower than this.
Also, it is extremely important to keep in mind that communications between the nodes in a MySQL Cluster are not secure; they are neither encrypted nor safeguarded by any other protective mechanism. The most secure configuration for a cluster is in a private network behind a firewall, with no direct access to any Cluster data or management nodes from outside. (For SQL nodes, you should take the same precautions as you would with any other instance of the MySQL server.) For more information, see Section 15.5.8, “MySQL Cluster Security Issues”.
15.6.15: Do I have to learn a new programming or query language to use MySQL Cluster?
No. Although some specialized commands are used to manage and configure the cluster itself, only standard (My)SQL statements are required for the following operations:
Creating, altering, and dropping tables
Inserting, updating, and deleting table data
Creating, changing, and dropping primary and unique indexes
Some specialized configuration parameters and files are required to set up a MySQL Cluster — see Section 15.3.2, “MySQL Cluster Configuration Files”, for information about these.
A few simple commands are used in the MySQL Cluster management client (ndb_mgm) for tasks such as starting and stopping cluster nodes. See Section 15.5.2, “Commands in the MySQL Cluster Management Client”.
15.6.16: How do I find out what an error or warning message means when using MySQL Cluster?
There are two ways in which this can be done:
From within the mysql client, use SHOW ERRORS or SHOW WARNINGS immediately upon being notified of the error or warning condition.
              From a system shell prompt, use perror --ndb
              error_code.
            
15.6.17: Is MySQL Cluster transaction-safe? What isolation levels are supported?
          Yes. For tables created with the
          NDB storage engine, transactions
          are supported. Currently, MySQL Cluster supports only the
          READ COMMITTED transaction
          isolation level.
        
15.6.18: What storage engines are supported by MySQL Cluster?
          Clustering with MySQL is supported only by the
          NDB storage engine. That is, in
          order for a table to be shared between nodes in a MySQL
          Cluster, the table must be created using
          ENGINE=NDB (or the equivalent option
          ENGINE=NDBCLUSTER).
        
          It is possible to create tables using other storage engines
          (such as MyISAM or
          InnoDB) on a MySQL server being
          used with a MySQL Cluster, but these
          non-NDB tables do
          not participate in clustering; each such
          table is strictly local to the individual MySQL server
          instance on which it is created.
        
15.6.19: In the event of a catastrophic failure — say, for instance, the whole city loses power and my UPS fails — would I lose all my data?
All committed transactions are logged. Therefore, although it is possible that some data could be lost in the event of a catastrophe, this should be quite limited. Data loss can be further reduced by minimizing the number of operations per transaction. (It is not a good idea to perform large numbers of operations per transaction in any case.)
15.6.20: 
          Is it possible to use FULLTEXT indexes with
          MySQL Cluster?
        
          FULLTEXT indexing is not supported by any
          storage engine other than MyISAM.
          We are working to add this capability to MySQL Cluster tables
          in a future release.
        
15.6.21: Can I run multiple nodes on a single computer?
It is possible but not advisable. One of the chief reasons to run a cluster is to provide redundancy. To obtain the full benefits of this redundancy, each node should reside on a separate machine. If you place multiple nodes on a single machine and that machine fails, you lose all of those nodes. Given that MySQL Cluster can be run on commodity hardware loaded with a low-cost (or even no-cost) operating system, the expense of an extra machine or two is well worth it to safeguard mission-critical data. It also worth noting that the requirements for a cluster host running a management node are minimal. This task can be accomplished with a 300 MHz Pentium or equivalent CPU and sufficient RAM for the operating system, plus a small amount of overhead for the ndb_mgmd and ndb_mgm processes.
It is acceptable to run multiple cluster data nodes on a single host for learning about MySQL Cluster, or for testing purposes; however, this is not generally supported for production use.
15.6.22: Can I add data nodes to a MySQL Cluster without restarting it?
Not in MySQL 4.1. While a rolling restart is all that is required for adding new management or API nodes to a MySQL Cluster (see Section 15.2.6.1, “Performing a Rolling Restart of a MySQL Cluster”), adding data nodes is more complex, and requires the following steps:
Make a complete backup of all Cluster data.
Completely shut down the cluster and all cluster node processes.
              Restart the cluster, using the --initial
              startup option for all instances of
              ndbd.
            
                Never use the --initial when starting
                ndbd except when necessary to clear
                the data node file system. See
                Section 15.4.2, “ndbd — The MySQL Cluster Data Node Daemon”, for
                information about when this is required.
              
Restore all cluster data from the backup.
Beginning with MySQL Cluster NDB 6.4, it is possible to add new data nodes to a running MySQL Cluster without taking it offline. For more information, see Adding MySQL Cluster Data Nodes Online. However, we do not plan to add this capability in MySQL 4.1.
15.6.23: Are there any limitations that I should be aware of when using MySQL Cluster?
          Limitations on NDB tables in
          MySQL 4.1 include the following:
        
              Temporary tables are not supported; a
              CREATE
              TEMPORARY TABLE statement using
              ENGINE=NDB or
              ENGINE=NDBCLUSTER fails with an error.
            
              FULLTEXT indexes are not supported.
            
Index prefixes are not supported. Only complete columns may be indexed.
In MySQL 4.1, MySQL Cluster does not support spatial data types or spatial indexes. See Chapter 16, Spatial Extensions.
              Only complete rollbacks for transactions are supported.
              Partial rollbacks and rollbacks to savepoints are not
              supported. A failed insert due to a duplicate key or
              similar error causes a transaction to abort; when this
              occurs, you must issue an explicit
              ROLLBACK
              and retry the transaction.
            
The maximum number of attributes allowed per table is 128, and attribute names cannot be any longer than 31 characters. For each table, the maximum combined length of the table and database names is 122 characters.
              The maximum size for a table row is 8 kilobytes, not
              counting BLOB values. There
              is no set limit for the number of rows per table. Table
              size limits depend on a number of factors, in particular
              on the amount of RAM available to each data node.
            
              The NDB engine does not
              support foreign key constraints. As with
              MyISAM tables, if these are
              specified in a CREATE TABLE
              or ALTER TABLE statement,
              they are ignored.
            
For a complete listing of limitations in MySQL Cluster, see Section 15.1.4, “Known Limitations of MySQL Cluster”.
15.6.24: How do I import an existing MySQL database into a MySQL Cluster?
          You can import databases into MySQL Cluster much as you would
          with any other version of MySQL. Other than the limitations
          mentioned elsewhere in this FAQ, the only other special
          requirement is that any tables to be included in the cluster
          must use the NDB storage engine.
          This means that the tables must be created with
          ENGINE=NDB or
          ENGINE=NDBCLUSTER.
        
          It is also possible to convert existing tables that use other
          storage engines to NDBCLUSTER
          using one or more ALTER TABLE
          statement. However, the definition of the table must be
          compatible with the NDBCLUSTER
          storage engine prior to making the conversion. In MySQL
          4.1, an additional workaround is also required;
          see Section 15.1.4, “Known Limitations of MySQL Cluster”, for details.
        
15.6.25: How do cluster nodes communicate with one another?
Cluster nodes can communicate via any of three different transport mechanisms: TCP/IP, SHM (shared memory), and SCI (Scalable Coherent Interface). Where available, SHM is used by default between nodes residing on the same cluster host; however, this is considered experimental. SCI is a high-speed (1 gigabit per second and higher), high-availability protocol used in building scalable multi-processor systems; it requires special hardware and drivers. See Section 15.3.5, “Using High-Speed Interconnects with MySQL Cluster”, for more about using SCI as a transport mechanism for MySQL Cluster.
15.6.26: What is an arbitrator?
If one or more data nodes in a cluster fail, it is possible that not all cluster data nodes will be able to “see” one another. In fact, it is possible that two sets of data nodes might become isolated from one another in a network partitioning, also known as a “split-brain” scenario. This type of situation is undesirable because each set of data nodes tries to behave as though it is the entire cluster. An arbitrator is required to decide between the competing sets of data nodes.
          When all data nodes in at least one node group are alive,
          network partitioning is not an issue, because no single subset
          of the cluster can form a functional cluster on its own. The
          real problem arises when no single node group has all its
          nodes alive, in which case network partitioning (the
          “split-brain” scenario) becomes possible. Then an
          arbitrator is required. All cluster nodes recognize the same
          node as the arbitrator, which is normally the management
          server; however, it is possible to configure any of the MySQL
          Servers in the cluster to act as the arbitrator instead. The
          arbitrator accepts the first set of cluster nodes to contact
          it, and tells the remaining set to shut down. Arbitrator
          selection is controlled by the
          ArbitrationRank configuration parameter for
          MySQL Server and management server nodes. (See
          Section 15.3.2.4, “Defining a MySQL Cluster Management Server”, for details.)
        
The role of arbitrator does not in and of itself impose any heavy demands upon the host so designated, and thus the arbitrator host does not need to be particularly fast or to have extra memory especially for this purpose.
15.6.27: What data types are supported by MySQL Cluster?
          In MySQL 4.1, MySQL Cluster supports all of the
          usual MySQL data types, except for those associated with
          MySQL's spatial extensions. (Spatial data types and
          spatial indexes are supported only by
          MyISAM; see
          Chapter 16, Spatial Extensions, for more information.)
          In addition, there are some differences with regard to indexes
          when used with NDB tables.
        
            In MySQL 4.1, MySQL Cluster tables (that is,
            tables created with ENGINE=NDB or
            ENGINE=NDBCLUSTER) have only fixed-width
            rows. This means that (for example) each record containing a
            VARCHAR(255)
            column will require space for 255 characters (as required
            for the character set and collation being used for the
            table), regardless of the actual number of characters stored
            therein. This issue is fixed in MySQL 5.1 and later;
            however, we do not plan to backport this functionality to
            MySQL 4.1.
          
See Section 15.1.4, “Known Limitations of MySQL Cluster”, for more information about these issues.
15.6.28: How do I start and stop MySQL Cluster?
It is necessary to start each node in the cluster separately, in the following order:
Start the management node, using the ndb_mgmd command.
              You must include the -f or
              --config-file option to
              tell the management node where its configuration file can
              be found.
            
Start each data node with the ndbd command.
              Each data node must be started with the
              -c
              or --connect-string
              option so that the data node knows how to connect to the
              management server.
            
Start each MySQL Server (SQL node) using your preferred startup script, such as mysqld_safe.
              Each MySQL Server must be started with the
              --ndbcluster and
              --ndb-connectstring
              options. These options cause mysqld to enable
              NDBCLUSTER storage engine
              support and how to connect to the management server.
            
          Each of these commands must be run from a system shell on the
          machine housing the affected node. (You do not have to be
          physically present at the machine — a remote login shell
          can be used for this purpose.) You can verify that the cluster
          is running by starting the NDB
          management client ndb_mgm on the machine
          housing the management node and issuing the
          SHOW or ALL STATUS
          command.
        
          To shut down a running cluster, issue the command
          SHUTDOWN in the management client.
          Alternatively, you may enter the following command in a system
          shell:
        
shell> ndb_mgm -e "SHUTDOWN"
          (The quotation marks in this example are optional, since there
          are no spaces in the command string following the
          -e option; in addition, the
          SHUTDOWN command, like other management
          client commands, is not case-sensitive.)
        
Either of these commands causes the ndb_mgm, ndb_mgm, and any ndbd processes to terminate gracefully. MySQL servers running as SQL nodes can be stopped using mysqladmin shutdown.
For more information, see Section 15.5.2, “Commands in the MySQL Cluster Management Client”, and Section 15.2.5, “Safe Shutdown and Restart of MySQL Cluster”.
15.6.29: What happens to MySQL Cluster data when the cluster is shut down?
The data that was held in memory by the cluster's data nodes is written to disk, and is reloaded into memory the next time that the cluster is started.
15.6.30: Is it a good idea to have more than one management node for a MySQL Cluster?
It can be helpful as a fail-safe. Only one management node controls the cluster at any given time, but it is possible to configure one management node as primary, and one or more additional management nodes to take over in the event that the primary management node fails.
See Section 15.3.2, “MySQL Cluster Configuration Files”, for information on how to configure MySQL Cluster management nodes.
15.6.31: Can I mix different kinds of hardware and operating systems in one MySQL Cluster?
Yes, as long as all machines and operating systems have the same “endianness” (all big-endian or all little-endian). We are working to overcome this limitation in a future MySQL Cluster release.
It is also possible to use software from different MySQL Cluster releases on different nodes. However, we support this only as part of a rolling upgrade procedure (see Section 15.2.6.1, “Performing a Rolling Restart of a MySQL Cluster”).
15.6.32: Can I run two data nodes on a single host? Two SQL nodes?
Yes, it is possible to do this. In the case of multiple data nodes, it is advisable (but not required) for each node to use a different data directory. If you want to run multiple SQL nodes on one machine, each instance of mysqld must use a different TCP/IP port. However, in MySQL 4.1, running more than one cluster node of a given type per machine is generally not encouraged or supported for production use.
We also advise against running data nodes and SQL nodes together on the same host, since the ndbd and mysqld processes may compete for memory.
15.6.33: Can I use host names with MySQL Cluster?
Yes, it is possible to use DNS and DHCP for cluster hosts. However, if your application requires “five nines” availability, you should use fixed (numeric) IP addresses, since making communication between Cluster hosts dependent on services such as DNS and DHCP introduces additional potential points of failure.
15.6.34: How do I handle MySQL users in a MySQL Cluster having multiple MySQL servers?
MySQL user accounts and privileges are not automatically propagated between different MySQL servers accessing the same MySQL Cluster. Therefore, you must make sure that these are copied between the SQL nodes yourself. You can do this manually, or automate the task with scripts.
            Do not attempt to work around this issue by converting the
            MySQL system tables to use the
            NDBCLUSTER storage engine. Only
            the MyISAM storage engine is
            supported for these tables.
          
15.6.35: How do I continue to send queries in the event that one of the SQL nodes fails?
MySQL Cluster does not provide any sort of automatic failover between SQL nodes. Your application must be prepared to handlethe loss of SQL nodes and to fail over between them.
15.6.36: How do I back up and restore a MySQL Cluster?
You can use the NDB native backup and restore functionality in the MySQL Cluster management client and the ndb_restore program. See Section 15.5.3, “Online Backup of MySQL Cluster”, and Section 15.4.15, “ndb_restore — Restore a MySQL Cluster Backup”.
You can also use the traditional functionality provided for this purpose in mysqldump and the MySQL server. See Section 4.5.4, “mysqldump — A Database Backup Program”, for more information.
15.6.37: What is an “angel process”?
This process monitors and, if necessary, attempts to restart the data node process. If you check the list of active processes on your system after starting ndbd, you can see that there are actually 2 processes running by that name, as shown here (we omit the output from ndb_mgmd and ndbd for brevity):
shell>./ndb_mgmdshell>ps aux | grep ndbme 23002 0.0 0.0 122948 3104 ? Ssl 14:14 0:00 ./ndb_mgmd me 23025 0.0 0.0 5284 820 pts/2 S+ 14:14 0:00 grep ndb shell>./ndbd -c 127.0.0.1 --initialshell>ps aux | grep ndbme 23002 0.0 0.0 123080 3356 ? Ssl 14:14 0:00 ./ndb_mgmd me 23096 0.0 0.0 35876 2036 ? Ss 14:14 0:00 ./ndbd -c 127.0.0.1 --initial me 23097 1.0 2.4 524116 91096 ? Sl 14:14 0:00 ./ndbd -c 127.0.0.1 --initial me 23168 0.0 0.0 5284 812 pts/2 R+ 14:15 0:00 grep ndb
          The ndbd process showing 0 memory and CPU
          usage is the angel process. It actually does use a very small
          amount of each, of course. It simply checks to see if the main
          ndbd process (the primary data node process
          that actually handles the data) is running. If permitted to do
          so (for example, if the StopOnError
          configuration parameter is set to false — see
          Section 15.3.3.1, “MySQL Cluster Data Node Configuration Parameters”), the angel
          process tries to restart the primary data node process.
        


User Comments
Add your own comment.