For applications that store data using the default MySQL
character set and collation (latin1
,
latin1_swedish_ci
), no special configuration
should be needed. If applications require data storage using a
different character set or collation, you can configure
character set information several ways:
Specify character settings per database. For example, applications that use one database might require
utf8
, whereas applications that use another database might requiresjis
.Specify character settings at server startup. This causes the server to use the given settings for all applications that do not make other arrangements.
Specify character settings at configuration time, if you build MySQL from source. This causes the server to use the given settings for all applications, without having to specify them at server startup.
When different applications require different character settings, the per-database technique provides a good deal of flexibility. If most or all applications use the same character set, specifying character settings at server startup or configuration time may be most convenient.
For the per-database or server-startup techniques, the settings control the character set for data storage. Applications must also tell the server which character set to use for client/server communications, as described in the following instructions.
The examples shown here assume use of the
utf8
character set and
utf8_general_ci
collation.
Specify character settings per
database. To create a database such that its tables
will use a given default character set and collation for data
storage, use a CREATE DATABASE
statement like this:
CREATE DATABASE mydb DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
Tables created in the database will use utf8
and utf8_general_ci
by default for any
character columns.
Applications that use the database should also configure their
connection to the server each time they connect. This can be
done by executing a SET NAMES 'utf8'
statement after connecting. The statement can be used regardless
of connection method: The mysql client, PHP
scripts, and so forth.
In some cases, it may be possible to configure the connection to
use the desired character set some other way. For example, for
connections made using mysql, you can specify
the --default-character-set=utf8
command-line option to achieve the same effect as SET
NAMES 'utf8'
.
For more information about configuring client connections, see Section 9.1.4, “Connection Character Sets and Collations”.
Specify character settings at server
startup. To select a character set and collation at
server startup, use the
--character-set-server
and
--collation-server
options. For
example, to specify the options in an option file, include these
lines:
[mysqld] character-set-server=utf8 collation-server=utf8_general_ci
These settings apply server-wide and apply as the defaults for databases created by any application, and for tables created in those databases.
It is still necessary for applications to configure their
connection using SET NAMES
or equivalent
after they connect, as described previously. You might be
tempted to start the server with the
--init_connect="SET NAMES 'utf8'"
option to cause SET NAMES
to be executed
automatically for each client that connects. However, this will
yield inconsistent results because the
init_connect
value is not
executed for users who have the
SUPER
privilege.
Specify character settings at MySQL
configuration time. To select a character set and
collation when you configure and build MySQL from source, use
the DEFAULT_CHARSET
and
DEFAULT_COLLATION
options for
CMake:
shell>cmake . -DDEFAULT_CHARSET=utf8 \
-DDEFAULT_COLLATION=utf8_general_ci
The resulting server uses utf8
and
utf8_general_ci
as the default for databases
and tables and for client connections. It is unnecessary to use
--character-set-server
and
--collation-server
to specify
those defaults at server startup. It is also unnecessary for
applications to configure their connection using SET
NAMES
or equivalent after they connect to the server.
Regardless of how you configure the MySQL character set for
application use, you must also consider the environment within
which those applications execute. If you will send statements
using UTF-8 text taken from a file that you create in an editor,
you should edit the file with the locale of your environment set
to UTF-8 so that the file encoding is correct and so that the
operating system handles it correctly. If you use the
mysql client from within a terminal window,
the window must be configured to use UTF-8 or characters may not
display properly. For a script that executes in a Web
environment, the script must handle character encoding properly
for its interaction with the MySQL server, and it must generate
pages that correctly indicate the encoding so that browsers know
how to display the content of the pages. For example, you can
include this <meta>
tag within your
<head>
element:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />