9.1.3.4. Column Character Set and Collation

Every “character” column (that is, a column of type CHAR, VARCHAR, or TEXT) has a column character set and a column collation. Column definition syntax for CREATE TABLE and ALTER TABLE has optional clauses for specifying the column character set and collation:

col_name {CHAR | VARCHAR | TEXT} (col_length)
    [CHARACTER SET charset_name]
    [COLLATE collation_name]

These clauses can also be used for ENUM and SET columns:

col_name {ENUM | SET} (val_list)
    [CHARACTER SET charset_name]
    [COLLATE collation_name]

Examples:

CREATE TABLE t1
(
    col1 VARCHAR(5)
      CHARACTER SET latin1
      COLLATE latin1_german1_ci
);

ALTER TABLE t1 MODIFY
    col1 VARCHAR(5)
      CHARACTER SET latin1
      COLLATE latin1_swedish_ci;

MySQL chooses the column character set and collation in the following manner:

  • If both CHARACTER SET X and COLLATE Y are specified, character set X and collation Y are used.

    CREATE TABLE t1
    (
        col1 CHAR(10) CHARACTER SET utf8 COLLATE utf8_unicode_ci
    ) CHARACTER SET latin1 COLLATE latin1_bin;
    

    The character set and collation are specified for the column, so they are used. The column has character set utf8 and collation utf8_unicode_ci.

  • If CHARACTER SET X is specified without COLLATE, character set X and its default collation are used.

    CREATE TABLE t1
    (
        col1 CHAR(10) CHARACTER SET utf8
    ) CHARACTER SET latin1 COLLATE latin1_bin;
    

    The character set is specified for the column, but the collation is not. The column has character set utf8 and the default collation for utf8, which is utf8_general_ci. To see the default collation for each character set, use the SHOW COLLATION statement.

  • If COLLATE Y is specified without CHARACTER SET, the character set associated with Y and collation Y are used.

    CREATE TABLE t1
    (
        col1 CHAR(10) COLLATE utf8_polish_ci
    ) CHARACTER SET latin1 COLLATE latin1_bin;
    

    The collation is specified for the column, but the character set is not. The column has collation utf8_polish_ci and the character set is the one associated with the collation, which is utf8.

  • Otherwise, the table character set and collation are used.

    CREATE TABLE t1
    (
        col1 CHAR(10)
    ) CHARACTER SET latin1 COLLATE latin1_bin;
    

    Neither the character set nor collation are specified for the column, so the table defaults are used. The column has character set latin1 and collation latin1_bin.

The CHARACTER SET and COLLATE clauses are standard SQL.

If you use ALTER TABLE to convert a column from one character set to another, MySQL attempts to map the data values, but if the character sets are incompatible, there may be data loss.

Copyright © 2010-2024 Platon Technologies, s.r.o.           Home | Man pages | tLDP | Documents | Utilities | About
Design by styleshout