MySQL: convert utf8_bin to utf8_general_ci

Author Message

Piotrek Karaś

Thursday 12 June 2008 9:00:01 pm

Hi,

I've just discovered that, probably by mistake, one of our eZP databases was encoded with utf8_bin. The site works fine, but there's little I can do with the DB. Does anyone know how to convert that DB back to the utf8_general_ci?

Thanks,
Piotrek

--
Company: mediaSELF Sp. z o.o., http://www.mediaself.pl
eZ references: http://ez.no/partners/worldwide_partners/mediaself
eZ certified developer: http://ez.no/certification/verify/272585
eZ blog: http://ez.ryba.eu

Bartek Modzelewski

Friday 13 June 2008 3:04:27 am

Hi Piotrek,

Just try with simple MySQL query:

ALTER DATABASE database_name DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;

You can do it also in phpMyAdmin.

Cheers
Bartek

Baobaz
http://www.baobaz.com

Piotrek Karaś

Friday 13 June 2008 6:19:28 am

Hey Bartek,

I was afraid that binary encoded fields aren't handled the same way. Thanks for the tip anyway! I had to use slightly different method, meaning CONVERT TO CHARACTER SET, which will actually convert values and apply that to all the text fields in a table.

Thanks,
Piotrek

--
Company: mediaSELF Sp. z o.o., http://www.mediaself.pl
eZ references: http://ez.no/partners/worldwide_partners/mediaself
eZ certified developer: http://ez.no/certification/verify/272585
eZ blog: http://ez.ryba.eu

Bruce Morrison

Friday 13 June 2008 5:22:07 pm

Hi Piotrek

There are two things in play here the Character Set and the Collation. I'll assume you know about character sets. A Collation is <i>a set of rules determining how to compare and sort character data. Collation properties include things like case sensitivity, accent sensitivity, trailing spaces sensitivity, and others. In MySQL, every CHAR, VARCHAR, TEXT column have some collation assumption.</i> The main effect of collations is how the ORDER BY sorts data.

utf8_bin and utf8_general_ci are both collations associated with the utf8 character set. The SQL posted by Bartek should have been adequate as all you needed to do was change the collation.

Hope this clears things up!

Cheers
Bruce

My Blog: http://www.stuffandcontent.com/
Follow me on twitter: http://twitter.com/brucemorrison
Consolidated eZ Publish Feed : http://friendfeed.com/rooms/ez-publish

Piotrek Karaś

Friday 13 June 2008 9:28:28 pm

Hi Bruce,

Yes, I do understand the distinction between character set and collation, or at least I thought I have until this problem that I described above. First, I changed the default collation at the database level, but that didn't change anything, of course. Then, I changed default collation for every db table, but unfortunately collation still had its default setting at the column level, and that one remained unchanged. I couldn't find any SQL statement to modify those column-level collations in an easy way, and that's when I found this CONVERT TO CHARACTER SET option, which - as stated in the MySQL doc - would take care of all column settings without having to explicitly address them (which in case of eZ Publish DB would be a long process or would require a little script, that I didn't have time for). So now I realize, that I haven't tried changing column's default collation, and as you and Bartek mention, that might have done the trick. I was also misled by the fact that until that convertion, every SQL dump/export included binary strings instead of plain text.

1) Is there any SQL statement to alter default collation "recursively"? I mean, at all levels: database, each table, each string-like field?
2) Would convert be a risk in this case, or just an unnecessary operation?

Thanks,
Piotrek

--
Company: mediaSELF Sp. z o.o., http://www.mediaself.pl
eZ references: http://ez.no/partners/worldwide_partners/mediaself
eZ certified developer: http://ez.no/certification/verify/272585
eZ blog: http://ez.ryba.eu

Powered by eZ Publish™ CMS Open Source Web Content Management. Copyright © 1999-2014 eZ Systems AS (except where otherwise noted). All rights reserved.

eZ debug

Timing: Jan 30 2025 21:42:07
Script start
Timing: Jan 30 2025 21:42:07
Module start 'layout'
Timing: Jan 30 2025 21:42:07
Module start 'content'
Timing: Jan 30 2025 21:42:07
Module end 'content'
Timing: Jan 30 2025 21:42:07
Script end

Main resources:

Total runtime0.0275 sec
Peak memory usage4,096.0000 KB
Database Queries3

Timing points:

CheckpointStart (sec)Duration (sec)Memory at start (KB)Memory used (KB)
Script start 0.00000.0056 588.2656151.2266
Module start 'layout' 0.00560.0038 739.4922220.7188
Module start 'content' 0.00940.0162 960.21091,005.9609
Module end 'content' 0.02570.0017 1,966.171937.9922
Script end 0.0274  2,004.1641 

Time accumulators:

 Accumulator Duration (sec) Duration (%) Count Average (sec)
Ini load
Load cache0.003010.9428140.0002
Check MTime0.00124.3318140.0001
Mysql Total
Database connection0.00093.292510.0009
Mysqli_queries0.00248.820730.0008
Looping result0.00000.051210.0000
Template Total0.00124.510.0012
Template load0.00103.601610.0010
Template processing0.00020.899510.0002
Override
Cache load0.00072.545810.0007
General
dbfile0.00124.422180.0002
String conversion0.00000.024340.0000
Note: percentages do not add up to 100% because some accumulators overlap

Templates used to render the page:

UsageRequested templateTemplateTemplate loadedEditOverride
1print_pagelayout.tpl<No override>extension/community/design/community/templates/print_pagelayout.tplEdit templateOverride template
 Number of times templates used: 1
 Number of unique templates used: 1

Time used to render debug report: 0.0001 secs