Fixing broken unicode in wordpress

I tried to use unicode stars in posts to show how much I liked some books. I was embarrassed to see them show up as ????? question marks. Yuck. Why is that happening?

I’ve been blogging for a long time and had a suspicion that it was going to be the age of this WordPress install. I’ve diligently upgraded the software, but nobody likes to migrate a DB schema. So I dug through the DB and sure enough, my wp-posts table is in latin1-ci. For non-nerds, that means it’s a case insensitive database that only is set to store “latin” characters – basic ABC123 and punctuation, but none of the fun unicode or non-latin characters in other languages.

I’ve done some digging and it looks like this isn’t a pushbutton process, so it looks like a weekend scripting adventure.

But wait, there's more

One thought on “Fixing broken unicode in wordpress

  1. I’ve wondered about this, because I’ve had this error message in my apache logs for several years:
    PHP message: WordPress database error Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8mb4_unicode_520_ci,COERCIBLE) for operation ‘like’ for query
    followed by some horrible hex coded query.

    I knew it would be hard to do, so I haven’t bothered yet. Especially since that particular message looks like somebody trying to do some sort of exploit.

Leave a Reply

Your email address will not be published. Required fields are marked *