Search

Enter a search word or two and press return to see the search results.

Who am I?

Hi, I’m Graeme and these are my notes, from my messy desk. I started this blog because Google proved to be more useful at finding content than anything else I’ve used.

So I started adding my own content in the hopes that Google would index it and allow me to find things again in the future.

It works.

You can find out more about me here, and you should follow me on Twitter here.

Keeping up

You can automatically receive new content here by subscribing to the “Blog RSS” (link below). This is the easiest way to keep up with what I write here.  See this BBC article for a good introduction on RSS and keeping up with the goings on of the Internet more easily.

« Crashing Powerbook | Main | MailManager migration woes »
Thursday
Nov032005

MySQL, Zope and Unicode

OK, I think I can debunk the real hack (that you can append UNICODE=1 to the connection string to set the MySQL connection to Unicode) after poking around in the source a little. This is the code in question which parses the connection string, from db.py:
[code lang="python"] def _parse_connection_string(self, connection):
kwargs = {'conv': self.conv}
items = split(connection)
self._use_TM = None
if not items: return kwargs
lockreq, items = items[0], items[1:]
if lockreq[0] == "*":
self._mysql_lock = lockreq[1:]
db_host, items = items[0], items[1:]
self._use_TM = 1
else:
self._mysql_lock = None
db_host = lockreq
if '@' in db_host:
db, host = split(db_host,'@',1)
kwargs['db'] = db
if ':' in host:
host, port = split(host,':',1)
kwargs['port'] = int(port)
kwargs['host'] = host
else:
kwargs['db'] = db_host
if kwargs['db'] and kwargs['db'][0] in ('+', '-'):
self._try_transactions = kwargs['db'][0]
kwargs['db'] = kwargs['db'][1:]
else:
self._try_transactions = None
if not kwargs['db']:
del kwargs['db']
if not items: return kwargs
kwargs['user'], items = items[0], items[1:]
if not items: return kwargs
kwargs['passwd'], items = items[0], items[1:]
if not items: return kwargs
kwargs['unix_socket'], items = items[0], items[1:]
return kwargs[/code]

which does everything it's documented to do (allows you to specify the database, host & port, whether transactions should be enabled, credentials and a path to the Unix socket), but nothing more. No Unicode enabling hacks, nothing.

So that answers that. Looks like the only solution is to set the default encoding globally in sitecustomise.py. Unfortunately, one can't set the default encoding anywhere else, since one of the last things that site.py does is to delete sys.setdefaultencoding() so that, once the Python interpreter is initialised, the encoding is fixed for the duration. (I guess there's a very good reason for this.)

Bother. I don't like that we have to make changes outside our application's domain. But for email, you need Unicode. Think of all that foreign character set spam you get -- at least MailManager will render it beautifully!

PrintView Printer Friendly Version

EmailEmail Article to Friend

Reader Comments

There are no comments for this journal entry. To create a new comment, use the form below.

PostPost a New Comment

Enter your information below to add a new comment.

My response is on my own website »
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>