UnicodeEncodeError after using ü in loop variable

Eru_Iluvatar · February 2016

Hi

I stumbled upon a problem with character encoding, when I want to concatenate strings to following way inside a loop-sequence block:
sendMessage(slideChangeEvent('Word_' + var.word))

It results in the following error message:

Error while executing inline script

item-stack: experiment[run].word_loop[run].word_sequence[run].sendMessageWord[run]
exception type: UnicodeEncodeError
exception message: 'ascii' codec can't encode character u'\xfc' in position 1: ordinal not in range(128)
item: sendMessageWord
time: Mon Feb 01 15:24:05 2016
phase: run

Traceback:
File "dist\libopensesame\inline_script.py", line 102, in run
File "dist\libopensesame\python_workspace.py", line 159, in _exec
File "", line 2, in
UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 1: ordinal not in range(128)

These are the possible values of var.word:

glücklich
wütend
traurig
erstaunt
angeekelt
Of course the problem only arises with "glücklich" and "wütend".

How can i solve this problem?
Thanks for your suggestions.

Cheers,
Stefan

sebastiaan · February 2016

Hi Stefan,

I suspect that this bit:

'Word_' + var.word

... works fine, and returns a unicode object (e.g. u'Word_glücklich'). You can try this by just doing the concatenation on it's own, not embedded in a larger command.

So the question is: What is slideChangeEvent or sendMessage doing that triggers a UnicodeEncodeError? Without seeing the full code I cannot tell!

Cheers,
Sebastiaan

Eru_Iluvatar · February 2016

Hi Sebastiaan

You're right of course, a simple print('word' + var.word) is working fine.

sendMessage()takes the string provided by slideChangeEvent() to send a message through a TCP socket to the external API of iMotions where we capture the people's facial responses together with markers sent from OpenSesame.

The relevant code is the following:

import socket

### Some global settings/variables used
lnbr = '\r\n'
IP = "127.0.0.1"
EAPI_PORT = 8089        # iMotions external API
CTRL_PORT = 8087        # iMotions Remote Control
# iMotions parameters
studyName = 'GamblingGame'  
subjectName = str(var.subject_nr)

# setup sockets and connect
sockExtAPI = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sockExtAPI.connect((IP, EAPI_PORT))
sockCTRL = socket.socket(socket.AF_INET,socket.SOCK_STREAM)
sockCTRL.connect((IP, CTRL_PORT))

# send external API message (mouseEvent or slideChangeEvent)    
def sendMessage(message):
    sockExtAPI.send(message)
    #response = sockCTRL.recv(4092)
    log.write('ExtAPI message sent: ' + message)

def slideChangeEvent(slideID):
    # discrete header
    # version 2
    header = 'M;2;'
    # field 5: slideID
    # field 7: marker type N
    # (marks the start of the next segment, automatically closing any currently
    # open segment.)
    event = ';;' + slideID + ';;N;I'
    return header + event + lnbr

Edit: nevermind my previous edit

sebastiaan · February 2016

I suspect the bad guy is here:

sockExtAPI.send(message)

In your case, message is a unicode object, which will be automatically converted to a str object. Kind of like this:

_message = str(message)
sockExtAPI.send(_message)

And this goes wrong, because the default is assumed is ascii, which doesn't contain special characters. To make this unicode-safe, you need to explicitly say which character encoding you want to use. For example:

_message = message.encode('utf-8')
sockExtAPI.send(_message)

This will send the message as a utf-8 encoded bytestring. Whether the receiver will take kindly to that is, of course, another matter.

Cheers,
Sebastiaan

Eru_Iluvatar · February 2016

sockExtAPI.send(message.encode('utf-8')) did the trick at least inside OpenSesame. However, write.log(message.encode('uft-8'))gives this funny mixture of symbols, same in the exported data from iMotions.

Hm, looks like we will need to encode the stimuli differently.
But thank you very much for your patience. I'm far from being experienced in python... :-)

Cheers,
Stefan

sebastiaan · February 2016

However, write.log(message.encode('uft-8'))gives this funny mixture of symbols, same in the exported data from iMotions.

The (very common) mistake that you're making is thinking that the problem lies in how the file is written, whereas it lies in how you're reading it. The OpenSesame log file is fine, but it's utf-8 encoded. If it looks funny, this is because the text editor/ spreadsheet has used the wrong encoding to read it, and you'll have to explicitly tell it to use utf-8.

Howdy, Stranger!

Categories

UnicodeEncodeError after using ü in loop variable

Comments

Howdy, Stranger!

Quick Links

Categories

UnicodeEncodeError after using ü in loop variable

Comments