Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Supported by

accent and mask convertion : ANSI, utf-8 format ?

edited February 2019 in OpenSesame

Hello everyone,

I'm editing a script OS and I have a problem.
In this task we present 4 French words followed by a mask.
The material is in a pool directory containing 4 .txt files.

The mask depends on the size of the words so that there are as many '#' as letters when masking:

mask = ''
mask = len(word1)*'#'+' '+len(word2)*'#'+' '+len(word3)*'#'+' '+len(word4)*'#'

It works for words without accent.
For words with accents - this adds a character e.g. 'poussé' -> '#######'

So I made a list:

mask = ''
acc_word = ["poussé","....","...."]

if word1 in acc_word
    mask = (len(word1)-1)*'#'+' '+len(word2)*'#'+' '+len(word3)*'#'+' '+len(word4)*'#'

That works well !

The problem lies on the first words of each file.
The words are displayed correctly but changing to the mask generates 3 '#' more
e.g., 'alan' -> '#######'

This only occurs for the words at the beginning of the file
Since there are 4 files it concerns 4 trials in the task.

So I wrote the material myself in a txt file in AINSI format

That works well !

The first word of the file is now correctly coded
e.g. 'alan' -> '####'

But now this message is displayed for words with accents

exception type: UnicodeDecodeError
exception message: 'utf8' codec can't 
decode byte 0xea in position 15: invalid continuation byte

At last I tried csv files but that's the same message.

If anyone could help me ?



  • Salut Chris,

    The devil lies in the details when it comes to character encoding. You mention that you read from a text file. To make things easy, I would ensure that the text file is saved in utf-8 encoding. (If possible, also indicate that there should not be a BOM [byte order mark], which often shows up as an extraneous invisible character at the start of the file.)

    Then, when you read in the text, convert everything to unicode as soon as possible and only then do stuff with it. The following script will do this in Python 2, which is what OpenSesame uses by default. (In Python 3, things are easier.)

    with open('my_file.txt') as f:
        for line in f:
            line = line.decode('utf-8')
            # Now do something with the line

    So in general, that's the flow you want to use. That's also what OpenSesame will do for you if you use a .csv file as a source for the loop table.


    Buy Me A Coffee

  • edited February 2019

    And if you gaze long enough into an abyss, the abyss will gaze back into you.

    Thank you Sebastiaan ! That was effectively a txt file in UTF-8 with BOM.
    I downloaded a good convert editor which allowed to convert to UTF-8 and only this format.
    Once that is done, the first word of each first line no longer three characters, but then
    accented words was still coded with one more characters.

    But it was clearer, I made a list with the accented words:

    acc_words = ["purée","bêtes", "..."]
    lw1, lw2, lw3,lw4=len(word1),len(word2),len(word3),len(word4)
    if word1 in acc_words:
                lw1 = lw1-1
    # I did that for each word
    mask = lw1*'#'+' '+lw2*'#'+' '+lw3*'#'+' '+lw4*'#'

    Now it works !
    Thanks again

  • Salut Sebastiaan,

    First of all thank you for answering.
    That was effectively text files saved in utf-8 with BOM (which adds 3 characters to the beginning of the file).
    So I edited the text files in utf-8 without BOM with Sublime Text editor and I had no problem with these 3 characters at the beginning of the file.
    However the problem of accented words persisted, but it was now easier, I created a list of accented words and I simply coded the masks :

    awo = ["purée","bêtes","œufs","...."]
    lw1, lw2, lw3, lw4=len(word1), len(word2), len(word3), len(word4)
    if word1 in awo:
        lw1 = lw1-1
    if word2 in awo:
        lw2 = lw2-1
    if word3 in awo:
        lw3 = lw3-1
    if word4 in awo:
        lw4 = lw4-1
    mask = lw1*'#'+' '+lw2*'#'+' '+lw3*'#'+' '+lw4*'#'

    It works now !!
    Thank you again, it was a good clue

  • Hi Chris,

    When defining literal text in an inline_script, it's best to define unicode strings directly by prefixing a u. For unicode strings, the length indicates the number of characters. Otherwise (as in your case) they will be bytecode strings by default, and the length will indicate the number of bytes, which doesn't need to match the number of characters!

    This, by the way, is only true for Python 2.


    Buy Me A Coffee

  • Hi Sebastiaan,

    The people who work on this subject report to me problems.
    It may have something to do with your last message ?

    it's the same message that was displayed when I tried your proposal with :

        with open(path) as file:
                 line = line.decode('utf-8')
            stimuli1_ord = file.readlines()[0:23]

    The program would stop displaying this message about empty list as if a file is empty.

    When I tried myself error display is : " Python seems to have crashed. This should not happen. If Python crashes often, please report it on the OpenSesame forum."

    Details : item-stack: ``

    That seems to be happening at the end of the task.

    It may be related to the file format ?
    Or the way to gather stimuli ?

    For information: the file "sentences1'' is edited with sublime text 3 and I did not wish to rename it with ".txt" because it worked well.


  • Problem solved !

    It was the cycle repetition that was poorly tuned in the looper.
    No problem with the file format.

    Have a good day,

Sign In or Register to comment.

agen judi bola , sportbook, casino, togel, number game, singapore, tangkas, basket, slot, poker, dominoqq, agen bola. Semua permainan bisa dimainkan hanya dengan 1 ID. minimal deposit 50.000 ,- bonus cashback hingga 10% , diskon togel hingga 66% bisa bermain di android dan IOS kapanpun dan dimana pun. poker , bandarq , aduq, domino qq , dominobet. Semua permainan bisa dimainkan hanya dengan 1 ID. minimal deposit 10.000 ,- bonus turnover 0.5% dan bonus referral 20%. Bonus - bonus yang dihadirkan bisa terbilang cukup tinggi dan memuaskan, anda hanya perlu memasang pada situs yang memberikan bursa pasaran terbaik yaitu Bola168. Situs penyedia segala jenis permainan poker online kini semakin banyak ditemukan di Internet, salah satunya TahunQQ merupakan situs Agen Judi Domino66 Dan BandarQ Terpercaya yang mampu memberikan banyak provit bagi bettornya. Permainan Yang Di Sediakan Dewi365 Juga sangat banyak Dan menarik dan Peluang untuk memenangkan Taruhan Judi online ini juga sangat mudah . Mainkan Segera Taruhan Sportbook anda bersama Agen Judi Bola Bersama Dewi365 Kemenangan Anda Berapa pun akan Terbayarkan. Tersedia 9 macam permainan seru yang bisa kamu mainkan hanya di dalam 1 ID saja. Permainan seru yang tersedia seperti Poker, Domino QQ Dan juga BandarQ Online. Semuanya tersedia lengkap hanya di ABGQQ. Situs ABGQQ sangat mudah dimenangkan, kamu juga akan mendapatkan mega bonus dan setiap pemain berhak mendapatkan cashback mingguan. ABGQQ juga telah diakui sebagai Bandar Domino Online yang menjamin sistem FAIR PLAY disetiap permainan yang bisa dimainkan dengan deposit minimal hanya Rp.25.000. DEWI365 adalah Bandar Judi Bola Terpercaya & resmi dan terpercaya di indonesia. Situs judi bola ini menyediakan fasilitas bagi anda untuk dapat bermain memainkan permainan judi bola. Didalam situs ini memiliki berbagai permainan taruhan bola terlengkap seperti Sbobet, yang membuat DEWI365 menjadi situs judi bola terbaik dan terpercaya di Indonesia. Tentunya sebagai situs yang bertugas sebagai Bandar Poker Online pastinya akan berusaha untuk menjaga semua informasi dan keamanan yang terdapat di POKERQQ13. Kotakqq adalah situs Judi Poker Online Terpercayayang menyediakan 9 jenis permainan sakong online, dominoqq, domino99, bandarq, bandar ceme, aduq, poker online, bandar poker, balak66, perang baccarat, dan capsa susun. Dengan minimal deposit withdraw 15.000 Anda sudah bisa memainkan semua permaina pkv games di situs kami. Jackpot besar,Win rate tinggi, Fair play, PKV Games