Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Supported by

voice key and recording

Hi there,

We are trying to implement a voice key (log RT from voice onset) and record "everything" of what participants are saying as response. We started from here : https://forum.cogsci.nl/discussion/5965/voice-key-recording.

So far so good, I can detect the voice, print out the loudness and check whether it's above the defined threshold. I also manage to write a .wav file per trial.

BUT, several issues remain and I'm sure it's quite futile but I can't get my mind around it :

  • It seems the .wav files are recorded UNTIL the threshold is reached. I'd like the response to be detected (and log RT) but to record everything that is said from stimulus onset (start_time) for a defined amount of time (until timeout, for example).
  • The following line did not work, as apparently "set_response" is not defined.
set_response(response=response, response_time=response_time)
  • If I try to automatically log all the variables in a logger, I get the following error : UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf7 in position 0: invalid start byte. However, if I list the variables to include, it works normally. What am I missing here ?

Any help would be greatly appreciated ! Best !


Here is what I have so far :

import pyaudio
import struct
import math
import wave

timeout = 5000
sound_threshold = 0.005

CHUNK = 1024
SHORT_NORMALIZE = (1.0/32768.0)
FORMAT = pyaudio.paInt16
CHANNELS = 2
RATE = 44100
RECORD_SECONDS = 5
WAVE_OUTPUT_FILENAME = "C://Users/kokor/Desktop/"+ str(target) + ".wav"

p = pyaudio.PyAudio()

def get_rms(block):
"""Get root mean square as a measure of loudness"""
count = len(block)/2
format = "%dh" % (count)
shorts = struct.unpack( format, block )
sum_squares = 0.0
for sample in shorts:
n = sample * SHORT_NORMALIZE
sum_squares += n*n
return math.sqrt( sum_squares / count )

stream = p.open(format=FORMAT,
channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK)

print("* recording")
frames = []

start_time = clock.time()
while True:
if clock.time() - start_time >= timeout:
response_time = timeout
response = u'timeout'
var.loudness = None
break

try:
block = stream.read(CHUNK)
frames.append(block)
except IOError as e:
print(e)

loudness = get_rms(block)
print(loudness)

if loudness > sound_threshold:
var.response_time = clock.time() - start_time
var.response = u'detected'
var.in_clock_time = clock.time()
var.start = start_time
var.loudness = loudness
break

print(response)
print(response_time)
print("* done recording")

stream.stop_stream()
stream.close()
p.terminate()

wf = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
wf.setnchannels(CHANNELS)
wf.setsampwidth(p.get_sample_size(FORMAT))
wf.setframerate(RATE)
wf.writeframes(b''.join(frames))
wf.close()

Comments

  • Hi,

    It is a bit hard to judge because the indentation in your code is incorrect, but I suppose the problem is this part:

    if loudness > sound_threshold:
        var.response_time = clock.time() - start_time
        var.response = u'detected'
        var.in_clock_time = clock.time()
        var.start = start_time
        var.loudness = loudness
        break
    

    Once you hit his if statement, you will leave the while loop (bc the break statement), and continue with writing the file. What you want is probable either to wait until the loudness is again below threshold (the person stopped talking?) and only then stop the recording, or wait a fixed amount of time before you leave the while loop. Do you see what I mean?

    If I try to automatically log all the variables in a logger

    Not sure what you mean with automatically logging all the variables. In any case, try to not log all variables and find out which variable is the problem. Once you know you might be able to figure out why it is a problem and whether and how you can fix it.

    Hope this helps,

    Eduard

    Buy Me A Coffee

  • Ah thanks for answering!

    You mean adding a time.sleep() before the break statement ? I'll give that a go !

  • That just by itself won't do. You will postpone the breaking the loop, but you will also completely halt your experiment, so nothing else will be recorded. You want your loop to normally proceed, until a certain amount of time has passed. So instead of breaking the loop, you start a timer, that you will check on every iteration until some waiting period has elapsed, and only then, break the loop.

    Does this make sense?

    Eduard

    Buy Me A Coffee

  • Thanks for this, it got me on the right track (well, I think). I think I get what I want now, however I am not sure it's the most optimal way to accomplish this :)

    I removed the calculation from the variables from the loop and did it "offline", in between trials. Does that make sense ?

    import pyaudio
    import struct
    import math
    import wave
    import pandas as pd
    
    
    
    # A low threshold increases sensitivity, a high threshold
    # reduces it.
    sound_threshold = 0.0005
    # Maximum response time
    timeout = 2000
    
    
    
    
    FORMAT = pyaudio.paInt16
    SHORT_NORMALIZE = (1.0/32768.0)
    CHANNELS = 2
    RATE = 44100
    INPUT_BLOCK_TIME = 0.01
    INPUT_FRAMES_PER_BLOCK = int(RATE*INPUT_BLOCK_TIME)
    filename = "C://Users/kokor/Desktop/"+ str(target) + ".wav"
    chunk=1024
    
    p = pyaudio.PyAudio()
    def get_rms(block):
    """Get root mean square as a measure of loudness"""
    
    count = len(block)/2
    format = "%dh" % (count)
    shorts = struct.unpack( format, block )
    sum_squares = 0.0
    for sample in shorts:
    n = sample * SHORT_NORMALIZE
    sum_squares += n*n
    return math.sqrt( sum_squares / count )
    
    # Open the mic
    stream = p.open(
    format=FORMAT,
    channels=CHANNELS,
    rate=RATE,
    input=True,
    input_device_index=0,
    frames_per_buffer=INPUT_FRAMES_PER_BLOCK
    )
    
    # Listen for sounds until a sound is detected or a timeout occurs.
    print("* recording")
    frames = []
    list_ldn = []
    list_ct = []
    
    start_time = clock.time()
    while clock.time() - start_time <= timeout:
    try:
    block = stream.read(chunk)
    frames.append(block)
    except IOError as e:
    print(e)
    loudness = get_rms(block)
    list_ldn.append(loudness)
    list_ct.append(clock.time())
    
    df = pd.DataFrame({'clock_time':list_ct,'loudness':list_ldn})
    try:
    var.test = (df.loc[df['loudness'] > 0.003, 'clock_time'].iloc[0]) - start_time
    except IndexError:
    var.test = timeout
    
    # Close the audio stream
    stream.stop_stream()
    stream.close()
    p.terminate()
    
    wf = wave.open(filename, 'wb')
    wf.setnchannels(CHANNELS)
    wf.setsampwidth(p.get_sample_size(FORMAT))
    wf.setframerate(RATE)
    wf.writeframes(b''.join(frames))
    wf.close()
    
  • import pyaudio
    import struct
    import math
    import wave
    import pandas as pd
    
    
    # A low threshold increases sensitivity, a high threshold
    # reduces it.
    sound_threshold = 0.0005
    # Maximum response time
    timeout = 2000
    
    
    FORMAT = pyaudio.paInt16
    SHORT_NORMALIZE = (1.0/32768.0)
    CHANNELS = 2
    RATE = 44100
    INPUT_BLOCK_TIME = 0.01
    INPUT_FRAMES_PER_BLOCK = int(RATE*INPUT_BLOCK_TIME)
    filename = "C://Users/kokor/Desktop/"+ str(target) + ".wav"
    chunk=1024
    
    
    p = pyaudio.PyAudio()
    def get_rms(block):
    """Get root mean square as a measure of loudness"""
     count = len(block)/2
     format = "%dh" % (count)
     shorts = struct.unpack( format, block )
     sum_squares = 0.0
     for sample in shorts:
     n = sample * SHORT_NORMALIZE
     sum_squares += n*n
     return math.sqrt( sum_squares / count )
    
    
    # Open the mic
    stream = p.open(
     format=FORMAT,
     channels=CHANNELS,
     rate=RATE,
     input=True,
     input_device_index=0,
     frames_per_buffer=INPUT_FRAMES_PER_BLOCK
     )
    
    # Listen for sounds until a sound is detected or a timeout occurs.
    print("* recording")
    frames = []
    list_ldn = []
    list_ct = []
    
    # Start a timer until timeout and compute loudness/clocktime for each block - append to lists
    start_time = clock.time()
    while clock.time() - start_time <= timeout:
     try:
      block = stream.read(chunk)
      frames.append(block)
     except IOError as e:
      print(e)
     loudness = get_rms(block)
     list_ldn.append(loudness)
     list_ct.append(clock.time())
    
    
    # Close the audio stream
    stream.stop_stream()
    stream.close()
    p.terminate()
    
    
    wf = wave.open(filename, 'wb')
    wf.setnchannels(CHANNELS)
    wf.setsampwidth(p.get_sample_size(FORMAT))
    wf.setframerate(RATE)
    wf.writeframes(b''.join(frames))
    wf.close()
    
    
    # merge the 2 lists to a df and compute response_time & response
    df = pd.DataFrame({'clock_time':list_ct,'loudness':list_ldn})
    try:
     var.response_time = (df.loc[df['loudness'] > sound_threshold, 'clock_time'].iloc[0]) - start_time
     var.response = u'detected'
    # use except to deal with errors when no response is detected
    except IndexError:
     var.response_time = timeout
     var.response = u'timeout'
    
  • Hi,

    optimality is overrated. As long as the code does what it is supposed to and doesn't cause unbearable delays, you're good to go I would say!

    Good luck,

    Eduard

    Buy Me A Coffee

Sign In or Register to comment.