voice key and recording

cltsson · July 2023

Hi there,

We are trying to implement a voice key (log RT from voice onset) and record "everything" of what participants are saying as response. We started from here : https://forum.cogsci.nl/discussion/5965/voice-key-recording.

So far so good, I can detect the voice, print out the loudness and check whether it's above the defined threshold. I also manage to write a .wav file per trial.

BUT, several issues remain and I'm sure it's quite futile but I can't get my mind around it :

It seems the .wav files are recorded UNTIL the threshold is reached. I'd like the response to be detected (and log RT) but to record everything that is said from stimulus onset (start_time) for a defined amount of time (until timeout, for example).
The following line did not work, as apparently "set_response" is not defined.

set_response(response=response, response_time=response_time)

If I try to automatically log all the variables in a logger, I get the following error : UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf7 in position 0: invalid start byte. However, if I list the variables to include, it works normally. What am I missing here ?

Any help would be greatly appreciated ! Best !

Here is what I have so far :

import pyaudio
import struct
import math
import wave

timeout = 5000
sound_threshold = 0.005

CHUNK = 1024
SHORT_NORMALIZE = (1.0/32768.0)
FORMAT = pyaudio.paInt16
CHANNELS = 2
RATE = 44100
RECORD_SECONDS = 5
WAVE_OUTPUT_FILENAME = "C://Users/kokor/Desktop/"+ str(target) + ".wav"

p = pyaudio.PyAudio()

def get_rms(block):
"""Get root mean square as a measure of loudness"""
count = len(block)/2
format = "%dh" % (count)
shorts = struct.unpack( format, block )
sum_squares = 0.0
for sample in shorts:
n = sample * SHORT_NORMALIZE
sum_squares += n*n
return math.sqrt( sum_squares / count )

stream = p.open(format=FORMAT,
channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK)

print("* recording")
frames = []

start_time = clock.time()
while True:
if clock.time() - start_time >= timeout:
response_time = timeout
response = u'timeout'
var.loudness = None
break

try:
block = stream.read(CHUNK)
frames.append(block)
except IOError as e:
print(e)

loudness = get_rms(block)
print(loudness)

if loudness > sound_threshold:
var.response_time = clock.time() - start_time
var.response = u'detected'
var.in_clock_time = clock.time()
var.start = start_time
var.loudness = loudness
break

print(response)
print(response_time)
print("* done recording")

stream.stop_stream()
stream.close()
p.terminate()

wf = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
wf.setnchannels(CHANNELS)
wf.setsampwidth(p.get_sample_size(FORMAT))
wf.setframerate(RATE)
wf.writeframes(b''.join(frames))
wf.close()

eduard · August 2023

Hi,

It is a bit hard to judge because the indentation in your code is incorrect, but I suppose the problem is this part:

if loudness > sound_threshold:
    var.response_time = clock.time() - start_time
    var.response = u'detected'
    var.in_clock_time = clock.time()
    var.start = start_time
    var.loudness = loudness
    break

Once you hit his if statement, you will leave the while loop (bc the break statement), and continue with writing the file. What you want is probable either to wait until the loudness is again below threshold (the person stopped talking?) and only then stop the recording, or wait a fixed amount of time before you leave the while loop. Do you see what I mean?

If I try to automatically log all the variables in a logger

Not sure what you mean with automatically logging all the variables. In any case, try to not log all variables and find out which variable is the problem. Once you know you might be able to figure out why it is a problem and whether and how you can fix it.

Hope this helps,

Eduard

cltsson · August 2023

Ah thanks for answering!

You mean adding a time.sleep() before the break statement ? I'll give that a go !

eduard · August 2023

That just by itself won't do. You will postpone the breaking the loop, but you will also completely halt your experiment, so nothing else will be recorded. You want your loop to normally proceed, until a certain amount of time has passed. So instead of breaking the loop, you start a timer, that you will check on every iteration until some waiting period has elapsed, and only then, break the loop.

Does this make sense?

Eduard

cltsson · August 2023

Thanks for this, it got me on the right track (well, I think). I think I get what I want now, however I am not sure it's the most optimal way to accomplish this :)

I removed the calculation from the variables from the loop and did it "offline", in between trials. Does that make sense ?

import pyaudio
import struct
import math
import wave
import pandas as pd



# A low threshold increases sensitivity, a high threshold
# reduces it.
sound_threshold = 0.0005
# Maximum response time
timeout = 2000




FORMAT = pyaudio.paInt16
SHORT_NORMALIZE = (1.0/32768.0)
CHANNELS = 2
RATE = 44100
INPUT_BLOCK_TIME = 0.01
INPUT_FRAMES_PER_BLOCK = int(RATE*INPUT_BLOCK_TIME)
filename = "C://Users/kokor/Desktop/"+ str(target) + ".wav"
chunk=1024

p = pyaudio.PyAudio()
def get_rms(block):
"""Get root mean square as a measure of loudness"""

count = len(block)/2
format = "%dh" % (count)
shorts = struct.unpack( format, block )
sum_squares = 0.0
for sample in shorts:
n = sample * SHORT_NORMALIZE
sum_squares += n*n
return math.sqrt( sum_squares / count )

# Open the mic
stream = p.open(
format=FORMAT,
channels=CHANNELS,
rate=RATE,
input=True,
input_device_index=0,
frames_per_buffer=INPUT_FRAMES_PER_BLOCK
)

# Listen for sounds until a sound is detected or a timeout occurs.
print("* recording")
frames = []
list_ldn = []
list_ct = []

start_time = clock.time()
while clock.time() - start_time <= timeout:
try:
block = stream.read(chunk)
frames.append(block)
except IOError as e:
print(e)
loudness = get_rms(block)
list_ldn.append(loudness)
list_ct.append(clock.time())

df = pd.DataFrame({'clock_time':list_ct,'loudness':list_ldn})
try:
var.test = (df.loc[df['loudness'] > 0.003, 'clock_time'].iloc[0]) - start_time
except IndexError:
var.test = timeout

# Close the audio stream
stream.stop_stream()
stream.close()
p.terminate()

wf = wave.open(filename, 'wb')
wf.setnchannels(CHANNELS)
wf.setsampwidth(p.get_sample_size(FORMAT))
wf.setframerate(RATE)
wf.writeframes(b''.join(frames))
wf.close()

cltsson · August 2023

import pyaudio
import struct
import math
import wave
import pandas as pd


# A low threshold increases sensitivity, a high threshold
# reduces it.
sound_threshold = 0.0005
# Maximum response time
timeout = 2000


FORMAT = pyaudio.paInt16
SHORT_NORMALIZE = (1.0/32768.0)
CHANNELS = 2
RATE = 44100
INPUT_BLOCK_TIME = 0.01
INPUT_FRAMES_PER_BLOCK = int(RATE*INPUT_BLOCK_TIME)
filename = "C://Users/kokor/Desktop/"+ str(target) + ".wav"
chunk=1024


p = pyaudio.PyAudio()
def get_rms(block):
"""Get root mean square as a measure of loudness"""
 count = len(block)/2
 format = "%dh" % (count)
 shorts = struct.unpack( format, block )
 sum_squares = 0.0
 for sample in shorts:
 n = sample * SHORT_NORMALIZE
 sum_squares += n*n
 return math.sqrt( sum_squares / count )


# Open the mic
stream = p.open(
 format=FORMAT,
 channels=CHANNELS,
 rate=RATE,
 input=True,
 input_device_index=0,
 frames_per_buffer=INPUT_FRAMES_PER_BLOCK
 )

# Listen for sounds until a sound is detected or a timeout occurs.
print("* recording")
frames = []
list_ldn = []
list_ct = []

# Start a timer until timeout and compute loudness/clocktime for each block - append to lists
start_time = clock.time()
while clock.time() - start_time <= timeout:
 try:
  block = stream.read(chunk)
  frames.append(block)
 except IOError as e:
  print(e)
 loudness = get_rms(block)
 list_ldn.append(loudness)
 list_ct.append(clock.time())


# Close the audio stream
stream.stop_stream()
stream.close()
p.terminate()


wf = wave.open(filename, 'wb')
wf.setnchannels(CHANNELS)
wf.setsampwidth(p.get_sample_size(FORMAT))
wf.setframerate(RATE)
wf.writeframes(b''.join(frames))
wf.close()


# merge the 2 lists to a df and compute response_time & response
df = pd.DataFrame({'clock_time':list_ct,'loudness':list_ldn})
try:
 var.response_time = (df.loc[df['loudness'] > sound_threshold, 'clock_time'].iloc[0]) - start_time
 var.response = u'detected'
# use except to deal with errors when no response is detected
except IndexError:
 var.response_time = timeout
 var.response = u'timeout'

eduard · August 2023

Hi,

optimality is overrated. As long as the code does what it is supposed to and doesn't cause unbearable delays, you're good to go I would say!

Good luck,

Eduard

cltsson · August 1

Howdy, Stranger!

Categories

voice key and recording

Comments

Howdy, Stranger!

Quick Links

Categories

voice key and recording

Comments