XMOS VocalFusion Dev Kit User manual

VOCALFUSION DEV KIT

VOCALFUSION

USER GUIDE

Welcome to the VocalFusion Dev Kit for evaluation of the XMOS XVF3510 far-eld voice processor. The XVF3510

voice processor interfaces to a 2-mic array, and uses the XMOS voice algorithms to capture commands several

metres away from the device and deliver a voice stream optimised for an Automatic Speech Recognition (ASR)

engine running on the system application processor. The algorithms include Acoustic Echo Cancellation and an

Interference Canceller for static point noise sources, making the XVF3510 the ideal solution for smart TVs, set-top

boxes and TV accessories that output multi-channel audio.

VOCALFUSION DEV KIT - USER GUIDE 2

7. CONTENTS

1. VocalFusion Dev Kit features .........................................................................................................................3

1.1. VocalFusion Dev Kit rmware and control utility .....................................................................................3

2. Setting up the VocalFusion Dev Kit ................................................................................................................4

2.1. Amazon AVS support ..........................................................................................................................4

2.2. Choosing your speakers ..................................................................................................................... 4

3. Getting started - capture and playback over USB ..........................................................................................5

3.1. USB In/Out Conguration hardware conguration ................................................................................. 5

3.2. Conguring the sound settings ............................................................................................................5

3.2.1. Windows sound settings ......................................................................................................... 5

3.3. Mac OSX sound settings ....................................................................................................................6

3.3.1. Linux Host utility ......................................................................................................................7

4. Recording the captured audio .......................................................................................................................8

5. Alternative hardware congurations ............................................................................................................. 10

5.3.1. USB/Analog Line-In conguration........................................................................................... 10

5.3.2. I2S slave conguration .......................................................................................................... 10

6. Further information .....................................................................................................................................11

6.1. Documentation .................................................................................................................................11

6.2. Updating the XVF3510 Firmware ....................................................................................................... 11

6.3. Hardware/rmware support ............................................................................................................... 11

VOCALFUSION DEV KIT - USER GUIDE 3

1. VOCALFUSION DEV KIT FEATURES

The VocalFusion Dev Kit includes:

• 2-mic linear array board with Inneon IM69D130 MEMS mics

• XVF3510 base board with XVF3510 voice processor

• RPi HAT board for connection to a Raspberry Pi 3

• XTAG debug adapter, ribbon cable to the mic array board and ribbon cable to the RPi HAT board

In addition you will need:

• Powered speakers with a 3.5mm input jack - see Section 2.2 for details on suitable speakers

• Music input source such as laptop/PC or Raspberry Pi - see Section 3 for connection details

• An audio recording application such as Audacity (http://www.audacityteam.org)

• Windows users - USB driver installer such as Zadig (https://zadig.akeo.ie)

• I2S conguration - Raspberry Pi 3 with RPi power supply, 16GB SD Card running NOOBS, HDMI monitor and

USB keyboard

1.1. VOCALFUSION DEV KIT FIRMWARE AND CONTROL UTILITY

The kit is shipped with pre-installed rmware for a USB In/Out conguration, which can be used out-of-the-box

to demonstrate voice capture and playback. The kit also supports an I2S Slave conguration when the XVF3510

device is connected to an application processor - Raspberry Pi 3, and a hybrid USB/I2S conguration - see Table 1

for details of the congurations:

Table 1: Supported hardware congurations les

Conguration Audio Out Control Reference Conguration File

USB In/Out USB USB USB xk_vf3510_l71_usb (pre-installed)

I2S Slave I2S I2C I2S xk_vf3510_l71_i2s_slave

USB Out / Analog In I2S USB I2S xk_vf3510_l71_hybrid_usb_i2s

Each conguration is provided by a rmware image, see www.xmos.com/xvf3510 for further details.

VOCALFUSION DEV KIT - USER GUIDE 4

2. SETTING UP THE VOCALFUSION DEV KIT

Follow the steps below to set up and test the VocalFusion Dev Kit

Table 2: Conguration Steps

USB In/Out I2S Slave

USB / Analog In

1. Congure hardware Section 3.1 Section 5.3.2 Section 5.3.1

2. Test capture/playback Section 4 Section 4 Section 4

Notes

1 Requires rmware update see Section 6.2.

2.1. AMAZON AVS SUPPORT

Developers who want to integrate the XVF3510 kit with the Amazon Alexa Voice Service (AVS) should refer to

the VocalFusion Dev Kit for Amazon AVS, which includes details about how to install the AVS SDK and Sensory

TrulyHandsfree wake word engine on the Raspberry Pi and use the I2S Slave conguration.

2.2. CHOOSING YOUR SPEAKERS

The choice of stereo speakers can greatly aect overall system performance.

• The amplier in the speakers should have linear gain. Non-linear gain (e.g. soft clipping) should be disabled or

avoided.

• Any audio processing available on the speakers should be disabled.

• For low quality speakers it is best to use low volume settings to avoid non-linear distortions of the reference

signal.

The Logitech Z130 stereo speakers, for example, work well with the VocalFusion Dev Kit.

Place the kit on a horizontal surface, for example on a table at the edge of the room. Place the powered speakers

either side of the kit, making sure they don’t point directly at the microphone array.

VOCALFUSION DEV KIT - USER GUIDE 5

3. GETTING STARTED - CAPTURE AND PLAYBACK OVER USB

The quickest way to test the microphone array capture is to use the integrated VocalFusion USB interface, with the

default USB In/Out rmware.

3.1. USB IN/OUT CONFIGURATION HARDWARE CONFIGURATION

The USB conguration uses the USB interface for the audio output, control and the AEC reference signal.

1. Connect the powered speaker to the 3.5mm audio socket on the host laptop/PC.

2. Connect the host PC to the micro-USB socket on the XVF3510 base board using the supplied cable.

3. Congure the sound settings for your platform.

3.2. CONFIGURING THE SOUND SETTINGS

The XVF3510 uses Adaptive USB transport, which means that developers can use the USB conguration to test on

all platforms.

3.2.1. WINDOWS SOUND SETTINGS

1. Open Sound settings window (click Start menu and search for Sound.)

2. On the Playback tab, make sure that the powered speakers are set as the default device.

3. On the Recording tab, select Show Disabled Devices, right-click on the Stereo Mix device (may be called

“Wave Out Mix”, “Mono Mix”, or “Stereo Mix”) and select Enable.

4. Double-click the Stereo Mix device to open the Properties window.

VOCALFUSION DEV KIT - USER GUIDE 6

5. Check the Listen to this device checkbox, and select USB Audio Device from the dropdown list. If not

present, select “XVF3510 (UAC1.0) Adaptive”.

6. Right click on Stereo Mix and choose Set as Default Device.

You are now ready to record the audio captured by the XVF3510 - see Section 4.

3.3. MAC OSX SOUND SETTINGS

1. Open Audio MIDI setup (Applications/Utilities).

2. Click the plus symbol in the bottom left corner to add a new device, select Create Multi-Output Device.

3. Select Built-in Output and XVF3510 device.

4. Select the new Multi-Output Device in the device list.

5. Click the gear icon at the bottom of the window to open the Options menu and select Use This Device For

Sound Output.

6. Select XVF3510 from the Master Device drop-down list.

VOCALFUSION DEV KIT - USER GUIDE 7

7. Select Drift Correction for the Built-in Output. The nal conguration should look like this:

8. Close the Audio Devices window.

You are now ready to record the audio captured by the XVF3510 - see Section 4.

3.3.1. LINUX HOST UTILITY

Issues have been found when using recent versions of the Linux kernel. This is due to a USB driver mismatch.

The following steps have been tested successfully on Ubuntu 18.04 kernel version 4.15.0.

1. Install paprefs using your package manager.

2. Open paprefs and click the Simultaneous Output tab.

3. Select Add virtual output device for simultaneous output on all local sound cards option.

4. Close paprefs.

5. Open Terminal and run the following command to restart the pulseaudio server:

pulseaudio -k

6. Run the following command to check for the sink named combined.

pactl list sinks

7. Run the following command to set the combined output as the default sound output.

echo “set-default-sink combined” | pacmd

You are now ready to record the audio captured by the XVF3510 - see Section 4.

VOCALFUSION DEV KIT - USER GUIDE 8

4. RECORDING THE CAPTURED AUDIO

Make sure the XVF3510 kit and speakers remain static during testing. If they are moved to new positions, the

adaptive algorithms will require a few seconds to adjust to the new audio environment.

1. Open a music player on the host PC (or RPi) and play a stereo music le.

Ensure that the default sound output device matches your evaluation setup. You should hear this music

through the powered speakers. Adjust the volume using either the music player or the speakers.

2. Open Audacity and congure it to communicate with the VocalFusion Dev Kit.

• USB Adaptive: Input Device: XVF3510 (UAC1.0) Adaptive

• I2S: Input Device: i2s_48k

3. Make sure that the number of recording channels is set to 2 (Stereo) in the Device Toolbar.

4. Set the Project Rate to 48000 Hz using the Audacity Selection Toolbar at the bottom of the screen.

5. Click the Record button (or press r) to start capturing the audio streamed from the XVF3510 device.

6. Start talking over the music content. Move around the room and continue talking.

Stop the music player.

7. Stop Audacity recording, click on the Stop button (or press space).

Audacity records a single audio channel streamed from the XVF3510 Kit which includes any extracted voice

signal. To listen to the recorded voice you need to split the processed audio into separate tracks

8. Click the dropdown menu next to Audio Track and select Split Stereo To Mono.

VOCALFUSION DEV KIT - USER GUIDE 9

9. Click Solo on the left channel of the split processed audio. You may need to increase the gain slider.

10. Click on the Play button (or press space) to playback the processed audio.

11. You will hear only your voice.

• The playback music is removed by the Acoustic Echo Cancellation.

• Your voice is isolated by the Interference Canceller.

• Background noise is removed by the Noise Suppression algorithms.

VOCALFUSION DEV KIT - USER GUIDE 10

5. ALTERNATIVE HARDWARE CONFIGURATIONS

The VocalFusion Dev Kit supports two additional congurations - see Table ???. To use these congurations you will

need to ash the XVF3510 with the correct conguration le - see Section ???

5.3.1. USB/ANALOG LINE-IN CONFIGURATION

The USB/Analog Line-In conguration uses an I2S signal to input the AEC reference signal to the device, and the

USB interface for the audio output and control. You need to update to the USB Out / Analog In rmware to use this

conguration - see Section 6.2.

1. Connect the XVF3510 base board to the host PC using the supplied USB A to Micro B cable.

2. Connect the 3.5mm LINE IN socket on the XVF3510 base board to your host PC. You can also connect

powered speakers to the LINE IN socket using a splitter cable.

5.3.2. I2S SLAVE CONFIGURATION

The I2S Slave conguration uses an application processor (Raspberry Pi - not provided in kit) to run an ASR client

and wakeword software. The captured audio is streamed to the application processor using an I2S interface with

the XVF3510 acting as the I2S Slave, using an I2C interface for control functions.

Developers working with the Amazon Alexa Voice Service (AVS) must use this conguration. XMOS provides a

setup script to help install the necessary AVS SDK and software. Developers using a dierent voice service need

to install their own I2S Master software implementation and wakeword technology on the RPi SD card and integrate

with the XMOS rmware.

For details of using the I2S hardware and software conguration please refer to the VocalFusion Dev Kit for Amazon

AVS User Guide.

VOCALFUSION DEV KIT - USER GUIDE 11

6. FURTHER INFORMATION

6.1. DOCUMENTATION

Title Download

XVF3510-QF60 Datasheet http://www.xmos.com/le/xvf3510-qf60-datasheet

XVF3510-QF60 Control Guide http://www.xmos.com/le/xvf3510-qf60-control-guide

XMOS Tools User Guide http://www.xmos.com/le/xmos-tools-user-guide

VocalFusion Dev Kit for Amazon AVS User Guide http://www.xmos.com/le/vocalfusion-dev-kit-for-amazon-

avs-user-guide

6.2. UPDATING THE XVF3510 FIRMWARE

The VocalFusion Dev Kit is shipped with rmware for each conguration, allowing you to start evaluating the

XVF3510 device without installing additional software. If you need to reinstall or upgrade, you will need to download

the XMOS software tools, which are available free of charge for registered XMOS users:

Download Description URL

XVF3510 rmware XE binaries for all congurations

xk_vf3510_l71_usb

xk_vf3510_l71_i2s_slave

xk_vf3510_l71_hybrid_usb_i2s

https://www.xmos.com/xvf3510

Adestro AT25SF161 conguration le ADESTO_AT25SF161.spispec

The VocalFusion Dev Kit includes an XTAG debug adapter. Plug the adapter into the XSYS DEBUG connector on

the XVF3510 base board, and then use a USB cable to connect the XTAG to your development system.

Extract the XE les from the rmware download

Open the xTIMEcomposer command prompt:

• Windows: Start > XMOS > xCommand Prompt

• Mac OSX: Finder > Applications > XMOS_xTIMEcomposer > SetEnv.command

Go to the folder that contains the binary and enter:

xash --no-compression <lename>.xe --spi-spec <lename>.spispec

Wait for the ashing program to complete. When nished the following message will be displayed:

Site 0 has nished successfully

All done!. Close the tools command prompt and unplug the XTAG adapter from the kit and PC.

Further information on using xash and the XMOS tools is available in the xTIMEcomposer User Guide.

Xmos Ltd. is the owner or licensee of this design, code, or Information (collectively, the “Information”) and is

providing it to you “AS IS” with no warranty of any kind, express or implied and shall have no liability in relation to its

use. Xmos Ltd. makes no representation that the Information, or any particular implementation thereof, is or will be

free from any claims of infringement and again, shall have no liability in relation to any such claims.

VOCALFUSION DEV KIT - USER GUIDE 12

6.3. HARDWARE/FIRMWARE SUPPORT

This user guide applies to the following hardware and rmware versions:

Hardware XVF510 base board 1V1 In kit

Microphone array 1V0 In kit

Raspberry Pi HAT 1V0 In kit

Raspberry Pi 3 3.0 or later Not in kit

Firmware v0.10.0 Available on xmos.com

XMOS VocalFusion Dev Kit User manual

Related papers

Other documents