Speech Recognition Using FPGA Technology
- June 23rd, 2007
- Posted in FPGA . My Projects
- By Carlitos
- Write comment
My friends David and Kanwen, and I implemented a speech recognition system on an FPGA development board (Altera DE2 Board) for the Design Project course at McGill (ECSE 494). We did this in two step: first we wrote a prototype for the algorithm in MATLAB (I’ll maybe port it to Octave), and then we did the hardware description for the FPGA.
MATLAB Prototype
Inspired by the algorithm described in a site from the University of Toronto, we wrote two MATLAB scripts: train.m and recogniz.m.
train.m deals with the training phase, in which many versions of a sound (a spoken word for instance) are input and averaged in the frequency domain thus generating the sound’s “reference fingerprint”.
recogniz.m deals with the recognition phase, where a sound is input, translated to the frequency domain (i.e. Its fingerprint is generated), and compared to the reference fingerprint by computing the euclidean distance between them (as if both fingerprints where vectors).
Both scripts need to detect the beginning of the sound (i.e tell when the spoken word begins). They do so by averaging two adjacent 1024-sound-samples groups (in the time domains) and computing the difference between the averages. So, if there is a sudden increase in the sound’s amplitude, the difference will be significant and the sound is assumed to start after that sudden increase. The sound’s length is fixed to 1,024 s (see the picture below for more details)
Note that the scripts use 16-bit WAV files as input @ 22050 Hz (this is the default windows sound recorder output, since I could not do it in Linux because the mic did not wanted to work). The sound input is downsampled and quantized in order to get it down to 8 bit /sample @ 5 kHz for processing.
Also you might encounter problems if the sound file is too short (it should last for more than 1,1 s), or if its volume level is too low (this happens because the detector threshold is fixed).
Hardware Implementation
Once we had played enough with the MATLAB prototype parameters, we mapped the algorithm into combinational logic and finite state machines (FSM) by breaking it down into independent modules.
For more details about the hardware implementation and the project in general you can read the full project report. You may also want to see the slides for a presentation we did (below).
Unfortunately, I cannot post the project files (i.e. VHDL code).
Here is a little video demo, enjoy:
Note that all the documentation for this project was done using the very excellent OpenOffice.org.









hello
I’m elio, philippines…
I’m having a hard time looking for a good reference book discussing thoroughly FPGA. i saw your project and i was amazed on how it works. what books must I acquire to understand more the program. if you have the books, can i download it and have it too?
Thank you…
God Bless
I don’t know of any book treating specifically on FPGAs. The book I used for the Digital System Design Class (the one where I learned to use FPGAs) was Fundamentals of Digital Logic with VHDL Design Second Edition.I don’t really remember reading it but it looks like I did. I think you could try to get it online (although it may be illegal). Please, send me an email if you need more info on the book.
thank you…
by the way may i know who’s the athur of the book?
The authors are Stephen Brown and Zvonko Vranesic. Here is a link to the book’s website: http://highered.mcgraw-hill.com/sites/0072460857/
hy i am sarfaraz.I read and found this project very good and taken an idea from this project now I wants to implement this speech recognition method using VHDL and FPGA for car locking, unlocking and ignition which is my final year project.Please help me for this.I hope u will.
my email addresses are
[email protected]
[email protected]
Hi Carlos! I am new to FPGA (had just started early this year) and I find your blog interesting and helpful. btw, I had added your blog site in my links. Here’s my url
http://fpga-dsp-scratch.blogspot.com/
Thank you so much for sharing! Your site helps many
.
sir i have read your entire project. i found it very interesting.am also working on voice recognition based on fpga..am using the same altera de2 board..am facing problems in programming the codec..we have to load the control word in codec..how to check the output of adcdat pin where digital data will be there.plz help me in this regard…
Thanks u vv much , vv clear and easy code .
You are very welcome!
I was wondering if you knew how to save pictures taken with the de2 d5m ltm combo onto your computer.
hi!
i’m Nhat, im from vietnam!
i’m working with FPGA, currently, i’m researching in speech recognize by using VHDL code for the FFT, ADC.
because i’m newbie, can u help me code these?
thanks you!
Sorry for the late response. I have been very busy and without regular access to a computer lately.
I am afraid I never tried such a device. It seems that they have some Verilog sample code for it on the manufacturers website.
I am sorry I cannot be of much help. Good luck with your project.
Hi Nhat,
For the FFT, depending on the platform you are using, there are IP mocules that you can import into your project, Also there is opencores.org which features lors of useful modules.
As for the ADC, you will need to interface with your hardware ADC and the best way to get started on that is to read your ADC datasheet.
I hope this helps and good luck.
Iam Computer Engineering student i appreciate ur work u have done excellent job but system is still not 100% accurate how to achieve that please help
@fayaz kamali
im just a Electronic engineering student,now i am doing such project also, here is some ideal can share which u..
i think u shuold improve ur filter. try to design a filter which is perfectly match to the sample voice clip which is going to be compare with the incoming one..
hey i liked ur project i wish to implement it im having cyclon ii fpga processor de2 board but m not having knowledge of vhdl ..will u plz provide me the vhdl code for speach recog.. plz mait it on my mail id [email protected]
thank you
im hving the same board u hav used
Hi,that was a nicce job,I am using DE2 for image transfering ,can you help me how to use the SD Card module to store and restore images?thanks.
Hi Carlos,
Greeting from the Caribbean. I find your speech recognition project very interesting and nicely done. Is there a way to share/send-by-email your Quartus II project so I can try it on my DE2 board…plz, Thanks, GCG
I think this article made some interesting points, I read a textbook directly related to this topic, its called Digital Systems Design Using VHDL by , I found my used copy for less than the bookstores at http://www.belabooks.com/books/9780534384623.htm
[WORDPRESS HASHCASH] The poster sent us ’0 which is not a hashcash value.
@Alireza
Alireza, I do not have a lot of experience with SD cards but I think they use an SPI bus so you will require an SPI controller in order to be able to write or read from them.
(Sorry for the late response)
Hi Carlitos,
Can you please send me VHDL code, i need it in my project
Thank you in advance.
my Email is:
[email protected]
@kahtan
I get this question very often so I think I have to answer it publicly once and for all: no.
Allow me to elaborate, I do not exclusively own the copyright for this work and thus I cannot share it freely. Also, I do not know where my source files are (they are in some backup somewhere). Finally, I do believe that implementing such a system is a very regarding experience and the instant gratification of having the source code might ruin the learning process.
Sorry for the negative response. Rest assured that I will answer any specific question that you (or anybody else) might have within my the limits of my knowledge.
i want to do project on speech recognition using fpga technology . i want some guidance regarding this project
sir i have read your entire project. i found it very interesting.am also working on speech recognition based on fpga..am using the same altera de2 board..am facing problems in VHDL code for this project can u please help me for this & tell me how to devolope vhdl code .
my emailid is
[email protected]
@varalakshmi
@swapniln joshi
As always, I will be please to answer any specific questions you may have.
Some tips: when writing VHDL code remember you are actually describing a circuit and not a program. Also, stick to standards, make everything modular and parameterizable (no hard coded constants) and KISS ( I guess this applies to programming in general)
Cheers and good luck.
acutually there is mapping from matlab to combinational logic and fsm using quartes there is problem in vhdl plese guide me i have no guidance but i want do project if any details send varalakshmi8882yahoo.co.in
sorry [email protected] previous sent mail id is wrong
your project nice i also do that project give some guidance please
i am executed the matlab program but how give the input.
i download one wave file.when i gave that input it gives invalid format.plese help
Hi,
I am working on speech to text conversion based on sopc.It uses audio codec in de2 ording and sampling of a signal.I dont understand how to make settings in it and how to use it.Please help me.
hiiii
Will u plz give the details of the detection part plzzz i.e, how it detects the input word and all…………
@siri
The detection is rather simple. As shown in the slides, we used an averaging window. Whenever the average of the current window is larger by a certain threshold than the previous average, we consider a word is being said to the system.
I hope this helps.
please help..
im student,i have read ur project entirely, n also i have spartan3e500 fpga board,now i want to implement speech recognition on my board for my final task, may u give me ur guidance??anythings??
please send me on my email : [email protected]
thanx u so much..
@rony
I’ll gladly answer any specific question that you (or anybody else for that matter) might have with respect to the project.
@Carlitos
thx for ur reply,can u start 2 help me with step-by-step how to made it,,
first,u might be know if spartan3e don’t have port for any mic,so,where i have to put my input signal for my mic??however there’s few of,but where the best to??or i have to connect it directly to adc??
if i want it to recognise some word to control something,4 word maybe,which filter i have to choose??fft,dtw or hmm??
what file i have to create first for convenience??
is there any code that i can make it to be my guidance??
im gladly to hear ur reply…
thanx for ur att
@rony
1- You will receive your step-by-step guide in the mail in two to three business days. Would you like a cup of tea with that? Seriously though, I am a blogger, not your personal electrical engineer.
2- Yup, you need an adc or ideally an audio coded for a mic.
3- You know best what algorithm best suits your application.
4- As stated before, I will not provide any sample code.
this project very nice
i need help for this project
this my final year project
so i requst you please sir
help
Hi, I ahve been able to get the adc data using audio codec.Now will i need to store the data?How can I record and store the data.
@Rushil
You need a memory device such as some SRAM or Flash
thx for ur rep,thats help me a lot…
i think thats a good idea to put some tutorial in ur blog though,u r expert on this FPGA..
Hello, I am a Moroccan student vhdl my teacher suggested I work on a project that seems to mind that you offered on your site I have to start the project I am beginning to hope that you will help me send me if you have any document on that and tell me how you started with this project.
best regards.
@rony
I did this project a long time ago. Let’s just say I am a bit rusty on the working of the DE2.
Also, writing a tutorial would take a lot of work and would be terribly specific and thus of no value for the great majority of people.
@badr
All the info I have is already posted on the website. Enjoy and good luck with your project!
hello, your project is really important I am a beginner in this field and my teacher suggested I work on a project like please is what you can give me the vhdl code blocks of your architecture?
best regards.
@Carlitos
if u not very busy,im still hope for ur email…
thx a lot..
@rony
Is there any question I missed?
hi Mr.Carlos. I am reading your project report but I have not understood that part of such detection W1 and W2 and Th represents what? and why if the difference is greater than Th then the word is considered from sc? I hope you will help me on that, thank you in advance.
best regards.
@badr
If the average of Window 1 (W1) is greater (by a certain threshold Th) than the average of Window 2 (W2), then we consider that the sound started. Else, it is still silence (or background noise).
I hope this helps.
hi Mr.Carlos,
plz, for the ADC , why did you take 8 bits for the precision??