Home > FPGA, Project > Speech Recognition Using FPGA Technology

Speech Recognition Using FPGA Technology

My friends David and Kanwen, and I implemented a speech recognition system on an FPGA development board (Altera DE2 Board) for the Design Project course at McGill (ECSE 494). We did this in two step: first we wrote a prototype for the algorithm in MATLAB (I’ll maybe port it to Octave), and then we did the hardware description for the FPGA.

MATLAB Prototype

Inspired by the algorithm described in a site from the University of Toronto, we wrote two MATLAB scripts: train.m and recogniz.m.

train.m deals with the training phase, in which many versions of a sound (a spoken word for instance) are input and averaged in the frequency domain thus generating the sound’s “reference fingerprint”.

recogniz.m deals with the recognition phase, where a sound is input, translated to the frequency domain (i.e. Its fingerprint is generated), and compared to the reference fingerprint by computing the euclidean distance between them (as if both fingerprints where vectors).

Both scripts need to detect the beginning of the sound (i.e tell when the spoken word begins). They do so by averaging two adjacent 1024-sound-samples groups (in the time domains) and computing the difference between the averages. So, if there is a sudden increase in the sound’s amplitude, the difference will be significant and the sound is assumed to start after that sudden increase. The sound’s length is fixed to 1,024 s (see the picture below for more details)

Note that the scripts use 16-bit WAV files as input @ 22050 Hz (this is the default windows sound recorder output, since I could not do it in Linux because the mic did not wanted to work). The sound input is downsampled and quantized in order to get it down to 8 bit /sample @ 5 kHz for processing.

Also you might encounter problems if the sound file is too short (it should last for more than 1,1 s), or if its volume level is too low (this happens because the detector threshold is fixed).

Hardware Implementation

Once we had played enough with the MATLAB prototype parameters, we mapped the algorithm into combinational logic and finite state machines (FSM) by breaking it down into independent modules.

For more details about the hardware implementation and the project in general you can read the full project report. You may also want to see the slides for a presentation we did (below).

Unfortunately, I cannot post the project files (i.e. VHDL code).

Here is a little video demo, enjoy:

Note that all the documentation for this project was done using the very excellent OpenOffice.org.

Share and Enjoy:
  • Digg
  • StumbleUpon
  • Reddit
  • Twitter
  • Facebook
  • del.icio.us
  • Google Bookmarks
  • Print
Categories: FPGA, Project Tags:
  1. elio
    April 15th, 2008 at 07:15 | #1

    hello
    I’m elio, philippines…
    I’m having a hard time looking for a good reference book discussing thoroughly FPGA. i saw your project and i was amazed on how it works. what books must I acquire to understand more the program. if you have the books, can i download it and have it too?
    Thank you…
    God Bless

  2. April 15th, 2008 at 22:58 | #2

    I don’t know of any book treating specifically on FPGAs. The book I used for the Digital System Design Class (the one where I learned to use FPGAs) was Fundamentals of Digital Logic with VHDL Design Second Edition.I don’t really remember reading it but it looks like I did. I think you could try to get it online (although it may be illegal). Please, send me an email if you need more info on the book.

  3. elio
    April 17th, 2008 at 03:16 | #3

    thank you…
    by the way may i know who’s the athur of the book?

  4. April 17th, 2008 at 10:35 | #4

    The authors are Stephen Brown and Zvonko Vranesic. Here is a link to the book’s website: http://highered.mcgraw-hill.com/sites/0072460857/

  5. sarfaraz
    July 29th, 2008 at 01:29 | #5

    hy i am sarfaraz.I read and found this project very good and taken an idea from this project now I wants to implement this speech recognition method using VHDL and FPGA for car locking, unlocking and ignition which is my final year project.Please help me for this.I hope u will.
    my email addresses are
    sarfarazattariciit@gmail.com

    sarfaraz_attari_ciit@yahoo.com

  6. tahder
    September 13th, 2008 at 03:22 | #6

    Hi Carlos! I am new to FPGA (had just started early this year) and I find your blog interesting and helpful. btw, I had added your blog site in my links. Here’s my url
    http://fpga-dsp-scratch.blogspot.com/

    Thank you so much for sharing! Your site helps many :) .

  7. saurabh chadha
    January 12th, 2009 at 11:21 | #7

    sir i have read your entire project. i found it very interesting.am also working on voice recognition based on fpga..am using the same altera de2 board..am facing problems in programming the codec..we have to load the control word in codec..how to check the output of adcdat pin where digital data will be there.plz help me in this regard…

  8. May 3rd, 2009 at 01:23 | #8

    Thanks u vv much , vv clear and easy code .

  9. May 5th, 2009 at 13:54 | #9

    You are very welcome!

  10. baumer
    May 28th, 2009 at 09:27 | #10

    I was wondering if you knew how to save pictures taken with the de2 d5m ltm combo onto your computer.

  11. nhat
    June 22nd, 2009 at 13:17 | #11

    hi!
    i’m Nhat, im from vietnam!
    i’m working with FPGA, currently, i’m researching in speech recognize by using VHDL code for the FFT, ADC.
    because i’m newbie, can u help me code these?
    thanks you!

  12. June 29th, 2009 at 12:42 | #12

    Sorry for the late response. I have been very busy and without regular access to a computer lately.

    I am afraid I never tried such a device. It seems that they have some Verilog sample code for it on the manufacturers website.

    I am sorry I cannot be of much help. Good luck with your project.

  13. June 29th, 2009 at 12:47 | #13

    Hi Nhat,

    For the FFT, depending on the platform you are using, there are IP mocules that you can import into your project, Also there is opencores.org which features lors of useful modules.

    As for the ADC, you will need to interface with your hardware ADC and the best way to get started on that is to read your ADC datasheet.

    I hope this helps and good luck.

  14. fayaz kamali
  15. Ryan
    July 30th, 2009 at 04:49 | #15

    @fayaz kamali

    im just a Electronic engineering student,now i am doing such project also, here is some ideal can share which u..

    i think u shuold improve ur filter. try to design a filter which is perfectly match to the sample voice clip which is going to be compare with the incoming one..

  16. mahesh
    August 6th, 2009 at 00:21 | #16

    hey i liked ur project i wish to implement it im having cyclon ii fpga processor de2 board but m not having knowledge of vhdl ..will u plz provide me the vhdl code for speach recog.. plz mait it on my mail id rhythmdevine88@gmail.com
    thank you

  17. mahesh
    August 6th, 2009 at 00:22 | #17

    im hving the same board u hav used

  18. Alireza
    August 19th, 2009 at 10:07 | #18

    Hi,that was a nicce job,I am using DE2 for image transfering ,can you help me how to use the SD Card module to store and restore images?thanks.

  19. August 28th, 2009 at 09:20 | #19

    Hi Carlos,

    Greeting from the Caribbean. I find your speech recognition project very interesting and nicely done. Is there a way to share/send-by-email your Quartus II project so I can try it on my DE2 board…plz, Thanks, GCG

  20. Amanda
    September 29th, 2009 at 23:51 | #20

    I think this article made some interesting points, I read a textbook directly related to this topic, its called Digital Systems Design Using VHDL by , I found my used copy for less than the bookstores at http://www.belabooks.com/books/9780534384623.htm

    [WORDPRESS HASHCASH] The poster sent us ’0 which is not a hashcash value.

  21. November 10th, 2009 at 17:04 | #21

    @Alireza
    Alireza, I do not have a lot of experience with SD cards but I think they use an SPI bus so you will require an SPI controller in order to be able to write or read from them.

    (Sorry for the late response)

  22. kahtan
    December 2nd, 2009 at 14:14 | #22

    Hi Carlitos,
    Can you please send me VHDL code, i need it in my project
    Thank you in advance.
    my Email is:
    kahtan.jwair@gmail.com

  23. December 2nd, 2009 at 14:22 | #23

    @kahtan
    I get this question very often so I think I have to answer it publicly once and for all: no.

    Allow me to elaborate, I do not exclusively own the copyright for this work and thus I cannot share it freely. Also, I do not know where my source files are (they are in some backup somewhere). Finally, I do believe that implementing such a system is a very regarding experience and the instant gratification of having the source code might ruin the learning process.

    Sorry for the negative response. Rest assured that I will answer any specific question that you (or anybody else) might have within my the limits of my knowledge.

  24. varalakshmi
    December 30th, 2009 at 14:07 | #24

    i want to do project on speech recognition using fpga technology . i want some guidance regarding this project

  25. swapniln joshi
    December 31st, 2009 at 00:22 | #25

    sir i have read your entire project. i found it very interesting.am also working on speech recognition based on fpga..am using the same altera de2 board..am facing problems in VHDL code for this project can u please help me for this & tell me how to devolope vhdl code .
    my emailid is
    swapnil01joshi@gmail.com

  26. December 31st, 2009 at 04:13 | #26

    @varalakshmi
    @swapniln joshi
    As always, I will be please to answer any specific questions you may have.

    Some tips: when writing VHDL code remember you are actually describing a circuit and not a program. Also, stick to standards, make everything modular and parameterizable (no hard coded constants) and KISS ( I guess this applies to programming in general)

    Cheers and good luck.

  27. varalakshmi
    December 31st, 2009 at 15:29 | #27

    acutually there is mapping from matlab to combinational logic and fsm using quartes there is problem in vhdl plese guide me i have no guidance but i want do project if any details send varalakshmi8882yahoo.co.in

  28. varalakshmi
    December 31st, 2009 at 16:16 | #28

    sorry sir.varalakshmi888@yahoo.co.in previous sent mail id is wrong

  29. kshore
    January 3rd, 2010 at 13:02 | #29

    your project nice i also do that project give some guidance please

  30. anodt
    January 9th, 2010 at 02:29 | #30

    i am executed the matlab program but how give the input.
    i download one wave file.when i gave that input it gives invalid format.plese help

  31. Rushil
    January 16th, 2010 at 11:37 | #31

    Hi,
    I am working on speech to text conversion based on sopc.It uses audio codec in de2 ording and sampling of a signal.I dont understand how to make settings in it and how to use it.Please help me.

  32. siri
    February 1st, 2010 at 09:38 | #32

    hiiii
    Will u plz give the details of the detection part plzzz i.e, how it detects the input word and all…………

  33. February 1st, 2010 at 19:31 | #33

    @siri
    The detection is rather simple. As shown in the slides, we used an averaging window. Whenever the average of the current window is larger by a certain threshold than the previous average, we consider a word is being said to the system.

    I hope this helps.

  34. rony
    February 3rd, 2010 at 03:23 | #34

    please help..
    im student,i have read ur project entirely, n also i have spartan3e500 fpga board,now i want to implement speech recognition on my board for my final task, may u give me ur guidance??anythings??
    please send me on my email : shiro_nugros182@hotmail.com
    thanx u so much..

  35. February 3rd, 2010 at 19:39 | #35

    @rony
    I’ll gladly answer any specific question that you (or anybody else for that matter) might have with respect to the project.

  36. rony
    February 4th, 2010 at 03:17 | #36

    @Carlitos
    thx for ur reply,can u start 2 help me with step-by-step how to made it,,
    first,u might be know if spartan3e don’t have port for any mic,so,where i have to put my input signal for my mic??however there’s few of,but where the best to??or i have to connect it directly to adc??
    if i want it to recognise some word to control something,4 word maybe,which filter i have to choose??fft,dtw or hmm??
    what file i have to create first for convenience??
    is there any code that i can make it to be my guidance??
    im gladly to hear ur reply…
    thanx for ur att

  37. February 4th, 2010 at 04:54 | #37

    @rony
    1- You will receive your step-by-step guide in the mail in two to three business days. Would you like a cup of tea with that? Seriously though, I am a blogger, not your personal electrical engineer.

    2- Yup, you need an adc or ideally an audio coded for a mic.

    3- You know best what algorithm best suits your application.

    4- As stated before, I will not provide any sample code.

  38. Milind Kawale
    February 4th, 2010 at 09:42 | #38

    this project very nice
    i need help for this project
    this my final year project
    so i requst you please sir
    help

  39. Rushil
    February 4th, 2010 at 10:34 | #39

    Hi, I ahve been able to get the adc data using audio codec.Now will i need to store the data?How can I record and store the data.

  40. February 5th, 2010 at 01:37 | #40

    @Rushil
    You need a memory device such as some SRAM or Flash

  41. rony
    February 5th, 2010 at 03:02 | #41

    thx for ur rep,thats help me a lot…
    i think thats a good idea to put some tutorial in ur blog though,u r expert on this FPGA..

  42. badr
    February 6th, 2010 at 10:57 | #42

    Hello, I am a Moroccan student vhdl my teacher suggested I work on a project that seems to mind that you offered on your site I have to start the project I am beginning to hope that you will help me send me if you have any document on that and tell me how you started with this project.
    best regards.

  43. February 6th, 2010 at 11:51 | #43

    @rony
    I did this project a long time ago. Let’s just say I am a bit rusty on the working of the DE2.

    Also, writing a tutorial would take a lot of work and would be terribly specific and thus of no value for the great majority of people.

  44. February 6th, 2010 at 11:52 | #44

    @badr
    All the info I have is already posted on the website. Enjoy and good luck with your project!

  45. Leo
    February 8th, 2010 at 07:53 | #45

    hello, your project is really important I am a beginner in this field and my teacher suggested I work on a project like please is what you can give me the vhdl code blocks of your architecture?
    best regards.

  46. rony
    February 8th, 2010 at 22:48 | #46

    @Carlitos
    if u not very busy,im still hope for ur email…
    thx a lot..

  47. February 8th, 2010 at 23:09 | #47

    @rony
    Is there any question I missed?

  48. badr
    February 10th, 2010 at 18:53 | #48

    hi Mr.Carlos. I am reading your project report but I have not understood that part of such detection W1 and W2 and Th represents what? and why if the difference is greater than Th then the word is considered from sc? I hope you will help me on that, thank you in advance.
    best regards.

  49. February 10th, 2010 at 22:15 | #49

    @badr
    If the average of Window 1 (W1) is greater (by a certain threshold Th) than the average of Window 2 (W2), then we consider that the sound started. Else, it is still silence (or background noise).

    I hope this helps.

  50. LEO
    February 15th, 2010 at 17:55 | #50

    hi Mr.Carlos,
    plz, for the ADC , why did you take 8 bits for the precision??

Comment pages
1 2 35
  1. No trackbacks yet.

Powered by WP Hashcash