DESIGN AND IMPLEMENTATION OF VOICE RECOGNITION CALCULATOR
(A CASE STUDY OF MICRO-FINANCE BANK FEDERAL POLYTECHNIC NASARAWA)
ABSTRACT
A speech Recognition is an application that converts speech into text, by analyzing and processing the speech using Natural Language Processing (NLP) and then using Digital Signal Processing (DSP) technology to convert this processed speech recognition representation of the speech. This project centers on the creation of voice recognition calculator in order to reduce stress in calculation for the Microfinance Bank of Federal Polytechnic Nasarawa, using C# (C Sharp) 2013 for the programming.
CHAPTER ONE
INTRODUCTION
- INTRODUCTION
Speech is the vocalized form of human communication. It ranges from 90 Hz to 7,000 Hz. Each spoken word is created out of the phonetic combination of a limited set of vowel and consonant speech sound units. Speech recognition is the process of automatic extracting and determining linguistic information conveyed by a speech wave using computers. This Speech recognized calculator can be implemented by using linear predictive coding (LPC) method. Implementation of this system includes two stages namely a) Training phase b) Testing phase
1.2 BACKGROUND OF STUDY
A recent survey conducted among people performing calculations on a daily basis shows that they prefer using a separate calculator instead of a calculator present in their work station computers to save time and achieve multitasking. They suggested that using the computer based calculator is time consuming as they have to switch screens and this leads to typing mistakes. Given an option, they would rather prefer a voice activated calculator running in the background on their computers, to which inputs can be given in the form of spoken digits and operations, and would display the result on the screen. This voice activated calculator can be implemented on a basic computer system with no additional hardware. The developed software would be able to take operands and the operation commands as voice inputs and perform the mathematical operation and display the output on screen.
This system includes two main stages. First stage is the training phase which consists of feature extraction using Mel Frequency Cepstral Coefficients (MFCC) and storage of extracted features as training data in the form of reference templates. Second stage is the testing phase in which the features of real time voice inputs are extracted and compared with the reference template using Euclidean distance criterion to recognize the input digit.
1.3 STATEMENT OF PROBLEM
Humans introduce different variability while speaking due to different reasons. Well-communicated speaker might adjust their reactions. The orator changeability can be auxiliary classified as with-in recite, anywhere; the previous foundation of inconsistency defends intra-orator changeability and the future as inter-speaker unevenness.
Medium and transducer:Transducer is a device which converts one form of energy into another form. The speech signal or source signal can be composed by various transducers and can be communicated by dissimilar networks, for example microphone speech and telephone speech. For illustration, the telephone speech band is restricted among 300 Hertz and 3300 Hertz, where the microphone speech has a larger bandwidth. So, throughout the speech process in the broadcast networks the power speech signals have the lot of changeability.
Atmosphere: The involvement of the transducer, identifying the speech signal, it is not only the aural density wave segmented by the speech manufactured system of envisioned speaker, also the signal comes from the contiguous environments: i.e. noise along with different analog signals. The uproar signal inhibits by the speech signal ensuing in supplementary patchiness.
Phoneticof speech:The audio recognition of the phonemes is extremely reliant upon the contiguous phonemes. For illustration, the acoustic realization of phoneme /a/ in word cat /k/ /a/ /t/ and word bat /b/ /a/ /t/ is dissimilar. This is generally owing to co-articulation. In other words, we can say that Co-articulation is “Covering of contiguous articulations”. Variability contemporary in the speech indicator is stimuli in the ASR structure at numerous stages. For example, the broadcast frequency vii variability has a consequence on the aural chin vectors, whereas, the articulation disparity has upshot upon both the aural feature vector and the verbal prototypical. Supplementary foundations of information also deliver extra material to condense inconsistency.
1.4 AIM AND OBJECTIVES OF THE STUDY
The aim of the study is to design voice recognition calculator software for Microfinance Bank Federal Polytechnic Nasarawa to ease manual method of typing figures and make job of calculation faster and accurate.
The objectives of the study are;
- To review works on voice recognition generally.
- To review and collect information relevant enough concerning the case study.
- To develop a voice recognition calculator for ease of calculation using Microfinance Bank as case study.
1.5 SIGNIFICANCE OF THE STUDY
Looking at the age at which computer has grown and its development in almost all the field of operation today it has become important to look into the development of voice recognition calculator.
1.6 SCOPE AND LIMITATION OF THE STUDY
In this project, the voice recognition calculator is designed for Microfinance Bank of Federal Polytechnic Nasarawa, Nasarawa State taken as case study to enable employees have quicker method of calculation so as to avoid time wastage.
1.7 DEFINITION OF TERMS
- Calculator: calculator is a small piece device that helps in performing either basic arithmetic operations or scientific calculations.
- Voice: the sound produced in a person’s larynx and uttered through the mouth, as speech or song.
- Recognition: identification of a thing or person from previous encounters or knowledge.
- User: a person who uses or operates something.
- C# (C Sharp): C# (pronounced “C-sharp”) is an object-oriented programming language from Microsoft that aims to combine the computing power of C++ with the programming ease of Visual Basic. C# is based on C++ and contains features similar to those of Java.
- Visual Studio: Microsoft Visual Studio is an integrated development environment (IDE) from Microsoft. It is used to develop computer programs, as well as websites, web apps, web services and mobile apps.
Micro-Phone: an instrument for converting sound waves into electrical energy variations which may then be amplified, transmitted, or recorded.
No comments:
Post a Comment