Hands-Free Searching Using Google Voice
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
IJCSNS International Journal of Computer Science and Network Security, VOL.10 No.8, August 2010 47 Hands-Free Searching Using Google Voice Hamzeh Al_bool1, Maytham Ali2, Hamdi Ibrahim3, and Adib M.Monzer Habbal4 InterNetWorks Research Group, UUM College of Arts and Sciences, University Utara Malaysia, 06010 UUM, Sintok, MALAYSIA many search engines have been developed to facilitate Abstract information search across the globe via the World Wide Google is one of the most popular information retrieval Web (www). Google is one of the most popular search systems which it has a fast increasing in the number of engines and information retrieval systems. It is a huge, users. It can provide a foundation for improving the global web-based application with search engine. learning and teaching environments. However, not all Furthermore, it is very easy to use [17] with very simple potential users have the capabilities that allow them to use interface. On top of that, it is linked with more than Google without impediments. This is specifically true for hundreds of millions of web pages involving similar motor-disabled users. Therefore, for this category of users numbers of different conditions. It has been recorded with a special approach of interaction with Google interface is over ten millions of queries daily [3]. necessary. The missing capabilities of this category of users should be substituted by capabilities that these users Unfortunately not all potential users have all capabilities have in order to deal with Goggle. This study entails the that allow them to use Google information retrieval system development of Hands-Free voice recognition Google without impediments. There are many people who have Search Engine to allow physically disabled the injuries and disabilities which prevent many of them to opportunity to operate a Google and browse the result of deal with computer applications to suit their conditions. It search without using a keyboard or mouse. The has been revealed that more than 50% of newly hired implementation is carried out with Java programming computer users reported musculoskeletal with symptoms language and the result of usability testing shows that like eyestrain, neck and shoulder pain, low back pain, 93% of the participants found the system usable. elbow pain tendonitis), forearm pain (muscles) and nerve entrapments [6]. Additionally, there are also disability Key words: constraints in using the traditional keyboard input system. Voice Recognition, motor-disabled users, Google, Java Hence, another form of input is necessary, such as voice in Programming Language recognition systems. The development of information retrieval system with 1. Introduction voice recognition products provides a foundation for improving the learning and teaching environments through Recent advances in computer technology have enabled technology. Voice recognition might be able to help the users of voice recognition products to achieve desirable disabled users to explore information through a set of results which were previously impossible with the largest voice commands that provide easy ways and controls mainframe computers or workstations. One of the most mechanism for searching information. It is expected that important reasons that led to the production a large voice will replace the physical manipulation as the number of voice input systems is the ability of the voice dominant input modality. This shift will dramatically alter input systems to remove a lot of literacy barriers in input needs, and the way users interact with computers. communicating with the instructions of the computer. This study is to introduce a new service which is not Besides, the inventions of the technological innovations currently available in Google search engine (GSE) [18]. It have suggested the emerging knowledge society, in which proposes the implementation of voice recognition input in all our engagements in life will be focused on having GSE. The voice recognition input will entail the knowledge of something. Such knowledge can be acquired development of Hands-Free GSE Using Voice in many ways and in different forms. On the other hand, Recognition to provide physically challenged individuals Manuscript received August 5, 2010 Manuscript revised August 20, 2010
48 IJCSNS International Journal of Computer Science and Network Security, VOL.10 No.8, August 2010 especially the visually-impaired with the opportunity to Voice recognition system is not an exception in this regard. operate the GSE and browse the result of searching The usability of any voice recognition system is very without using keyboards or mouse. Furthermore, it will important and is proven by [5]. Also, they suggest that assist people who normally have problem with human factors should surround the use of voice musculoskeletal disorders when working with computers recognition system. via typing of instruction. However, it was also revealed that using voice recognition This paper is organized into six main sections as follows: to replace the traditional keyboard input system requires It starts with this section that introduces the study and this additional support [8]. This implies that there is a need to paper. It is followed with a section that discusses some take care of the sensitivity of the interaction mode in terms related works. Section 3 follows next, discussing on the of noise management and the like exists. concept of hands-free voice recognition GSE. The implementation and evaluation are provided in Sections 4 The related works are reviewed to help deeply understand and 5, which are finally followed with a concluding the missing and the impediments capabilities of motor- section. disabled users that should be substituted by capabilities that these users have in order to deal with Google. Mainly, the studies in the literature come up with important information that is needed to be more aware of each of end 2. Related Works users for Hands-Free searching using voice recognition GSE. The importance of developing such system is to help In the quest to assist the disabled individuals, in terms of the motor-disabled users to explore information by their mode of interactions with computer-enhanced developing a set of voice commands that provide easy way technologies, [5], proposed an assistive system utilizing and control mechanism for searching information over voice recognition input technology. In the application, the Google information retrieval system. voice commands are generated to assist the disabled people to operate basic electronic equipments without Unlike other types of computer system, developing a voice necessarily touching the equipments. In addition, the recognition system is better done using dynamic authors used mouse emulators and keyboard emulators for programming approach [9]. This serves as the justification interaction with computer for different computing for choosing Java programming language for system purposes. development in this study. Later, for physically disabled people to enjoy part of the dividends of the 21st century technological innovations, [1] propose a framework for controlling computer desktop 3. Hands-Free Searching Using Google Voice with issuance of voice commands. Concepts In fact, efforts on utilizing voice recognition for searching The concept behind the proposed voice recognition system purposes are not new. In relation, a voice recognition over GSE is based on a native method, robot class; system has also been designed for visually impaired Grammar file, sphinx-4, and Wiener filter as discussed in people [7]. Although the system is specific to people with the following paragraphs. vision problem, the developed voice recognition system in their work is more general. If not for the provision of Native method (a.k.a Exec Method): it is primarily alternative input system, these categories of challenged designed to enhance the interaction between the program individuals might not have the opportunity to survive in and the operating system. It is a way of combining the this knowledge world. power of C or C++ programming with Java [13]. In this study, such advantage is utilized to be able to run the On the other hand, voice recognition has also been internet browser application and also for easy display of employed as security information in place of the website addresses. conventional knowledge-based authentication approach [4]. This implies that voice message is a promising stress- Robot Class in java: This class is used to generate native free approach to replace the conventional typing-based operating system input events for the purposes of testing input system. automation, self-running demos, and other applications where control of the mouse and keyboard are needed [2]. On top of the robustness of a computer implementation, its The primary purpose of Robot is to facilitate automated friendly usage experience is considered more important. testing of Java platform implementations. It is employed to
IJCSNS International Journal of Computer Science and Network Security, VOL.10 No.8, August 2010 49 deal with voice commands by taking over the control of 4. Implementation the keyboard and mouse commands. The type of command method in this case is voice command. A system has been implemented with the technique discussed in the previous section using Java based Grammar File: for the grammar file, a context free interface. The command and control grammar-based grammar is specified as the input is capable of accepting recognition is implemented to make the system robust and the voice input and produces a java language functions speaker independent. At present, the system includes two that recognizes the appropriate instances of the grammar. types of functions. Firstly, through a microphone that is The implication of this is that instances are created for all connected to a computer, the user can search information possible grammars. by using Google search engine through a write function method voice commands. Write function is a method that Sphinx-4: it is a state-of-the-art voice recognition system is built to allow users to enter any sentences they want to written entirely in Java, and is considered as a suitable search for, in Google search text box by voice. Secondly, framework for Voice recognition to adopt. The main the user can access all Google search engine functions, blocks are the FrontEnd, the Decoder, and the Linguist. such as Web, Images, Maps, Books and deal with all sub Supporting blocks include the Configuration Manager and functions Google provides through using mouse voice the Tools blocks. The communication between the blocks, commands and keyboard voice commands. Table 1 shows as well as communication with an application, is described all the voice commands with resulting events to be carried in [16]. Sphinx-4 is used to analyze the uses of voice out by the commands. commands as shown in figure 1. Table: 1 Voice Command Description Voice Commands Command Events Name Search To put the mouse pointer in the Google text box. Go To run the Google process. Up, Down, To control the mouse in four trends Right, and Left. Open To click on any button or link in which it depends on the mouse position Tab To move to the next section of a Google page. Enter To begin the desired process, which is usually an alternative over the pressing an OK button. Back To return to the previous page. Home To return to the Google Home Page. Fig. 1 Sphinx-4 Decoder Framework. Having specified the commands, with specified desired Wiener filter: it is based on tracking a priori SNR using Decision-Directed method, proposed by [19]. The two- actions as listed in Table 1 this study proceeds with step noise reduction (TSNR) technique removes the initializing the steps in implementing the system. Figure 2 annoying reverberation effect while maintaining the shows the steps involved in the system implementation by benefits of the decision-directed approach. However, opening the Google page. The diagram explains that users classic short-time noise reduction techniques, including can utter any sentence that they want to search for. If the TSNR, introduce harmonic distortion in the enhanced sentence is not included in the grammar file then the speech. To overcome this problem, a method called system will show expiration message that the sentence is harmonic regeneration noise reduction (HRNR) is implemented in order to refine the a priori SNR used to not included in the grammar file. But if the sentences are compute a spectral gain able to preserve the speech in the grammar file the sentences will be displayed in harmonics. Google text box search.
50 IJCSNS International Journal of Computer Science and Network Security, VOL.10 No.8, August 2010 Start 5. Test and Evaluation Voice Input A. User test The user test has been carried out. It was heavily based on the users’ own words. This study chooses not to collect any quantitative data or define goals for the tests to reach Voice Recognition in order to pass as successful or not. The reason for this is the same as made us go with a qualitative investigation in the first place. The users had informative opinions of the [No] Is word in a program after the tests. The evaluation include direct gram m ar file? questions on whether each user thinks he or she will be [Yes ] using voice recognition input-based interaction in GSE situations in the future, and if the tested program is a real The word will be wrote in alternative in this sense. Google search text box In terms of procedure, this study invited users personally. End In the invitation, users were provided with the system and a suggestion on the details to be tested with each user. Fig. 2 Opening Google page. This study intends to test different aspects of the reliability and easy to use concepts with different users. The reason Further, this study also emphasizes on using voice to is that it is more effective to have the test users inspired and performs the tasks spontaneously, than to make them control the input and navigate in a web page. Figure 3 all try every detail. The invitation also involves educating illustrates the algorithm for the function. The diagram the users in terms of dictation software as a way of explains that users can use the voice command shown in preparing them for the actual tests that are following. Table 1 to control the search process and explore the Along with the invitation process, the program will be search result. In addition, users can control mouse customized in order to fit each task and user. movements and keyboard commands with voice to move to any place in the Google web page and to access any link When this study has made-up decisions with the selected participants in the test and evaluation process, the or function. investigation were decided to be on reliability, easy to use, and satisfactions of Hands-Free searching using voice Start recognition GSE. Therefore this study concluded that the users should all be familiar of using GSE. Furthermore, this study thought it would be rewarding to have at least a Voice Input couple of the users already familiar with normal speech recognition. Finally, the participants were of total thirty, divided as follows: Voice Recognition Table 2: The participants of user test Numb Users Age Gender er of description averag [No] Is it com mand ? Users e Two programmers/e 42.5 M [Yes ] ngineers working with Apply the action of speech command recognition development Five students 36 M/F End already familiar with normal Fig. 3 Using voice commands to control search process. speech recognition
IJCSNS International Journal of Computer Science and Network Security, VOL.10 No.8, August 2010 51 Fiftee Students that 27.2 M/F hundred words and keyphrases organized for the purposes n using Google of the search processes. these documents also contain the search a lot but voice commands and the description of these commands without where the main purpose to provide these voice commands experience of to make participants more familiar about the voice speech commands that are available and more clearly about the recognition. functions of these commands in order to browse Google interfaces. Two Users have 23.8 M motor Finally, we asked the participants to use the system and disabilities. browse the result of searching without using keyboards or mouse. Six Newly hired 26.6 M/F computer users The results of this laboratory experiment have been analyzed and presented in the next section. All participants were gathered from University Utara Malaysia, and users who have motor disabilities from B. Result outside the university. They were volunteered to perform There are two obvious problems found in the developed the user test sessions. system. Firstly, technical problems such as voice recognition limitations in the current voice input systems. Besides, the vocabularies that are stored in the grammar Secondly, the limitations of the grammar file size where it file consist of more than 3000 words. These vocabularies can only contain 120,000 words. These problems reduce were collected manually from different sources; also they the reliability of gathered result. are related to learning and education area. This study gives special attention to the participants about the domain of In the study, the observations reveal that users deal words that are availably to be search about, in which they directly with Google interface. This study interprets that are only related to English education and learning area. the developed system is easy to use. This is partly because of the interface of GSE itself, which is very simple and The user test was focused on three aspects which are: (1) minimal. how would users describe good reliability of our techniques? (2) how would users describe good usability In addition, from the qualitative observation, this study of our techniques?; and (3) are users satisfied?. quantifies the data for better view of the results. As stated earlier, this study investigates the ease-of-use, reliability, Therefore, the valuation of the program in the context and satisfaction among users in the user test. Hence, the scenarios had come in order to identify the main three results were gathered accordingly, and are presented in aspects of this study as follows: Figure 4. Firstly, we introduced Hands-Free GSE Using Voice Hands-free searching using Google Recognition program to all participants and explained the main functions of our system to increase the awareness voice among the participants on what this system can do and 93% 86.60% 100% 76.60% how they can deal with it. 80% 60% Secondly, we installed our prototype software on thirteen 40% systems and we divided participants into randomly groups, 20% Total except the users they have motor disabilities we put them 0% participants … in an independent group to be more focused. Also, programmers/engineers working with speech recognition development they do not have any group where those played the oversight role on all groups within test and evaluation process. Thirdly, before we started we distributed a set of Fig. 4 The result of the participants satisfaction on the documents to participants, which it contains a list of one Voice-based Google search Instruction
52 IJCSNS International Journal of Computer Science and Network Security, VOL.10 No.8, August 2010 From the graph in Figure four, this study interprets that 7. References users found that the system is easy to use. In addition, the results of search activities were found highly reliably. [1] Abdeen, M., Mohammad, H. & Yagoub, M. C. E. An With the high reliability and ease of use, the users were Architecture for Multi-lingual Hands-free Desktop Control found highly satisfied with system. System for PC Windows. IEEE Xplore, 1747- 1750. (2008). [2] Baldwin. RIntroduction to the Java Robot Class in Java, retrieved, August 10, 2010, from Two different noise suppression methods were tested in http://www.dickbaldwin.com/java/Java1472.htm. . (2003) Matlab [19]. They were noise suppression using Spectral [3] Brin S., Page L. The anatomy of a large-scale hypertextual enhancement and noise suppression using Wiener filter. A Web search engine* 1. Computer Networks and ISDN wiener filter of the method proposed by Plapous was Systems 30, 107-117. . (1998). tested. Furthermore, a better approach will be adding an [4] Cui, B. & Xue, T. Design and Realization of an Intelligent implementation of Wiener filter in Java to the front end of Access Control System Based on Voice Recognition, Sphinx-4, as the front end of Sphinx-4 is flexible and International Conference on computing, communication, pluggable. Control, and Management, ISECS, IEEE, 229 – 232. . (2009). [5] Gao, X. T., Ong, S. K., Yuan, M. L. & Nee, A. Y. C. Assist Disabled to Control Electronic Devices and Access This study has been introduced a new service which is not Computer Functions by Voice Commands, Communication of currently available in Google search engine (GSE) [18]. the ACM, 37 – 42. (2007). Furthermore, to the best of our knowledge, there is no [6] Gerr F., Marcus M., Ensor C., Kleinbaum D., Cohen S., similar or close study to ours in order to compare our Edwards A., Gentry E., Ortiz D., Monteilh C.A prospective approach with other studies approaches or to comparison study of computer users: I. Study design and incidence of between our results with other studies results. musculoskeletal symptoms and disorders. American Journal of Industrial Medicine 41`. 221-235. (2002). [7] Halimah, B. Z., Azlina, A., Behrang, P. & Choo, W. O. Voice Recognition System for the Visually Impaired: Virtual 6. Conclusion Cognitive Approach, IEEE Xplore. (2008). [8] Mills, S., Saadat, S. & Whiting, D. Is Voice Recognition the Previous studies have shown that there is a need to solution to keyboard-Based RIS?, IEEE Xplore, 1 -6, (2006). provide an alternative to the conventional keyboard and [9] Ney, H. & Ortmanns, S. Dynamic Programming Search for mouse input system, not only because of disability on the Continuous Voice recognition, Signal Processing Magazine, IEEE Xplore, 16(5), 64 – 83. (1999). part of the users but also, the associated stress with the [10] Rashid, R. A., Mahalin, N. H., Sarijari, M. A. & Abdul modes of interaction among normal people. This has led to Azizi, A. A. Security System Using Biometric Technology: several techniques in the area of voice recognition. In this Design and Implementation of Voice Recognition System age, the world is being ruled with knowledge throughout (VRS), International Conference on Computer and the potentials of technological innovations. Voice input Communication Engineering, IEEE, 898 – 902. (2008). system will remove all barriers imposed by the traditional [11] Sergey B., Lawrence P. The anatomy of a large-scale input system and will enable the physically challenged hypertextual web search engine. Computer Networks and individuals access to necessary information as long as they ISDN Systems 30:107-117.(1998). can communicate with audible voice. In this study, the [12] Shrawankar U., Thakare V. Feature Extraction for a Voice Recognition System in Noisy Environment: A Study, Second developed voice-based GSE is found usable. It is therefore International Conference on Computer Engineering and recommended that future researches should look at the Applications (ICCEA), 2010, 1. 358-361. (2010). integration of language translation that can make such a [13] Vairale V., Honwadkar K. Wrapper Generator using Java system to be able to work with variety of world languages. Native Interface. International Journal of Computer Science, However, there are some limitations in the existing voice 2(2). 126-139. (2010). input systems. One of which is the recognition accuracy. [14] Yuan, K., Hou, W. & Zhao, Y. Human Factor Research of Sphinx-4 is written in java and it does not have any noise Voice Search on Mobile Internet, IEEE Xplore. (2008). suppression module. It is very important to have one to [15] Vaishnavi, V., & Kuechler, W. Design research in improve its performance in the presence of noise. It will information systems, retrieved, July 19, 2010, from http://www.isworld.org/Researchdesign/drisISworld.htm. be interesting to implement the recommended Wiener (2005). filter implementation in java. It can be easily added to the [16] Walker W., Lamere P., Kwok P., Raj B., Singh R., Gouvea Front end block of Sphinx-4. E., Wolf P., Woelfel J. Sphinx-4: A flexible open source framework for voice recognition. Sun Microsystems, Inc. Mountain View, CA, USA:18. (2004). [17] SeniorNet. Lesson 4. retrave in 12 augest, 2010 from http://www.seniornet.org/php/default.php?ClassOrgID
IJCSNS International Journal of Computer Science and Network Security, VOL.10 No.8, August 2010 53 =5337&PageID=5920Hamzeh Boul is typing a message. Adib M.Monzer Habbal received his (2004). degree in computer engineering and post [18] Google, ANNUAL REPORT PUR-SUANT TO SECTION graduate Diploma in Informatics 13 OR 15(d) OF THE SECURITIES EXCHANGE ACT OF engineering from Aleppo University, Syria, 1934 For the fiscal year ended December 31, 2008. retrieved, in 2003 and 2005 respectively. In March July 19, 2010, from http://investor.google.com/ (2009). 2007, he received the Master’s degree in [19] Plapous, C.; Marro, C.; Scalart, P. Im-proved Signal-to- Information Technology from University Noise Ratio Estima-tion for Speech Enhancement, IEEE Utara Malaysia, Malaysia. Currently, He is Transactions on Audio, Speech, and Language Processing, a lecturer at UUM, Malaysia. His main 14 . 2098 -2108. (2006). research interests include Internet performance engineering, network security, network application, TCP and congestion control in Mobile Ad- hoc Networks. He serves on the Technical program committee of Hamzeh Mohammad Al_Abool many international conferences, such as NETAPPS 2010, received the B.S. degree in Software ICCAIE 2010, and ISCI 2011. Also he is a technical reviewer for engineering from Al-Balqa’ Applied IEEE TENCON 2009 and IEEE WiMob 2010. University, Jordan, in 2008. Currently, he doing Master studies in Information Technology (IT) at University Utara Malaysia and expected to graduate by this year. He has participated as presenter in the national conference titled " National Conference on Rural ICT Development (RICTD)" held on 2009, also he participated in the ITU-UUM Asia Pacific Centre of Excellence Training Workshop on " Wireless communication System: Wireless Network Fundamentals" held on 2009. Maytham Abdulhussein Ali received the B.S. degree in Computer Science from The University of Mustansiriya of Iraq in 2007. Currently, he doing Master studies in Information Technology (IT) at University Utara Malaysia and expected to graduate by this year. He has participated in the national conference titled " National Conference on Rural ICT Development (RICTD)" held on 2009, also he participated in the ITU-UUM Asia Pacific Centre of Excellence Training Workshop on” Wireless communication System: Wireless Network Fundamentals" held on 2009. Hamdi Khalifa Ibrahim received the B.S. degree in Computer Science from Sebha University, Libya, in 2004. Currently, he doing Master studies in Information Technology (IT) at University Utara Malaysia and expected to graduate by this year. He got on International Computer Driver License (ICDL) certificate in 2006. He has been working for two years as a teacher in computer institutes and programmer in the General Company of Electricity for four years. He has participated in The 2010 International Conference on Education Technology and Computer.
You can also read