Hands-Free Searching Using Google Voice

Page created by Frances Burns

IT & Technique

English

Like
Share
Embed
Fullscreen
Slides
Download HTML
Download PDF
Abuse

←

→

Page content transcription

If your browser does not render page correctly, please read the page content below

IJCSNS International Journal of Computer Science and Network Security, VOL.10 No.8, August 2010 47

Hands-Free Searching Using Google Voice

Hamzeh Al_bool1, Maytham Ali2, Hamdi Ibrahim3, and Adib M.Monzer Habbal4

InterNetWorks Research Group, UUM College of Arts and Sciences,
University Utara Malaysia, 06010 UUM, Sintok, MALAYSIA

many search engines have been developed to facilitate
Abstract information search across the globe via the World Wide
Google is one of the most popular information retrieval Web (www). Google is one of the most popular search
systems which it has a fast increasing in the number of engines and information retrieval systems. It is a huge,
users. It can provide a foundation for improving the global web-based application with search engine.
learning and teaching environments. However, not all Furthermore, it is very easy to use [17] with very simple
potential users have the capabilities that allow them to use interface. On top of that, it is linked with more than
Google without impediments. This is specifically true for hundreds of millions of web pages involving similar
motor-disabled users. Therefore, for this category of users numbers of different conditions. It has been recorded with
a special approach of interaction with Google interface is over ten millions of queries daily [3].
necessary. The missing capabilities of this category of
users should be substituted by capabilities that these users Unfortunately not all potential users have all capabilities
have in order to deal with Goggle. This study entails the that allow them to use Google information retrieval system
development of Hands-Free voice recognition Google without impediments. There are many people who have
Search Engine to allow physically disabled the injuries and disabilities which prevent many of them to
opportunity to operate a Google and browse the result of deal with computer applications to suit their conditions. It
search without using a keyboard or mouse. The has been revealed that more than 50% of newly hired
implementation is carried out with Java programming computer users reported musculoskeletal with symptoms
language and the result of usability testing shows that like eyestrain, neck and shoulder pain, low back pain,
93% of the participants found the system usable. elbow pain tendonitis), forearm pain (muscles) and nerve
entrapments [6]. Additionally, there are also disability
Key words: constraints in using the traditional keyboard input system.
Voice Recognition, motor-disabled users, Google, Java Hence, another form of input is necessary, such as voice in
Programming Language recognition systems.

The development of information retrieval system with
1. Introduction voice recognition products provides a foundation for
improving the learning and teaching environments through
Recent advances in computer technology have enabled technology. Voice recognition might be able to help the
users of voice recognition products to achieve desirable disabled users to explore information through a set of
results which were previously impossible with the largest voice commands that provide easy ways and controls
mainframe computers or workstations. One of the most mechanism for searching information. It is expected that
important reasons that led to the production a large voice will replace the physical manipulation as the
number of voice input systems is the ability of the voice dominant input modality. This shift will dramatically alter
input systems to remove a lot of literacy barriers in input needs, and the way users interact with computers.
communicating with the instructions of the computer.
This study is to introduce a new service which is not
Besides, the inventions of the technological innovations currently available in Google search engine (GSE) [18]. It
have suggested the emerging knowledge society, in which proposes the implementation of voice recognition input in
all our engagements in life will be focused on having GSE. The voice recognition input will entail the
knowledge of something. Such knowledge can be acquired development of Hands-Free GSE Using Voice
in many ways and in different forms. On the other hand, Recognition to provide physically challenged individuals

Manuscript received August 5, 2010
Manuscript revised August 20, 2010

48 IJCSNS International Journal of Computer Science and Network Security, VOL.10 No.8, August 2010

especially the visually-impaired with the opportunity to Voice recognition system is not an exception in this regard.
operate the GSE and browse the result of searching The usability of any voice recognition system is very
without using keyboards or mouse. Furthermore, it will important and is proven by [5]. Also, they suggest that
assist people who normally have problem with human factors should surround the use of voice
musculoskeletal disorders when working with computers recognition system.
via typing of instruction.
However, it was also revealed that using voice recognition
This paper is organized into six main sections as follows: to replace the traditional keyboard input system requires
It starts with this section that introduces the study and this additional support [8]. This implies that there is a need to
paper. It is followed with a section that discusses some take care of the sensitivity of the interaction mode in terms
related works. Section 3 follows next, discussing on the of noise management and the like exists.
concept of hands-free voice recognition GSE. The
implementation and evaluation are provided in Sections 4 The related works are reviewed to help deeply understand
and 5, which are finally followed with a concluding the missing and the impediments capabilities of motor-
section. disabled users that should be substituted by capabilities
that these users have in order to deal with Google. Mainly,
the studies in the literature come up with important
information that is needed to be more aware of each of end
2. Related Works users for Hands-Free searching using voice recognition
GSE. The importance of developing such system is to help
In the quest to assist the disabled individuals, in terms of the motor-disabled users to explore information by
their mode of interactions with computer-enhanced developing a set of voice commands that provide easy way
technologies, [5], proposed an assistive system utilizing and control mechanism for searching information over
voice recognition input technology. In the application, the Google information retrieval system.
voice commands are generated to assist the disabled
people to operate basic electronic equipments without Unlike other types of computer system, developing a voice
necessarily touching the equipments. In addition, the recognition system is better done using dynamic
authors used mouse emulators and keyboard emulators for programming approach [9]. This serves as the justification
interaction with computer for different computing for choosing Java programming language for system
purposes. development in this study.
Later, for physically disabled people to enjoy part of the
dividends of the 21st century technological innovations, [1]
propose a framework for controlling computer desktop 3. Hands-Free Searching Using Google Voice
with issuance of voice commands. Concepts
In fact, efforts on utilizing voice recognition for searching The concept behind the proposed voice recognition system
purposes are not new. In relation, a voice recognition over GSE is based on a native method, robot class;
system has also been designed for visually impaired Grammar file, sphinx-4, and Wiener filter as discussed in
people [7]. Although the system is specific to people with the following paragraphs.
vision problem, the developed voice recognition system in
their work is more general. If not for the provision of Native method (a.k.a Exec Method): it is primarily
alternative input system, these categories of challenged designed to enhance the interaction between the program
individuals might not have the opportunity to survive in and the operating system. It is a way of combining the
this knowledge world. power of C or C++ programming with Java [13]. In this
study, such advantage is utilized to be able to run the
On the other hand, voice recognition has also been internet browser application and also for easy display of
employed as security information in place of the website addresses.
conventional knowledge-based authentication approach
[4]. This implies that voice message is a promising stress- Robot Class in java: This class is used to generate native
free approach to replace the conventional typing-based operating system input events for the purposes of testing
input system. automation, self-running demos, and other applications
where control of the mouse and keyboard are needed [2].
On top of the robustness of a computer implementation, its The primary purpose of Robot is to facilitate automated
friendly usage experience is considered more important. testing of Java platform implementations. It is employed to

IJCSNS International Journal of Computer Science and Network Security, VOL.10 No.8, August 2010 49

deal with voice commands by taking over the control of 4. Implementation
the keyboard and mouse commands. The type of command
method in this case is voice command. A system has been implemented with the technique
discussed in the previous section using Java based
Grammar File: for the grammar file, a context free interface. The command and control grammar-based
grammar is specified as the input is capable of accepting recognition is implemented to make the system robust and
the voice input and produces a java language functions speaker independent. At present, the system includes two
that recognizes the appropriate instances of the grammar. types of functions. Firstly, through a microphone that is
The implication of this is that instances are created for all connected to a computer, the user can search information
possible grammars. by using Google search engine through a write function
method voice commands. Write function is a method that
Sphinx-4: it is a state-of-the-art voice recognition system is built to allow users to enter any sentences they want to
written entirely in Java, and is considered as a suitable search for, in Google search text box by voice. Secondly,
framework for Voice recognition to adopt. The main the user can access all Google search engine functions,
blocks are the FrontEnd, the Decoder, and the Linguist. such as Web, Images, Maps, Books and deal with all sub
Supporting blocks include the Configuration Manager and functions Google provides through using mouse voice
the Tools blocks. The communication between the blocks, commands and keyboard voice commands. Table 1 shows
as well as communication with an application, is described all the voice commands with resulting events to be carried
in [16]. Sphinx-4 is used to analyze the uses of voice out by the commands.
commands as shown in figure 1.
Table: 1 Voice Command Description
Voice
Commands Command Events
Name
Search To put the mouse pointer in the Google
text box.
Go To run the Google process.
Up, Down, To control the mouse in four trends
Right, and
Left.
Open To click on any button or link in which
it depends on the mouse position
Tab To move to the next section of a Google
page.
Enter To begin the desired process, which is
usually an alternative over the pressing
an OK button.
Back To return to the previous page.
Home To return to the Google Home Page.
Fig. 1 Sphinx-4 Decoder Framework.
Having specified the commands, with specified desired
Wiener filter: it is based on tracking a priori SNR using
Decision-Directed method, proposed by [19]. The two- actions as listed in Table 1 this study proceeds with
step noise reduction (TSNR) technique removes the initializing the steps in implementing the system. Figure 2
annoying reverberation effect while maintaining the shows the steps involved in the system implementation by
benefits of the decision-directed approach. However, opening the Google page. The diagram explains that users
classic short-time noise reduction techniques, including can utter any sentence that they want to search for. If the
TSNR, introduce harmonic distortion in the enhanced sentence is not included in the grammar file then the
speech. To overcome this problem, a method called
system will show expiration message that the sentence is
harmonic regeneration noise reduction (HRNR) is
implemented in order to refine the a priori SNR used to not included in the grammar file. But if the sentences are
compute a spectral gain able to preserve the speech in the grammar file the sentences will be displayed in
harmonics. Google text box search.

50                  IJCSNS International Journal of Computer Science and Network Security, VOL.10 No.8, August 2010

                                      Start
                                                             5. Test and Evaluation
                          Voice Input                             A. User test
                                                             The user test has been carried out. It was heavily based on
                                                             the users’ own words. This study chooses not to collect
                                                             any quantitative data or define goals for the tests to reach
                            Voice
                          Recognition
                                                             in order to pass as successful or not. The reason for this is
                                                             the same as made us go with a qualitative investigation in
                                                             the first place. The users had informative opinions of the
                [No]                  Is word in a           program after the tests. The evaluation include direct
                                      gram m ar file?
                                                             questions on whether each user thinks he or she will be
                                [Yes ]
                                                             using voice recognition input-based interaction in GSE
                                                             situations in the future, and if the tested program is a real
                    The word will be wrote in                alternative in this sense.
                    Google search text box

                                                             In terms of procedure, this study invited users personally.
                                      End                    In the invitation, users were provided with the system and
                                                             a suggestion on the details to be tested with each user.
              Fig. 2 Opening Google page.                    This study intends to test different aspects of the reliability
                                                             and easy to use concepts with different users. The reason
Further, this study also emphasizes on using voice to        is that it is more effective to have the test users inspired
                                                             and performs the tasks spontaneously, than to make them
control the input and navigate in a web page. Figure 3
                                                             all try every detail. The invitation also involves educating
illustrates the algorithm for the function. The diagram      the users in terms of dictation software as a way of
explains that users can use the voice command shown in       preparing them for the actual tests that are following.
Table 1 to control the search process and explore the        Along with the invitation process, the program will be
search result. In addition, users can control mouse          customized in order to fit each task and user.
movements and keyboard commands with voice to move
to any place in the Google web page and to access any link   When this study has made-up decisions with the selected
                                                             participants in the test and evaluation process, the
or function.
                                                             investigation were decided to be on reliability, easy to use,
                                                             and satisfactions of Hands-Free searching using voice
                               Start                         recognition GSE. Therefore this study concluded that the
                                                             users should all be familiar of using GSE. Furthermore,
                                                             this study thought it would be rewarding to have at least a
                       Voice Input
                                                             couple of the users already familiar with normal speech
                                                             recognition. Finally, the participants were of total thirty,
                                                             divided as follows:
                         Voice
                       Recognition                           Table 2: The participants of user test
                                                              Numb           Users          Age        Gender
                                                               er of     description       averag
            [No]                Is it com mand ?              Users                           e
                                                              Two      programmers/e        42.5          M
                             [Yes ]                                         ngineers
                                                                        working with
                   Apply the action of                                       speech
                       command                                            recognition
                                                                         development
                                                              Five          students         36        M/F
                                End
                                                                       already familiar
                                                                         with normal
 Fig. 3 Using voice commands to control search process.                      speech
                                                                          recognition

IJCSNS International Journal of Computer Science and Network Security, VOL.10 No.8, August 2010 51

Fiftee Students that 27.2 M/F hundred words and keyphrases organized for the purposes
n using Google of the search processes. these documents also contain the
search a lot but voice commands and the description of these commands
without where the main purpose to provide these voice commands
experience of to make participants more familiar about the voice
speech commands that are available and more clearly about the
recognition. functions of these commands in order to browse Google
interfaces.
Two Users have 23.8 M
motor Finally, we asked the participants to use the system and
disabilities. browse the result of searching without using keyboards or
mouse.
Six Newly hired 26.6 M/F
computer users The results of this laboratory experiment have been
analyzed and presented in the next section.
All participants were gathered from University Utara
Malaysia, and users who have motor disabilities from B. Result
outside the university. They were volunteered to perform There are two obvious problems found in the developed
the user test sessions. system. Firstly, technical problems such as voice
recognition limitations in the current voice input systems.
Besides, the vocabularies that are stored in the grammar Secondly, the limitations of the grammar file size where it
file consist of more than 3000 words. These vocabularies can only contain 120,000 words. These problems reduce
were collected manually from different sources; also they the reliability of gathered result.
are related to learning and education area. This study gives
special attention to the participants about the domain of In the study, the observations reveal that users deal
words that are availably to be search about, in which they directly with Google interface. This study interprets that
are only related to English education and learning area. the developed system is easy to use. This is partly because
of the interface of GSE itself, which is very simple and
The user test was focused on three aspects which are: (1) minimal.
how would users describe good reliability of our
techniques? (2) how would users describe good usability In addition, from the qualitative observation, this study
of our techniques?; and (3) are users satisfied?. quantifies the data for better view of the results. As stated
earlier, this study investigates the ease-of-use, reliability,
Therefore, the valuation of the program in the context and satisfaction among users in the user test. Hence, the
scenarios had come in order to identify the main three results were gathered accordingly, and are presented in
aspects of this study as follows: Figure 4.

Firstly, we introduced Hands-Free GSE Using Voice
Hands-free searching using Google
Recognition program to all participants and explained the
main functions of our system to increase the awareness voice
among the participants on what this system can do and 93% 86.60%
100% 76.60%
how they can deal with it. 80%
60%
Secondly, we installed our prototype software on thirteen 40%
systems and we divided participants into randomly groups, 20% Total
except the users they have motor disabilities we put them 0% participants …
in an independent group to be more focused. Also,
programmers/engineers working with speech recognition
development they do not have any group where those
played the oversight role on all groups within test and
evaluation process.

Thirdly, before we started we distributed a set of Fig. 4 The result of the participants satisfaction on the
documents to participants, which it contains a list of one Voice-based Google search Instruction

52 IJCSNS International Journal of Computer Science and Network Security, VOL.10 No.8, August 2010

From the graph in Figure four, this study interprets that 7. References
users found that the system is easy to use. In addition, the
results of search activities were found highly reliably. [1] Abdeen, M., Mohammad, H. & Yagoub, M. C. E. An
With the high reliability and ease of use, the users were Architecture for Multi-lingual Hands-free Desktop Control
found highly satisfied with system. System for PC Windows. IEEE Xplore, 1747- 1750. (2008).
[2] Baldwin. RIntroduction to the Java Robot Class in Java,
retrieved, August 10, 2010, from
Two different noise suppression methods were tested in
http://www.dickbaldwin.com/java/Java1472.htm. . (2003)
Matlab [19]. They were noise suppression using Spectral [3] Brin S., Page L. The anatomy of a large-scale hypertextual
enhancement and noise suppression using Wiener filter. A Web search engine* 1. Computer Networks and ISDN
wiener filter of the method proposed by Plapous was Systems 30, 107-117. . (1998).
tested. Furthermore, a better approach will be adding an [4] Cui, B. & Xue, T. Design and Realization of an Intelligent
implementation of Wiener filter in Java to the front end of Access Control System Based on Voice Recognition,
Sphinx-4, as the front end of Sphinx-4 is flexible and International Conference on computing, communication,
pluggable. Control, and Management, ISECS, IEEE, 229 – 232. . (2009).
[5] Gao, X. T., Ong, S. K., Yuan, M. L. & Nee, A. Y. C. Assist
Disabled to Control Electronic Devices and Access
This study has been introduced a new service which is not
Computer Functions by Voice Commands, Communication of
currently available in Google search engine (GSE) [18]. the ACM, 37 – 42. (2007).
Furthermore, to the best of our knowledge, there is no [6] Gerr F., Marcus M., Ensor C., Kleinbaum D., Cohen S.,
similar or close study to ours in order to compare our Edwards A., Gentry E., Ortiz D., Monteilh C.A prospective
approach with other studies approaches or to comparison study of computer users: I. Study design and incidence of
between our results with other studies results. musculoskeletal symptoms and disorders. American Journal
of Industrial Medicine 41`. 221-235. (2002).
[7] Halimah, B. Z., Azlina, A., Behrang, P. & Choo, W. O. Voice
Recognition System for the Visually Impaired: Virtual
6. Conclusion Cognitive Approach, IEEE Xplore. (2008).
[8] Mills, S., Saadat, S. & Whiting, D. Is Voice Recognition the
Previous studies have shown that there is a need to solution to keyboard-Based RIS?, IEEE Xplore, 1 -6, (2006).
provide an alternative to the conventional keyboard and [9] Ney, H. & Ortmanns, S. Dynamic Programming Search for
mouse input system, not only because of disability on the Continuous Voice recognition, Signal Processing Magazine,
IEEE Xplore, 16(5), 64 – 83. (1999).
part of the users but also, the associated stress with the [10] Rashid, R. A., Mahalin, N. H., Sarijari, M. A. & Abdul
modes of interaction among normal people. This has led to Azizi, A. A. Security System Using Biometric Technology:
several techniques in the area of voice recognition. In this Design and Implementation of Voice Recognition System
age, the world is being ruled with knowledge throughout (VRS), International Conference on Computer and
the potentials of technological innovations. Voice input Communication Engineering, IEEE, 898 – 902. (2008).
system will remove all barriers imposed by the traditional [11] Sergey B., Lawrence P. The anatomy of a large-scale
input system and will enable the physically challenged hypertextual web search engine. Computer Networks and
individuals access to necessary information as long as they ISDN Systems 30:107-117.(1998).
can communicate with audible voice. In this study, the [12] Shrawankar U., Thakare V. Feature Extraction for a Voice
Recognition System in Noisy Environment: A Study, Second
developed voice-based GSE is found usable. It is therefore
International Conference on Computer Engineering and
recommended that future researches should look at the Applications (ICCEA), 2010, 1. 358-361. (2010).
integration of language translation that can make such a [13] Vairale V., Honwadkar K. Wrapper Generator using Java
system to be able to work with variety of world languages. Native Interface. International Journal of Computer Science,
However, there are some limitations in the existing voice 2(2). 126-139. (2010).
input systems. One of which is the recognition accuracy. [14] Yuan, K., Hou, W. & Zhao, Y. Human Factor Research of
Sphinx-4 is written in java and it does not have any noise Voice Search on Mobile Internet, IEEE Xplore. (2008).
suppression module. It is very important to have one to [15] Vaishnavi, V., & Kuechler, W. Design research in
improve its performance in the presence of noise. It will information systems, retrieved, July 19, 2010, from
http://www.isworld.org/Researchdesign/drisISworld.htm.
be interesting to implement the recommended Wiener
(2005).
filter implementation in java. It can be easily added to the [16] Walker W., Lamere P., Kwok P., Raj B., Singh R., Gouvea
Front end block of Sphinx-4. E., Wolf P., Woelfel J. Sphinx-4: A flexible open source
framework for voice recognition. Sun Microsystems, Inc.
Mountain View, CA, USA:18. (2004).
[17] SeniorNet. Lesson 4. retrave in 12 augest, 2010 from
http://www.seniornet.org/php/default.php?ClassOrgID

IJCSNS International Journal of Computer Science and Network Security, VOL.10 No.8, August 2010 53

=5337&PageID=5920Hamzeh Boul is typing a message. Adib M.Monzer Habbal received his
(2004). degree in computer engineering and post
[18] Google, ANNUAL REPORT PUR-SUANT TO SECTION graduate Diploma in Informatics
13 OR 15(d) OF THE SECURITIES EXCHANGE ACT OF engineering from Aleppo University, Syria,
1934 For the fiscal year ended December 31, 2008. retrieved, in 2003 and 2005 respectively. In March
July 19, 2010, from http://investor.google.com/ (2009). 2007, he received the Master’s degree in
[19] Plapous, C.; Marro, C.; Scalart, P. Im-proved Signal-to- Information Technology from University
Noise Ratio Estima-tion for Speech Enhancement, IEEE Utara Malaysia, Malaysia. Currently, He is
Transactions on Audio, Speech, and Language Processing, a lecturer at UUM, Malaysia. His main
14 . 2098 -2108. (2006). research interests include Internet
performance engineering, network security,
network application, TCP and congestion control in Mobile Ad-
hoc Networks. He serves on the Technical program committee of
Hamzeh Mohammad Al_Abool many international conferences, such as NETAPPS 2010,
received the B.S. degree in Software ICCAIE 2010, and ISCI 2011. Also he is a technical reviewer for
engineering from Al-Balqa’ Applied IEEE TENCON 2009 and IEEE WiMob 2010.
University, Jordan, in 2008. Currently,
he doing Master studies in Information
Technology (IT) at University Utara
Malaysia and expected to graduate by
this year. He has participated as
presenter in the national conference
titled " National Conference on Rural
ICT Development (RICTD)" held on
2009, also he participated in the ITU-UUM Asia Pacific Centre
of Excellence Training Workshop on " Wireless communication
System: Wireless Network Fundamentals" held on 2009.

Maytham Abdulhussein Ali received
the B.S. degree in Computer Science
from The University of Mustansiriya of
Iraq in 2007. Currently, he doing Master
studies in Information Technology (IT)
at University Utara Malaysia and
expected to graduate by this year. He has
participated in the national conference
titled " National Conference on Rural
ICT Development (RICTD)" held on
2009, also he participated in the ITU-UUM Asia Pacific Centre
of Excellence Training Workshop on” Wireless communication
System: Wireless Network Fundamentals" held on 2009.

Hamdi Khalifa Ibrahim received the
B.S. degree in Computer Science from
Sebha University, Libya, in 2004.
Currently, he doing Master studies in
Information Technology (IT) at
University Utara Malaysia and expected
to graduate by this year. He got on
International Computer Driver License
(ICDL) certificate in 2006. He has been
working for two years as a teacher in
computer institutes and programmer in the General Company of
Electricity for four years. He has participated in The 2010
International Conference on Education Technology and
Computer.

You can also read