Visualisation of the Java Runtime Environment
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Visualisation of the Java Runtime Environment by Martin Alfred Henschke, BSc. School of Computer Science, Engineering and Mathematics, Faculty of Science and Engineering Supervisors: Mr. Graham Bignell, A/Prof Paul Calder November 19th, 2010 Submitted in partial fulfillment of the requirements of the Honours degree of Bachelor of Science in Computer Science at Flinders University of South Australia Adelaide, South Australia, 2010
Contents Declaration of Originality vi Abstract vii Acknowledgements viii Introduction 1 Literacy Review 4 Algorithm Animators . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Program Visualizations . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Visual Programming Environments . . . . . . . . . . . . . . . . . . . . 6 Effectiveness of Visualizations . . . . . . . . . . . . . . . . . . . . . . . 7 Java Virtual Machine Architecture . . . . . . . . . . . . . . . . . . . . 9 Problem Statement 10 1 Project Summary 12 1.1 Project Aims . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 1.2 JVM as an information source . . . . . . . . . . . . . . . . . . . . 12 1.3 Visualisation as an information medium . . . . . . . . . . . . . . 13 1.4 Dynamic Visualisation . . . . . . . . . . . . . . . . . . . . . . . . 14 1.5 Experimental Procedure . . . . . . . . . . . . . . . . . . . . . . . 15 2 Java Virtual Machine Features 16 2.1 Object Heap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.2 Frame Stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 ii
3 Visualisation Outline 19 3.1 Object Reference View . . . . . . . . . . . . . . . . . . . . . . . . 19 3.1.1 The Main Node . . . . . . . . . . . . . . . . . . . . . . . . 20 3.1.2 Color Coding . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.1.3 Querying Nodes . . . . . . . . . . . . . . . . . . . . . . . . 21 3.1.4 Navigation of Tree . . . . . . . . . . . . . . . . . . . . . . 22 3.1.5 Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.1.6 Multiple References . . . . . . . . . . . . . . . . . . . . . . 25 3.1.7 Looping references . . . . . . . . . . . . . . . . . . . . . . 26 3.2 Frame Stack Visualisation . . . . . . . . . . . . . . . . . . . . . . 27 3.2.1 Local Variables . . . . . . . . . . . . . . . . . . . . . . . . 27 3.2.2 Source Code View . . . . . . . . . . . . . . . . . . . . . . 28 3.2.3 Time Controls . . . . . . . . . . . . . . . . . . . . . . . . . 29 4 Implementation 31 4.1 Initializing TraneViz . . . . . . . . . . . . . . . . . . . . . . . . . 32 4.2 Data Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 4.2.1 Heap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 4.2.2 Stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.3 Display . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.3.1 Time Controls . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.4 Agent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 5 Experimental Design 35 5.1 Goal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 5.2 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 5.3 Materials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 5.3.1 Workshop Content . . . . . . . . . . . . . . . . . . . . . . 37 5.3.2 Testing Materials . . . . . . . . . . . . . . . . . . . . . . . 38 5.4 Expected Outcome . . . . . . . . . . . . . . . . . . . . . . . . . . 39 iii
6 Academic Survey 40 6.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 6.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 6.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 7 Conclusion 43 7.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 7.1.1 Low-level Representation of JRE . . . . . . . . . . . . . . 43 7.1.2 Implementation of Dynamic Visualisation . . . . . . . . . 44 7.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 7.2.1 Limitations of User Interface . . . . . . . . . . . . . . . . . 44 7.2.2 Additions to Visualisation Features . . . . . . . . . . . . . 46 7.2.3 Experiment Execution . . . . . . . . . . . . . . . . . . . . 46 7.3 Research Implications . . . . . . . . . . . . . . . . . . . . . . . . . 47 Bibliography 48 A Academic Survey Questionnaire 51 iv
List of Figures 3.1 An application visualized in the Object Reference View . . . . . . 20 3.2 An Object Reference Visualisation accompanied by the Color Legend 22 3.3 A selected node displaying its instance variables in the Object Ref- erence Visualisation . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.4 An Object Reference Visualisation with Side Bar displaying depth 24 3.5 Arrays in the Object Reference view . . . . . . . . . . . . . . . . 25 3.6 An example of multiple referencing in the Object Reference view. The node highlighted in green points to the same object as the selected node. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.7 A program’s execution trace shown in the Frame Stack Visualisation 28 3.8 A selected frame and it’s local variables in the Frame Stack Visu- alisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.9 The active frame highlighting the line of code it points to in the Frame Stack Visualisation . . . . . . . . . . . . . . . . . . . . . . 30 4.1 The System Architecture for the TraneViz . . . . . . . . . . . . . 31 v
Declaration of Originality I certify that this thesis does not incorporate without acknowledgement any ma- terial previously submitted for a degree or diploma in any university; and that to the best of my knowledge and belief it does not contain any material previously published or written by another person except where due reference is made in the text. Signed Dated Martin Alfred Henschke vi
Abstract Visualisation used as a demonstration tool, in the teaching of object-oriented con- cepts in computer science, exist both as standalone tools used in conjunction with running applications and integrated development environments. Although many of these tools have merit in teaching high-level conceptual information, issues can arise when students make the transition from the use of these tools to more advanced or low-level programming tools and techniques, as information about program functionality is typically abstracted out of the visualisations presented in these tools. This project attempts to bring the benefits of higher level comprehension and engagement inherent with the use of visual teaching tools while providing a low- level perspective of the functionality of a program’s runtime environment. This application, TraneViz, is designed for use in developing programs written in the Java programming language. The visualisation provides information about the JRE’s two most critical features, the frame stack and the object heap, while providing an additional layer of abstraction to demonstrate other features such as the reference relationship between variables and objects, variable assignments and the flow of execution. The visualisation developed is aimed at aiding students’ understanding of more complex object-oriented concepts at a first or second year level. The appli- cation has been implemented to support a single-threaded, small scale Java pro- gram. A thorough plan for testing and assessment resources has been developed to determine if the application is a viable alternative to traditional non-visual methods of teaching, but the experiment had to be simplified to use academic staff and students in place of beginner programming students due to a lack of volunteers. This project demonstrates that the Java Runtime Environment can be rep- resented visually, and whether or not the visualisation has positive learning out- comes for students. On a broader level, it provides a study on the use of visu- alisation in a teaching context, and whether this approach has merits in other scientific applications. vii
Acknowledgements I wish to acknowledge my supervisors for their assistance and encouragement in the preparation of this thesis: My primary supervisor, Mr. Graham Bignell aided in the early development of the project plan and research, and my assistant supervisor A/Prof Paul Calder in its implementation. I would also like to thank Mr. Brett Wilkinson for acting as a mediator between myself and the students I had intended to use in testing my visualisation, and Dr. Carl Mooney in aiding in the development of an experimental plan. Finally, I would like to thank all the staff and students at Flinders University who participated in the academic survey, who have accomodated my research and provided feedback on possible applications and improvements to the visuali- sation. viii
Introduction As object-oriented languages have become more popular as the first language taught to programming students, visualisations designed to demonstrate object- oriented features of these languages have become increasingly popular as teaching tools. Applications such as BlueJ and Alice are now standards in many schools around the world as tools used to first teach students about the basic principles of object orientation. These visualisations can vary broadly in scope; some provide a simple demonstration of a particular aspect of a language while others provide a fully integrated development environment. Many also provide additional levels of abstraction on top of the subject matter, to hide more advanced and complicated language features not considered relevant to teaching the broader principles of OO programming. These forms of visualisation support a high-level ‘objects first’ approach to teaching programming [Cooper et al., 2003], where the broad features of the data organisation structure of OO languages are taught to students initially, followed by programming structure and basic syntactic features. More advanced language features are typically not taught to students until they have gained a firm grasp of the paradigm; once this stage has been reached students make the transition to learning more traditional procedural programming techniques and how these relate to the object-oriented paradigm. Visualisations usually support only the first stage of this method of teaching, providing an abstract representation of objects and data structures while hiding some of the more complicated features of the language usually reserved for later years of study. Once this point is reached, students are advised to program in a different environment, either through the use of a conventional IDE or by the command line. While this method of teaching does provide a strong understanding of the OO paradigm, it can have negative consequences in later years of study. The visualisation tools used to initially teach these early aspects of object orientation have been observed to be used by students well beyond the scope of their use as teaching tools by the course providers. As these tools typically hide information about the more complicated language features and lack many of the more sophis- ticated features of an IDE, the use of these tools outside the context of teaching high-level OO features can restrict student learning outcomes [Kölling, 2008]. An 1
over-reliance of these tools can make the transition to a command line environ- ment or to an IDE challenging for students. Tools that hide information such as access to the main method or syntactic features can severely hamper a student’s ability to pick up the language with all of its features included in a different en- vironment. These problems can be linked back to the high level of abstraction typically seen in visualisation tools, which contribute to hiding information about how a programming language functions. This project attempts to provide a solution to this issue by providing a vi- sualisation with a significantly different scope to help teach students about the more complicated low-level features of the Java programming language, while re- taining a simple interface that is facilitative to student programming at a first or second year level. This approach is based on the observation that many is- sues with these visualisations lie in the abstraction of low-level information. The visualisation produced by this project will act as a compliment in teaching specif- ically about these features. The application developed as the main aspect of this project acts as an abstract visualisation of the Java Runtime Environment, or JRE. Many of the core features of the Java programming language can be di- rectly observed and are governed by the JRE. The visualisation developed takes advantage of this fact and uses this information to demonstrate some of the key features of how the Java programming language functions. The visualisation is divided into two main sections. The first section provides a visual representation of the object heap, which stores each object that has been constructed in the process of the application. Each object contains state information and pointers to other objects stored in the heap. The visualisation takes advantage of the pointer structure of referencing in the heap, and provides a hierarchical view of how objects reference one another. This visualisation is populated by a series of nodes organized in a tree-like structure, with each node representing an object variable pointing to another object. A connection between nodes indicates the parent node holds a reference to the child node. The resultant structure allows students to view the structure of pointers and references in their Java application, and allow them to develop an understanding about variable access and pointers without requiring a strong understanding of the material prior to observing the visualisation. The second section of the visualisation is a direct representation of the frame stack stored in the main thread of execution. The visualisation does not support multi-threading, so only the main thread’s stack is observable. In the stack, the JRE stores a series of frames that point to each method currently in the process of being executed. Frames are added and removed as methods are called and completed. At each point in an application’s execution, students are able to observe each frame in the stack, and the value of local variables stored in each 2
of the frames. The way the frame stack is presented is very similar to how many IDE and debuggers present execution history when reporting exceptions, and so should be familiar and easy to understand by most students. It also allows them to comprehend what each aspect of their application does, how it affects the flow of execution and object states, as well as demonstrating in detail many other features of the language typically taken for granted, such as the initialization of field variables outside of the constructor and the passing of parameters to methods. Finally it gives a detailed view of local variables and differentiates them from field variables, helping students understand how the scope of methods affects the accessibility of data. To determine the effectiveness of the visualisation as a teaching tool for first and second year computer science students, an experimental plan has been de- veloped, aimed at specifically determining if there are any measurable benefits to learning with this tool, and if so which method of interaction with the tool provides the greatest benefits. Students are tested through a two hour tutorial, where they are taught information considered to be relevant to the learning out- comes of the visualisation. The experiment consists of three major test groups- the first has access to the visualisation as a demonstration tool, and are only able to watch programs as they execute. The second group has full access to the visualisation as a development tool and are able to make modifications to the application and observe the results of their changes. The third and final group act as the control, and are taught the same material but do not have access to the visualisation. Each experimental group is administered two written tests, one before the tutorial and one after, which they must complete to determine any changes in their understanding of the material. Due to a lack of participants for the experiment, a shorter survey involv- ing honours students PhD students and academic staff was taken to determine whether or not the visualisation was percieved as a viable teaching tool. The sur- vey involved a brief 15 minute demonstration of the visualisation’s features and capabilities, followed by a short written survey including a quantitative analysis of the visualisation’s accessibility and usefulness, as well as several short-answer questions regarding the strongest and weakest aspects of the visualisation, and its possible applications. 3
Literacy Review Algorithm Animators The use of visualization in teaching Computer Science has its roots in algorithm animators, which provide a visual representation of the function of algorithms. Traditional algorithm animators are standalone computer applications that take in user specified data and produce an animation to accompany the output. Cur- rently, algorithm animations are mostly used in teaching broad computer science concepts, including searching, sorting and recursive algorithms. One such system is XTango [Stasko, 1992], an algorithm animator that al- lows users to develop custom animations for algorithms written in C. The system provides editing tools that allow the representation of variables and data struc- tures as shapes, and also provide a means of manipulating their position, color or size as their state changes. Fluid animations are used to represent changes in state; discrete ‘frames’ of each step in an algorithm make precisely detecting changes difficult, where an animation draws the users attention to a specific part of the visualization and allows them to observe how it changes. As the system allows custom animations to be made, it can serve as teaching tool for lecturers to demonstrate how particular algorithms operate. Program Visualizations Visualization tools became more popular with the mainstream acceptance of the object-oriented paradigm and the replacement of Pascal with Java as the most commonly used teaching language in introductory Computer Science courses. Object-oriented languages lend themselves well to visualizations, as objects can be represented as individual entities that are created and modified over the life of a process. The use of visualizations has become increasingly popular in teaching approaches that have a focus on the data structure and the abstract concepts of the language, due to their clear and effective representation of the OO paradigm, which can be difficult to understand or efficiently utilize at first [Dershem and 4
Vanderhyde, 1998, Machanick, 2007]. The varieties of visualization available to aid in learning object-oriented languages can vary from a windowed application accompanying a running program to a programming interface with visual com- ponents, but each have the fundamental goal of providing a visual abstraction of the problem space to make programming more accessible to students. The simplest of visualizations provide a view of one aspect of a running Java application. Jacot is a tool developed to demonstrate the concurrent capabilities of Java by providing a visualization of thread operation over the life of a program [Leroux et al., 2003]. There are two parts to the visualization: a sequence diagram which shows the execution history of each thread, and a thread state diagram which shows the state of each created thread at a given point in time. Jacot uses UML sequence diagrams for representing thread activity, because of their ability to represent time and the interaction between objects through method invocations. Thread state diagrams are represented as UML statechart diagrams, as they can accurately depict each of the various states of a thread over its lifetime. Jacot can be used to visualize concurrency issues such as starvation and deadlock, but its close adherence to UML has been criticized as not being able to represent some critical behaviors of threads in concurrent applications such as notification, suspension and sleeping. This problem has been addressed by other more recent visualizations, such as the lock-causality graph [Kim et al., 2009] which creates a new graphing notation to better represent concurrency, and extensions to the UML notation to supplement these shortcomings [Artho et al., 2007]. JAVAVIS is another system that provides a separate visual interface, but its focus is on object attributes and method calls rather than concurrency [Oechsle and Schmitt, 2002]. JAVAVIS uses an object diagram to visualize a frame on the JVM stack, providing information about the frame’s possessing object, local variables and method arguments. A sequence diagram is also used, with the same notation as a UML sequence diagram, to represent method calls between objects. As the visualization continues, method calls are made and recorded as events on the sequence diagram. During program execution different classes can be selected and visualized. The goal of the tool is to provide a gestalt perspective on the application’s function at runtime, to solidify concepts such as flow of execution. The Jeliot family of visualizations have a similar goal and system to JAVAVIS, but with a stronger focus on visualization as a learning tool [Moreno et al., 2004]. The Jeliot visualizations also show interactions between objects and variables, but provide several different visualizations which students may select depending on which is most relevant to what they want to understand about their program. 5
Visual Programming Environments In some Computer Science courses, programming is a secondary focus to higher level concepts such as object orientation and algorithm function. This approach, based on Bloom’s taxonomy, is based on recent research suggesting that pro- gramming is not considered to be the lowest level of cognitive skills in computer science [Machanick, 2007, Hitchner et al., 2001]. Particularly with lower level languages such as C++, where students have access to system resources through dynamic memory allocation or pointers, learning solely about the language’s func- tion without proper understanding of the overarching structure and capabilities can lead to poorly designed programs and systems. Learning to program at too early a phase can be considered to have a negative impact on a student’s learning outcomes in the field of Computer Science [Hitchner et al., 2001]. In response to this concern, there exist several visualization tools that provide a programming interface designed to take pressure off of the complex syntax and semantics of Java and focus more heavily on the concepts that govern it. One of the earliest and most commonly used of these visualization interfaces is BlueJ [Kölling et al., 2003], a visual programming interface that provides a static visualization of the class structure of the application. BlueJ is designed to provide an object-oriented program development environment in place of a more conventional, procedurally structured environment. The system is designed to remedy three major issues observed with programming environments that in- terfere with understanding the OO paradigm: they are traditionally not object oriented themselves, overly complex, and too heavily focused on user interface development. BlueJ extends UML notation to produce a class diagram, which displays class association, inheritance and polymorphism. Modifying the source code of a given class requires finding its place in the hierarchy, providing an in- formative display of the object-oriented concepts that govern a Java program as the students writes their code [Van Haaster and Hagan, 2004]. Programmers are also able to create instances of objects external to the main method of the appli- cation, and individual object methods can be called by the programmer, adding tangibility to data structures. A survey of the use of BlueJ as a teaching tool found students performed better and were more comfortable with the concepts of the OO paradigm when using the tool than those that used a text editor. BlueJ’s heavy focus on object orientation over industry standard programming practice have also been acknowledged by Kölling [2008] as making the transition from BlueJ to an IDE or to command-line programming a difficult one however, which can lead to an over reliance on the tools at inappropriate levels and restricted learning outcomes. Several visualization tools that have been developed provide an interface to 6
supplement or replace manual coding, foregoing the need to teach syntax in favor of higher level concepts. JPie is a teaching tool that provides a programming interface by dragging and dropping tags containing code instructions to form statements [Goldman, 2004]. Calling methods or modifying variables is done by finding the correct methods and variables from a list of tags and dragging and dropping them into the method body. Programming is done entirely through the visualisation, and requires no text modification; language features such as setting method access and modifiers are done through the manipulation of the interface rather than writing code. Another tool with a similar interface for program development is Alice, where statements are written in pseudocode rather than in Java syntax [Cooper et al., 2003]. Instead of a traditional programming tool, Alice’s programming interface allows users to create animated 3D scenes. Objects are placed in a 3D environment and can be manipulated using drag and drop programming commands. Alice also supports the teaching of many more complicated concepts such as recursion [Dann et al., 2001]- ordinarily a very complicated concept that is difficult to visualize Alice’s tools allows students to easily produce recursive methods and receive visual feedback on how they operate. Effectiveness of Visualizations The use of visualizations as a pedagogical tool is a dividing factor within the com- puter science community. Advocates of visualization argue visual representations provide an unique perspective on programming that aids student comprehension, and provides a more interesting and engaging way to learn. Visualizations are also argued to be more natural to programmers; Petre and Blackwell [1999] de- termined in a study of industry experts that visual thinking is a large part of the programming process, and programmers will often use or fabricate visual metaphors to help them in understanding a problem. Despite this, visualization tools have yet to become popular in mainstream teaching. A meta-study on the effectiveness of algorithm visualizations revealed that despite most studies con- cluding that visualisations are effective teaching tools, they remain unpopular with universities as they not effective enough to justify the amount of time neces- sary to become familiar with them [Hundhausen et al., 2002]. One of the primary advantages of using a visualization tool is intuition- the abstract representation should lead to an intuitive understanding of the subject. However visualizations are not always a perfect mapping of the problem domain, and their interpretation is a learned skill [Petre, 1995]. Despite the importance of interaction with these tools, surprisingly little research has gone into the impact of interface design and interactivity. Echoed in another study, it has been found that student interfac- ing with these tools can be just as important as the visualization produced by 7
the tools themselves. Lawrence et al. [1994] determined that students that ac- tively engaged with algorithm animation tools by providing their own data and manipulating the visualization had higher learning outcomes than students that were provided with ready-made data. It was also determined that these bene- fits seemed to have the greatest impact on deeper levels of comprehension, in understanding the fundamental aspects of the algorithm itself. Abstract visualizations also have several inherent dangers. Petre [1995] dis- cusses the heavy reliance images have on relative component positioning and size (dubbed ‘secondary notation’) in conveying meaning. Secondary notation is present in all visualizations, and implies the relevance of aspects of the image and relationships between components. Well constructed visualizations use secondary notation to improve clarity and simplicity of the diagram, but conversely images that fail to recognize these subtle distinctions can imply relationships and levels of relevance where they do not exist, making them misleading. The audience for which the visualization is intended is also important- using a visual metaphor in place of a concrete concept implies some level of information will be lost- what information is lost and how the metaphor makes up for this shortcoming will of- ten determine the purpose of the visualization itself. Visual metaphors designed to explain very high level concepts, as many of the program visualizations are, will often hide a large number of lower level concepts. These visualizations are useful for understanding the concepts the image accounts for, but will be of little use to programmers who need access to the hidden information. In the same vein visualizations with many concepts covered in the visual metaphor may be useful to skilled programmers, but will be too confusing for novices. The level of abstraction, i.e. how “close” the visual metaphor is to the underlying concept is also a major influencing factor in how useful the visualization is as a learning tool. Scheiter et al. [2008] performed a study comparing realistic and abstract biological visualizations of cell division, and the benefits each approach had to students. It was determined that the most realistic representations (in this case camera feeds from microscopes) had a negative impact on learning outcomes, and students performed best when using abstract representations. Similarly, abstract representations can shield students from certain lower level concepts, but when it is necessary to learn these concepts it can make the bridging process more difficult than it would have been otherwise [Kölling, 2008]. In addition to secondary notation, many visualizations use animation of com- ponents to draw the user’s attention to relevant aspects of the image. As most algorithm animators and program visualizations are showing changes to the state of a system, animation can be used to make specific changes more obvious and their effects on the rest of the system clearer. Despite this, the use of anima- tion in software visualization tools is also subject to some debate. In a survey 8
on commercial programmers, it was found that animation and 3D visualizations are the least important aspects of a visualization tool for this group [Bassil and Keller, 2001]. Furthermore, analysis of the Jeliot programming tools found that animations could often obstruct information and make the tool more difficult and frustrating to use [Moreno et al., 2004]. A meta-survey conducted by Tversky et al. [2002] on the use of animation found that they can be dangerous in some applications as it can be misleading or distracting, and benefits are only seen in systems where the animation conveys some additional meaning to that already provided in the visualizations; it has a null or negative effect if used to reiterate information that is already present or for cosmetic reasons. Java Virtual Machine Architecture As specified in Lindholm and Yellin [1994], the Java Virtual Machine architecture is made up of a series of components that interpret class files and execute instruc- tions. There are five major components that make up the JVM- the PC register, which points to the currently executing method for each thread, a stack unique to each JVM thread that stores frames and traces thread execution throughout the application, the heap accessible to all threads that is used by references to directly access data objects, the method area which contains loaded class meth- ods analogous to compiled code and used for thread executions and the constants pool which keeps a list of all final variables for each object. These five major com- ponents are used by the main thread of execution and all instantiated threads to access instructions and update the program’s state. The JVM state is accessible through the Java Platform Debugging Architecture (JDPA), a multi-tiered inter- face that has direct access to JVM execution through the use of agents [jdk, 2010]. It is made up of three levels: the Java Virtual Machine Tools Interface (JVMTI), a native-code interface that operates through the JDI to access VM state infor- mation directly, the Java Debug Wire Protocol (JDWP) which specifies the way in which debuggers interact with the JVM and the Java Debug Interface (JDI), a Java-side interface that accesses state information from the VM. The JVMTI is typically used for accessing information specifically about the functionality of the VM and is recommended only for debugging very low-level program features or the VM itself, while the JDWP/JDI are designed to produce debugging tools for use with other Java applications. 9
Problem Statement The goal of this project is to provide a solution to some of the major issues that exist with the currently available visualisations. At present a large host of visualisations exist that have beneficial results on students’ understanding of the OO paradigm. However many visualisations used as teaching tools have been acknowledged to also have negative effects on student learning. These relate to the method in which the visualisations are presented and the information they provide to students. Most high-level visualisations demand a level of abstraction to present their information clearly and cohesively without confusing the users with technical de- tail. Although this does help to make the visualisations more approachable to beginning students, the method in which this information is hidden makes it more difficult to comprehend when it is necessary to access at a later date. This infor- mation typically includes lower-level features of programming languages- in the context of Java this can include the method by which variables reference objects, the storage of field and local variables, the main method and its relationship with the rest of the program, the method of execution and in some systems even basic syntactic language features. If this information is not sufficiently covered by other learning sources, it can lead to a poor understanding of language functionality and the inability to write functional applications on a larger or more intricate scale. Visualisations have also been noted to make the transition to different devel- opment environments more difficult. Integrated development environments such as Alice or BlueJ successfully make programming easier by removing complicated features of language such as main methods, structuring of class files or detailed syntax, but as a consequence can make the transition to another environment that does not automatically handle these features more difficult. Students have been reported to continue using inappropriate development tools at much later stages in their education, which can act as a detriment to their ability to compre- hend presented material- without an understanding of the basic mechanics of the language, proper understanding and effective utilization of advanced language features is impossible. 10
This project will approach this problem by providing a similar programming tool with a focus on information provided by the Java Runtime Environment, and determine whether or not a similar programming tool can be developed to be used alongside these visualisations that allow students to better understand the lower- level features of Java. The project must also determine if such an approach will have positive learning outcomes for students, and if it is successful in solving or alleviating the issues associated with other visualisation tools. 11
Chapter 1 Project Summary 1.1 Project Aims The project has the overarching goal of providing a solution to the problem statement by developing a piece of software that would be used to help students understand some of the lower level features of the Java programming language. The software would be primarily aimed at a beginner programming audience as a demonstration tool rather than as a software-development tool, but provide sufficient flexibility for experimentation. The application is also intended to be used with running Java applications- the tool should be able to take a program as an input and use it to generate the relevant information. The scope of this project will encompass the development of a software tool that is capable of producing information representing the low-level functionality for any given Java program. It will also encompass the testing the efficacy of this software with beginner-level programmer students to determine if the software improves their understanding of the Java language and the way Java programs work. 1.2 JVM as an information source The source of information for the software will be the Java Virtual Machine, the platform on which all Java applications run. When the project software is executed, it will start a new instance of the JVM running the demonstration program, and access all information about the state of the program from the JVM. The JVM retains all runtime information about the program it is currently running, including the state and value of each of the objects that have been constructed and the state of each currently executing thread. 12
Information regarding the state of JVM can be accessed using the Java Plat- form Debugging Architecture, or JDPA. The JDPA defines a protocol for access- ing each of the various attributes the JVM stores as well as some basic abilities to control its execution such as requesting sleep and notify events on active threads. The JDPA is divided into two main levels- a Java-side level that accesses infor- mation about the program currently running on the JVM, and a C-side agent layer that is able to interface with the JVM directly and access all information regarding it’s state and execution. Support for these interfaces is typical in most virtual machine implementations, including Oracle’s HotSpot VM which is the standard used for most Java installations. The JVM was chosen as the source of information for the application as it provides a large amount of runtime information about a Java application while retaining a low-level focus. As an agent constructed to operate on the JVM is independent of either the platform or the application, it allows the software to be highly flexible in its application. The low-level perspective of the JVM combined with the wealth of information it provides also allows for abstraction to be used to remove information considered irrelevant, and provide a stronger focus on the aspects of the application considered most crucial to the understanding of the underlying features of the Java language. 1.3 Visualisation as an information medium One of the main tasks of the application will be the implementation of a visu- alisation that will use the information extracted from the JVM. The goal of the visualisation is to provide an easy-to-understand visual perspective of the running application, and giving the user the ability to control what information they wish to display through a visual interface. Visualisation was chosen as the main medium for presented information to exploit the beneficial learning features associated with the use of visuals, while retaining the depth necessary to present the information from the JVM. Visu- alisations have been shown to improve student learning outcomes when used to represent concepts that lend themselves well to a visual representation, and those that make effective use of diagrammatic secondary notation can improve compre- hension of material [Petre, 1995]. Well designed visualisations can also provide a highly detailed view of a particular area of a problem, or a broad, overarching gestalt perspective. This allows students to either view a problem as a whole or find a particular aspect of problem in the visualisation and extract more detailed information about it from that perspective. In the problem space of a program ex- ecuting in runtime, with a potentially large number of objects or runtime features 13
to visualize, the ability to have both a broad and specific perspective available is highly desirable. Visualisations are also considered to be more interesting and comprehensible than a flat textual representation. With the large amount of runtime information available from the JVM, simply viewing this information from a textual perspec- tive alone could make the location of specific information difficult and be less ben- eficial as a teaching tool. Organizing information visually makes relevant infor- mation not only easier to locate, but more interesting and captivating for the user. A visual representation of running Java applications is relatively common among both visualisations and other teaching materials, especially through the use of UML notation to demonstrate either program execution [Leroux et al., 2003] or class structure [Van Haaster and Hagan, 2004, Riley, 2006]. The widespread use of these applications demonstrates the effectiveness of using visualisations to represent object-orientation visually. The visualisation should be able to represent the object-oriented features of an application and provide sufficient depth to allow comprehension of some of the more complicated features of the JVM. Due to the JVM’s high level of complex- ity, a direct visual representation would not be beneficial as a learning tool for beginning programmers- a level of abstraction over the architecture will ensure information provided is relevant and easy to comprehend while maintaining a good level of depth A detailed user interface should also allow the scope of the visualisation to change from an overarching perspective of the entire application to a more precise, detailed view of individual components. The visual interface should also be simple, tidy and consistent with the features it is intended to demonstrate. 1.4 Dynamic Visualisation This application will be implementing a dynamic visualisation rather than a static one; this means that as the state of the application changes, the visualisation will update to demonstrate these changes in real time. A dynamic visualisation was chosen to be implemented for this project as it provides significantly better feedback on a program than a static one would in a similar context. This also adds a level of complexity to the application, as it demands the virtual machine be run in real time alongside the visualisation rather than separately, and measures must be taken to ensure it is properly controlled by the user without compromising the integrity of the application. For the application to run dynamically, the VM will be running in parallel alongside the application. This requires the visualisation to be able to start and 14
stop the VM at particular intervals. This approach was chosen over the alternate possibility of running the application before the visualisation and generating a profile with all information about the VM stored within it. Generating a profile has the advantage allowing users to move forward and backwards in execution, but runtime errors would be detected before the profile completes whereas in a real-time running VM, a runtime error is detected once the visualisation reaches it in execution. This allows students to witness exactly where it occurred, and monitor the state of the application before it occurred. Another benefit of live execution is other possible errors that cannot be detected easily, such as infinite loops. A profile cannot recreate an infinite loop as effectively as running the program in real time, as the error can be observed as the program is executing. 1.5 Experimental Procedure Once the visualisation has been completed, to determine whether or not it will be effective as a teaching tool an experimental procedure will be developed. The experiment will use volunteers from the target audience, namely first or second year programming students, and test the effect the visualisation has on their understanding of programs written in Java, as well as the Java language itself. It should include presenting low-level material about the Java language to students, with one of the student groups having access to the visualisation and a control group that does not, to determine if the first group has a better understanding by the end of the experiment. Execution of the experiment is an optional aspect of the project as time con- straints as well as the complexity of the software necessary for the experiment to go ahead are considered to make any formal experimentation unfeasible. How- ever, plans for such an experiment as a possible segment for future work on this project make the development of an experimental plan a desirable component. The ability to formally test the capabilities and value of the application will pro- vide insight into whether the visualisation is an effective learning tool, as well as whether or not a low-level approach such as the one employed in this project has merit in practical teaching applications. These conclusions are necessary to verify the validity of the approach, and the project in general. 15
Chapter 2 Java Virtual Machine Features As the visualisation takes its information directly from the Java Virtual Machine, and is intended to demonstrate runtime behavior of a program, the structure of the visualisation very closely follows the model used for the JVM itself. The two main features of the JVM, the object heap and the frame stack are used as the basis for the two main views of the visualisation. Abstraction has been used in devising the visualisation to hide information considered irrelevant to the purpose of the application and to ensure the information presented is as clear as possible. 2.1 Object Heap The main memory storage aspect of the JVM is the object heap, where all declared objects are stored and referenced. The heap has no organization or structure, but each object is dynamically allocated the amount of memory necessary for encapsulation, and their memory locations are used by other objects and methods to access their individual components. A typical Java application stores a series of objects over the duration of its execution. Due to the tight integration of the OO paradigm into Java, all ob- jects exist as references to another object with the exception of the main class, which instead stores objects as local variables of its method or static fields of its containing class. All objects in Java are stored in memory and are inaccessible for access by other objects unless they hold a reference, either a field or local variable, that points to the object’s location. Field and local variables function in Java in a very similar fashion to pointers in C. Each one holds a memory location for the object it refers to, and using the ’.’ operand in conjunction with a variable allows access to the object the reference points to. 16
In this form of memory structure, each object is intrinsically linked to one another through a hierarchical structure. Each object holds a series of variables that point to other objects. At the beginning of every Java application, in the main method a series of objects are defined and held as references, which in turn create more objects and store them as references. Each object created in a Java application can be traced back to the main method, and the initialisation of the main method’s containing class. This hierarchical structure forms the basis for the first component of the visual model. The object heap, although itself without an implemented organization structure, is organized by the way in which objects create and reference one another inherent with Java’s mechanism of execution. At the bottom of this structure are the variables accessible to the main method, with every other Java object that has been created and stored able to trace its containing objects’ references back to the main node. This creates a tree-like structure, which can be represented visually. There are some features of the Java language that must be taken as special cases to be accounted for in this model. Primitive values in Java such as integers and bytes are not stored in the same way objects are- references to primitives contain the value of the primitive itself, rather than a pointer to an address in memory where the primitive is stored. As a consequence, primitives do not appear as nodes on the tree. Static variables among classes also break the paradigm as a static variable does not belong to any one object. It is considered that each object instead has a reference to the same variable, and as a consequence the static variable of a class appears as a reference to each object of the class, with each reference pointing to the same object. In addition to this, there are several other possible mechanisms that can be taken in the Java language that break this model. These are each handled indi- vidually and discussed in the visualisation section This component of the model is designed to help students better understand how Java handles object references and its pointer-based structure. Making a modification to a object through a single reference will influence all references that point to that object, which can cause unexpected results in a program if not properly catered for- this model demonstrates when this will occur, and to what effect. It also allows students to observe the structure of their applications from a gestalt perspective and better understand what objects exist and their state at different points in execution. 17
2.2 Frame Stack The second feature considered relevant to this application is the frame stack stored in the JVM. For every active thread in an application, the JVM stores a frame stack that measures its current position in execution. The frame stack consists of a series of frames that demonstrate each method a Java program is in the process of calling. Every time a new method is called, a frame is added to the stack that stores information about the exact location of the execution and the state of local variables within that method. Frames are removed when a method completes its execution. The visualisation developed has a relatively literal interpretation of the frame stack, and visualises it as a series of ‘blocks’ that are stacked on top of one another. Each block contains a list of information found in the frame, namely the containing object’s type, the method, line number and a list of all local variables contained in the frame. The frame stack visualisation is designed to be completely compatible with the heap visualisation, and any local variables that are displayed in the frame stack should be viewable on the object tree. As with the frame stack in the JVM, frames are added and removed as methods are entered and exited. As it is not intended for this visualisation to support multiple threads of execution, only one frame stack, that belongs to the main stream of execution, will be present. The frame stack allows students to understand a number of features regarding the method in which Java executes code. The linear progression can be observed as the frame stack updates from one line to the next, and points where it devi- ates from a linear execution, such as to initialize field variables that have been declared and assigned outside of the constructor, will be viewable. It also makes an important distinction between field variables and local variables- where field variables are always accessible and have technically no scope of accessibility out- side of Java’s imposed member access structure, local variables are only accessible within the method that they are declared in, and reference to them is removed as soon as the method terminates. It will also demonstrate scoping within meth- ods, with local variables declared within conditional statements or loops being dereferenced once the frame has moved outside of that area of execution. Finally, as it allows the inspection of the value of variables, it may help with some very rudimentary debugging processes, such as detecting infinite loops. 18
Chapter 3 Visualisation Outline The JVM visualisation developed for this project, TraneViz, is capable of compil- ing and launching a Java program’s source code input and producing a dynamic runtime visualisation of the state of the application. The dynamic visualisation generated by TraneViz uses information taken directly from the runtime state of the VM used by the embedded application. The application demonstrates many of the Java programming languages features and how they apply to individual applications. 3.1 Object Reference View The object reference view takes its information directly from the JVM heap, but the subject of the visualisation is the relationship between objects and variables. The diagram forms a hierarchical tree-like structure, consistent of a series of nodes. Each node represents an instance variable held by an object, and each node is labeled with the variable name. The root of each tree is in the center of the diagram, with each sub node forming a circular shape around it. A connection between nodes indicates that the child node, the node farthest from the center of the circle is an instance variable owned by the object the parent node is pointing to. Figure 3.1 demonstrates a simple example of a series of objects connected by references. The center of the tree, bookcase, is a variable pointing to an object that holds three books- horrorbook, mysterybook and fairytale. Each are connected in a circular pattern outside of bookcase. In addition to this, mysterybook also holds a reference to a bookmark object, connected to the outside of the mysterybook node. To ensure the simplicity of the diagram, only objects created from classes that were written directly by the programmer are displayed on the tree diagram. 19
Figure 3.1. An application visualized in the Object Reference View All other objects, including those defined in the Java SE6 API are hidden from view. This is because many pre-defined or imported classes can have very large definitions with lots of variables, that can overwhelm the diagram and make navigation significantly more challenging. Hiding these objects from the tree diagram ensures all information is relevant and easy to comprehend. 3.1.1 The Main Node At the root of every single tree is the main node. When any Java applica- tion starts, typically the first operation that is performed is executing the main method, and the termination of any application occurs when this method is ex- ited by the main thread of execution. However, as the main method is static, all Java applications technically begin with no object being created or referenced; with the visual model specified, there is no technical ‘root’ to the application’s reference structure. To solve this issue, at the root of every tree is what is referred to as the main node. The main node is not technically an object, but represents the stem of all references from the beginning of the application. These references include every local variable that is declared and initialized in the main method, and any static field variables owned by the main method’s containing class. These two sources combined represent the beginning point for every object that is created at the start of an application. On the tree diagram, the main node is always given the 20
label ‘main’, reflecting its method name. An example of a main node can be seen in Figure 3.2. 3.1.2 Color Coding To increase organization of the display, a color legend is employed. Each class defined by the user is given a unique color code that determines the color of their nodes on the tree. Each color is weighted depending on the number of references an object holds- the more objects, the greater the saturation of the color. A legend displaying the code from colors to objects is displayed in the top left corner of the screen- mousing over any of the objects in the legend will highlight each node of that class with a gradient that demonstrates the variation in saturation- the lowest saturation of the gradient is used for objects containing no variables, while the highest saturation shows objects containing over ten variables. Figure 3.2 shows a object reference visualisation next to its color code. The application is a representation of a zoo, filled with animals of different types. Each animal is assigned a unique color code depending on its class, which is then used to represent them on the tree diagram. In this image, none of the animals contain any object references, and have a very low color saturation, while the center node, having six references has a relatively high color saturation. Saturation can be compared when observing the gradient on the tree- the lowest saturation makes up the left portion of the gradient, and the highest saturation makes up the right. In this image, the class ‘Chimpanzee’ has been moused over, which has caused the variable ‘ham’ to be highlighted- this helps in identifying individual classes, particularly if some are of a similar color such as dog and rhinoceros. 3.1.3 Querying Nodes The diagram’s interface allows the selection of variables to be performed with a single mouse click, to query that particular. When a node is selected, a red box surrounds the node, and its contents are displayed in the variable view window on the right side of the screen. The variable view window contains a table of all instance variables currently stored within that node- this information includes the name and type of each of the variables, and where appropriate, the value of the variable. Instances of classes not defined by the user are not displayed on the tree diagram, but they are displayed on variable view, so that they can be observed. This is also true for all primitive values, which are not technically objects but still require a display. Selecting any node will display not only the object references it possesses, but all primitive values as well. The variable view also displays any 21
You can also read