CLOCK SYNCHRONIZATION IN DECENTRALIZED SYSTEMS - DIVA PORTAL
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Clock Synchronization in Decentralized Systems Latency aware peer-to-peer overlay topologies based on firefly synchronization models SIDDHARTH SHARMA Master’s Thesis Supervisor: Asst.Prof. Jim Dowling, KTH/SICS Examiner: Prof. Seif Haridi, KTH/SICS Stockholm, February 2012 TRITA-ICT-EX-2012:27
Dedicated to my guiding lights Preeti & Niranjan Sharma
Acknowledgment I would, first of all, thank Dr. Jim Dowling for providing me with an opportunity to work on an immensely interesting research problem and guiding me throughout the duration of the thesis work with his able technical and non-technical mentorship through the ups and downs in the project. I am grateful for each instance when he was there to help me when I was in need. I am very thankful to my examiner, Prof. Seif Haridi for his assistance, guidance and feedback throughout this project. I would also like to thank Swedish Institute of Computer Science (SICS) for providing me an opportunity to work on this project. I also wish to thank all of my friends who are my driving force not only for this project but for my whole life. And finally, I would like to thank my family for their love, sacrifice and constant support, without whom I would not have come this far. Stockholm, Sweden Siddharth Sharma February 2012 v
vii Abstract Clock synchronization between different entities in a system has been ap- proached using two main methods, decentralized and centralized synchroniza- tion. Examples of centralized synchronization include Network Time Protocol (NTP) and the use of Global Positioning System (GPS) as a central clock. The synchronization of clocks in distributed systems is a well-studied and dif- ficult problem. Current solutions possess a significant convergence delay and a non-perfect synchronization window. This thesis approaches the problem of clock synchronization in decentral- ized systems by analysing and using pulse-coupled oscillator models, like the Kuramoto model and the Mirollo-Strogatz firefly model, while leveraging the knowledge of internode latencies to form a biased gradient overlay topology, and creating a custom firefly synchronization model. The system node coordinates are indicative of internode latencies if they are assigned statically using a latency data set or through a dynamic co- ordinate protocol, which assigns coordinates according to current internode latencies. The coordinates are then used to create an overlay over the phys- ical topology by having larger number of links with lower internode latency. Neighbours are selected based on an information need basis. Logical time on the nodes is set in sync along with the phase synchronization using fine tuned algorithms to set a common timestamp on each cycle, and to optimize the synchronization window and the convergence time. The results show that the gradient firefly synchronization is efficient in convergence time as well as synchronization window. The protocol works better with a single cluster of nodes as compared to multiple clusters. It is concluded in the thesis that latency aware gradient firefly synchronization protocols can be used per cluster and the performance can be improved further with the incorporation of dynamic coordinate protocols.
viii Sammanfattning Klocksynkronisering mellan olika enheter i ett system brukar vanligtvis angripas med en av två huvudsakliga metoder, decentraliserad eller centrali- serad synkronisering. Exempel på centraliserad klocksynkronisering inklude- rar Network Time Protocol (NTP) och användandet av Global Positioning System (GPS) som en centraliserad klocka. Synkroniseringen av klockor i distribuerade system är ett välstuderat område med många svåra problem. Nuvarande lösningar innefattar alla betydande konvergensproblem och ett icke-perfekt synkroniseringsfönster. Den här avhandlingen tar sig an problemet med klocksynkronisering i decentraliserade system genom att analysera och använda sig av pulskopplade oscillatormodeller, såsom Kuramotos modell och Mirollo-Strogatzs eldflugs- modell, samtidigt som den lutar sig mot kunskap kring latens mellan noder för att forma sig en viktad gradient överlagringstopologi och skapa en anpassad eldflugs-synkroniserings-modell. Systemnodskoordinaterna är indikativa för latensen mellan noder om de tilldelas statiskt med hjälp av ett set av latensdata eller dynamiskt genom ett koordinatprotokoll, vilket tilldelar koordinater enligt den aktuella latensen mellan noder. Koordinaterna används sedan för att skapa en överlagring över den fysiska topologin genom att ha ett större antal länkar med lägre latens mellan de olika noderna. Valet av grannar styrs efter vilket informationsbehov de har. Logisk tid på noderna sätts synkroniserat tillsammans med fassyn- kroniseringen genom användandet av fininställda algoritmer för att skapa en gemensam tidstämpel på varje cykel och för att optimera synkroniserings- fönstret och konvergenstiden. Resultaten visar att den gradienta eldflugs-synkroniseringen är effektivt vad gäller både konvergenstid och synkroniseringsfönstret. Protokollet fun- gerar bättre med ett enstaka kluster av noder, jämfört med multipla klus- ter. Avhandlingen kommer också fram till att ett latensmedvetet gradient eldflugs-synkroniseringsprotokoll kan användas per kluster och att resultatet kan förbättras ytterligare genom införandet av dynamiska koordinatprotokoll.
Contents 1 Introduction 1 1.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Report Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2 Background and Related Work 5 2.1 Clock Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 Synchronization of Pulse-Coupled Oscillators . . . . . . . . . . . . . 7 2.2.1 Winfree Model . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2.2 Kuramoto Model . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2.3 Mirollo-Strogatz Model . . . . . . . . . . . . . . . . . . . . . 8 2.2.4 Ermentrout Model . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2.5 Heartbeat Synchronization using Firefly Models . . . . . . . . 9 2.2.6 Clock Synchronization using a Modified Kuramoto Model . . 9 2.3 Gradient Clock Synchronization . . . . . . . . . . . . . . . . . . . . . 10 2.4 Latency-based Network Coordinates . . . . . . . . . . . . . . . . . . 10 2.4.1 Vivaldi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.4.2 Pyxida . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.4.3 Htrae . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.4.4 Heirarchial Vivaldi . . . . . . . . . . . . . . . . . . . . . . . . 11 3 Model 13 3.1 Selection of the Base Firefly Model . . . . . . . . . . . . . . . . . . . 13 3.2 Pre-implemented Logic for Firefly Synchronization . . . . . . . . . . 14 3.3 Latency Coordinate System . . . . . . . . . . . . . . . . . . . . . . . 14 3.3.1 Static Allocation . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.3.2 Coordinate Protocol . . . . . . . . . . . . . . . . . . . . . . . 18 3.4 Synchronization of Logical Clock . . . . . . . . . . . . . . . . . . . . 18 3.4.1 Overlay Network . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.4.2 Neighbour Preference Function . . . . . . . . . . . . . . . . . 21 3.5 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 4 Results and Evaluation 27 ix
x CONTENTS 4.1 Performance on Generated Static Topologies . . . . . . . . . . . . . 27 4.1.1 Single Gaussian Cluster . . . . . . . . . . . . . . . . . . . . . 29 4.1.2 Performance on Multiple Gaussian Clusters . . . . . . . . . . 30 4.1.3 Performance on a Doughnut Cluster . . . . . . . . . . . . . . 32 4.2 Performance of Random Overlay . . . . . . . . . . . . . . . . . . . . 32 4.3 Performance of Gradient Overlay . . . . . . . . . . . . . . . . . . . . 36 4.3.1 Neighbour Preference Parameters . . . . . . . . . . . . . . . . 36 4.3.2 Degree Distribution Parameters . . . . . . . . . . . . . . . . . 40 5 Conclusion and Future Work 47 5.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 5.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Bibliography 49
List of Figures 3.1 Single Gaussian Node Cluster . . . . . . . . . . . . . . . . . . . . . . . . 16 3.2 Doughnut Cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.3 Gaussian Node Cluster Pair . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.4 Multiple Gaussian Node Clusters . . . . . . . . . . . . . . . . . . . . . . 17 3.5 Protocol stack running on each node . . . . . . . . . . . . . . . . . . . . 20 3.6 Class structure in gradient overlay topologies using latencies from King data sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.7 Class structure in random topologies using latencies from King data sets 25 3.8 Class structure in gradient overlay topologies using latencies mapped to location . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.9 Class structure in random topologies using latencies mapped to location 26 4.1 Synchronization of Flashes . . . . . . . . . . . . . . . . . . . . . . . . . . 28 4.2 Flashes in the 30th Synchronization Window for Figure 4.1 . . . . . . . 28 4.3 Degree Distribution of a Single Gaussian Cluster . . . . . . . . . . . . . 29 4.4 Degree Distribution for two Gaussian Clusters . . . . . . . . . . . . . . 31 4.5 Degree Distribution for five Gaussian Clusters . . . . . . . . . . . . . . . 31 4.6 Degree Distribution for a Doughnut Cluster . . . . . . . . . . . . . . . . 32 4.7 Standard Deviation Results on Varying Topology and Overlay for Gen- erated Static Topologies . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.8 Synchronization Window Results on Varying Topology and Overlay for Generated Static Topologies . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.9 Random: Degree Distribution while Varying Degree . . . . . . . . . . . 34 4.10 Random: Standard Deviation Results on Varying Degree . . . . . . . . 35 4.11 Random: Synchronization Window Results on Varying Degree . . . . . 35 4.12 Degree Distribution while Varying Neighbour Preference Parameters . . 37 4.13 Standard Deviation Results on Varying Alpha . . . . . . . . . . . . . . . 38 4.14 Synchronization Window Results on Varying Alpha . . . . . . . . . . . . 38 4.15 Standard Deviation Results on Varying Beta . . . . . . . . . . . . . . . 39 4.16 Synchronization Window Results on Varying Beta . . . . . . . . . . . . 39 4.17 Standard Deviation Results on Varying Gamma . . . . . . . . . . . . . . 41 4.18 Synchronization Window Results on Varying Gamma . . . . . . . . . . 41 4.19 Degree Distribution while Varying Core Radius . . . . . . . . . . . . . . 42 xi
xii List of Figures 4.20 Standard Deviation Results on Varying Core Radius . . . . . . . . . . . 42 4.21 Synchronization Window Results on Varying Core Radius . . . . . . . . 43 4.22 Degree Distribution while Varying Minimum Degree . . . . . . . . . . . 43 4.23 Standard Deviation Results on Varying Minimum Degree . . . . . . . . 44 4.24 Synchronization Window Results on Varying Minimum Degree . . . . . 44 4.25 Degree Distribution while Varying Maximum Degree . . . . . . . . . . . 45 4.26 Standard Deviation Results on Varying Maximum Degree . . . . . . . . 46 4.27 Synchronization Window Results on Varying Maximum Degree . . . . . 46
List of Algorithms 3.1 Logical Clock Synchronization: Highest Received ID v1 . . . . . . . 19 3.2 Logical Clock Synchronization: Highest Received ID v2 . . . . . . . 19 3.3 Logical Clock Synchronization: Average Sync Number Received . . . 19 3.4 Logical Clock Synchronization: Guarded Listening Period . . . . . . 20 xiii
List of Abbreviations BMC Best Master Clock DNS Domain Name System GPS Global Positioning System IEEE Institute of Electrical and Electronics Engineers MIT Massachusetts Institute of Technology NTP Network Time Protocol P2P Peer To Peer PRC Phase Response Curve PTP Precision Time Protocol RTT Re-Transmission Time TAI Temps Atomique International UTC Coordinated Universal Time xv
Chapter 1 Introduction Synchronization of natural or man-made cycles is one of the keys to functioning of the universe as we know it. Synchronization of pacemaker cells in a heart are crucial in keeping the beats uniform and within controllable limits and hence keeping us alive. The circadian rhythm of our body governs the numerous system cycles to keep our body functions cyclic, predictable and under control. The circadian clock is in turn sourced from the brain’s alpha wave which acts as the grandmaster of the human body clockwork. Certain species of fireflies are able to synchronize their flashing without a central coordinator. Several firefly models have been proposed for synchronizing clocks in distributed systems without a central coordinator, although all such models have been based on random networks. The goal of this thesis is to investigate the role of latency-aware P2P topologies in improving decentralized clock synchronization algorithms by biasing neighbour selection to favour connectivity to those nodes with low latencies, while avoiding partitioning. In the context of communication systems, synchronization of logical clocks is vital for communication between nodes of a network. The success of communication between a system of nodes largely depends on the sense of reference time at the nodes. A common reference time is easy to negotiate between two nodes but as the system size increases, issues involving convergence time and synchronization window come into picture. There are many applications which can benefit from highly time synchronized systems. 1.1 Problem Statement Accurate synchronization of clocks in a decentralized system is an open problem with the best emission window lengths being in the order of milliseconds. Phase synchronization of a system of pulse coupled oscillators is a generic phenomenon which also covers synchronization of clocks in a distributed system using unidi- rectional pulses or messages. Existing models for synchronizing firefly agents in 1
2 CHAPTER 1. INTRODUCTION distributed systems are based on random networks. Existing results show that agents can synchronize their clocks on the open Internet to around 30 milliseconds. However, 30 milliseconds is too high to build many useful applications. 1.2 Approach The approach in this thesis is to investigate the use of gossiping algorithms and aggregation algorithms to build a latency-aware peer-to-peer topology. Algorithms from the literature on firefly synchronization are used to synchronize clocks, in original or modified form. Existing literature on firefly synchronization and pulse- coupled oscillators was studied and an appropriate algorithm selected for further development. The selected algorithm is then modified to synchronize the logical clocks running over the pulses. The neighbour selection is done to select the best possible subset of the network to communicate with efficiently while keeping the performance high. In short, this is achieved by reducing communication of unreliable, low performance nodes of the system to a bare minimum. Latency is one measure of lower performance but latency alone does not reduce the performance. The variance of latency, which is found to be correlated to latency itself, is a major contributor to the introduction of error in calculation of round trip time between nodes essential to synchronization of clocks, along with the errors that might be inherent in the hardware clocks of the nodes. As mentioned above, it is desirable to have a gradient in the overlay topology with better performing nearby neighbours to be connected more well connected than the nodes far away on the edges so that most of the network converges as fast and as close as possible to the system time. The overlay topology runs over an actual topology, which can either be user generated, for the purpose of understanding the behaviour of the synchronization protocol under different layouts, or using data from a real world scenario. The different models are integrated in the implementation and the various de- fined parameters are tuned to get a measure of the optimum values for a particular model or algorithm. 1.3 Report Structure In the following Chapter 2, the background information is provided about clock synchronization and discusses the related work done in the area of pulse-coupled oscillator models and network coordinate protocols which were analysed for this thesis. It specifically describes the related work utilized as a basis for this thesis work. Chapter 3 contains details about the approach followed in working on the prob- lem statement. It describes the alternate models considered, implemented and
1.3. REPORT STRUCTURE 3 reasons for the selections for a particular approach. Information about the envi- ronment and tools used for the implementation is also included in this chapter. Chapter 4 provides the results of the simulations based on different models developed. The results are explained and their evaluation is done comparing the results with respect to different static topologies and between random and gradient overlay cases, for the various defined parameters in the models. Chapter 5 concludes the thesis by discussing the evaluation of results from the previous chapter and reflects on the significance of latency aware gradient overlay topologies on clock synchronization in decentralized systems. The scope of future work is also discussed here in this chapter, where possible improvements to the thesis work are detailed along with the identified areas for further development.
Chapter 2 Background and Related Work This chapter delves into the background knowledge required for the thesis and describes the reputed models in the field and discusses the related work done in the field based on these models. 2.1 Clock Synchronization There are many algorithms developed over the years to synchronize clocks over a network. Clock synchronization can be classified into centralized or decentralized, depending on whether the system has a central source of time or not. It can also be classified into internal or external based on whether the network sources the time from an external clock. Centralized Clock Synchronization Centralized clock synchronization refers to the category of protocols in which one or more than one of the nodes in a system are selected or elected to be the grandmasters who keep the time for the system and act as a reference for the rest of the system to keep time. The adjustments to individual clocks can be calculated at the time servers or clients based on the the algorithm used. Network Time Protocol or NTP [20] is the most commonly used time syn- chronization method over the Internet. Since version 3 of NTP, an intersection algorithm [21], a modified version of Marzullo’s algorithm [18], is used to synchro- nize time within a distributed system of erroneous NTP servers. Precision Time Protocol or PTP [6] is used to synchronize clocks on a com- puter network and is defined in the IEEE 1588 standard. It uses a distributed selection algorithm called the Best Master Clock algorithm or BMC which selects a master clock based on quality of the clocks among other parameters. 5
6 CHAPTER 2. BACKGROUND AND RELATED WORK Decentralized Clock Synchronization In decentralized clock synchronization, the system of nodes does not have a grounded sense of time. The clock synchronization has local significance within the system and may or may not map to an external time reference used in common, like UTC. General synchronization methods in this category rely on sequence of events and causality to have a sense of the time. A couple of algorithms used for decentralized clock synchronization are mentioned below. Lamport Timestamps Leslie Lamport proposed a mechanism in [16] which uses the happened-before (→) ordering to numerically calculate the logical clock. This logical clock is specific to the process that is being used for event generation, hence there can be multiple logical clocks specific to different disconnected processes in a distributed system. Vector Clocks Vector timestamps based on partial ordering of events was pro- posed by Fidge in [10]. In this method a local logical clock is maintained by each node or process in a vector and local clock is incrementing before sending the vec- tor on events. The local vector logical clocks are updated to the highest of the timestamps in the received vector, while adding own logical clock in the vector if not already present. Internal Clock Synchronization The methods which do not rely on external sources for time source, for example in intranets, can be classified under internal clock synchronization. Berkeley and Cristian algorithms are a couple of methods which are optimized for intranets. Berkeley Algorithm Berkeley algorithm proposed in [12] elects a master in the system which receives updates from the slaves periodically with the time and after estimating the RTT sends back positive or negative adjustments to be done on the slave end. Cristian’s algorithm Cristian’s algorithm [4] is a probabilistic method to syn- chronize clock over low latency networks where the RTT is low as compared to the required precision in accuracy. External Clock Synchronization This classification is used for the systems which have a single or a group of nodes which have communication with an external time source for reference for example UTC. Examples of this are the Network Time Protocol and GPS Clock Synchro- nization.
2.2. SYNCHRONIZATION OF PULSE-COUPLED OSCILLATORS 7 GPS Clock Synchronization This is one of the most accurate methods of time synchronization available currently. The set of 24 navigation satellites set up by the United States of America Department of Defence can be used for synchronizing clocks to GPS time connected to the GPS receiver stations to the order of 50ns. The GPS time is offset by a few seconds from the TAI. 2.2 Synchronization of Pulse-Coupled Oscillators Synchronization of pulse-coupled oscillators is one of the approaches to synchroniza- tion of clocks on a decentralized system. First major breakthrough in the modelling of a group of pulse-coupled oscillators was the work done by Art Winfree in [25]. Yoshiki Kuramoto improved upon the Winfree model by simplifying the approach to coupling interaction in [15], enabling the analytical solution of the model. One of the biological models used in synchronization of pulse-coupled biological oscillators is a group of fireflies. Multiple models have been proposed for firefly synchronization, notably the Mirollo-Strogatz firefly model developed by Rennie Mirollo and Steven Strogatz in [22], following which Ermentrout proposed an adap- tive model in [8]. 2.2.1 Winfree Model Before Winfree, the models used to solve an N-body problem of mutually coupled 2nd order oscillators involved solution of complex 2Nth order non-linear differential equations. Winfree approached the problem by constricting the scope to weak coupling interactions. Winfree modelled the oscillators as a group of nodes oscillating together with dif- ferent initial phases as well as different natural oscillating frequencies. All nodes in this system possess identical influence and sensitivity over the whole cycle, through which they influence and get influenced by the rest of pulse-coupled oscillators re- spectively. Both influence and sensitivity are functions of the phase of an oscillating node. The equation 2.1 represents the Winfree Model. XN θ̇i = ωi + X(θj ) Z(θi ) (2.1) j=1 Where, θi is the self phase of the oscillator, ωi is the natural frequency of the oscillator, i varies from 1 to N , θj is the phase of remote oscillator, X is the influence function and Z is the sensitivity function. It is assumed that influence and sensitivity of each oscillator in the system is the same. Winfree’s approach to the problem was beneficial in obtaining qualitative results as opposed to quantitative results. Kuramoto was influenced by the work of Winfree and went on to propose improved models for pulse coupled oscillator systems.
8 CHAPTER 2. BACKGROUND AND RELATED WORK 2.2.2 Kuramoto Model Kuramoto assumed like Winfree that the each oscillator was coupled to the collec- tive rhythm generated by the whole population. He improved upon the Winfree model by simplifying the way the influence of coupled oscillators affected the phase and frequency of an oscillator in the system. He defined the model in terms of coherence between the phase or frequencies of the oscillators, coupling strength between the oscillators and the mean phase of the oscillator system. θ̇i = ωi + Kr sin (ψ − θi ) (2.2) Where, θi is the self phase of the oscillator, ωi is the natural frequency of the oscillator, i varies from 1 to N . Mean phase ψ and coherence r are the mean field quantities of the system and along with the coupling strength K, they influence the frequency of oscillators to come closer towards the mean phase. Coupling strength and coherence are directly proportional and have a positive feedback between them. Hence, if the coherence is set to increase, the coupling strength increases and that in turn increases the coherence and so on. 2.2.3 Mirollo-Strogatz Model Mirollo and Strogatz developed a mathematical model to explain the synchroniza- tion of fireflies. They talk about a phase response curve or PRC, which guides the extent of shift rendered in phase of the local firefly flash cycle based on the current local phase when the flash is received from a peer. The variable x which is a function of phase φ is continuous as the function or PRC is continuous. x = f (φ) f : [0, 1] → [0, 1], f (0) = 0, f (1) = 1 f 0 > 0, f 00 < 0 The variable x can be considered as a voltage which when reaches 1, triggers a flash and falls down to zero. The voltage rises towards one again according to the node’s natural frequency and whenever a flash a received, voltage x is incremented by a small value ε. x0 = min(x + ε, 1) (2.3) φ0 =f −1 (x0 ) Where, x0 and φ0 are the new values of voltage and phase after receiving a flash. This model assumes that all the nodes have the same initial frequency and it doesn’t vary from the natural frequency over the course of time. It is also affected easily by message delays and drops.
2.2. SYNCHRONIZATION OF PULSE-COUPLED OSCILLATORS 9 2.2.4 Ermentrout Model Ermentrout proposed an adaptive model for synchronization of flashes between fire- flies of specifically a species named Pteroptyx Malacae which exhibits a very high order of synchronization, and was used by Mirollo-Strogatz model too. The adap- tive nature of the model refers to taking into account naturally differing biological entities and hence different natural flashing frequencies, which the Mirollo-Strogatz model assumes to be the same. The fireflies can also shift their frequencies up or down within constraints to bring their flashes in sync with the system frequency and phase. This model can be represented mathematically as the calculation of new fre- quency for the node as shown in equation 2.4 whenever it receives a flash from on of its peers. ω 0 = ω + ε(Ω − ω) + g + (φ)(Ωl − ω) + g − (φ)(Ωu − ω) (2.4) sin 2πφ g + (φ) = max ,0 2π − sin 2πφ g (φ) = − min ,0 2π Where, Ω is the natural frequency of the node, Ωl and Ωu are the lower and upper limits for the frequency node can flash at, and ω is the actual frequency of the node. Phase variable φ is not updated by the algorithm as in the Mirollo- Strogatz model, but it does determine whether the new frequency should be shifted up towards Ωu , when a flash is received in the second half of the local cycle, or should it be shifted down towards Ωl , when a flash is received in the first half of the local cycle. 2.2.5 Heartbeat Synchronization using Firefly Models Work done by Babaoglu et al. in [2] synchronizes local heartbeats on each node of the distributed system which helps the applications which rely on cycle based execution to have common cycle boundaries with the rest of the system. They had analysed four synchronization protocols including the Mirollo-Strogatz and Er- mentrout models and had chosen the Ermentrout model for its better performance while having less assumptions about the physical properties of the nodes. 2.2.6 Clock Synchronization using a Modified Kuramoto Model In his doctoral thesis [24], Scipioni implements a modified version of Kuramoto model for pulse-coupled synchronization of oscillators to achieve internal clock syn- chronization. He uses techniques to obtain clock estimates and converges towards the target clock value using convergence functions over the clock estimates.
10 CHAPTER 2. BACKGROUND AND RELATED WORK 2.3 Gradient Clock Synchronization Fan and Lynch worked on the problem of gradient clock synchronization, in [9], using a distributed system of nodes with hardware clocks and which adjusted their logical clocks based on the hardware clock and the exchanged messages. Uncertainty in message delivery between any two nodes is considered as the distance between them. The nodes closer to each other should be more tightly synchronized as compared to the the nodes far away which helps in reducing the error probability in calculating the clock adjustment. 2.4 Latency-based Network Coordinates To create a reliable and current latency aware overlay on a network, we need a dynamic protocol which adjusts the network coordinates continuously according to the latest latency state. In the area of coordinate protocols, some of the notable efforts on this front are described below starting with Vivaldi [5]. 2.4.1 Vivaldi Vivaldi algorithm, introduced by Dabek in [5], creates a euclidean coordinate system of nodes in a distributed system where the internode distances depict the latencies between them. This is accomplished by modelling the connections between nodes as springs and forces exerted on each other by the nodes depends on the RTT and the system coordinates are dynamically updated based on the deviations measured in the RTT. The system coordinates also have an additional latency dimension, height, for the local access link. 2.4.2 Pyxida Ledlie et al. [17] proposed an algorithm Pyxida, which is an extension of Vivaldi. In Pyxida, additionally a time limited cache of nodes and their measured RTTs is kept and any new measurement for a node is compared so that the outlying RTT measurements are ignored. This helps in the latency estimations to be closer to the approximate value. 2.4.3 Htrae In [1], Agarwal and Lorch proposed the Htrae mechanism for latency coordinate generation. It also builds up on Vivaldi by introducing a better initial state for the protocol by mapping the IP addresses to the geographical locations using the GPS coordinates. This helps in minimizing the internode latency coordinate distance changes and hence reduces the convergence time.
2.4. LATENCY-BASED NETWORK COORDINATES 11 2.4.4 Heirarchial Vivaldi Elser et al. [7] have worked on developing a modified version of Vivaldi protocol in which they predict the peer nodes’ embedding error for the optimization of selection of peers. This method is targeted towards improving the coordinate and hence latency predictions.
Chapter 3 Model In this chapter, the models influenced by the literature study or newly developed are discussed along with their pros and cons and the reasons for their consideration. Models for firefly synchronization were used for selection from previous work for modification as creating a new model for firefly synchronization was out of scope for the thesis. Algorithms for logical clock synchronization were made according to the observed behaviour of the synchronization. Static topologies were generated initially for the development of the firefly synchronization model and the latencies were made dependant on the static locations. Different parameters were modelled for the degree distribution and neighbour preference. 3.1 Selection of the Base Firefly Model The models which were identified and studied for use in this thesis are the Kuramoto model for pulse coupled oscillator systems, the Mirollo-Strogatz firefly synchroniza- tion model [22] and the Ermentrout model [8] for firefly synchronization. The Kuramoto model for pulse coupled oscillators is an elegant model for ex- plaining the synchronization between nodes which are mutually coupled to each other node in the system. In its original form, this model can not be used directly for modelling the firefly synchronization as a typical firefly network graph is not fully connected, and the practical applications for the model will not require the nodes to be connected full mesh and might not be even possible in some cases. There are variations of the Kuramoto model which implement a limited view con- nectivity for synchronization between nodes but they were not utilized as part of this thesis. As discussed in subsection 2.2.4 and by [2], the Ermentrout model is an extension of the Mirollo-Strogatz model and removes the simplifying assumption that all the nodes in the system have the same initial cycle length and gives us more practical and realistic data for the applications like wireless sensor networks where there is a possibility of a variable clock skew in the nodes and the indeterminate initial state 13
14 CHAPTER 3. MODEL of the system or dynamic joins and leaves. Babaoglu et al. worked on heartbeat synchronization of overlay networks using the Mirollo-Strogatz and Ermentrout firefly models. This work is used as a starting point for protocol and algorithm development for this thesis. The portion of the pulse coupled synchronization which was already implemented as part of their work is explained in the section 3.2. 3.2 Pre-implemented Logic for Firefly Synchronization A section of the logic and implementation for the clock synchronization already existed in the firefly heartbeat synchronization protocol developed in [2]. The firefly protocols, based on Mirollo-Strogatz and Ermentrout models, were implemented over the PeerSim environment, which synchronized the pulses generated by the nodes. This thesis work builds on the above implementation and introduces synchro- nization of timestamps to synchronize the logical clock which runs on the nodes. Multiple hooks were created in the protocol to read the latency coordinate infor- mation from different sources, static allocation using a data set or a coordinate protocol. 3.3 Latency Coordinate System Creation of a latency aware synchronization protocol it requires to have some sense of the latency between the peer and self continuously. This can be achieved effec- tively by assigning coordinates to the internode latencies relative to the origin of the latency space. This latency space is n-dimensional plus one dimension extra for local access latencies. An estimate of the internode latency, l, is made by measuring sum of local access latencies for peer and self nodes in addition to the Euclidean distance in the remaining n dimensions as in equation 3.1. s = (s0 , s1 , s2 , . . . , sn ) p = (p0 , p1 , p2 , . . . , pn ) v u n uX l = (s0 + p0 ) + t (pi − si )2 (3.1) i=1 where, s and p are the self and peer latency coordinates respectively. s0 and p0 are the local access latencies and the rest are coordinates for the n-dimensional latency space. The latencies coordinates are allocated and updated using different methods to provided to the firefly synchronization protocol stack. They are discussed in the following subsections.
3.3. LATENCY COORDINATE SYSTEM 15 3.3.1 Static Allocation Static allocation of coordinates is one of the methods for providing latency infor- mation to the system. A major advantage in using such a method is to reduce the overhead on the system to calculate the latencies by itself, but it prevents the system from having an estimate closer to the current state of internode latencies. Static allocation of latency coordinates was done using two methods, establish- ing a direct relation to a manual static mapping of location coordinates, and using data sets available from previous research on latency information from live pro- duction networks. In case of static allocation, a Gaussian jitter was added to the latency coordinate euclidean distances to generate the internode latency so that the data can be closer to real life scenario where the latencies keep varying over the network. The jitter value also depends on the euclidean distance with the maxi- mum value being directly proportional to the distance as it is understood that an internode latency varies more if the latency is high. Mapping to Static Location Coordinates All the nodes are assigned static two dimensional location coordinates to emulate a geographical distribution. The different statically allocated coordinates are as follows. Single Gaussian Cluster A Gaussian cluster of nodes is generated by distribut- ing the nodes radially from the centre of the cluster in a Gaussian distribution. This is the simplest geographical topology used for simulating the clock synchronization over in this project. A sample of the Gaussian cluster is plotted in figure 3.1. Doughnut Cluster This geographical topology was generated to simulate a rarely populated core of a cluster of nodes relative to a cluster with a Gaussian distribution. The plot of the generated topology is as shown in the plot in figure 3.2. Multiple Gaussian Clusters Multiple Gaussian clusters are used to simulate real life distribution of cluster groups of closely connected nodes over a wide ge- ographical area. Coordinate protocols generate a similar coordinate system but with higher dimensions. The purpose here is to determine whether the degree dis- tribution and neighbour preference would work over multiple clusters. A couple of distributions used are shown in figures 3.3 and 3.4. Allocation using Data Sets Another approach to statically allocate latencies is to read latency data from pub- licly available data sets like the MIT King data set and the Meridian King data set. These data sets have latency measurements taken over a period of time from live
16 CHAPTER 3. MODEL 4 Nodes 3 2 1 Y Coord 0 -1 -2 -3 -4 -4 -3 -2 -1 0 1 2 3 4 X Coord Figure 3.1. Single Gaussian Node Cluster 6 Nodes 5 4 3 2 Y Coord 1 0 -1 -2 -3 -4 -5 -5 -4 -3 -2 -1 0 1 2 3 4 5 X Coord Figure 3.2. Doughnut Cluster
3.3. LATENCY COORDINATE SYSTEM 17 4 Nodes 3 2 1 Y Coord 0 -1 -2 -3 -8 -6 -4 -2 0 2 4 6 8 10 X Coord Figure 3.3. Gaussian Node Cluster Pair 8 Nodes 6 4 2 Y Coord 0 -2 -4 -6 -8 -8 -6 -4 -2 0 2 4 6 8 10 X Coord Figure 3.4. Multiple Gaussian Node Clusters
18 CHAPTER 3. MODEL networks between, for example, the DNS servers using the King method [11]. The latency values from these have a close match to the real world scenario and can be used for the simulation of the firefly clock synchronization. 3.3.2 Coordinate Protocol Latency coordinate protocols are useful in measuring the current internode latencies in the system and updating the n-dimensional logical coordinates in the latency space. For the thesis, Pyxida [17] was utilized as a coordinate protocol as it is tried and tested in the Azureus Bit Torrent application for looking up the ideal peers available on the Internet. It was initially planned to utilize the coordinate protocol service running on the PlanetLab [3], mentioned on the Pyxida projects’ homepage, but the coordinate protocol service was not found to be present. It was decided that, in light of the unavailability of the service, a layer be created between the PeerSim’s King data set access classes and Pyxida to try and simulate the actual reads of Re-Transmission Time (RTT) to use for the simulation and use the latency coordi- nate information from Pyxida to create the latency aware overlay using preferred neighbours. It is intended that this will be useful for implementing the Pyxida service on a network like PlanetLab as a part of future work. 3.4 Synchronization of Logical Clock In case of heartbeat synchronization, the objective is to achieve phase synchroniza- tion of nodes in a distributed system. The nodes of the system do not communicate cycle IDs or timestamps and hence it can not be used for clock synchronization as it is. Without the communication of logical clock timestamps or IDs, a single heartbeat falls across two or more logical clock timestamps depending on the initial distribution of flashes. The synchronization of timestamps was done by passing the local logical clock IDs along with the flashes or pulses, and updating the logical clocks at receiver nodes using different algorithms to evaluate the performance of each method. These algorithms are stated in algorithms 3.1, 3.2, 3.3 and 3.4. 3.4.1 Overlay Network The network overlays are created using T-Man [13] running over Newscast [14] peer sampling service to generate different topologies including the random and gradient networks. In the base code which was used, the heartbeat synchronization was running directly on the Newscast peer sampling protocol. For the generation of overlay topologies, T-Man was introduced in between the Firefly protocol and the Newscast protocol.
3.4. SYNCHRONIZATION OF LOGICAL CLOCK 19 Algorithm 3.1 Logical Clock Synchronization: Highest Received ID v1 Passive Thread 1: while receive pulse do 2: if received ID > self ID then 3: self ID ← received ID 4: end if 5: end while Active Thread 1: send pulse 2: increment self ID Algorithm 3.2 Logical Clock Synchronization: Highest Received ID v2 Passive Thread 1: while receive pulse do 2: if received ID > self ID then 3: self ID ← received ID 4: end if 5: end while Active Thread 1: increment self ID 2: send pulse Algorithm 3.3 Logical Clock Synchronization: Average Sync Number Received Passive Thread 1: while receive pulse do 2: if received ID > self ID then 3: self ID ← received ID 4: end if 5: end while Active Thread 1: send pulse 2: increment self ID
20 CHAPTER 3. MODEL Algorithm 3.4 Logical Clock Synchronization: Guarded Listening Period Passive Thread 1: while receive pulse do 2: if time since last pulse < lower listening period limit then 3: if received ID > self ID then 4: self ID ← received ID 5: end if 6: else 7: if received ID > self ID + 1 then 8: self ID ← received ID - 1 9: end if 10: end if 11: end while Active Thread 1: send pulse 2: increment self ID Figure 3.5. Protocol stack running on each node
3.4. SYNCHRONIZATION OF LOGICAL CLOCK 21 Random Topology In case of random topology, T-Man just passes the nodes received from Newscast and doesn’t modify the buffer as it is not required to build a specific topology. The degree of the node, or view size, is determined by a configuration parameter passed to the system. Gradient Topology To generate the gradient topologies, T-Man processes its buffer of nodes received from the peer sampling service, Newscast, and passes on a number of nodes to the Firefly protocol layer. The degree of a node i.e. the number of neighbours for the node is determined by Equation 3.2, with the maximum possible degree of a node being dmax . rdrop d = dmin + (dmax − dmin ) × (3.2) r The degree distribution depends on the node’s distance from the topology’s center, determined either by symmetrical static allocation of coordinates or by a coordinate protocol, as well as the following parameters, minimum degree allowed for a node dmin , maximum degree allowed for a node dmax , and the distance from the coordinate center where the degree drops lower from the highly connected core rdrop , which can be modified to influence the shape of degree distribution, and hence the quality and speed of clock synchronization. The degree of nodes after the drop radius rdrop , is inversely proportional to the distance from the coordinate center i.e. r. 3.4.2 Neighbour Preference Function To implement a bias for preferred neighbours, a neighbour preference function is used which can be tuned by varying different weightage parameters. Random Topology In case of random topology, T-Man just passes on the nodes received from Newscast to the Firefly protocol as it is not required to have preferred neighbours. Gradient Topology Every cycle, T-Man receives a node buffer from Newscast and sends a subset of the buffer to the Firefly Protocol, with the size of the subset being equal to the node’s degree determined by degree distribution. The content node buffer sent to Firefly Protocol is determined by a neighbour preference function which is depicted by Equation 3.3
22 CHAPTER 3. MODEL ( α rrel + γ×r if rrel > 0 p= β (3.3) |rrel | if rrel < 0 The preference value p is governed differently based on whether the relative dis- tance between nodes rrel is positive or negative. The relative distance is calculated as the peer’s coordinate minus the self coordinate on the n-dimensional latency co- ordinate system. Hence, if the rrel is positive, then the peer lies further away from the coordinate center than self, and if it is negative then the peer lies in between the center and self. The higher the preference value p, the more the peer is preferred. One of the objectives of the preference function is to give a higher preference value to the peers which are closer on the latency coordinate system so that we reduce the error in estimating the round trip time and hence minimize the error in calculation of phase correction on the node. To achieve this the preference value is set higher for the nodes close to each other in the latency space. The weightage for this preference is α when the peer node is further away from the latency space origin than self, and is β when the peer node is towards the origin. Having a different weightage for nodes as mentioned helps in gaining knowledge about the relative effect of sending pulses towards the outer latency space as compared to sending pulses towards the core of the network, which is essential because the firefly protocols are inherently one sided in communication. Additional information is obtained by measuring the effect of the distance of peer node from latency space centre r in preferring to send pulses towards the outer edges of the network. This information is useful in determining the advantage of having the core of the network highly connected as compared to having no bias on peer distance from the origin. The effect is measured by giving a weightage γ to the distance of peer node from the origin. These weightage parameters α, β and γ, are varied during simulations and an optimal proportion of weightage is determined by comparing the convergence time and synchronization window in different cases. 3.5 Implementation This section describes in detail about the implementation of the developed model, as explained in the previous sections of this chapter. The implementation of the project was done using Java version 1.6.0. It was decided to use this platform because it is efficient get algorithm simulations done and the availability of P2P simulation frameworks on Java. PeerSim was used as the simulation platform for development of the project as it is extremely scalable and dynamic, and the base code used for the firefly heartbeat synchronization protocol was already using this framework to run over. Newscast was used as the peer sampling protocol to run on the PeerSim frame- work as it was already used under the base firefly synchronization protocol. T-Man was the topology management protocol introduced between the Newscast and fire-
3.5. IMPLEMENTATION 23 fly protocol to enable overlay creation. It was chosen due to highly efficient and fast topology creation and the ease of integration with the Newscast peer sampling protocol. As acknowledged in the section 3.2, the firefly protocol, used along with the different models like Mirollo-Strogatz and Ermentrout models, is leveraged. The implementation of this is explained in brief here. The class Firefly is used to control the properties of a firefly node and it can be configured through the PeerSim configuration file. Depending on the behaviour we require species parameters are specified in the file and Firefly class refers to the Species class which contains the properties of different firefly species. It is also possible to manually provide the properties in the configuration. The main protocol which runs on each firefly is the FireflyProtocol. It imple- ments the EDProtocol simulation engine of PeerSim as it functions driven by events. The events FireflyProtocol listens for are the Timer and Pulse events. The next phase and frequency of the Pulse generation is determined by the AdaptiveModel, which is the Ermentrout synchronization model, according to the Pulse events re- ceived by the protocol from other firefly nodes. The Timer is set based on the next phase and frequency calculated by the model. The Timer event, when received, triggers the flashing of Pulse to the neighbouring nodes in its view. The view size of a node is determined according to the overlay type followed. As explained in 3.4.1, the random network view size is limited by a configuration parameter passed to the PeerSim through the configuration file or the command line, and the gradient overlay view sizes differ per node and are calculated by equation 3.2. The upper and lower bounds for the view size is passed on similarly to the PeerSim simulator. In case of the gradient overlay topologies, the neighbours are compared using the PeerNodeComparator and the ones with highest preference value are selected, according to the neighbour preference function as calculated in equation 3.3, by the T-Man protocol to include in the Buffer to be used by the FireflyProtocol to form its view. The size of the buffer depends on the view size of the particular node. FireflyProtocol also uses the PeerNodeComparator to update its cache of best known neighbours from the cyclically received buffers from TMan. In both random and gradient topologies, Newscast samples the peers of the system and passes on a buffer of nodes to the TMan. To keep the node iden- tifier space common between the Newscast, TMan and FireflyProtocol, all of them are initialised together by NetworkInit using the same node identifier space. NetworkInit is also responsible for initializing Topology with the configured type and parameters through PeerSim. In case of random topologies, all nodes have the same configured degree or view size which is set in Topology, whereas in case of gradient topologies the minimum and maximum view sizes are configured. EmissionObserver and NetworkObserver are the two observer classes in the system. EmissionObserver is responsible for keeping synchronization statistics for the flashes sent in the system. NetworkObserver keeps track of the message traffic generated in the system due to the flashes.
24 CHAPTER 3. MODEL PeerSim has a set of transport classes to support passing of information between different nodes of the system and use the appropriate delay for the messages accord- ing to the type of transport selected. To incorporate data sets generated using the King method, like MIT King data sets and Median King data sets, E2ETransport is available for use. E2ETransport internally uses the E2ENetwork which gets the latency information between different nodes from the data set files through the KingParser. Figure 3.6. Class structure in gradient overlay topologies using latencies from King data sets
3.5. IMPLEMENTATION 25 Figure 3.7. Class structure in random topologies using latencies from King data sets Figure 3.8. Class structure in gradient overlay topologies using latencies mapped to location
26 CHAPTER 3. MODEL Figure 3.9. Class structure in random topologies using latencies mapped to location
Chapter 4 Results and Evaluation This chapter presents the results of the simulations from the project and evaluates them in detail based on the different aspects as defined in the models defined in Chapter 3. The size of the network is 1000 nodes or fireflies. The natural frequency of the firefly used in the simulations is one second and the remaining parameters for the fireflies are set according to the default firefly defined in the Species class. Each set of simulations was run 10 times and the data was averaged over the runs. The evaluation is broadly done between different kinds of topologies used, between the random and gradient overlays, and within random and gradient overlays based on various modelled parameters. Figure 4.1 shows the nature of synchronization which occurs in a suitably con- figured synchronization model and it can be seen that the initial chaos settles down into a cyclic order. The vertical lines in the figure are not actually synchronized to perfection and a zoom in to one of the synchronized cycles is plotted in figure 4.2. The measurement parameters are mainly the standard deviation of flash timing within a single sync attempt having the same timestamp and the emission window size which is the difference between the maximum and minimum timing of the flash emissions within a single sync attempt with the same timestamp. Hence, lower the standard deviation better the performance and lower the emission window size better the performance. 4.1 Performance on Generated Static Topologies Static topology experiments are essentially performed to compare the performance of the developed clock synchronization protocol over different modelled topologies and determine for which topologies the protocol works better. These experiments were also performed on random overlay systems to study their behaviour on the topologies and at the end compare the relative performance per topology. Ten ex- periments are performed for each unique set of parameters and the data is averaged over the ten sets before comparing. 27
28 CHAPTER 4. RESULTS AND EVALUATION 1000 Flash 900 800 700 600 Node 500 400 300 200 100 0 0 5e+06 1e+07 1.5e+07 2e+07 2.5e+07 3e+07 3.5e+07 4e+07 Time (in µsec) Figure 4.1. Synchronization of Flashes 1000 Flash 900 800 700 600 Node 500 400 300 200 100 0 29251 29252 29253 29254 29255 29256 29257 Time (in msec) Figure 4.2. Flashes in the 30th Synchronization Window for Figure 4.1
4.1. PERFORMANCE ON GENERATED STATIC TOPOLOGIES 29 Both gradient and random overlay topologies were created on the generated static topologies, single, two and five Gaussian clusters, and a doughnut cluster.The messages sent under the gradient and random overlay cases is used as an equalizer to set the degree distribution and other overlay parameters. The random case has a uniform degree of 45, for all its nodes, whereas the degree distribution and neighbour preference parameters for the gradient case are as listed in the table 4.1. Table 4.1. Gradient Overlay Parameters for Static Topology Simulations Parameter Value alpha 100 beta 10 gamma 10 core radius 0.1 max degree 100 min degree 10 4.1.1 Single Gaussian Cluster The degree distributions over the single Gaussian cluster are according to the pa- rameters discussed in 4.1 and can be visualized in the figure 4.3. The x-axis depicts the distance of nodes from the latency coordinate center and the y-axis depicts the degree of nodes. Each point on the plot depicts a node in the system. 100 Topology 1 Gaussian Grad 1 Gaussian Rndm 90 80 70 60 Degree 50 40 30 20 10 0 0.5 1 1.5 2 2.5 3 Distance from origin Figure 4.3. Degree Distribution of a Single Gaussian Cluster
30 CHAPTER 4. RESULTS AND EVALUATION The results of this simulation along with simulations for other topologies is shown by figures 4.7 and 4.8. Figure 4.7 plots the standard deviation of actual time of flashes within a single timestamp for a sync attempt on the y-axis against the sync attempt on x-axis. In case of single Gaussian cluster, the performance was better than the multiple Gaussian clusters as well as doughnut cluster, within the same overlay type, in terms of both the lower size of emission window and a smaller standard deviation after convergence, as seen in figures 4.7 and 4.8. The standard deviation for gradient overlay Gaussian cluster is lower than that of random overlay Gaussian cluster. The convergence time of standard deviation for gradient overlay Gaussian cluster is also lower than the random Gaussian cluster. Emission window size is bigger and convergence of the window size is faster in a random overlay Gaussian cluster than in the gradient case. 4.1.2 Performance on Multiple Gaussian Clusters Two cases of multiple Gaussian clusters were handled under multiple Gaussian clusters category with two and five clusters. In case of two Gaussian clusters the center of coordinate system lies outside the clusters while in case of five clusters, one of the clusters’ centres coincides with the coordinate center. Two Gaussian Clusters The degree distributions over two Gaussian clusters are according to the parameters discussed in 4.1 and can be visualized in the figure 4.4. The x-axis depicts the distance of nodes from the latency coordinate center and the y-axis depicts the degree of nodes. Each point on the plot depicts a node in the system. The results of this simulation along with simulations for other topologies is shown by figures 4.7 and 4.8 as explained in subsection 4.1.1. For two Gaussian clusters, the performance is not better than the single Gaussian or doughnut cluster with a high standard deviation after convergence and a high window size but with respect to each other gradient Gaussian cluster pair has a lower standard deviation than the random case. The convergence and window size is better in gradient case too as compared to the random topology. Five Gaussian Clusters The degree distributions over five Gaussian clusters are according to the parameters discussed in 4.1 and can be visualized in the figure 4.5. The x-axis depicts the distance of nodes from the latency coordinate center and the y-axis depicts the degree of nodes. Each point on the plot depicts a node in the system. The results of this simulation along with simulations for other topologies is shown by figures 4.7 and 4.8 as explained in subsection 4.1.1. The standard devia- tion and window size of synchronization in this case is bigger than the single Gaus- sian cluster. The window size is smaller in case of random overlay but the standard deviation for the gradient overlay is lower than the random case. The convergence
You can also read