On Steering Swarms - CS, Technion
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
On Steering Swarms? Ariel Barel1[0000−0003−3275−4264] , Rotem Manor2[0000−0002−2504−1509] , and Alfred M. Bruckstein3 Technion - Israel Institute of Technology, Technion City, Haifa 32000, Israel arielbarel@gmail.com http://www.cs.technion.ac.il Abstract. The main contribution of this paper is a novel method allow- ing an external observer/controller to steer and guide swarms of identical and indistinguishable agents, in spite of the agents’ lack of information on absolute location and orientation. Importantly, this is done via simple global broadcast signals, based on the observed average swarm location, with no need to send control signals to any specific agent in the swarm. Keywords: Steering · Guiding · Control · Multi-Agent · Decentralized. 1 Introduction This paper deals with steering multi-agent systems, based on decentralized gathering laws, using an external broadcast control signal. Agents move accord- ing to local information provided by their sensors. The agents are assumed to be identical and indistinguishable, memoryless (oblivious), with no explicit commu- nication between them. The agents do not share a common frame of reference i.e. agents are not equipped with either GPS systems or compasses. By assumption, agents sense the distance and/or bearing to their neighbours, within a finite or infinite range of visibility. An external observer controller continuously monitors the swarm’s location and broadcasts the same control signal, based on the cen- troid of the agents’ constellation. We present a simple yet practical method to steer the swarm and guide it to a given destination. Note that unlike the simple agents that are anonymous, unaware of their position, lack memory, and do not use explicit communication to maintain the swarm cohesion, the external controller does need the ability to continuously monitor the trajectory of the swarm location. Due to these capabilities, the controller is able to influence the movement of the swarm, with a very simple global control signal broadcast simultaneously to all agents. The inspiration to this control method came from the following observation: some of the gathering algorithms, while they ensure the convergence of agents to a bounded area, do not imply that the centroid of the agents’ location remains stationary in the plane. In fact, some gathering algorithms exhibits random walk ? This research was partly supported by Technion Autonomous Systems Program (TASP) Technion - Computer Science Department - Technical Report CIS-2018-01 - 2018
2 Ariel Barel et al. like behaviour of the centroid of the agents’ constellation after gathering. The method to steer the swarm to a target point, presented herein, exploits the movements of the system’s center of gravity due to the agents’ compliance with the distributed convergence algorithm. Fig. 1. System of four agents with a typical gathering algorithm in discrete time. The left graph shows the trend of the radius of the enclosing circle of the system. At the right, the random walk like behaviour of the centroid of the agents’ constellation after gathering is apparent. 2 How to control a single drunkard We first describe the basic idea in conjunction with a single agent performing a random walk in the plane, and then extend the discussion to multi agent systems carrying out various cohesion ensuring gathering algorithms. Assume a drunkard is moving in the plane in the following random way: at discrete times k = 1, 2, 3, ... the drunkard selects a new destination for time k + 1. The destination location p̃(k+1) is randomly and homogeneously distributed in a unit ˜ disc centered at its current position p(k), so that p̃(k + 1) = p(k) + ∆(k), where ˜ ∆(k) is a random vector uniformly distributed in a unit disc. After selecting p̃(k + 1) the agent starts going there from p(k) in a straight path. By monitoring his motion, one can steer him in any direction with the following control rule: if the projection of his current movement is in the required direction - allow the drunkard to finish his step. Otherwise, stop him after a fraction of the unit interval µ < 1, by broadcasting (shouting) a startling “stop!” signal. This process will cause the drunkard to perform a biased walk, making, in expectation, bigger steps in the desired direction. To bring the drunkard toward a region near a precise target point in the plane, one may define the desired direction to always point from the current location of the drunkard to the goal. Assume first, for simplicity, that the desired direction is fixed. Let p(k) be the current position of the agent and let d ∈ R2 be a unit vector in the direction in which we require the agent to move. Denote by ∆(k) ˜ the Technion - Computer Science Department - Technical Report CIS-2018-01 - 2018
On Steering Swarms 3 planned travel vector of the agent for the current time period [k, k + 1), from p(k), its position at time k, to a homogeneously distributed random point in a unit disc centered at p(k), and by ∆(k) its actual travel vector. The relation ˜ between ∆(k) and ∆(k) is as follows : at time k the agent starts traveling from its ˜ existing position p(k) to its planned position p̃(k+1) = p(k)+ ∆(k) in a piecewise ˜ ˜ T constant velocity equal to ∆(k)/1. If ∆(k) d ≤ 0, the external controller stops the agent at a fraction µ of the time-step, i.e. ∆(k) = µ∆(k), ˜ otherwise the controller does not interrupt its motion during the current time period, hence ˜ ∆(k) = ∆(k). Therefore we have ˜ p(k + 1) = p(k) + c(k)∆(k) ˜ Td < 0 (1) µ ∆(k) c(k) = 1 o.w. ˜ where ∆(k) is a vector from p(k) to the homogeneously distributed random point in a unit disc centered at p(k). ˜ We clearly have that ∆(k) is a random variable with a rotationally symmetric distribution, hence the expectation of the planned travel of an agent is zero ˜ E{∆p(k) | p(k)} = 0. Without loss of generality, we relate to a coordinate system whose x axis is in the direction of the required motion direction d, i.e. ˜ p(k) = (x(k), y(k)), ∆(k) ˜ = (∆x(k), ˜ ∆y(k)), and ∆(k) = (∆x(k), ∆y(k)). By symmetry of the random distribution function we have that ˜ E{∆x(k)} =0 (2) The required direction of movement d is towards the positive x axis, i.e. to the right. Clearly, by the symmetry of the distribution function, we have that the probabilities that the drunkard moves right and left are same and equal 0.5. Hence, the expected actual travel of the agent, given external controller’s (pos- sible) interruptions, is (omitting the time index (k) for simplicity): E{∆x} = 0.5E{∆x | ∆x ˜ ≥ 0}+0.5E{∆x | ∆x ˜ < 0} = 0.5(1−µ)E(∆x ˜ | ∆x˜ ≥ 0) (3) In order to guide an agent to a target point, the controller can set the required direction at each time-step, from the current position of the agent to the target point. Assume without loss of generality that the goal is to reach the point (0, 0) p(k) in the plane. Let p(k) be the current position of the agent, so that d(k) = − kp(k)k is a unit vector from the agent’s current position to the target. The dynamics based on the required direction d(k) is, therefore ˜ p(k + 1) = p(k) + c(k)∆(k) (4) 1+µ 1−µ ˜ T −p(k) c(k) = + sgn ∆(k) 2 2 kp(k)k Let us find the expected position of the agent at time (k + 1) given p(k), i.e. E{kp(k + 1)k2 | p(k)}. Technion - Computer Science Department - Technical Report CIS-2018-01 - 2018
4 Ariel Barel et al. ˜ T p(k) + c2 (k)k∆(k)k We have that kp(k + 1)k2 = kp(k)k2 + 2c(k)∆(k) ˜ 2 hence, by the law of cosines in a triangular (see Figure 2): Fig. 2. Illustration of some values in equation (5). Case (a) the projection direction is opposite to that of d(k), case (b) the projection is in the direction of d(k). E{kp(k + 1)k2 | p(k)} = E{kp(k)k2 | p(k)} + 2E{c(k)∆(k)T p(k) | p(k)}+ (5) + E{c2 (k)k∆(k)k2 | p(k)} From this we obtain that: 1−µ E{kp(k + 1)k2 | p(k)} = p(k)2 − A( )kp(k)k + (1 + µ2 )B (6) 2 n ˜ T n ˜ T oo ∆(k) p(k) ∆(k) p(k) where A = E kp(k)k sgn kp(k)k is positive and depends only on the p(k) ˜ direction vector d(k) = kp(k)k , and for a rotationally symmetric ∆(k) it is in- dependent of d(k) (and on p(k) of course), and B = E{k∆(k)k ˜ 2 } is positive and obviously independent on p(k). Hence for a rotational symmetric selection ˜ of ∆(k), at every step we have that A and B are constants depending on the ˜ particular distribution of ∆(k) (i.e. the displacement with regard to the current location of the agent, or the system center of gravity as discussed next in the multi agent case). From this result it follows that 2 2 1−µ E{E{kp(k+1)k | p(k)}} = E{kp(k)k }−A E{kp(k)k}+(1+µ2 )B (7) 2 Define M (k) , E{kp(k)k2 }, hence we get after some manipulations that for δ defined as the decrease rate of M (k), before 2 2 D2 (0) − B(1+µ 1−µ A( 2 ) )+δ K= (8) δ Technion - Computer Science Department - Technical Report CIS-2018-01 - 2018
On Steering Swarms 5 steps, the process of decrease will necessarily stop. Simulated results of k vs. δ for some different initial values of D(0) are presented in Figure 3, while the graph of Equation (8) is plotted in Figure 4. These results are compatible with the upper bound on k presented in Equation (8) as the simulated time to reach the target always fall below the calculated upper bound. Fig. 3. Plot of k vs. δ by simulation, for different initial D(0) values from 10 to 100, for an evenly distributed random jump to a unit circle, µ = 0.1. Number of runs: 10, 000. Fig. 4. Plot of k vs. δ by Equation (8) for different initial D(0) values from 10 to 100 for an evenly distributed random jump to a unit circle, and µ = 0.1. Technion - Computer Science Department - Technical Report CIS-2018-01 - 2018
6 Ariel Barel et al. Figure 5 shows a typical simulation result for a single agent with uniformly distributed noise inside a unit circle, guided to a desired point. Fig. 5. An agent with uniformly distributed jump inside a unit disc, with external control signal to a point. 3 Controlling multi agent systems - the idea Let us adopt this steering method to a multi agent system. Suppose there is a multi agent system which converges. The lack of a global orientation of the agents prevents the viewer from simply broadcasting the desired direction of movement as suggested by Azuma et. al. [15] and others, since the agents are unable to obey global-direction-based commands. Additionally, recall that our agents are anonymous and indistinguishable, hence an external observer wishing to lead the system in a required direction can not steer individual agents separately by transmitting control commands to each one of them. We show here that an external observer can lead a multi agent system in a required direction (while the agents also converge to a bounded region), by only sensing the motion of the system’s centroid. This information represents for the external controller the location of the group, and it is feasible to measure or estimate in real life multi agent scenarios,P especially for large numbers of agents, such as swarms of drones. n Let pcm (t) = n1 j=1 pi (t) be the system’s centroid. The velocity of the centroid Pn is the average velocities of the agents ṗcm (t) = n1 j=1 ṗi (t) and we have that while all agent velocities are constant the centroid velocity is constant as well. We assume that during each time interval k = 1, 2, 3, ... each agent’s velocity is constant, therefore we have that ṗˆcm (t), the direction of the centroid movement is piecewise constant (i.e. does not change during time intervals hence moves in straight lines). Similar to our discussion in section 2, here, the external controller tracks the motion of the centroid of the system. If the projection of its movement is on the required direction (∆˜cm (k)T d ≥ 0) - it allows all the agents to finish their planned travels. Otherwise, it stops them all after a fraction µ of the time- step, i.e. when they complete a fraction µ their planned travel. We discuss in detail different types of such systems, and bound the expected “velocity” of the swarm’s centroid due to this control mechanism. Technion - Computer Science Department - Technical Report CIS-2018-01 - 2018
On Steering Swarms 7 3.1 Controlling a system with infinite visibility and full sensing Recall the simplest linear multi agent gathering process called S2 , see [13], in discrete time. In this system each agent i moves according to the decentralized dynamic law: n X pi (k + 1) = pi (k) − σ (pi (k) − pj (k)) (9) j=1 2 where 0 < σ < nis a constant gain factor, i.e. at each time-step, each agent jumps proportionally to the sum of relative position vectors to all the other agents. As proved by Gazi, Passino et. al. [4], since the dynamics of (9) is gov- erned by an antisymmetric pairwise interaction function, the average position of the agents is invariant, i.e. n 1X p̄ = pi (k) = const. (10) n i=1 i.e. for an arbitrary initial constellation, if 0 < σ < n2 all the agents of system S2 asymptotically converge to the average location of the initial constellation. In order to remain focused on the subject of this article, let us assume that 0 < σ < n1 so that overshoot phenomena do not occur, i.e. agents will not pass through the centroid of the system to the opposite side. To steer the system S2 in some desired direction, we would like to bias the motion of the system centroid by measuring its trend, however by (10) the centroid of the original system S2 does not move. Let us assume some additive “noise” that breaks symmetry and causes the center of the system to move. Hence we modify system S2 above by adding some randomness to the agents’ motion. Each agent, in addition to obeying the distributed control law (9), moves to a randomly selected point at each time step: n X pi (k + 1) = pi (k) − σ (pi (k) − pj (k)) + ∆˜i (k) (11) j=1 where ∆˜i (k) is a randomly and homogeneously distributed selected point in a unit disc (i.e. ∆˜i (k) is chosen by each agent as in the single agent case discussed above). Here too, at time k the agents start traveling from their existing posi- tions pi (k) towards their next planned positionsPn p̃i (k + 1) in piecewise constant velocities equal to their distance from it [−σ j=1 (pi (k) − pj (k)) + ∆˜i (k)]/1, so that if an external controller does not intervene, all the agents arrive at their destinations simultaneously at time k +1. hence we may denote the planned step of the centroid to be n ¯ + 1) − p̄(k) = 1 X ∆˜cm (k) = p̃(k ∆˜i (k) (12) n i=1 The control mechanism for system (11) is: Technion - Computer Science Department - Technical Report CIS-2018-01 - 2018
8 Ariel Barel et al. n X pi (k + 1) = pi (k) + c(k)[−σ (pi (k) − pj (k)) + ∆˜i (k)] j=1 (13) µ ∆˜cm (k)T d < 0 c(k) = 1 o.w. where c(k) represents the optional “stop” signal received Pn simultaneously at fraction µ of the time-step by all agents, ∆˜cm (k) = n1 i=1 ∆˜i (k) is the planned travel of the centroid of the agents, and d is the required direction of movement of the system. The second central moment is the variance. Since the projection on x of the second moment of a disc of radius r is 14 πr4 , hence we have that V ar{∆x1 } = 14 πr4 /πr2 = 14 . It follows that E{∆xcm } ≥ 0.5(1 − µ) 8n 1 i.e. the bound on the expected step of the centroid is counter proportional to the number of agents. For guiding this system to a goal point, the observer controller should set the desired direction at every time interval so d(k) is a unit vector from the centroid of the system to the goal point. Figure 6 presents simulations result of system S2 with full visibility and complete sensing, with some evenly distributed noise jump to a unit disc of each agent, as presented in equation (13). Fig. 6. System S2 with noise, and guiding signal to a target. In this simulation, 10 agents traveled to a target point in a distance of 10 units, in 780 time-steps. 3.2 Controlling a system with Limited visibility and bearing only sensing In this system we assume that the agents are able to sense the direction to their neighbours (i.e. bearing only sensing), so that their “knowledge” about neighbours is partial, and their motions are determined by the set of unit vectors pointing from their current location to their neighbours. The neighbours are defined for each agent i at time-step k as the set of agents located within a given Technion - Computer Science Department - Technical Report CIS-2018-01 - 2018
On Steering Swarms 9 visibility range V form its position pi (k), so that agents j is currently neighbour of agent i if j ∈ Ni (k), i.e. if kpj (k) − pi (k)k ≤ V . Gordon et.al. [8] [9] suggested a distributed motion law for gathering the agents, based on regions where the agents are allowed to move, while preventing them from losing visibility to their neighbours. Recall that according to [9], an agent i may move only into an allowable region ARi (k). Manor et. al. [16] modified Gordon’s motion law, and prove that his new law gathers the agents of the system to a disc with a radius equal to the agents’ maximal step size σ within finite expected number of time steps, and that as time tends to infinity the distribution of the agents’ average position converges in probability to the distribution of a random-walk. In the sequel we shall use that random-walk behavior of the swarm’s centroid (which is, in this case, not a system invariant), to control the swarm, without violating the paradigm of agents unaware to their location or orientation in a global frame of reference. First, a new allowable smaller but symmetric region ari (k) is defined, which is contained in ARi (k) above. Second, Ui+ (k) and Ui− (k) are defined as the unit vectors pointing from the position of agent i to the current extremal left and right agents defining the current minimal angular-sector anchored at pi (k) and containing all its neighbours. σ \ σ ari (k) = D σ2 pi (k) + Ui− D σ2 pi (k) + Ui+ (14) 2 2 where σ < V /2 (see Figure 7). Since if all agents take steps into their original allowable regions (ARi (k)), they all proved to maintain visibility to their neigh- bours, the same result obviously holds with the new “sub” allowable regions (since ari (k) ∈ ARi (k)). The dynamic law also allows some agents to occa- Fig. 7. The dashed area created by the intersection of the circles of diameter V is a general allowable region ARi (k), and the thick-line delineated the shape created by the intersection of the circles of diameter σ < V /2 is the modified allowable region ari (k) given by (14). Technion - Computer Science Department - Technical Report CIS-2018-01 - 2018
10 Ariel Barel et al. sionally “sleep” (in this paper the meaning of “sleep” will be “stay put”) with probability 0 < δ < 1, so that pi (k) ψi (k) ≥ π or χi (k) = 0 pi (k + 1) = (15) a point in ari (k) o.w. 1 w.p. δ χi (k) = 0 w.p. 1 − δ Manor et. al [16] show that if the agents of the system jump to uniformly distributed random points in their (modified) allowable regions, they gather to a disc of radius σ, and proves that the centroid of the agents’ constellation performs a random motion, and its location distribution converges in probability to the distribution of a random-walk as k tends to infinity. As in section 3.1, we slightly change the dynamics timing from discrete time dynamics (where agents “appear” in new chosen positions at each time-step) to piecewise continuous dynamics (where agents continuously move towards those destinations) as follows: At time k all agents simultaneously choose their next desired positions p̃i (k+1) as defined in (15). Denote by ∆˜i (k) = p̃i (k + 1) − pi (k) the planned travel of agent i, and by ∆i (k) its actual travel, hence each (active) agent continuously moves towards its planned destination in a piecewise constant velocity equal to ∆˜i (k)/1, hence, without external intervention, all agents reach their planned destinations simultaneously. In order to control the system and bias it towards a required direction d, an external controller tracks the trend of the centroid of the system, and if the projection of this trend on the required direction is negative - it stops all (moving) agents at fraction µ of their current move, otherwise it does not intervene the dynamics of the current time-step. pi (k) ψi (k) ≥ π or χi (k) = 0 pi (k + 1) = pi (k) + c(k)∆˜i (k) o.w. 1 w.p. δ χi (k) = (16) 0 w.p. 1 − δ µ ∆˜cm (k)T d < 0 c(k) = 1 o.w. ˜ ∆i (k) = vector from pi (k) to a random point in ari (k) Pn where ∆˜cm (k) = i=1 ∆˜i (k) is the planned jump of the centroid of the system, and d is a unit vector in the required moving direction of the system. It is proved in [16] that the original model, given no external control, satisfies E{∆cm (k)} = 0 and mathematical developments of the equations yield that 1 E{∆xcm } ≥ 0.25(1 − µ) V ar∗ (17) n2 σ 1 − cos4 ( π−ψ∗ 2 ) V ar∗ = δ 2 ( )2 π−ψ∗ 2 2 − 21 sin(π − ψ∗ ) Technion - Computer Science Department - Technical Report CIS-2018-01 - 2018
On Steering Swarms 11 Figure 8 presents simulations result of this system. The system gathers and moves to a target point (presented as concentric circles), and the trace of the travel of the system’s centroid is printed. Fig. 8. System 3.2, and guiding signal to a target. In this simulation, 10 agents traveled to a target point in a distance of 10 units, in 950 time-steps. 4 Conclusions A method has been introduced here that allows an external observer to control a multi agent system and guide it to a requested destination even when the agents are very primitive. According to our paradigm all the agents are identical (anonymous), therefore the external observer can not send a separate command to each agent, but can broadcast the same command to all the agents. The viewer controls the swarm by means of an identical command sent simultaneously to all agents. The method is presented and tested for different cases: the control of a single moving agent performing random-walk, steering of a system with infinite visibility and relative distance and bearing measurement, and control of a system with partial information (limited visibility and bearing only measurement). In multi agent systems, where we proved convergence to an area bounded by a given radius, and in all cases we provided a probabilistic bound to the movement rate of the swarm in the requested direction. A necessary condition for the functioning of the system is constant motion of the centroid of the system. We always have that the centroid dynamics is due to the convolution of symmetric variables (e.g. i.i.d. jumps to a disc around the agents without external interrupts) hence the expected value of the centroid jump equals zero, and the probability to jump in the “right” direction PR (k) equals 0.5 due to the symmetry. Therefore, in order to bound the traveling speed of such system due to the novel control method presented here, one should examine the variance of the single agent move. The traveling speed of the system is bounded away from zero by a constant, dependent on the nonzero variance of the single agents, the number of agents in the system, and the value of µ. Technion - Computer Science Department - Technical Report CIS-2018-01 - 2018
12 Ariel Barel et al. References 1. Ichiro Suzuki and Masafumi Yamashita. Distributed anonymous mobile robots: Formation of geometric patterns. SIAM Journal on Computing, 28(4):1347-1363, 1999. 2. Hideki Ando, Yoshinobu Oasa, Ichiro Suzuki, and Masafumi Yamashita. Dis- tributed memoryless point convergence algorithm for mobile robots with limited visibility. Robotics and Automation, IEEE Transactions on, 15(5):818-828, 1999. 3. A. Jadbabaie, JieLin, and A.S.Morse. Coordination of groups of mobile au- tonomous agents using nearest neighbor rules. Automatic Control, IEEE Trans- actions on, 48(6):988-1001, 2003. 4. Veysel Gazi and Kevin M. Passino. Stability analysis of swarms. IEEE Transactions on Automatic Control, 48:692-697, 2003. 5. Reza Olfati-Saber. Flocking for multi-agent dynamic systems: Algorithms and the- ory. Automatic Control, IEEE Transactions on, 51(3):401-420, 2006. 6. Meng Ji and Magnus B. Egerstedt. Distributed coordination control of multi- agent systems while preserving connectedness. Robotics, IEEE Transactions on, 23(4):693-703, Aug 2007. 7. Reza Olfati-Saber, J Alex Fax, and Richard M Murray. Consensus and cooperation in networked multi-agent systems. Proceedings of the IEEE, 95(1):215-233, 2007. 8. Noam Gordon, Israel A Wagner, and Alfred M Bruckstein. Gathering multiple robotic a (ge) nts with limited sensing capabilities. In Ant Colony Optimization and Swarm Intelligence, volume 3172 of Lecture Notes in Computer Science, pages 142-153. Springer, 2004. 9. Noam Gordon, Israel A Wagner, and Alfred M Bruckstein. A randomized gathering algorithm for multiple robots with limited sensing capabilities. In Proc. of MARS 2005 workshop at ICINCO Barcelona, 2005. 10. Noam Gordon, Yotam Elor, and Alfred M. Bruckstein. Gathering multiple robotic agents with crude distance sensing capabilities. In Ant Colony Optimization and Swarm Intelligence, volume 5217 of Lecture Notes in Computer Science, pages 72-83. Springer Berlin Heidelberg, 2008. 11. Levi-Itzhak Bellaiche Alfred Bruckstein. Continuous time gathering of agents with limited visibility and bearing-only sensing. Technical report, CIS Technical Report, TASP, 2015. 12. Rotem Manor and Alfred Bruckstein. Chase your farthest neighbour: a simple gath- ering algorithm for anonymous, oblivious and non-communicating agents. Techni- cal Report CIS-2016-1, Technion CIS Technical Report, TASP, 2016. 13. Ariel Barel, Rotem Manor, and Alfred Bruckstein. Come together: Multi-agent ge- ometric consensus (gathering, rendezvous, clustering, aggregation). Technical Re- port CIS-2016-3, Technion CIS Technical Report, TASP, 2016. 14. Ariel Barel, Rotem Manor, and Alfred Bruckstein. Probabilistic Gathering Of Agents With Simple Sensors. Technical Report CIS-2017-04, Technion CIS Tech- nical Report, TASP, 2017. 15. S. Azuma, R. Yoshimura, T. Sugie: Broadcast control of multi-agent systems Au- tomatica,49 (2013), 2307-2316. 16. Manor, Rotem and Bruckstein, Alfred. Discrete Time Gathering of Agents with Bearing Only and Limited Visibility Range Sensors. Technical Report CIS-2017-1, Technion CIS Technical Report, TASP, 2017. Technion - Computer Science Department - Technical Report CIS-2018-01 - 2018
You can also read