Impact of the stochastic features of the Push-Sum protocol on the variance of its convergence rate

Autor: M. Kenyeres, J. Kenyeres, V. Škorpil <kenyeres(at)phd.feec.vutbr.cz>, Pracoviště: Vysoké učení technické v Brně, Téma: Aplikace, sítě a služby, Vydáno dne: 01. 10. 2015

The paper presents the analysis of how the stochastic features of the Push-sum protocol affects its convergence rate and the behavior of the parameter estimation.


Impact of the stochastic features of the Push-Sum protocol on the variance of convergence rate

In this paper, we analyze the effect of the stochastic features of the Push-sum protocol on the variance of its convergence rate. We have provided the theoretical introduction and executed the experiments on 10 randomly generated topologies (the experiments were repeated 1000 times for each topology). The second experiment consists of the analysis of the behavior of the parameter estimation. We analysed the gained results and the theoretical conclusions have been derived.

Keywords: distributed computing; distributed signal processing; Push-sum protocol; gossip-based aggregation algorithms


Introduction

This paper is devoted to the Push-sum protocol, which is classified as a gossip-based aggregation algorithm. We have analyzed the effect of the stochastic features of this protocol on the variance of its convergence rate. The first part summarized the general knowledge about distributed computing. The second part introduces the set of the gossip-based aggregation algorithms. The last theoretical part deals with the Push-sum protocol, which was analyzed within the practical part. The practical part consists of the experiments executed on 10 different randomly generated topologies (the experiment was repeated 1000 times for every topology). The obtained results have been depicted and analyzed. At the end, we have analyzed the behavior of the estimation.

Distributed computing

Distributed computing is classified as the computer science and describes the systems executing distributed algorithms[1]. These systems consist of spatially distributed entities whose goal is to communicate with one another in order to fulfill a specific functionality as the whole. The mutual communicate among them is realized by message passing. We distinguish between two types of message passing[1]:

Distributed systems can be described by several specific features. Among the most important features of distributed system, we can list the following ones[1]:

Gossip-based aggregation algorithms

There are many categories into which distributed algorithms can be divided. In this paper, we have focused our attention on so-called gossip-based aggregation algorithms. The set of gossip-based aggregation algorithms is a type of computer-to-computer communication, inspired by spreading of information between people via gossips in a social environment. Nowadays, a communication solution utilizing gossip-based manner of information interchange is often preferred in systems with specific characteristics such as a complex structure, distributed character of a computation process, an extensive area that a distributed system covers etc. According to [2], the set of the gossip algorithms can be characterized by three main attributes:

Protocol Push-sum

The protocol Push-sum is classified as a gossip-based aggregation algorithm formed by an iterative pair-wise distribution of aggregated values within a network. Its mail goal is to compute sums or averages of the values of all the entities in a distributed system. The Push-sum protocol is considered to be a highly resistant protocol. During the whole process, each entity stores two values: the value of the averaged quantity and the weight. The value of the averaged quantity is set to the initial value of the entity at the beginning of the whole process. The initial weight equals 1 for all the entities in a distributed system. The protocol [3] is executed as follows: During every iteration, each entity chooses uniformly at random one of its neighbors. This neighbor is sent the pair: the half of the value of the averaged quantity and the half of the weight. This information is also stored in the inner memory of the sending entity. Afterward, all the entities compute the estimation of the average by calculating the ratio of sums of these two parameters. This procedure is repeated until the system reaches the consensus, i.e. the difference of the maximal and minimum value within a system is less than 0,00015. According to the authors of [4], the correctness of the Push-sum protocol can be verified in terms of the fundamental property defined as the mass conservation. It says that the global sum of all the estimates in a system is constant in every iteration. The condition also says that the sum of all the values of the averaged quantity is same during the whole process. The same condition is valid also for weights.

Experiments

Within the practical part, we have focused on examining the effect of stochastic features of the Push-sum protocol on the rate of the protocol. It means that we have examined how a random choice of a receiver of a message affects the overall number of the iterations necessary for the protocol to be completed. We generated 10 topologies with the generator presented in [5] and depicted all the obtained results in Fig. 1. The systems consist of 50 entities and their topologies can be classified as dense. Since the Push-sum protocol is a converging algorithm executed in an iterative manner, it was necessary to define the convergence event to indicate the convergence. We used the already defined event with the difference less than 0,00015, which ensures a high precision of the computation process. The vertical axis in Fig. 1 represents a particular network. The horizontal axis contains the number of iterations necessary for a system to converge. We executed 1000 repetitions of the same experiment for each topology. From the graph, we can see that the more slow the protocol is for a particular topology, the more spread the obtained data tends to be; therefore, the variance is much larger in these topologies. We used the black diamond to label the average value counted from the 1000 repetitions for one topology. The empty circles have been used to label the number of iterations necessary for a system to converge (there may be more than one result with this value). These circles for different topologies are distinguished from each other by using a different color .

push_sum_01

Fig. 1 The variance of the convergence rate

In the next experiment, we analyzed the behavior of the parameter estimation in a distributed system. We depicted the curves for all 50 entities of one topology in Fig. 2. The black dash line indicates the average value counted from all the initial ones. It is the value to which all the entities converge.

push_sum_02

Fig. 2 The behavior of the Push-sum protocol

Contribution

This paper provides the analysis of how the stochastic features of the Push-sum protocol affects the convergence rate. We executed 10 000 experiments (1000 repetitions x 10 topologies) and depicted the obtained results. From them, we can see the stochastic features of the Push-sum protocol affect much more significantly distributed systems whose convergence rate is slower. At the end, we showed the behavior of the parameter estimation.

Acknowledgment

Research described in this paper was financed by the National Sustainability Program under grant LO1401. For the research, infrastructure of the SIX Center was used.

Literatura

[1] Kenyeres, M. (2015). Optimalization of Distributed Classification of the Convergence Event. In Proceedings of the 21st Conference STUDENT EEICT 2015. Vysoké ucení technické v Brne, Fakulta elektrotechniky a komunikacních technologií.
[2] Lin, M. J., Marzullo, K., & Masini, S. (2000). Gossip versus deterministically constrained flooding on small networks. In Distributed Computing (pp. 253-267). Springer Berlin Heidelberg.
[3] Kempe, D., Dobra, A., & Gehrke, J. (2003, October). Gossip-based computation of aggregate information. In Foundations of Computer Science, 2003. Proceedings. 44th Annual IEEE Symposium on (pp. 482-491). IEEE.
[4] Jesus, P., Baquero, C., & Almeida, P. S. (2009, January). Fault-tolerant aggregation by flow updating. In Distributed Applications and Interoperable Systems (pp. 73-86). Springer Berlin Heidelberg.
[5] Kenyeres, J., Kenyeres, M., Rupp, M., & Farkas, P. (2013). Connectivity-Based Self-Localization in WSNs. Radioengineering, 22(3).