Path Similarity Evaluation using Bloom Filters

[en] The performance of several Internet applications often relies on the measurability of path similarity between different participants. In particular, the performance of content distribution networks mainly relies on the awareness of content sources topology information. It is commonly admitted nowadays that, in order to ensure either path redundancy or efficient content replication, topological similarities between sources is evaluated by exchanging raw traceroute data, and by a hop by hop comparison of the IP topology observed from the sources to the several hundred or thousands of destinations. In this paper, based on real data we collected, we advocate that path similarity comparisons between different Internet entities can be much simplified using lossy coding techniques, such as Bloom filters, to exchange compressed topology information. The technique we introduce to evaluate path similarity enforces both scalability and data confidentiality while maintaining a high level of accuracy. In addition, we demonstrate that our technique is scalable as it requires a small amount of active probing and is not targets dependent.

Disciplines :

Computer science

Author, co-author :

Donnet, Benoît ; Université Catholique de Louvain - UCL > ICTEAM > INL

Gueye, Bamba,

Kaafar, Mohamed Ali

Language :

English

Title :

Path Similarity Evaluation using Bloom Filters

Publication date :

February 2012

Journal title :

Computer Networks

ISSN :

1389-1286

eISSN :

1872-7069

Publisher :

Elsevier Science, Amsterdam, Netherlands

Volume :

Issue :

Pages :

858-869

Peer reviewed :

Peer Reviewed verified by ORBi

Available on ORBi :

since 23 January 2012

Statistics

Number of views

166 (15 by ULiège)

Number of downloads

1645 (6 by ULiège)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

OpenCitations

Bibliography

B. Donnet, and T. Friedman Internet topology discovery: a survey IEEE Communications Surveys and Tutorials 9 4 2007 2 15
V. Jacobson et al., Traceroute, Main Page, UNIX, 1989, See source code: < ftp://ftp.ee.lbl.gov/traceroute.tar.gz >.
K. Claffy, Y. Hyun, K. Keys, M. Fomenkov, Internet mapping: from art to science, in: Proc. IEEE Cybersecurity Applications and Technologies Conference for Homeland Security (CATCH), 2009.
A. Schmitt et al., La météo du net. < http://www.grenouille.com/ > (ongoing service since 2000).
P. Radoslavov, H. Tangmunarunkit, H. Yu, R. Govindan, S. Shenker, D. Estrin, On Characterizing Network Topologies and Analyzing their Impact on Protocol Design, USC-CS-TR 00-731, Computer Science Department, University of Southern California (February 2000).
R. Teixeira, K. Marzullo, S. Savage, G. Voelker, In search of path diversity in ISP networks, in: Proc. ACM SIGCOMM Internet Measurement Conference (IMC), 2003.
T. Fei, S. Tao, L. Gao, R. Guerin, How to select a good alternate path in large peer-to-peer systems? in: Proc. IEEE INFOCOM, 2006.
A. Agapi, T. Kielmann, H. Bal, Synthetic coordinates for disjoint multipath routing over the Internet, in: Proc. CoreGRID Symposium, 2007.
N. Hu, P. Steenkiste, Quantifying Internet end-to-end route similarity, in: Proc. Passive and Active Measurement Workshop (PAM), 2006.
N. Hu, O. Spatscheck, J. Wang, P. Steenkiste, Optimizing network performance in replicated hosting, in: Proc. Web Caching and Content Distribution Workshop (WCW), 2005.
S. Ratnasamy, M. Handley, R. Karp, S. Shenker, Topologically-aware overlay construction and server selection, in: Proc. IEEE INFOCOM, 2002.
S. Bakira, Approximate server selection algorithms in content distribution networks, in: Proc. IEEE International Conference on Communications (ICC), 2005.
B.H. Bloom Space/time trade-offs in hash coding with allowable errors Communications of the ACM 13 7 1970 422 426
A. Broder, M. Mitzenmacher, Network applications of Bloom filters: a survey, Internet Mathematics 1 (4).
M. Mitzenmacher, Compressed Bloom filters, IEEE/ACM Transactions on Networking 10 (5).
Hexasoft Development Sdn. Bhd, IP address geolocation to identify website visitor's geographical location. < http://www.ip2location.com >.
IANA, Special-use IPv4 addresses, RFC 3330, Internet Engineering Task Force (September 2002).
M. Matsumoto, and T. Nishimura Mersenne Twister: a 623-dimensionally equidistributed uniform pseudorandom number generator ACM Transactions on Modeling and Computer Simulation 8 1 1998 3 30
B. Donnet, B. Baynat, T. Friedman, Retouched Bloom filters: allowing networked applications to trade off selected false positives against false negatives, in: Proc. ACM CoNEXT, 2006.
J. Bruck, J. Gao, A. Jiang, Weighted bloom filter, in: Proc. IEEE Internationl Symposium on Information Theory (ISIT), 2006.
N. Hu, P. Steenkiste, Exploiting Internet route sharing for large scale available bandwidth estimation, in: Proc. Internet Measurement Conference (IMC), 2005.
A. Pathak, H. Pucha, Y. Zhang, C. Hu, M. Mao, A measurement study of Internet delay asymmetry, in: Proc. Passive and Active Measurement Conference (PAM), 2008.
B. Donnet, T. Friedman, M. Crovella, Improved algorithms for network topology discovery, in: Proc. Passive and Active Measurement Workshop (PAM), 2005.