
This is the webpage companion of the paper "A detailed study of the attachment strategies of new autonomous systems in the AS connectivity graph" by I. Daubechies, K. Drakakis and T. Khovanova, Internet Mathematics, Vol. 2 (2005-06), pp. 185-245. The research for this paper was funded mainly by NSF grant DMS-9872890, with also partial support from AFOSR and DARPA.
This webpage contains the different data tables used for the paper. All of these are derived from the tables at NLANR. For our work, we needed information from this huge dataset, distilled in various different ways. From informal discussions with other researchers, we believe that these distillations, in the formats we provide, may be useful to others as well. This is why we make them publicly available here. Please feel free to use them; we would appreciate, if you use them in published work, that you include a reference to this webpage.
Ingrid Daubechies, Konstantinos Drakakis, Tanya Khovanova
The data we used were retrieved from NLANR http://moat.nlanr.net/AS/. In order to keep the present website functional, even if the NLANR-site became unavailable, we also copied all the data that we used; they can be retrieved here, in their original format.
Every one of these datafiles is labeled as pairs.yyyymmdd.number, where number is the original label of the file in the NLANR dataset, and where yyyymmdd is the date corresponding to the file. Each datafile is a list with two AS numbers on each line, separated by a space, indicating that on that day, the two corresponding ASes were connected.
We merged the original data week by week, with consecutive weeeks starting on consecutive Sundays. We also "cleaned" the data as follows: we removed AS numbers larger than 27,000 (as well as the corresponding links), and we removed "self-connections" (in which the pair of AS numbers lists the same number in first and second place). In the original data one week in the period we studied (more precisely, the second week of 1999) stands out by having only fragmentary data; to compensate for this missing week, we copied the preceding week. These weekly files can be found here.
Every weekly file is labeled Weekxxx-yyyyWzz.txt. The three digits x before the dash number the consecutive weeks starting with 000 and ending with 108 (inclusive). The four digits y after the dash label the year (1997, 1998 or 1999); the two digits z (after the W) number the week within its year.
Each file is a plain text, in which each line lists two AS numbers, separated by a space; every pair represents a link in the AS-connectivity graph summary for that week. The links are sorted: in first instance, in incresing order for the first AS number on each line; for pairs that all have the same first AS number, sorting is done in increasing order for the second AS number. Every link is listed twice: once as the pair AS#1 AS#2, and a second time as AS#2 AS#1.
The weekly files were put together by merging snapshot files from the original data; the Merge Summary lists the files that were picked (randomly) to create the weekly graphs.
For each week, we list all the AS numbers listed in the weekly summary of that week. To see the list of these weekly AS-lists, click here.
From this list, you can reach any of the weekly AS-list-files; the naming convention is very simple: for week 078 (out of the weeks 000 to 108) the file is named ASList-Week078.txt. Each file is a simple text file, in which each line lists an AS-number present in the graph that week; the AS numbers are listed in increasing order.
A summary that keeps track of only the total number of active ASes each week is given in the file NumberOfASesByWeek.txt; this file has one entry on each line; this number is the total number of ASes in the graph for the consecutive weeks fromm 000 to 108 (in that order). When we plot the total number of ASes in a week versus the week number, we obtain the following plot:

From the weekly graphs we can determine the degree of each AS present this week, i.e. the total number of other ASes ti which it is linked this week. These degrees are listed in files that can be found here: Degrees of ASes.
Again the naming convention is simple: file Degrees-Week078.txt contains the degrees of the ASes in week 78. Each file is a simple text file, in which each line has two entries, separated by space: the first entry is the AS number, the second its degree that week; the AS numbers are listed in increasing order.
The information of all these weekly files is also contained in the Degree evolution file (5.66MB). This files contains exactly 27,000 lines; the entries on the N-th line list the degrees of the AS with number N-1 (from 0 to 26,999) for each of the weeks from 000 to 108, separted by spaces. If AS number 2 (for instance) is not present in our weekly summary graph for week number 5 (for instance), then the corresponding entry in the file is zero. If a particular AS number belowe 27,000 is not present at all in the graph, then all the entries on that line are zero. (For example, the first line in the file, corresponding, as described above, to the non-existing AS number 0, consists entirely of zero entries, separated by spaces.
For each week we give the Degree Distributions here.
The degree distribution files are again named in a simple way: each week gets its own file; the file for week number 78, e.g., is named DegreeDist-Week078.txt. In each of these text files every line has two entries separated by space; the first entry is the degree, the second the number of vertices with that degree. Only degrees for which there is at least one vertex with that degree, are listed; in other words, the second entry on each line is always greater than zero. The lines are listed by increasing degree.
A summary of these degree distributions is given here. This summary is given in a table format. The first column is the week number, the second column is the highest degree that week, the third column is the average degree that week, the forth column is the median degree that week, the last column is the average degree for ASes with degrees more than 16.
The file MaximumDegreeDistribution.txt describes how many ASes there are with a given maximum degree. The maximum degree is calculated over the whole period. The first number on each line is a maximum degree, the second number is how many ASes there are with this maximum degree.
Appearing and disappearing ASs by week file for each week shows how many ASs appear and how many disappear. The same file in the table form is here. The picture below is a graphical representation of this file.

Here is the summary on the number of appearing and disappearing ASs. And this file summarizes appearing and disappearing behavior by the maximum degree.
The plot below plots the number of new ASs per week by type for the first 100 weeks. In this picture "new" means appearing for the first time.

Note that the classification of the nodes into T1, T2, or T3 should be taken seriously only up to week 70 or so; for ASs emerging near the end of the RV data set, we could not as accurately decide on their Kismet.
In the next picture below "new" means not present the week before:

This file shows total degree gain and total degree loss by week. That is for every week all increases in degree is summed up and also all decreases in degree a summed up.
This file shows a summary how big can be a total change in degree for one AS. Results are grouped. It also has a summary for new ASs.
Based on the evolution patterns of their degree, we distinquish three different types of ASes.

but for some the fluctuations are more frequent and serios, to the point of looking random. See an example of AS number 2 below:





Log-log plot of the degree distribution for our dataset is plotted below:

The same plot for the last week only:

DegConnTo and DegConnFrom series of files describe for each week the new ASes that just appear for the first time, their degrees and the degrees of the ASes they connect to. In the file DegConnTo the first number is the AS number to which a new connection is made and the second number is the degree of this AS (to which the connection is made). Notice that if the initial degree of a new AS is m, there correspond m consecutive lines to it in this file. In the file DegConnFrom the first number correspond to the AS number of the new AS in question, the second number corresponds to its degree and the third number is for internal purposes.
The file ConnectionDegrees.txt describes the degrees of new connection that occured between weeks 0 and 89. See the explanation in the paper why removed the last 18 weeks from this calculation. Each line in this file represent the degree of the New AS at the moment of appearance, then the degree to each it connects to and then how many such connections there are.
Here is the file DegreeGrowth.txt which describes the degree growth of new ASes in 18 weeks. (It counts every new AS that appears 18 weeks before the end of data). Each line has three numbers: first number is the degree upon appearance, the second number is the degree in 18 weeks, and the third number shows how many such ASes there are.
The plot below describes the empirical attachment probability:

Here is the same plot binned:

Here is the log=log plot of the Empirical Attachment Probability as a function of degree binned geometrically, with ratio square root of 2. The figure below shows the result for both the generous and the cautious definition of a "new" AS (see paper). Each error bar shows the first and third quartile for the degree-bin, with the mean in the middle, for the generous definition. The resulting curves are virtually identical:

Here is the same graph as before except it is a function of the maximum degree of the connectee AS.

This plot represents the empirical attachment probability for a single given week. In this case week 2.

Here we would like to study how the empirical attachment changes with time. We take the period of 100 weeks and separate it into two periods of 50 weeks each. Then we calculate the attachment probability in each period (the results are binned). The log-log plot of attachment probability dependent on the degree is below:

Here is the similar plot corresponding to the division of the whole 100 week period into three periods:

Notice how the line for the later period is similar to the line for the former period by is moved to the right. This effect corresponds to the fact the the degrees of individual ASes as well as the highest degree for the whole set is growing. The two pictures below represent the same pictures as above but with the plot line merged to compare the shape:

