Routing SIG, Thursday February 26, 4:00 pm-5:30 pm PHILIP SMITH: OK, we should make a start. Welcome to the APNIC 17 Routing SIG. Just some administration and announcements before we start proceedings properly - first housekeeping announcement is use the onsite notice board for update announcements. Please check that. There is a Jabber chat client service available, web casting and, of course, the live transcripts. Please let colleagues know that these services are actually available for you. The APRICOT closing event is this evening. You need a yellow event ticket to get on the bus so this is the yellow event ticket. If you don't have one, please go to the Secretariat. I think there are about a dozen or so tickets left. The buses leave at 6.00 pm from the hotel lobby, OK, so please be there 6.00 pm, not much later. The APNIC Members' Meeting is tomorrow morning. Registration starts at 8:30 and you need to collect your new name badge. Your APRICOT badge will not work, OK? The meeting will be held in this room, which is Unity 1, just in case you're not sure which room you're in. OK, so that was the housekeeping. This, as I was saying, is the Routing SIG. It's chaired by Randy Bush and myself. If you need to speak to us or want to speak to us, the e-mail address is there on the screen. You can join the mailing list through the APNIC website. If you're going to ask questions of our speakers, please use the microphones. Don't shout out from the middle of the room. There are two stationary mics there and, there are two roving mics who will come and find you if you don't want to walk to the stationary mic. Please speak slowly and clearly. If you're going to ask a question, announce your name and affiliation just for the benefit of the stenographers. OK, our agenda is very short. Well, we have only one item on our agenda, I should say. Unfortunately, Phillip Harris, our other speaker has had to go back home for personal reasons. So we'll postpone that part of the content until a future date. So we have one speaker who is Geoff Huston, who will be talking basically about allocation versus announcement, comparison of the RIR IPv4 allocation records with global routing announcements and other things. GEOFF HUSTON: We seem to have a fair deal of time so Phil and I thought that maybe it would be useful to go back through some motivational material first and actually have a look at the entire space of what we're talking about which is global routing on the Internet. So the first of these slides - lots of cute colours, lots of cute point. It's a bit like television, always compressed. There's actually a description of the Internet table for the last 10 years from 1994 through to 2004. And the first thing we're looking at here is the number of individual entries that exist in the routing table. And, in some ways, that's a kind of metric of the size of the Internet and the pace at which it's growing. There are a number of interesting things here, some of which are only partially obvious, but I'll go through them. Right at the back here in 1994, there was a suspicion we were running out of size, that the number of routing tables in the Internet was going to grow very dramatically and life was going to get very crappy in a couple of years. You can't see much before there but it's actually quite a neat exponential growth curve. It was at that point that classless domain routing was pushed on to the world and a push was taken to introduce it. From March IETF onwards in that year, you notice a slowdown through 94 in the growth of the routing table as operators move from advertising /8s, /16s and /24s into aggregating that and reducing the number of advertisements. And what was exponential growth up until around 1998 turned into linear growth of the routing table which is not a bad level. Around there, I actually managed to find the second series of data, so this data here comes from the Netherlands. I flicked over to Australia and there's now two things, one from Australia and one from the Netherlands, tracking through that growth. If the Internet routing table is an indicator of the growth of the Internet, then it's reasonable to say that the Internet boom started in March 1999 and what was a relatively gentle linear growth in aggregated routes turned into a mad splash of fragmentary routes from 1999 until 2001. And over that period, the routing table more than doubled, close to tripled, in size. The problem that we had at the time was that that pace was far faster than Moore's Law. Moore's Law says that the number of transmitters in silicon will double over 18 months and the amount of computation you can put in silicon doubles every 18 months. The worry that we had about here was that, if that trend continued, the Internet would get faster or bigger than we could make silicon for within a few years. And there was certainly some concern around then that we were going to get to the point where the size of the Internet would exceed the capability of silicon to actually manage it. Around there two things happened - one - a recognition within the community that we really did have a problem. And secondly - the end of the Internet boom. Between the two, 2001 was certainly a somewhat different year. The other thing that happened in 2001 is one of the more remarkable data sources came online. This is a very large router operated at the University of Oregon that takes around 40 separate BGP feeds and allows you to slip them out so now, instead of just tracking two BGP tables, you actually see that I'm tracking around 40. Interestingly, not everyone sees the same Internet. This view up here is actually internal to Telstra's autonomous system. And what you find is there are a large number of routes that are purely internal and they're very fine-grained - /29s, /30s and /32s and there are quite some thousands of them. Most ISPs would have the same thing. Internally, you have a lot of detail, externally, you aggregate and the bulk of the aggregation is there. So most folk see within about 7,000 or 8,000 routes of each other. Down here is interesting. I think the major point is actually Verio. There are a couple of others. A number of major transit ISPs do do prefix-length filtering and what they do is they publish a set of /8 prefixes and they publish the minimum prefix size they're prepared to carry and they're running a somewhat smaller routing table. Interestingly, it's the same growth level, the same growth trend, but they're down by around 10,000 routes. Let's have a look at more of this because it kind of gets interesting. This is the amount of address space that we're consuming in the routing table. Currently, some 2 billion or a little under /2 addresses have actually been allocated, about 1.8 billion. At the moment, 1.3 billion are advertised and the other half a billion addresses are dark, they're not advertised globally on the Internet. This is not a 10-year curve. It starts at 2000 so it's just the last four years. Interestingly, the growth went on linearly until early 2002. There was a leveling of the amount of address space. Whatever was going on there, whether NATs were very popular in 2002, were they? Flavour of the month? Or something else that was more social and economic. It's actually hard to believe because there were a lot of DSL rollouts at that point. For whatever reason, the growth rate was a lot lower across 2002. By 2003 we're back on a more gentle curve and then, just at the end of 2003, we're back into where we were again. It would be interesting - and if someone wants to do it - to actually look at snapshots of the routing tables to understand why we're getting these rather massive shifts in the growth of the address space but, nevertheless, yes, there are some shifts. How big are the prefixes we're advertising? Same four-year period - 9,000 /32s, 10,000, 11,000, 12,000 - over the last four years the average prefix length has certainly come down a little bit and we're now steady at around 10,000 entries. 10,000 is a /19, /20 average as I recall, I think it's about a /19. Although, in the last few months, it's come down again. There are a couple of more finer-grained prefixes - a large number actually - altering that average prefix length that's being advertised. The other thing that's evident in the BGP table is that there's a lot of noise, noise where the information doesn't add anything. It's quite common to see a large advertisement, maybe a /16. And then smaller advertisements within that same space - /20s, /24s, etc. So we have 130,000 entries in the routing table and, these days, around 65,000 of them, or half of the routing table, adds no new information. And, in general, the reason why people do this is often to do with local resilience and local traffic engineering. Unfortunately, it's very difficult to scope that. It must be very difficult to scope that kind of local advertisement because no-one seems to be able to do it successfully. And a huge amount of the global table, half of it, appears to be driven by what appears to be an expression of local policy - probably not the best thing to happen. And, as you see, it grows and continues to grow. That's the percentage - 50% of the routing table, 50% pretty steady over the last few years. How much address space? If half the routing table is more specifics, if half of the load in routers is more specifics, you'd expect, I suppose naively, that half of the address space would be treated this way? No. One sixth of the address space, in fact, slightly less, around 11% of the address space, is over half of the routing table. So the local policies we're talking about are actually local policies on very, very small prefixes predominantly /24s, 256 entries. So that what's happening is that the mice, the little things, are actually dominating the routing globe. If you get rid of those small specifics and look at the prefixes that actually add information, across that four-year period, it's actually quite a consistent curve so that, where we saw all these bumps and anomalies, I suspect that a lot of it is due to traffic engineering and multihoming in a local context and, globally, the growth of the Internet over the last four years has been remarkably even and, interestingly, in routing terms, not in socioeconomic terms, but in routing terms, relatively linear rather than exponential. Oh, one last thing. The number of Autonomous Systems, the number of separate entities that push stuff into the routing table, the number of unique Autonomous Systems - the last five years - that's the end of the boom. I was only measuring from one point there but it's quite a strong curve. Around the middle of 2001, you see quite a pronounced need and now we're in a post-boom build-out, very linear and here's where I started turning on looking at route views. Not only do all of the neighbours of route views see a slightly different set of advertisements, they also see a different number of Autonomous Systems and the folk doing prefix filtering actually don't see up to about 1000 Autonomous Systems. It's not clear to me that they actually don't see their routes. I believe that, in most cases, even when you prefix filter, what you cloud out is fine detail. You can still get your packets there anyway. Perhaps that's an area where more work can be done. That's an overview of the routing table and the basic message at this particular point is that the growth is no longer exponential and very hard, the growth is actually more gentle and linear. Let's move on to one other report before I get on to the presentation and this is a resource that you may want to look at for your own autonomous system. It's a thing called cidr-report.org. What does it do? It tries to look at which particular originating ASs are generating prefixes that could, in theory, be dampened down into being local rather than global. And here are the autonomous system numbers, the number of networks that are actually advertising and the number that could be removed if true aggregation was happening. And you find all kinds of interesting folk there with all kinds of aggregation possibilities listed through. The other thing I try and look at is who is adding and who is removing routes every week? So last week, something that looks very much like being in South America added another 14 prefixes into the global routing table and all these other folk added, you know, seven, six, five, four, three, two, etc. Quite a few. Busy week. The folk who increased their number of entries into the routing table, the number of folk who aggregated or decreased and so on. Leading me into my talk is one more section of the report. I'm interested to understand how many people receive an allocation from an RIR and advertise it as that allocation and how many people receive an allocation from an RIR and fragment it and advertise more specific routes. Because I'm wondering if our policies, which implicitly suggest that what you get allocated is what should be advertised, are actually working or not. So there's a section on the report here that actually looks at who is advertising fragments of an initial RIR allocation and how many of those things are actually fragments. And, as you find, there are a fair few interesting folk there and I'm sure you can read it as well as I can, some from this region, some from America, some from closer to home. They're actually advertising more specifics of the RIR allocation. So that neatly leads me into the presentation that I'd like to do this afternoon talk being a comparison of what the registries allocate versus what comes up in the global routing table. As I pointed out, a number of ISPs introduced prefix-length filters on the routes they accepted to try and dampen down some of this noise. Although not everyone does it, a fair few ISPs do some form or other of prefix-length filtering. We've had a number of accidents over many years where a large block leaks all the specifics, operator error whatever, and sometimes that brings down other people's routers. So prefix length filtering is actually defensive and many folk do it. And the filters are typically based on the RIR allocation units. So if, out of a particular block, the minimum allocation is a /20, then you might find that these prefix filters have been put in for that particular /8 is set to a /20 and, if people fragment and separately advertise little bits, it doesn't spread across the entire Internet, the prefix filters catch it in various points. So the implication of all of this is that there is, I believe, a relatively widespread view out there that what the RIR allocates is what we should see in the routing table as a single aggregate. And that the intention of those folk who fragment advertisements is generally things about local multihoming, local resilience, spreading your load between two upstream and a certain amount of traffic engineering and, in theory, those fine-grained fragments should be scoped, no export, explicit communities that try and limit its promulgation. In theory these fragments shouldn't limit around the world. The kind of question that I had in my mind is how good is the assumption that what you get allocated is what gets advertised? And, if that assumption was good or bad, have things been changing recently, are we getting better or worse at actually tending to the health of the routing system? So the methodology is relatively simple. If you wander through the RIRs' data repositories you'll see delegated files which are effectively a log of the allocations they perform and their size and you can take that log or prefixes in size and compare it to a dump in the BGP table, which is what I did. So the first thing to look at is the last 13 odd months, across 2003 and what I found was 4,500 allocations which is actually, I think, a higher number than it should be and I think some of the early historical transfers from ARIN to RIPE got redated as they came through. So I suspect the really number is closer to 1,000 but I don't know. I can only take the data that I have. So, with that in mind, we press on. A quarter or a little less than a quarter of those allocations aren't in the routing table. Interesting. In the last year, a quarter of the things that got allocated aren't in the routing table. You'd say, "Maybe that's all those RIPE early allocations." It's not. Even in APNIC, LACNIC and ARIN, where 2003 allocations really happened in 2003, even though the allocation happened, it hasn't been advertised. However, 3,600 are. Of those 3,600 registry transactions, the routing table has 11,000 entries so, on the whole, everyone takes an allocation and puts two more specific fragments on it. Naughty people. I looked at it again because I actually don't think that's the case. So I looked a little more at those 3,600 allocations and found that most of them were advertised precisely. If they were allocated a /16, they advertised a /16. To the bulk of it, 8,000 advertisements, actually come from 1,000 allocations, roughly. So 80% of the allocations are good but 20%, one fifth, are bad and the fragmentation rate is around 6.6. So it appears that most of us are doing the right thing globally and a fifth of us are actually splashing a lot of fragments out to the Internet. That's a large and busy table. Here's what people got, here's what they advertised. The predominant thing is to take /15s, /16s, /17s and even /20s and slice and dice them into /24s. Someone must have a textbook out there that is remarkably well read. I don't know why /24s but that's it. Most of it is just slicing and dicing into /24s. Not good data, not a healthy look. I thought maybe if I limit myself to the last one-and-a-half months, are we getting any cleverer. In looking at the last one-and-a-half months, 520 allocations, 217 aren't announced yet. They're very recent so that's probably about right. They've just got their block, it may be some time before we see it. 303 are announced and 576 routing advertisements. So rather than a multiplier of three, it's a multiplier of two. Getting a bit better. Are we? Oddly enough, the folk who are reading that particular textbook from the 1970s are still reading it and still doing it because, even in the last six weeks or so, 78% of the allocations are correct but 22% are fragmented and it's 4.6 that are fragmented. It's the same proportion who are doing exactly the same now. How many are in this room? 20. Four of you are doing it. In the same table, what's the most popular fragmentation point - /24s, although I noticed some /16s, which are actually quite large allocations. These folk should know what they're doing - fragmented into /21s and globally announce them. I'm kind of interested now that nothing seems to be changing. That doesn't tally. We must be getting better at the job. So rather than just looking at a few years previously, I looked at the entire table, the big picture for the last 20 years. Lots of numbers, aren't they fun. Pretty graph. What I've tried to track is the allocation versus the advertisement. People were allocated /16s and advertised them as /16s but also love advertising to /24s and you see that the most common way of fragments space is a /24 and there's also fragmentation around the /19, /20. Local traffic engineering. So 80% of the advertisements are 'as is'. Has that always been the case? Around 80% of the advertisements are 'as is'. If we did this exercise in 1989, it would actually be 90% of the advertisements would be 'as is'. We got a lot worse until 1994 when we really got bad. Between 89 and 94, people were accurate. Through the boom, through to 2001, the routing space also fragmented. The boom was driven by, if you will, fragmentation in there. And, since then, we've been slowly getting cleverer again. So it's not all bad news. Have a look at the number of fragments versus the number of allocations. Is actually now very indicative of the problem space. Between around '97 and 2001, we were excessively fragmenting the space. Since then, we've actually been getting a whole lot better in managing to actually advertise what we receive as an aggregate. And last, and not least, look at the number of allocations, the proportion that are advertised as fragments. Again, you see that same curve that, since around 2001 when we peaked at around 50%, we're now getting better. This is good. So you see that spot where the boom occurred, the number of routing table entries just went straight up. That was actually fragmentation of the space rather than raw demand of addresses. What we actually saw was a number of small businesses come up, grab addresses from wherever they could, fragment and advertise them out to anywhere they wanted. And, at the end of the boom, we actually managed to restore what we might call 'business as usual', although at a slightly higher level. So, as I see, the reason why the routing table tends to have unbounded growth is, when the RIR allocations aren't matching the natural tendency of the industry to operate. That, when the allocations are too big, the address space gets fragmented down into smaller entities. Since late 2000, oddly enough, the level of fragmentation has dropped. So what do people do? They take an allocated block and slice and dice it into /24s. That's what the textbook said. But the textbook was written before BGP had NOEXPORT in it as far as I can tell. If you want to do it locally, you don't want the world to see it. Around one fifth of the operators out there, the ASs, do an awful lot of fragmentation and everyone else actually manages to do aggregated advertisements. When the RIRs started allocating /21, /22s and /23s, that will actually match the end point. You don't actually see further fragmentation of those smaller blocks. Those smaller blocks appear to match quite precisely the end point of the business. And, yes, since around 2000, we have been getting better at it and the BGP table is certainly better behaved which is, I suppose, good. Questions. Oh, and thank you. RANDY BUSH: Randy Bush, IIJ. Do you believe that most of the fragmenting announcements - if I filtered strongly, my packets would get to the destination. GEOFF HUSTON: Yes, I believe it and, yes, I can show you data for it, yes. Fragmentations are not without covering aggregates on the whole that, when you actually look at particular Autonomous Systems, what you see is the original /18 and then maybe a /19 and, just to be on the safe side, a /20 and maybe some /24s as well all inside the same block. So prefix filtering certainly allows you, I believe, to see precisely the same Internet as much as anyone does see the same Internet and is a reasonable practice to try and limit the amount of noise you are seeing in terms of updates and so on of those prefixes. RANDY BUSH: And load on router. GEOFF HUSTON: And load, yes. So, yes, prefix filtering is probably a reasonable thing to do to actually reduce routing load and increase, if you will, the efficiency of your overall system. Interestingly, a number of you folk are actually fragmenting right now, one in five. These online reports will actually tell you who are. Type in your own AS and you will see. And it's probably worth doing. Because it's actually quite easy to cooperate with your upstreams and reduce the scope of the fragment. It's quite OK to take a fee from provider A and provider B and advertise more specifics to each to balance your incoming load. But, quite frankly, the rest of the world isn't interested and perhaps you should be giving them a community - maybe your upstream has a community you can tag the routes with to say, "Look, suppress this from here on through." Not interested in having it promulgated further, it's a local problem. If you are a transit provider, you should be offering your customers communities that allow them to say, "This is a more specific. Don't onadvertise. There are covering aggregates. It doesn't matter." So, on both the customer and the transit side, there are things you can do right now, simple things around communities, that actually limit this noise. Half of the routing table is this kind of noise. So the more we do in this, the more headroom we get in the routing system and the fewer spurious fine-grained updates go all the way round the world being amplified by BGP. This is probably a good thing. PHILIP SMITH: Have you contacted any of the people who are announcing these /24s to ask them why? Are they leaking their BGP on purpose or do they really believe that we still live in the old Internet or what? GEOFF HUSTON: Personally, no, Philip. I am aware that a number of individuals around the world have done various efforts of contact with varying degrees of success but, sometimes, when you contact them, you see a change. Other times when you contact these folk, evidently, the answer is either no answer or the answer is, "It's really complicated. It's so complicated we fix this. Wow, this is complicated" and that's the end of the conversation. I'm actually not sure. But this fragmentation has been around for a long time and those efforts to try and fix it at that grass roots level haven't been as good as they could be I suppose. RANDY BUSH: I might as well confess. Of course, I was the one who had the filtered policies and, in fact, we could reach everybody, etc and going round begging them to stop polluting was, ineffective. I wrote a large telco in the Pacific and he told me he was doing it for traffic engineering reasons and so on and so forth and so it was easier just to filter. GEOFF HUSTON: And, in some ways, filtering the noise out yourself gives you back some of your router space. It knocks off some of the updates in the processing load. It gives you back some of your routers and you don't lose connectivity by filtering, as far as I can tell. All these fine-grained factors are actually covered by aggregates. PHILIP SMITH: A question for the audience. Are there any operators here? Do you do any filtering of these fragments that are being announced for example like Randy has been doing. If so, do you want to comment? ANDY LINTON: Yes, we do and yes, we filter. PHILIP SMITH: No other questions, comments. PB PATIL: We are a no-export community. We are three. We have three ISPs. We export for traffic engineering purposes. PAUL WILSON: Hi. Paul from APNIC. We've heard quite a few times that routing tables have grown quickly but they are by no means too big for modern routers and modern routers could cope with routing tables of much larger size at least in static terms and, furthermore, that what causes load on modern routers these days - I can see Randy preparing - what causes load is the dynamism in the routing system, the number of updates which are - which need to be handled and so forth. Geoff, you provide a static view of how big the routing table is but you made reference to routes leaking out and trafficking the world and being amplified. I'm wondering if there is work being done in terms of the volume of announcements, the size of announcements and withdrawals and that sort of dynamic nature. RANDY BUSH: Yes, research has been done in that area and smaller - longer prefixes churn more. Also, just because giant routers, for which I must pay $500,000 can handle it doesn't mean that that's what I want to employ in a small, multihomed POP, right? So, you know, there's - consideration often makes sense in this world. PAUL WILSON: The routing table while, during the dotcom boom, it may have exceeded more, on average, over the last 10 years, it's been well below I believe. So where are we really in terms of what we can, what we can handle versus the size of the table that we're being asked to handle? GEOFF HUSTON: I've been a keen student of some of Randy's presentations and what I see is that the interaction of various instances of BGP often turn a single announce or withdrawal into a flood of announcements further away. I have not seen a comprehensive analysis of logs of updates and withdrawals in major transit spots although I suspect that, if you looked at something like the RIS, the RIPE routing information service, you'd find a wealth of data. The frightening thing is, it is so much data, it's actually quite difficult to process. What we're trying to understand in terms of scale is that you can't take a snapshot of logs. You've actually got to take a time series and do very, very deep analysis if you're trying to analyse the whole. So far, we've been unable to - I haven't seen any good papers in this subject. We haven't been able to do that and the reason why I'm a keen student of the work Randy and his associates have been doing is because they're doing experiments on single prefixes and trying to understand the amplification factor. So we have this feeling that, as it grows, there's a multiplier in terms of the total load that the system has to carry and we're becoming aware, by those experiments, as to the multiplication factor. The next piece of work, which is a lot of work, is to understand the total upload on the system as a whole. The suspicion is that these fine-grained prefixes which are intended to be traffic engineering local load-balancing are actually less stable than the aggregates and that's by and large what we're seeing, although it needs a little bit more analysis and that tends to suggest that the higher degree of fragmentation, the worse position we're all in. PAUL WILSON: Thanks. I've got one more question which is in the area of training and education. Geoff, you gave some suggestions earlier about how ISPs might avoid, lessen their impact on the routing table, tread more lightly on the routing table, if you like. How strongly would you word those suggestions and are your suggestions strong enough for them to be incorporated by the APNIC Secretariat into the sort of training that we're doing quite regularly around the region. Should we be promoting better routing practice in the training efforts that we're undertaking? GEOFF HUSTON: Yes. Simple communities can solve a whole heap of these issues much faster, I think, than almost anything else. But, having operators understand how they can apply communities to dampen down more specifics doesn't appear to be obvious. Here's an example where you see an original allocation of a /17 sliced and diced into /20s all from the same AS. So there's no new information about routing going on here. There's no new policy going on. And the rest of the world actually doesn't need to see any of that stuff. It's not new information. And, if the upstream at slowly AS cooperated on communities and dampened down this, we'd have a much smaller routing table with a much reduced overhead in terms of superfluous load. PHILIP SMITH: Are there any other questions or comments that anybody has at all? GEOFF HUSTON: I have one question and I think David O'Leary is here and I'd like to direct it to you. At the time when we were facing large problems in the late '90s and we had a routing table of around 100,000 entries, there was this general suspicion that, if we ever grew, at that point in time, to around 500,000 entries, we'd have meltdown. But it seemed to be that the margin of error in the routing equipment of the day was around 500,000 odd entries. If you were going to make an approximation as to what current technology is able to do, is it possible to give an estimate of how big a table could be? DAVID O'LEARY: I guess this is an 'it depends' question, right? How many entries can we put into a T640 versus what can run in the real world and I don't have that number for you. I guess - well, I haven't thought about that specifically, I haven't looked at that, we haven't done those tests in, kind of, real-world Internet because I think one of the dynamics we're seeing now is with other services being turned out in boxes, the edge devices are seeing more routes than the core devices, possibly, and there are a lot of local other stuff, VPNs and so on. So there's kind of higher demand there and, what's the biggest we've seen? I don't know, but certainly a lot bigger, a lot bigger than where we are in today's Internet. GEOFF HUSTON: Are we talking millions rather than hundreds of thousands? Let me prompt you. DAVID O'LEARY: Yeah, absolutely, yeah. I think one of the things that's happened is we have some competition now. RANDY BUSH: Is that storage or proportional to return? DAVID O'LEARY: You know, that's a good question. We don't see bosses running at 100% utilisation on CPUs. If you do, tell us and we'll try to figure out what's wrong. Don't see BGP as being what's hurting large routers. Now, I mean, that's not to say there aren't, you know… as Randy said earlier, there are smaller routers out there on the edges that are being challenged but, you know, I'm the wrong guy to talk to about small routers, pretty much. I do have a question though. One is a kind of an observation and you can talk about this offline if you want, but you said that, prior to '89, I think it was, I don't remember exactly what date, you said there was a 90% correlation between the allocations and the advertisements and I'm confused as to why it's not 100% correlation in that. I'm not sure how you configure on a router prior to 1992 in terms of how to, you know, summarise routes and de-aggregate routes. You get allocated class B and advertise that. GEOFF HUSTON: This is as a result of my methodology. What I don't have is routing snapshots for every day going back 30 years. I took today's routing snapshot and compared it to the historical allocation and what I find is that a lot of the old historical allocations are still in today's routing table unaltered. And it's only at around 1988, the allocations that happened around then, that they actually started to get fragmented. Now, whether they were fragged in 94, in 1998 or yesterday, I can't tell. But the original allocation is largely intact, which is actually quite strange and it's only the ones that happened in the late '80s that you start to see this fragmentation happening. DAVID O'LEARY: Maybe it's strange, maybe it's not. At least any customer, enduser, I've talked to - if you ask for some address space back or ask them to renumber, that's always a challenge. Especially the people who bought their class Bs in, whatever, 1986- there aren't a lot of them volunteering to give back space. GEOFF HUSTON: Right but, sort of, since then, in the mid 90s, you see that fragmentation. DAVID O'LEARY: I think all these dynamics here and one of the reasons why this is hard to figure out, what's going on, is because there are so many factors interacting. I know there are some interested people in this room but getting local exchange points, do we think that's actually making a difference in terms of visibility of advertisements. Does the world of Asia Pacific or Australia or something look different in Poland when an exchange point is involved in Indonesia? I mean, in theory, it should, right? To you does it? GEOFF HUSTON: I have no data to answer that. DAVID O'LEARY: Yeah, I didn't expect but it would be interesting to try to figure that out, if it does actually really - in terms of the exchange point, people think of those kinds of things. Again, there's so many factors here that drive all these numbers. It's hard to tell which ones are going to have the most impact - it's a lot more work. GEOFF HUSTON: From my view in looking at this, the observation that they're actually consistently covered by aggregates tends to suggest that the specifics are designed for specific response locally. And, to my mind, the first-up answer is, bilateral use of communities between transits and upstreams will get rid of a huge amount of load and noise from the routing system and get rid of these fragments, because they just don't appear, logically, to have global [inaudible]. Otherwise, they wouldn't be covered by aggregates anyway. PHILIP SMITH: OK. Thank you very much. RANDY BUSH: I just want to be specific and say thanks because you're about the only one doing research at this level and this stuff is critical to operations. GEOFF HUSTON: Thank you. PHILIP SMITH: Are there any other questions at all? Any other comments? Anybody got any other things they want to discuss or bring up? OSAMA DOSARY: Actually, this is relevant to Randy's question - to remove specifics, longer prefixes, what would be a good breaking - /21, /18? GEOFF HUSTON: I mean, the assumption that's being made in a lot of places here is that the allocation suits the business and the business should be able to advertise that as an aggregate and that, when you slice and dice into more specifics, it's predominantly because you want to alter incoming traffic through various policies and paths. Now, if that assumption that what you get allocated is what you advertise is wrong, then, when we start talking about minimum allocation sizes and allocation windows, maybe it's a good time to expose whatever conditions you have that that assumption [inaudible] doesn't hold but the predominant idea in the routing - I'll use the word philosophy here - is that an autonomous system is also a single unit of policy and you should be able to, from a global perspective, be seen as a single unit, even if your local interactions are more finely grained. I suppose the answer is we shouldn't be fragmenting at all if you're doing it right. OSAMA DOSARY: OK, we might be on the file you mentioned. This has to do with where we're located. We are a subsidiary of a larger telecom and we need to allocate more addresses and they have their own peering arrangements that are separate from us, right, and we are not allowed to request an allocation for them. We had to break up our own allocation for them. RANDY BUSH: I want to address your first question. When IIJ was a filtering Nazi, we filtered on the minimum allocations with the prefixes being allocated by the registry. So we knew for each - the registry is documented, you can find them on the websites - for each part of the address spaces what was the longest prefix they allocated and we filtered on that boundary and, for the traditional class B space, we filtered on /16 and it worked. OSAMA DOSARY: Are you talk being reachability? RANDY BUSH: Yes. OSAMA DOSARY: OK, how do you know it worked? RANDY BUSH: Because for hundreds of thousands of customers, we had no complaints, "I can't get the X". We also studied the unfiltered routing table and showed that, essentially, for any prefix, it was covered by a shorter prefix with similar pathing. OSAMA DOSARY: So, as a recommendation to prefix incoming routes, we can just, you know, just filter our /16. Is that your recommendation? RANDY BUSH: For Bs. For the B space /16, right? For the A space, pretty much a /8 except now that the proportions of the A space are allocated to the registries for slicing and dicing into longer prefixes, you have to treat each /8 differently. 60 /8 you want to do on a - oh, some registry person tell me, /23 or /22 or something. 206, I think, you can do on a /19. Right? The registries have, each registry who has a block has a page that says for what range, what prefix is the longest prefix they allow for. So the filter is, you know, 50 entries, but you're not filtering packets, You're filtering route announcements so you're not going to load your router down. Am I making sense? OSAMA DOSARY: So you'd have to break it down, depending on the range, like, for A, you just do it as if it was A class, and for B as if it was a B class. But we have a couple of customers that actually are - they gave us from an A-class, so, if we did that kind of filtering, how would that affect the traffic? RANDY BUSH: All they have to do is get it to you and you can slice and dice it to your customers. All I have to do is get the ball into your park. The fact that you slice it up, that's your problem. PHILIP SMITH: OK. Do we have any more contributions? No, nothing at all. OK, well I would like to thank Geoff Huston for a very, very interesting fringes presentation, full of amazing information, as usual. And thank you to all the contributors who gave very enlightening and interesting discussion afterwards as well. It was quite useful. So we're finished about 25 minutes early. Hopefully, that will give you enough time to get ready for the APRICOT closing social this evening. Just a reminder - the buses leave at 6.00 pm from the reception - or the lobby, the front of the hotel. And remember you need your yellow ticket to get on the bus. Otherwise, you don't get there. OK? Thank you very much everybody for coming and we will see you at the next APNIC meeting. APPLAUSE