______________________________________________________________________ DRAFT TRANSCRIPT Routing SIG Thursday 2 September 2004 4.00pm ______________________________________________________________________ PHILIP SMITH: Good afternoon, everyone. I think we should make a start to this just to try to keep time for once, I guess. Welcome to the routing SIG. It's the final event of this part - the SIG part - of the conference. Just some administration before we get started with the presentations. Just a reminder, Randy and I chair this special interest group. You can reach us at this address: sig-routing-chair@apnic.net. The mailing list is sig-routing@lists.apnic.net. We don't have any matters arising from the previous routing SIG meeting, so we can pretty much get straight into the presentations. We've got three presentations today, plus I suppose Geoff Huston's 'BGP - The Movie' to finish off the SIG at the end of the afternoon. Gert Doering is not here, obviously, so I'll be doing his presentation, hopefully, on his behalf. Otherwise we will make a start with Tim, who will be talking about BGP wedgies. TIMOTHY GRIFFIN: OK, thanks. This talk is really an informational talk about a class of BGP anomalies and it's more of just a description, so that particular operators might recognise the situation if it ever occurs in their network. But in general, it's just - you know, BGP is this beast that has evolved and we occasionally have to deal with some of the anomalies that are caused by its lack of design. This talk describes one class of anomalies. So the class of anomalies I've given the term BGP wedgie - it describes bad policy interactions that cannot be debugged. That's a little bit strong, because as we'll see, if operators talk to one another they may be able to straighten out the situation. I have a formal definition here of what a wedgie is. First of all, when you look at the BGP policies, they make sense locally. I mean, if you look within any particular autonomous system, talk to the operators, they will explain their policies, they make sense. The interaction of the policies globally, however, may allow for multiple stable routings. Now it's interesting that quite a few BGP gurus are not aware of the fact that you can actually get very distinct stable routings with the same BGP policies. That is, BGP doesn't have - in general, you don't have a unique routing associate with a set of BGP policies. In practice we often do, but the protocol certainly doesn't guarantee it. So you either get one unique routing or you get multiple routings and which routing is chosen is sort of non-deterministically determined. You can also have zero solutions or zero routings and that's when we have protocol divergence I'm sure some of you have heard of the MED isolation problem. That arises when there is no stable routing that satisfies the routing policies, so the protocol just exchanges messages forever. So we have policies that make sense locally. They interact globally to give multiple solutions. Then, the important thing is that some of those solutions may be consistent with the intended policies and some are not. The problem is when a routing is installed, a stable routing, and it's not intended. Well, the only way to kick the system back is through some sort of manual intervention. OK? And just these three conditions I'll call a 3/4 wedgie. This is not quite a full wedgie yet. What really makes a full wedgie is when the policies are distributed across multiple domains and no one group of network operators has enough information to debug the system when it falls into an unintended solution. So that's a crucial thing. Now exactly what that means, that's pretty vague, I'll admit, but you get the idea. Let me give you an example. This example was abstracted away from a much more complicated situation that arose at a large service provider in North America that I used to work for. So I've simplified it here quite dramatically. When you think about this example, you've got to think about a much more complex situation where people are trying to figure out what is going on and trying to sort out is it this, is it that, is it the other thing. So I've done all that simplification for you. You don't have to worry about that. Here we have four autonomous systems and this customer-provider relationship that I've indicated there. So, for example, autonomous system three here is the provider of autonomous system 2 which is the customer of AS 3. These two top levels ASs have a peering relationship. So we have a customer that wants to implement a backup link to autonomous system 2 and a primary link to autonomous system 4. What it's going to do is implement that by sending a depref me community up to autonomous system 2. These are fairly common now. Autonomous systems publish their communities to their customers usually and customers can do things like control the scope of their routing announcements or the preferences of their routing by sending the appropriate community value. Many sober network engineers resist the temptation to implement these, but the marketing people insist, so they implement them. So the customer here in AS 1 is going to use that to implement the backup community and AS 2 is going to implement that in a route map. Essentially you get a route with that community and it's going to assign a local preference that's below that of its upstream providers' routes. So here is the intended routing in that scenario. That is, the AS 2-AS 1 link is intended only for backup purposes if the AS 4-AS 1 link goes down. The intended routing is that traffic should flow from AS 2 through AS 3 and over to 4 and down to 1. Does that make sense? OK. But there is another solution to that set of routing policies and I'll call this one the unintended solution or the unintended routing, I'm sorry. That is, AS 3 says, "Hey, I love customer routes. I love them more than peer routes. And if AS 2 gives me a route and I have a choice between AS 2's route and AS 4's route, well, I'll take AS 2's route." Once AS 3 has that route in its hands it's never going to tell AS 2 about AS 4's route, so AS 2 has a route that has a depref me community on it, but it's the best route it has, so it's stuck with it. So if you look at the intended solution there, the reason that's not happening is because AS 3 only hears about one route to AS 1. It hears about the one through its peer. It doesn't hear about the route through its customer. So which one of these gets installed? Which solution, which routing, gets installed really depends on the non-deterministic exchange of routing messages or various kinds of things. If you install the intended route - the intended routing, you can very easily kick it over to the unintended routing just by bouncing the BGP session between AS 1 and AS 4. So if it just goes down temporarily or goes down as a failure, when it comes back up you'll find yourself in the unintended solution, the unintended routing and you will stay there until you manually fix the problem. So the unintended routing is reachable just from this failure mode. The other thing here I want to say is the intended routing would be the unique solution or the unique routing if AS 2 translated its depref me communities into depref me communities for its provider, AS 3. Then AS 3 in that case - if we look at the unintended routing, that wouldn't be a stable routing because AS 3 would say, "Hey, I have this backup route and I have a peer route. I prefer the peer route." But I know of no service provider that does that kind of translation. You can just imagine the mess. For example, AS 2 perhaps has 13 upstream providers, all with different sets of communities. Well, maybe some of them don't even support communities in this way. Such as Deutsche Telecom. So this is a real problem. It does happen in practice and one of the difficulties I think I want to get across here is that people just don't understand what's happening. When I saw this happen at AT & T - whoops! - it was basically, "Well, it's a bug in Cisco, it has to be a bug in Cisco." Right? But it's not. It's a bug, if you will, in the protocol itself. And I think as people start using more and more complex routing policies with more and more interdomain signalling using communities and more and more fancy routing, I think we're going to see more and more of this. So how do we fix it manually? Well, suppose we're sitting here in the unintended solution. I can bring the session down or filter out the prefixes involved, whatever, on that session and then bring the session back up and I kick the system back into the intended solution. Now you've got to have network operators that understand what's going on here. And usually if they do realise they have to reset the session again, they're most likely going to think, "Well, I just cleared a bug. There must be some bug, that I reset the session." But it's important to realise this is just inherent in the beast. So it requires manual intervention, which is not good for a dynamic routing protocol. It can be done in AS 1 or AS 2. That's why I'm calling it a 3/4 wedgie. It's not quite a full wedgie because one group of autonomous system administrators could figure this out. Question back there. RAM NARULA: I don't understand - in the chart on the previous page on the left side, if the backup fails it would go in the unintended? If the primary link fails? TIMOTHY GRIFFIN: If the primary link fails - well, let's go back to the intended solution back here. If the primary link fails, then we'll have traffic here. When it's brought back up we'll end up in this situation. So the customer will call up the help desk at AS 2 and say, "Hey, I fixed my primary but you're still sending me traffic. Why are you doing this? Stop it." RAM NARULA: Is it due to the cache? TIMOTHY GRIFFIN: No. It's due to the fact that routing policies are very expressive. We're not constrained to shortest path routing. There is no guarantee that there is a stable routing. There's no guarantee it will be unique if there is one. RAM NARULA: If there is a prepend, won't it come back up? TIMOTHY GRIFFIN: That works differently. In this case, it won't work at all, because AS 2 is likely to prefer customer routes, no matter how long the prepend is. That's precisely why they're going to use communities in this case to implement the backup route. RAM NARULA: Thank you. TIMOTHY GRIFFIN: It's not good in a dynamic routing protocol to have to actually intervene. The situation could get a lot worse. I'm going to make this example symmetric in a case where instead of backup to primary I'm going to do load balancing, so AS 1 here has prefix P1 and P2 and it's going to use this link as a primary for prefix P2 but a backup for prefix P1. Then it's going to do exactly the opposite on that link. It's going to implement this with communities AS 2 up to AS 5. You see the problem. The problem is any kind of kicking the system back for P1 is going to cause you to get wedged for P2 and vice versa. So remember in that example if I just bring the primary down and bring it back up I'm going to get stuck in an unintended solution. This is one of those things where I actually have to bring down both sessions simultaneously, and then bring them back up, or filter out both prefixes from those sessions and then bring them back up. I've talked to a lot of network operators. I don't know many who can handle this situation. Imagine in particular that the left-hand link is in New York and the right-hand link is in Tokyo and you've got to coordinate this between different groups within your network - within your engineering organisation. Randy, question? RANDY BUSH: It gets even worse. Imagine the engineers actually found out what was going on and calling the network operations centre and telling them they have to bring down a working link. GEOFF HUSTON: It doesn't happen! RANDY BUSH: Yeah, we need a laugh this afternoon. TIMOTHY GRIFFIN: The case where I actually saw this happen first was exactly in a load balancing case where there were many other prefixes coming from the customer, not just the ones that were causing problems. Let's go on to a full wedgie example. What I've done here is just cooked up an example where I've tried to make a little delta change to that example we've already seen so it doesn't get too complicated. I've just added this autonomous system that has a peering relationship with AS 2. It also is a customer of AS 3. And I have now two backup links and one primary. So it's not that much different than the example that I just showed you. I have just added a little bit of complexity. Again AS 1 sends up depref me communities to AS 2 and AS 5. AS 2 implements it in the way that I said before, except now we have to make sure that it prefers that route less than any route from peer 5. Now here's where things get a bit funny. So AS 5 implements the depref me community. It says, "Well, I'll put your preference between that of a peer and a provider." That is, when I depref you using this community, I'll prefer peer routes but not provider routes. So it's just a matter of where you are in the pecking order with this depref me. You see a lot of this going on. There is no consistent implementation of these communities between providers. So that's the way that's implemented. Now, here is the intended routing. Again it's just like we had before except this guy's in there going up to the customer. So, boom, unintended routing - again just like before. This guy's coming down through its customer. This guy just is hopping over to a peer. But it looks a lot like it looked before. Now the problem is - what's different about this example is the recovery is not going to be simple. So suppose that I'm in the unintended solution here and I take that link down between AS 2 and AS 1. Well, I'll bounce over to this other routing where suddenly I have traffic coming in from AS 5. So remember on the left-hand side I'm a customer, I say to my provider, AS 2, "Hey, you're sending me traffic. My primary came back up. Will you stop it." So we decided to reset the session. Guess what? We bounce over - the other provider is suddenly sending me traffic from upstream. Why are they doing that? And so you bring the link back up. So if you bring the AS 2-AS 1 link up and down, you'll bounce back and forth between these two unintended routings. What you have to do here is you have to reset the AS 2-AS 1 and AS 5-AS 1 sessions simultaneously. So in other words you've got to get cooperation from two service providers and one of the service providers - AS 5 in this case - is going to say, "That route isn't even my best route. You're asking me to reset that session. And I'm not even learning a best route on that session. How could that possibly be. "Go away. Don't bother me. I have other fires to attend to. Stop annoying me." Suppose you convince them to do the right thing, you can bring both sessions down and then bring them back up and you're back to your intended solution. Now this example I've given using communities and customer provider relationships, but I don't mean to imply that that's the only place they could occur. So I've just sort of invented an ISP here or it could be a corporate Internet for that matter. By the way, well-kept secret, many large corporations use BGP as an IGP, so this problem may appear there as well. Actually it's more likely to appear in a situation like that, because you're less constrained by things like customer-provider relationships. Suppose here this is my corporation split into five autonomous systems. So I'm just going to say here they are again. It's the same example. Although perhaps now I'm implementing those policies using - maybe I'm using MEDS, maybe I'm using communities, maybe I'm using other sorts of hard-wired preference values. The point here is that same kind of problem can arise in sort of traffic engineering within one happy family of autonomous systems. So what do we do about this? Well, I think the first thing to do is be aware that there is a problem and it can be difficult to debug and it has the characteristics that I outlined. It's a bit like the MED isolation problem. It's nice to know it can exist so when you're seeing strange things in your network you can say, "Hey, perhaps this is a MED isolation problem." It could be something else but at least you have a list of things you can check. Here it's a little bit different, because it involves just the very definition of it, it involves knowing things that your neighbours know that you don't know. So it does imply that you have to talk to your neighbours a little more than perhaps is currently happening. The other thing is that interdomain communities need to be thought out very carefully. So, for example, when people are - I actually think it would be a good idea to standardise some of these depreffing communities so we could actually talk about consistently implementing them across autonomous systems. One of the big problems here is that for these very commonly used communities depref me in some way there is no common way that they're implemented. The differences between implementation can cause these problems. Finally, there may be some way for a small set of configurations - let's say configuration files within a small handful of autonomous systems, half a dozen, a dozen, there may be techniques to actually automatically discover the possibility or potential BGP wedgies there and that's something I've been thinking about and just started working on. In theory, it's very difficult, because you essentially have to enumerate all possible paths in the system, but for smallish networks that's actually do-able in a reasonable amount of time. So perhaps there's some help from tools here. The other thing I'd be interested to hear from operators that may have encountered problems like this, but maybe didn't have a framework to put that problem in. So I'd be happy to hear about comments. Questions? RANDY BUSH: Geoff, are you considering a task force draft on a single set of communities for pref so translation upwards is unnecessary? GEOFF HUSTON: You're asking me this in the context of the GROW working group? RANDY BUSH: The Internet vendor task force. GEOFF HUSTON: There is a draft attempting to standardise some number of communities in BGP, but the use of associated values with those communities is not being standardised so that you still get the issue of, well, rather than saying depref me in 10 different ways, you can say depref me one way, but trying to standardise the actual preference values is actually not on the agenda right now, Randy, no, and there are no drafts around in that area. So nothing is being done at the level you were suggesting, Tim, that it would need to be standardised at that level. TIMOTHY GRIFFIN: Perhaps what's needed is not so much the values themselves but the relative values of different classes of routes. RANDY BUSH: No, because then I wouldn't have to translate. TIMOTHY GRIFFIN: I mean to say that these different - I know at AT & T you can depref yourself below a peer or below a provider and what are those classes and how are they - you don't have to actually say 100 or 90 or 70 for a preference. You can just say relative to this other class I'm lower. GEOFF HUSTON: And that is not in the standardisation agenda right now but certainly could be. There's no work at that level yet. TIMOTHY GRIFFIN: Another question. SHARIL TARMIZI: It probably is a dumb question - I'm probably the only non-techie in this room. But have you known of any incidents where people have used these kinds of situations as a competitive or anti-competitive measure because you do not want them to route the traffic in a particular way because you do not want your competitor to get more? TIMOTHY GRIFFIN: I don't, but I know of providers that provide these communities because they're forced to because their competitors will provide them. And that's the only competitive situation I can think of here. GEOFF HUSTON: This is one of these instances of precisely what are you trying to achieve and how. BGP as was originally set up had this really coarse matrix around AS path links and the system basically preferred the most specific prefix with the shortest AS path link and that was the end op. Then folks said not good enough. I want to bias my incoming traffic based around things other than raw AS connectivity. So we got into AS path prepending wars and AS poisoning wars where you were trying to manipulate the path with traffic engineering outcomes. Then along came the use of communities to try and increase the language in BGP to allow you to do a whole bunch of things beyond simply the AS path. Now you've got this highly expressive language but no standardised way of talking it and then you push this out to the community. So where we got into with the depref me stuff was normally ISPs want to minimise their spend so you normally prefer customers over peers over transits. But to attract customer money - in other words, maximise your income - you generally have to give them the greatest amount of flexibility you can as a provider, so all of a sudden you expose your internal widgets to those customers saying if you will direct me to do anything you want this is my advantage as an upstream. All of a sudden now you're getting this rich expressive language being used in a transitive way. You're getting highly complex community mappings coming through because the communities exposed to you aren't the communities my upstreams may use - and then you're saying strange things may happen. TIMOTHY GRIFFIN: That's right. GEOFF HUSTON: Cool. PHILIP SMITH: Thank you, Tim. Next up we have Randy who will be talking about happy packets and other things. RANDY BUSH: As you can see, Tim is a co-conspirator on this and I can't keep my dates straight. So much for that. There's a central question which we're asking, which is what is the relationship between the control plane - that is, routing - instability and the data plane - that is, forwarding of payload packets. And the related question is is the quantity of BGP updates good or bad. Who wants to see zero BGP updates? That's static routing. We know that's not going to work. Because we frequently hear comments and read in the press and so on and so forth that the Internet routing is fragile, the Internet routing is collapsing, it's going to hell, BGP is broken or is not working well. On Wednesday there was a bad routing day on the Internet. Just look at my graph. Or if we change the routing protocol it will improve routing. And we often measure routing dynamics in some fashion - like the number of updates - and say that some measurement is better or worse than another. And we are also told that a lot of BGP updates is bad as in instability. And there are too many BGP updates or BGP must be broken. In fact, I would suggest that BGP announcements are like white blood cells. Their presence signals a problem. When you have a high white blood cell count it's telling you the body is producing these things to fight an infection. But the white blood cells are not the infection. They're part of the cure, not the problem. So once your routing announcements tell you something's happened, they're just trying to help the peaks get around it. This is a log base 10 scale. We see this and we get told this is a major problem - the Internet had a very bad day - because they saw a lot of prefix announcements. In fact, these people are measuring big events. I'm going for some detail single announcement events. This isn't too well connected with this graph. But we're seeing things like that. I would stay instead let's talk about routing quality and what is good routing. How can we say this measurement shows routing is better in A than B unless we have a metric. And it's not the number of prefix or speed of convergence et cetera that are the measure of routing quality. We contend that the measure of routing quality is how well it controls the network so that the users' packets - the payload - reaches the destination. I'm an operator. Did I deliver the packets to the customer or not? So if the users' packets are happy, the routing system and the other components in the network are doing their job. So we call these happy packets. We have well-known metrics for happy packets - delay, drop, jitter and reordering. We have easy ways to measure so we set out to measure how the control plane is correlated to events on the data plane. When I say that we don't care if there are a lot of BGP announcements if the data gets there, I would like to qualify that a little, by the fact that if there were so many announcements that the routers were getting loaded by the control plane, then there's a problem. But as long as the BGP chatter stays below Moore's law as the Internet scales - in other words if the BGP chitchatter does not increase more than double every 1.5 years, then the hardware is going to keep up with it. So we decided to conduct an experiment. And we have a BGP beacon - I'll tell you what it is in a minute - and we stream packets to it from all the Internet and we record the packets while BGP changes. Let me give you some details. A BGP beacon is a prefix that is announced and withdrawn at a known time. Here's a BGP beacon announcing this prefix to the global Internet. It's a dual-homed beacon. It doesn't announce it for two hours, it announces it for two hours, it withdraws, it announces, it withdraws. But because it's dual-homed, it's got two providers - provider one and provider two. Here is the announcement to provider 1. At 2:00 in the morning it changes to both. At 4:00 in the morning it goes back to 1, et cetera, et cetera. Think of it as I'm simulating a dual-homed enterprise losing one of its links and the link comes back up or the other link goes down, et cetera. So we're simulating real events on the Internet. And we stream data from something called PlanetLab. It's about 370 nodes and you can run experiments on all these nodes. We stream the data from all those PlanetLab nodes towards the BGP beacon. As the beacon changes its announcements and we capture those data streams and as it changes its announcements we look to see what happens to the payload data, the actual customer data. Here we have the announcements at time 0. This is delay on this axis and drop and reordering on this axis. So we have no drop, no need for reordering and no change in delay. Well, when we went from dual-homed and we lost ispA* - well, in fact that's because this provider always preferred - this customer always preferred ispB. Since we didn't drop *ispB the packets are travelling the same path. This will be half all the measured cases on the average. Here we have the same thing - we're going from dual-homed, we drop B. So they get dropped, the delay goes in here and I think even some scatter reorders happen. This is for about 30 seconds, I think. We can expand it and see what happens - no, it's about 45 seconds - oh, but it starts 15 seconds after the event. And we see this gap where the packets actually don't get there. So the customer loses a link and for 30 seconds packets don't get there. We also see some interesting cases where they're in the gap there are these islands where things are working. In fact if you look here they're working most of the time, though there is some drop and some jitter. But packets are getting there a lot of the time. Why there are these islands of instability we don't know yet. We suspect intermediate autonomous systems are converging. How's that for smoke! Here's a bunch of them. Just looking at the delay for a bunch of different sources. The delay is pretty noisy in the first place. The event happens - you know, so this guy gets a lot worse on the new patch which is over to it in a few seconds. This is a close-up of that - not very interesting. Here's another anomalous situation. Things were slowly getting better all the time. So probably the source node was sick in some fashion and carrying congestion and the congestion may have been going down. But notice that, you know, this guy gets very noisy after. So his old path was good. His new path is worse. Then we have endless of these graphs. And we decide what we want to do is see if there is a relationship between the amount of loss of data packets and the amount of BGP announcements. In other words, is BGP noise correlated with the data loss? So this is the CDF of the number of updates and the duration of packet loss during a transition from both days - in other words these. It doesn't look very exciting. isp*As lost, about the same story. Here it is going from just ispB to both - in other words A is recovering. So on and so forth. So we're sort of plotting duration of BGP announcements - how long the noise went on and packet loss. If there was a correlation, we'd expect to see a line somewhere like this. We don't. Same thing for the BGP updates - this was on a transition to A. Here's one on a transition to B. Same uninteresting news. Here's one with BGP duration again and aggregated. It's all beacon events. Again, no patterns on the diagonal, which is what we were looking for. So it's very hard to say that what we see is a correlation between the number or duration of BGP announcements and customer data loss. There was that anomalous guy whose stuff bounced. We went after it. It's some sort of radial link that he's transferring to. There he is a little closer. Here's the nonsense. Just anomalies we've seen. There's another way of looking at it. The sites prefer B so we drop B. We go from AB to A. There's Germany - during the routing, change packet loss gets worse. Russia - packet loss gets much worse. Toronto. Australia - it's better when they lose their provider. They're upside down there anyway - who knows. MIT, Berkeley, Poland, Texas. I don't know what you do down there, Geoff. OK, same story for transferring the other way - in other words, they prefer B, so we recover B. Prefer B - go to B only. So if they prefer B and you go to B why are these guys getting a something in that routing change. It goes on like this. Here is a correlation between loss rate and AS hop count. It doesn't look really exciting. The next one I think does, yes. Router hop count and duration of the BGP announcements. Now this figures the more routers in the chain the longer it's going to take to converge. OK, that one is as little as we expect. So this seems to say that distant sites experience more loss. There is a correlation between a site's routing preference and the type of transition. In other words, if they prefer A they'll have higher loss when they go from AB to B. If they lose their preferred provider they're in trouble. The correlation between loss rate and AS or router hop count is very weak. At some sites the loss rate during normal periods is higher than during routing change. Some references, et cetera. I just note sponsors, University of Oregon, National Science Foundation, and so on. Now I'm going to subject you to just one more presentation that I'm sneaking in on you that you didn't expect. I made a presentation I think at the last one - again by the same nefarious crew. This is a year ago I did it - so it's two of them in Seoul. I told you about BGP beacons and we have one large international ISP - we have all their announcements for a single-homed BGP beacon. And we discovered that when we made an announcement things looked simple, even if it was multi-homed. But two hops away in Chicago we saw route oscillation due to MEDS so for one announcement we saw four and in some circumstances in Chicago for a single announcement going from null to putting up ispA and ispB we saw 41 events. So we started working with that isp and looking at that data and here's the different colours and their different routers - and look at all that noise. 25 things for one event, et cetera, et cetera. This is the event of going from - the beacon just turns up and says announced by B. So they made a simple change and got improvement. The change they made was just - if you remember the talk from a year ago we believed the problem was in part due to multiple vendors and their different buffering over time before they'll announce a route. So they pushed a little delay into one of the vendors and greatly reduced the noise. And no serious delay. They barely touched the router and got this. So we can help you. Questions? Answers? RAM NARULA: Are these issues with IPv6? RANDY BUSH: No idea. GEORGE MICHAELSON: I think it's a reality check statement to get back to the data plane and talk about the good book. It is astonishingly obvious - it's that a simple truth will always be the case. But there are also interesting analogies in the biological sciences. People are very fond of saying Darwinism, evolution, is maximally efficient. But that's completely untrue. There are lots of inefficiencies in natural evolution. It only has to be good enough, it doesn't have to be perfect. Of course it's better to have good routing, but good enough is good. I think it's a good context to remind people of. RANDY BUSH: I don't have to run faster than the lion. I only have to run faster than my friend. PHILIP SMITH: Thank you very much to Randy. Now it is my turn to pretend to be Gert Doering. (Thanks hosts and sponsors) Just remind you about the onsite noticeboard, if there's any up-to-date information it will all be available on there. The opening plenary is available in the archive already and on the APNIC18 meeting web site. MyAPNIC demo is on all day at the help desk area just outside this room. RANDY BUSH: And it's really good. PHILIP SMITH: And it is really good! The help desk is also available, come and have a chat with the APNIC host masters when you're available. If you haven't had a look at the MyAPNIC demo please go and have a look at it. Take your opportunity to go and speak with APNIC host masters, indeed any of the APNIC staff. It's quite rare to see them out in this part of the world so take that opportunity that you've been offered. Gert Doering couldn't make it from Germany. He did want to come, but supposing escaping the German autumn to come down to what we call it here - partial summer. He's done IPv6 writing table report for probably the last three, four years, mostly at the routing working group or maybe the IPv6 working group at the RIPE meetings. Don't remember which one but I've seen most of his presentations. I'll give this on his behalf. Please direct questions towards him. His email address will be up at the end of the slides. General overview, have a look at some of the numbers, some pictures, graphs, trends, things that should be there, some conclusions and some recommendations. The presentation is on the APNIC web site. He's been updating it every day over the last few days. I've given the latest one to APNIC so hopefully it's actually up there, failing that you can get it from that URL. As of the 30th August, in other words three days ago there were 457 AS numbers visible. If you compare that to 421 a few months back, so steady growth of AS numbers is happening. Of that 2812 are origin only. 164 are origin plus provide transit. And 12 transit-only ASes. As a side note, those of you who see my routing report will see that I quote some of the statistics, so Gert has been trying to copy the same format so you can compare between v4 and v6. v6 has a bit to go. There is a mixture of RIR and 6Bone space being announced. 300 ASes will originate 1 RIR prefix. 44 just announced 6Bone prefixes, 46 announced both from the 6Bone and RIR. 27 ASes originate 2 RIR prefixes, 10 will announce /32s and 35s. Those of you working in v6 space will remember that the original initial application was a /35 and those who had that got a free upgrade to 32. Some are still in transition between the two sizes. 28 ASes are announcing more than that. The largest one is an announcement of 56 prefixes which is quite amazing. 14 ASes still announce the prefix has a 32 and a 356789 all these paths are observed from AS 5539. Which is I believe the space of the system in Germany. So why are people announcing two prefixes? The first one is a 6Bone to RIR migration, some people started off in the experimental IPv6 Internet so they got this 3 ffe address space. When I suppose real IPv6 address space appeared these people migrated or starting to migrate over to the 2001/16 address block that's used for that. So the example he has there is Cisco, we announced both our address blocks and are transitioning out of the 6Bone space. The next one - the migration from /35 to /32, some people are announcing the 35 they originally received from the regional Internet registries and they have started announcing the 32 as well, the free upgrade so to speak. Then we have experiments and or leaks, possibly. This example originated by AS 17382, the sending out the /32 but they're also announcing two sub-nets of this, two /48s. The next one - the multi-uplink or multihoming experiments. There you see three /48s appearing. The final one - mergers and acquisitions - or something. Or even different business units of the same company, AS 3303 is announcing two /32 address blocks. So if we look at the number of prefixes received, /16s we see one and that is the 6to4 address block. Which I guess we would expect. There is a single /20 which is coming out of the RIR address space. The 38 /24s coming from the 6Bone address space. We see a single /27. 38 /38s from the 6Bone address space. The 6Bone started off handing out /24s for the experimental network then down sized that to /28 which is why we see the two address space sizes from 6Bone blocks. As for /32 we see 363 from RIR space, so those are the allocation that the registries are making. We also see 31 out of the 6Bone address block. /33s we see two. /35s we see 40. 36s to 39 nothing at all, but we see three /40s. Three - address blocks for 41 to 45 and 101 /48s. /48s is the I suppose the minimum address space that's assigned to an end site. Then smaller than that, 52 to 60s, nothing at all. Single /64, then nothing from 65 to 128. The 64 prefix is basically an anycast prefix so we expect to see multi-origin. ASes. Many autonomous systems, service providers and others will provide this facility. If you look at the list there, Gert sees about ten or twelve 6to4 prefix blocks being announced - from ten or 12 autonomous systems. There has been some research done on non-publicly visible 6to4 relays. David Malone reckons there must be around 34. There could well be there. There is some more specific prefixes from the 6to4 block being announced, that's predicted by the RFC 3056. If we look at the routing table growth. Over the last 36 months you see a steady increase like that. One or two jumps, obvious jumps here and that there which I'll talk about in a minute as well a big hole here, he doesn't know what's caused that, probably a breakage in the feed. So it's a steady growth. Comparing the RIR address block to the 6Bone prefixes over the last 36 months. The 6Bone is the pink or red graph there. It was incrementing slowly. It reached a peak about this point and has been decrementing there. We see this steady decline there. Where as the RIR space is showing quite a steady increase. Over the last four months while some of the notable events, we had a leakage here from AS 17832 that started announcing 19/48s, for good measure a couple of months later they announced another 30 just to make the table bigger. I would guess somebody's been looking at the slides because yesterday they pulled it out again. So the 50 /48s disappeared from the IPv6 routing table. Apart from that there's nothing really spectacular happening. comparing /35s to /32 s. The number 35 if you look before the graph it was going quite steadily then the new size was released so to speak and the No. 35 started dropping steadily. It's still hanging in there, around 45, 40 announcements. Whereas /32 size has been showing a fair steady increase. There's still quite a few /35s visible and I guess the move is on to remain the remaining people to stop announcing the favour of the /32s. Some numbers - 684 LIR blocks allocated out of the 2001 address space. ARIN have 120, APNIC 169. RIPE 385. LACNIC 10. As of end of last month. That's a bit of an increase compared with four months ago where - sorry, three months ago where it was 595. You see the steady growth. The RIPE region has got the lion's share. Some of the root servers have IPv6 addresses already. Some are visible on root servers.org, some registered on the BGP. Line 382 visible, 684 allocated. There's a bit of a gulf there. There's some very large allocations seen, those are listed. NL-Benelux, Vectantnet in Japan, and Ataconet in Austria. Looking at the graphics, by regional Internet registry region, so the RIPE region has the black graph here, you see the steady growth in the RIPE region announcements. 6Bone you see the general decline there. The red line is APNIC, you see the jumps from the leakage of AS 17382. At the bottom, ARIN, not quite the bottom, then LACNIC right on the floor. Take the right graph and look at the specifics by country there. You can see again the general growth. I guess Germany's announcing the most prefixes. There's an AS 1654. But the most noticeable one of course is the - an EU-wide provider, it's my former employer. They leaked a huge amount of prefixes back at the start of year. So you see that spike there. If we look at the APNIC region, Japan is announcing far more prefixes than anybody else. You see a way up here. If you look down there's some other ones. Gert said to me privately maybe this was a bid by Korea to catch up with Japan in the announcement stakes. The remaining countries around about 110 prefixes. If we look at the allocated space versus the routed space. To explain this graph - this is the number of prefixes that are routed or allocated, so the blue is what's routed and the light grey is what's been allocated. These are the actual /24 blocks out of the two 001 space. The first two, the /22 went to APNIC, the next two to ARIN and so forth. Again you can see what's been allocated versus what's actually been routed. So about 50% there. Interesting observations - he calls it ghost busting. There's some interesting prefixes. We see floating around. This one's probably the most interesting. A ghost is basically caused by a BGP withdrawal bug. There's still quite a lot of old, I daresay development and buggy software that's still being used in the IPv6 network. These paths stay mostly unchanged for weeks and weeks on end. You kind of track it through the network and you still see routers hanging on to paths that they don't hear any more. This is at most noticeable one that's been around for a long, long, long time. You sometimes get accidental hijacks where people have finger trouble or maybe they don't understand hexadecimal, this was one - a /32 originated by AS 3292 was accidentally originated by AS 29657. That was fixed pretty quickly. It's caused by static route and people redistributing that into BGP. Some more interesting observations - leaking of Martians. Network 1,000/8 then some subprefixes of that. Gert reckons it's caused by some problem within probably an - network of some sort and buggy software. It's been the only documented leak since a-way back in late 2002, by this time 2002. So in that sense it's pretty good going. But it does show there's potential for improving BGP filters. Fourth one - weird AS path leaks. I suppose Gert put this one in because it show this is one transited AS 5539 which is the one Gert manages and he's quite careful with the prefixes that he permits through his network. The ghost buster flagged this one as a ghost. The problem was really caused by an unlimited prefix distribution. In other words, a leaf node, AS offering a full BGP feed to both upstreams and upstreams accepting it. In the early days of the 6Bone pretty much everybody appeared with everybody else and appeared with everybody else and the 6Bone mesh, until quite recently was quite a mess. There's been a quite a lot of effort to try and prove some of this so the edge of the network doesn't try and provide transit back into the core and so forth. Fifth one - invalid AS numbers. We see this in v4 land as well. AS 4555 - I don't know who owns it but it comes from the exchange point AS block, announcing a private AS to the P6 Internet. Seemingly transiting MIT's AS number as well so maybe it's a problem with MIT or maybe it's a problem with whoever's running that particular router at AS 4555. Private AS numbers should not be announced worldwide. So there's only really this one left. It looks like an So news - the 6Bone, in other words the 3 ffe /16 address space is going away. It finishes 6th June 2006. Private an unallocated AS numbers seem to be out of control but the ghost routes are appearing again - appear to be under control. The early IPv6 networks starting to deteriorate again. There's quite a number of unsolicited full transit links. But people actually looking closer at it, they're looking at trace routes, making some effort to try and fix things. Overall structure's reckoned to be improving quite a bit, going towards production quality. In other words the v6 path is no worse than the v4 path. If you've been following the v6 progress over the last three years or so, you'll see really bizarre situations as best path from one country in Europe to another country in Europe would be via Australia or Japan or something really weird. Quite a lot of people, especially in the core of the 6Bone have made a bit of effort to try and get rid of some of this highly suboptimal routing. Also the US region is catching up on allocations but still lagging far behind on actually advertising routes. Where to from here? Needs to be more work on filtering recommendations, more work on routing best current practice recommendations at the RIPE routing working group or whoever else would like to do the work. There's still much cleanup needed to be done. Bad tunnels, filters, unsolicited transit relations. I'm probably as guilty as anybody else because I do run 6Bone Cisco node and I've got lots of tidy work to do that there as well. Bug your upstream providers to offer native IPv6 upstream. Keep an eye on trace routes to find out which ways packets are travelling and try and get rid of the stupid paths, in other words paths crossing oceans to go across the street, and consider de-peering non-useful peers. Is it useful peering with someone across the other side of the world. Talk to your peers and help them fix stuff. Speak to your actual IP person to person. IPv6 routing recommendations - the MIPP project says no peerings over bad tunnels, get rid of high RTTs, third parties. Apply incoming prefix filters to peers. Filter private ASes and overly long paths. Don't give unrestricted IPv6 transit to peers unless asked to. I tend to find most people we set up tunnels to will only give their actual prefixes which is much better than what was happening a few years back where you just got the lot and try not to take IPv6 transit from too many upstreams and avoid taking a single upstream over intercontinental tunnel. Those are the references. So the ghost route hunter, quite an interesting web site. If you're running v6 node you could join the ghost route hunter if you're interested. Merit 6Bone routing report. The list of IPv6 blocks allocated by the RIRs is listed there and so forth. Questions should go to Gert at Gert@space.net. I will try and answer some of the questions. GEORGE MICHAELSON: I have about six questions, so it might be better to defer to Randy to get one out of the way. RANDY BUSH: Could you go backwards two files, please. I work for a large IPv6 Japanese ISP and no peerings over bad tunnels, we operationally define bad tunnels as " All tunnels". If you're trying to run a real network don't peer over tunnels. PHILIP SMITH: Get your upstream to provide you with native IPv6. GEORGE MICHAELSON: These are not all questions, these are different disaggregated, disrelated observations. The 6to4 count seeing ten to 15 and probing and discovering 42 that's an amazingly useful information point for those of us in the midstream of looking at deployment of 6to4 reverse - hint, hint, Doug, the total number of activity here at this time in this network is low. So if we're considering risk reward issues here it's ten. However, set against that I recently decided to go and think about Teredo a bit more which is not 6to4 but its ubiquitous, it's in every SP2 release-type technology. I think it could be very interesting if there is a measure to measure Teredo distinctly because I think it is currently lurking in 3 ffe space and so there is potentially an opportunity for us to find something out about this. And get some better sense of what is going on in that space. That was an observation. The tunneling observation - I think there is a question here for people interested in mobile IP. I had tunnel vision, I've talked about people saying we should run the entire network as an overlay because the real net could provide addresses, there is no address exhaustion. These kinds of pint-of-beer discussions. But the observation that went in the real world when up run tunnels, tunneled IP like it is significantly worse than VPN tunnel. This has got to be thought about for what people will experience when we start to do layered IP behaviours on a global scale. Geoff is shaking his head but I think there is a real world experience here that needs to be thought about. Before he comes and rebuts there was another one that mattered for me. What was it - that's right. There are going to be some significantly large assignments and allocations breaking the /20 horizon by a country mile. It would be interesting if you made some measure of the disaggregation they have to do to get under the radar rather than over the radar because they get the ten or whatever it is, what do they have to announce to get through people's locked in, reject this, it's junk prefixes. This will be interesting. The last one - not all 35s are historical. If you have a 32 but you want to disaggregate it you can't announce what you really want to, the 48s because people filter. But the 35s gives you three bits of wiggle room. You have 8 virtual sub-nets of disaggregation that anyone can do for free. I would argue that is enormously useful and we should not seek to say you must only announce your 32. We should of course say put it in a routing registry but that wiggle room is useful. I use it. GEOFF HUSTON: I was going to rebut one thing which is about tunnels and refer back to a presentation made by Ken Duro at an IPG meeting in March This year where he cross-correlated the performance of v6 through tunnels against the underlying v4 and found the RTT was basically the same as v4. He was pointing out a fair few of the tunnels he was seeing were remarkably cleanly engineered and didn't go out anti-routing. But I'm doing a secondhand report. I think the issue is - go look it up. GEORGE MICHAELSON: OK. Fair comment. PHILIP SMITH: Thank you, Geoff for that. And George for your questions - input more, If Gert is watching this or - he'll see the transcript afterwards, I think's all really useful feedback for him. I'm sure he'll put it in his updated report which for those of you who are going to the Manchester RIPE meeting will probably see it all there. So any other questions about this presenting at all, v6 status? If not, we will move on to the final presentation, a bit of fun, hopefully. BGP, the movie. Directed by Geoff Huston. GEOFF HUSTON: (Pause while presentation uploads) I should give credit where it's due, a lot of the heavy lifting is actually being done by George Michaelson. So it's work we have done together here. This is a quick status report on BGP. Then I'll just get into the movie as quickly as I can. There's the routing table, the picture since 1994. Snapshots taken every hour. What you see in the very left is the class v explosion, then you see CIDR-isation between 98 and 20 2,000 you see the boom. Between 2001 and about 2002 you see correction with the crash. Now you're seeing another growth again. It seems to be from all of the route views here at various levels. A single view - it's fairly obvious what's going on. The last 12 months in the IPv4 routing space something good happened in the middle of April, something bad in the middle of May. There have been some step functions as large amounts of routes appeared and disappeared from the table. The table is kicking at around 142,000 routes, someone just announced another thousand a few days ago, thank you very much. This is the address table span of the entire space. Much harder to figure out what's going on. Not everyone sees the same amount of address space in BGP. Most of the network is connected but some of the network isn't. I'm saying what's announcing into route views. There's a lot of qualification - (Question from randy) This is total reachability. So I included all the aggregates, the lot. So if you're low you're missing some address space that other folk are seeing. Really strange graph, actually. Let's take a single view. There are flapping /8s. This has gone down to about 3, from 4. One of them is I think InterOp's old ShowNet but I don't know what the other one is, that's the banding that's going on. The boom and bust happened a little bit later here. You saw strong linear growth to mid-2001. It has been growing again since early 2003 quite consistently. Our burn rate is still on average around 4 /8s, or in other words 60 million /32s per year is the current rate of growth there. Fragmentation of the routing table - on the left is percentage. So most folks see that around 50% of their routing table are actually more specifics of covering aggregates. How we got there was I think a bit interesting. They pushed up some more data. Most of the fragmentation and the address space happened between the Internet boom from 1998 till 2001 but it never went away. In other words, once bad things happen in the routing table they never go away. The routing table seems to never forget and never get any better. What appeared is some sort of rather strange behaviours during the boom to sit there inside the routing table and rot. Some folk do strong prefix filtering, some do weaker, but about half of the information in your routing table is covered by other entries. (Next slide) There was a strong exponential growth until around 2001. There's a variation, around 500 AS numbers. People do see in terms. AS numbers a different type of reachability. Lots of AS numbers so you'd think the network would be getting longer and stringier. It's not. As it grows it gets denser so the average AS path length with all those ASes being added is still much the same. So that the topology of the network is one where at any particular AS diameter there are more paths in there and more ASes, which is interesting when you consider that with routing protocols. Some very naive work about aggregation. Naive because it makes sweeping assumptions. We're currently carrying a little over 140,000 routes. If you preserve prepended AS paths and simply knock things together and ignore the fact there might be - you could take it down to a little under a thousand routes. That's a savage guess. It's probably not as good as that. But it does indicate a lot of the fragmentation that happens in the network is perhaps not due to traffic engineering but due to some other artefact and some is due to traffic engineering. if you do stricter and stricter forms of aggregation, so the line below that is stripping out the prepending, then I'd take out all AS path and go - if the origin's the same knock it out. then I look at chequerboarding. There's a savage practice out there of taking an allocation and you announce a subcollection of specifics but they're not actually connected. So if I actually aggregate over across wholes across RIR allocations you get better theoretic aggregation. Then I really push hard in terms of prefix length filtering. If you try to squash things down the actual information content in the routing table is around 43,000 routes without traffic shaping. In other words, just straight reachability. If you include reachability, if you include the sort of my traffic goes down there path not that path the information content appears to be around 100,000 entries. GEORGE MICHAELSON: Geoff, the slope of those lower lines are relatively consistent but all are a lower slope than the top one. it's not just that there's less of them but their growth rates appears to be consistently lower. GEOFF HUSTON: In other words, the amount of absolute fragmentation is growing. GEORGE MICHAELSON: But slowly. GEOFF HUSTON: But slowly. From Gert Doering you saw the IPv6 routing table. Here is the same as Gert saw. 17,000 autonomous systems in IPv4, slightly fewer in IPv6. The latest census I see is around 160. The aggregation potential, v6 has strong aggregation there and currently the community's small enough to be strong about aggregation. There's actually very little fragmentation happening there. Although on a relative scale it's probably about the same. Now I'll get to the movies 'cause I know time is short. We've tried to figure out where we've got to and why and how. So this is the first of these slides. There is a consistently colour pattern. This is a map of IPv4. It's in /8. Each column is a single bar of 16.7 million addresses. Five colours. If IANA still got that space and it is useable at some time in the future in the routing table, so it's unicast space it's coloured yellow. From 85 through to 127, top of the old B space, I believe there's 223 /8 kicking over there. If the RIRs have been allocated the block and it's still sitting in the RIR pool it's coloured red. So this is a space that's about to be used by the RIRs. Old space that got returned to the RIRs. When I say " Got returned" there's no delegation record in the RIRs. So some of this old class B space you'll see the red as well. If the IETF has reserved space due to a standard action, the top space classed D and E. Network 10, network 14, network 0, network 128. You'll see that in blue and probably if you look really, really hard - although I think there's a bug in the program, there should be some blue around there in the old class B space for 1918. GEORGE MICHAELSON: It shows up in the movie later on. - better. GEOFF HUSTON: The rest of the space, green and light green is space that's been deployed, assigned. It's out there. But not all of it's routed. If you had routing information, the dark green is space that I see in the routing system itself. The light green is space that's out there but is not being routed, so it's legitimately been handed out somewhere but I can't see it in the routing table. Here's some of the old historic /8s that got allocated directly by IANA. The whole lot that are light green. Here's the old class B space. A whole lot that's light green. Even the old C space, the older C space is fair deal. The more recently managed space and the space that's actively managed rather than dormantly ignored is quite efficiently used. Similar exercise in the autonomous system numbers. We're over halfway of the current 16 bits, 65,000. That's your private AS number block. Red is what the RIRs have. It's stuff that if it got reclaimed in the early part most of this space is allocated. That's what I see in the routing table. That's what I see unrouted. ASes seem to have a bit like radioactive decay - half lives that is over a period of time half of them disappear and that seems to be continuous. Here's the IPv6 table. If it really was the IPv6 table this wouldn't be here. There'd be a very skinny stripe here and one there. So I've taken the only two benefits active space I can see which is a /16 block, 2001. That's the left-hand side. The /16 block 3 FFE which is the right-hand side. What you see with 3 ffe is it's all been allocated out from IANA. 6Bone's got it, but this much has been handed out and is routed. I can't find a 6Bone registry to colour that any other way, so I've just assumed the whole lot is out there somewhere. Because their routing unit is basically a /24 it's either all or half. So 3 ffe was dealing in really large blocks. Here's the original set of /23s, right. Here's the original /35 allocations out there. And they really were tiny but a fair deal of it was routed. So in that space there the actual light green is quite small. These are the more recent allocations you've just heard coming in from a number of the larger providers using v4 legacy. So that space there, for example, has been hit the delegation files, I believe that one there has come out from IANA and will be delegated a the some point soon. Soy expect that to change colour. There's another one where again the allocations are just happening. So the movie is trying to say how did we get to that state, because that state is about a day or so old. What we've done is comb back through all the data we can find and push it all together. Unfortunately we don't have routing data that goes back that far. The routing data only goes back to 1997. But the delegation data you can clean up. The IANA files are a bit crappy but if you take that with the ARIN data which is really quite accurate and rewrite some of the crappy IANA data you can go back to 83 and start developing a reasonable picture. This has now got an image for every day. It's just running through. Let me explain a bit - we're trying to run a tachometer on this to see how fast things are happening. AS numbers at the bottom, IP numbers at the top. When you see an allocation you'll see one of these numbers start to flick. That's pre-BGP so that is nonsense. It'll take a while: At the moment - 1984 - /8s - 1986 - /8s are being allocated. Around 1987 the NSF project started up in the B space rather than the As. A whole lot of universities, firstly North America and then elsewhere started heading into the B space, then RIPE wanted a space to open into Europe then you started to see the B space really movement although there's activity notice B space /24s are so tiny you can hardly see them moving. This is really when we started to worry about the class B space. If you notice the amount of space that's coming out of the class B space at the time it was enormous. The delegations were really quite solid. There was also some A space around at around that time as well. I think the Japanese around that time got net 45. You should see it coming in just around there. The B space is moving very, very hard. Now we're starting to do a little bit more BGP. At this point the network was still hierarchical coming out of the US. There wasn't an awful lot of AS system growth. You're seeing slow growth. 1993 was when we really started to recognise there was a problem here with class Bs and started to think what other ways of doing this are there? At this point, early 94, A very strong pressure to get out of the Bs and start using the C space in a better way. You find now the C space is moving quite quickly because we're now giving out CIDR blocks, the B space is relatively static. We're about to get into the routing point. CIDR is moving on pretty hard. We're doing it in routing. Now the AS numbers are doing very strong growth. What's happening now is rather than the network getting massively bigger in terms of its address space it's getting massively bigger in terms of its routing capacity and diversity. Routing has come in now. The B space has only been half used. What got allocated years ago was not getting routed. What got allocated in the A space is really not getting routed. When you don't look after address space it goes to the dogs. That's what we're seeing here. Over there the RIR system is allocating at the edge. The RIR system is allocate hearing. Here's the AS space. Quite a lot happening here. Huge amounts. The network is getting more complex faster than it's getting bigger in address space and half life of decay - when ASes get old they disappear. The boundary coverage is so fast it's amazing. 2003,000. This is the recovery phase. The As are opening up in IP space but the AS numbers are growing really quickly. Occasionally route views misses it. There you *go, that was quick. Hopefully quick enough. Any questions? I don't know where we got these credits from but they're pretty amazing. There is a movie on IPv6. Nothing much happens - the allocations are a couple of large blocks and that's about it. It's probably not worth playing in this forum but there is a comparable movie with the data that we have, yeah. TIM GRIFFIN: Looks like we need larger AS numbers. GEOFF HUSTON: Yeah, that kind of picture of the AS numbers is pretty bad. I did some analysis about two years ago thinking we had 3 or 4 years before we had to worry. The AS number is still pretty solid. Full byte ASes are the way to go. If it's going to take some time for vendors to implement then we've got to do something about recovery. If vendors come in with kit quickly over the next two years give us some time to test and deploy we don't have to worry. That stuff is irrelevant if you go to 32 bits. This is the fastest growth point. The IPv4 space, the growth rates aren't as bad. As I've done in other work - the growth into these areas here is quite steady. You could say it's NATs, a whole bunch of things but we've got well over one decade, two, possibly three, growing into these spaces but that's not the same picture with AS numbers. That's much faster. If you look at the recovery opportunities that's just amazing. There's so much space sitting there in unmanaged allocated space. If we used that as well I think you'd buy about another ten years out of this space if you actively managed the As rather than shut your eyes and hoped they went away. Any other questions, comments? Thank you. PHILIP SMITH: Thanks very much, Geoff. (APPLAUSE) That's all we have for the routing SIG for this APNIC meeting. I'd like to thank all the speakers, Tim, Randy, Geoff and Gert Doering in his absence, of course*. Thank you for coming. May I remind you of the mailing list. You can go to the web page there on the screen. If you're not a member, please join. If you are a member please contribute. If you would like to contribute a presentation to the next meeting of the routing SIG that will be during APRICOT in Kyoto, Japan end of February 2005. If it's not in your diary please put it in. If you haven't made travel plans please think about doing that also. See you all in about six months. The BGP signing BOF is in about 15 minutes. Thank you for coming. (End of online sessions for Thursday) Time: 5.45pm