______________________________________________________________________ DRAFT TRANSCRIPT Routing SIG Wednesday 23 February 2005 2.00pm ______________________________________________________________________ PHILIP SMITH We probably should make a start to this. I would like to welcome you all to APRICOT 2005. This is APNIC 19. The Routing SIG session. Just some administration issues before we start - the chairs of this special interest group are myself, Philip Smith, and Randy Bush. You can reach us if you have any need to send us email (refers to web addresses on a slide) Hopefully useful and contributing appropriately to discussions about routing in the Asia Pacific region. Before I dive into the agenda, I just want to go through some general administration. I want to remind you all that there is an onsite notice board. Please check that. There is a jabber chat room which you can participate in. There's also live text streaming. I'd like to remind both speakers and everyone else who wishes to participate in the session to speak nice and slowly and clearly so that our two helpers here can understand what we're saying and convert the words into writing on the screen for you. I'd also ask anybody who wants to ask a question or make a comment to please don't shout at from the room, but come up to the microphone, state your name and then ask your question. There will also be a roving microphone. I think I'm using the wrong microphone at the moment! (Laughs) but I'll hand this over to the - John will be going around with the mike. OK. So to the agenda for today. The session is in two halves, we've got quite a large number of presentations so APNIC kindly agreed to give us two-90 minute sessions which we're appreciative of. The first session we'll be starting off with a presentation from Brian Seen then we'll get stuck into BGP security which is one of the hot topics of Internet at the moment. We'll have two speakers, Russ Housley and Steve Kent talking about that. Then in the second session after the break, more reports into the routing system. We'll discuss those more after the break but there is a sample of what's coming in the second session this afternoon (Refers to slide) so hopefully you'll stay for the entire afternoon and help us out with this routing special interest group. OK. So getting back to this session. First presentation I'd like to invite Brian to come up and tell us about global IP network mobility using BGP. BRIAN SKEEN Everybody hearing me OK? (Pause) OK. Good afternoon noon. My name's Brian Skeen. I work in the network engineering department for Connexion by Boeing. It is a business unit of the Boeing Company. We provide broadband data services to numerous commercial, government, private customers. I'm going to talk today about how we leverage BGP to provide a solution for global network mobility, talk a little bit about some of the challenges and some of the considerations that we looked at in doing so. A quick outline - give you some background on who Connexion is, what we do, I'm gonna touch on the current mobility standards, how they may or may not meet the needs that we have from a network mobility standpoint. Talk about some of the unique challenges and service considerations we have. Moving at a high rate of speed, 506 miles per hour covering a large geographic region on a daily basis. So some inherent challenges are there, and why BGP? What does it offer, how do we leverage it in our solution and of course queues. Connexion, as I mentioned we are a broadband data services provider. At the bottom there you can see sampling of our current customer base, service launched in May of this last year aboard a Lufthansa flight. They were our launch customer. We have been expanding routes to some of those other customers and others not listed there. Thats ongoing on a daily basis. Service is available on both Boeing and airbus aircraft, contrary to some belief. But it's dependent on the airline and what aircraft they want to use to cover certain flight routes. It's pretty evenly split. What do we provide? Internet access. VPN support. It's like an 802.11 wireless hotspot you find in a book store or airport or something of that nature. Later this year we'll be introducing television aboard some Singapore Airlines flights that will expand to other providers. To the airlines, you can read what we have there but what we needed was a system that was robust and reliable. Aircraft systems have to have a lifespan of about ten years. Taking an aircraft out of service for maintenance or for a retrofit is cost prohibitive for an airline perspective. Primarily we use existing 802.11 wireless technology on board, reasons for that are probably fairly obvious. Reduced weight, reduced power, again you don't have to take the plane for a retrofit. Quick picture of what this looks like. You have an antenna profile mounted on the top. We use existing geosynchronous satellites, we release numerous transponders from different providers around the world. There is a data transceiver router function on board that provides both the on and off routing. Interconnection with the satellite subsystems and also the interface to the wireless access points. There can be up to 7 of them. Wireless access points, that is. depending on the configuration and the size of plane, those sorts of things. Quick look at the system architecture, talked - There is a core network units which provides the cabin distribution system interface. It's physical connectivity for the wireless access points, provides the DNS, DHCP, things of that sort. On the ground we currently have four operational ground stations located throughout the world, a fifth has been brought online. Will be done in April. Each of those tie backs remain - to a main data centre on the west coast. They each have a connector to the Internet. I'll talk about that more later. Give you an idea of our current service region. This is constantly expanding to cover customer flight routes. This is showing the existing ground stations, one here in Japan north of us. One in Moscow, Russia, another in Switzerland, Leuk and one in the US just outside Denver. The fifth one being brought online is on Vancouver Island in British Columbia. That may be at the end of April. One thing to note - all of these ground stations are maintained as separate BGP autonomous systems within our network. Just a little bit about the current mobility standards and how they may or may not be working for us. The current open standard for IP mobility targets , host mobility, relies on some level of mobility support within a protocol stack and also relies heavily on tunneling back to home ranging. In the small geographic area where you have more confined space, you don't have a platform that's moving outside that region frequently or potentially spending 50% of its time out of that region that tradition traditional model may work well. Our network is highly mobile, it moves over large geographic region throughout the course of the day and back again and it spends about 50% of the time outside of its home region. So you're looking at a potential backhaul. Some of the network, NEMO basic support protocol recently - that starts to address some of our concerns although it deals with IPv6 primarily at this time and again talking here about IPv4 and still relies heavily on IP tunneling. What were some of our challenges? I've mentioned the platforms specific challenges, the traditional network mobility model. You could have up to a few hundred within a network depending on the size and configuration of the aircraft. To give you an idea of the scope - a potential - typical flight between Europe and Asia could touch about 3 different ground stations and up to four transponders. More likely probably two ground stations and two to three transponders but there are some routes that do fall into that category. That leads to what mobility tries to address in the first place which is a similar user experience - we have frequent changes, point of attachment to the Internet. A transparent experience and a seamless situation for the users is desired. If you're looking at the standard mobility implementation, again, I mentioned the fact that you have increased latency and the capacity of the land service. This gives you a high level idea of what we're talking about there. In this case, where you have a statistically homed plane, say in Europe but it's operating out of this region currently and it wants to reach a web site in this region you can get an idea of what some of the latency would be involved there. Those are some references based on some testing we've done. You can't get away from the satellite delay. That's roughly 5.5 milliseconds, something of that order. You start to get into some back and forth across the ocean situation that is drive up the latency time. So you can approach 3 seconds in some cases. What we set out to do was to find a better way to do that. Try to leverage some existing technologies, what we had available to us, to find a better path, reduce that latency, reduce the requirements on our land links and hopefully improve the overall reliability. If we reduce latency and - in the time it takes to move back and forth across the ocean the network should be more reliable and be a better user experience. We also wanted to leverage existing technologies. We didn't want to have special requirements that we had to go to our providers. We didn't want to have any special requirements or situations that they would have to deal with. That basically led us to a conclusion that we need to follow the geography of the plane. And how do we do that? Leverage BGP. At this point we leveraged the fact that it's supported natively, everywhere on the Internet. That we have the ability to control the routes that we use within the Internet as a whole and not just within our network and it allows us to selectively advertise those aircraft routes they move. Going back to the other example of where that leaves us. Now we're talking about a situation where a plane is dynamically homed to a particular gateway when it's in that region. You can see the latency savings in doing that. You don't get away from the satellite latency but you try and get a web site directly in Asia then you're there. You can see the change there. Just a quick diagram of how this works. A representation of a few of the gateways, all have ties to the Internet as well as back to primary data centres in the US. Essentially what happens is as the plane moves into certain regions, we will make a selective route advertisement from that ground station and all passenger traffic will be sent directly to the Internet from that location, to and from. As the plane transitions and moves let's say to the European region we do a selective withdrawal of that route and will readvertise it out of the European region. Traffic in and out of the providers in that region. Each ground station is only serving or advertising the routes for the aircraft that's currently serving. Just a quick network diagram to show you what the ground station network looks like. Only necessary to retain dual ISP. We have a set of routers here that are primarily responsible for the dynamic rejection and withdrawal. Those route servers have a back-end tie to some of our satellite subsystems and allow us to very dynamically and soon to be fully dynamically move those routes around . Just talk a little bit about some of the challenges. I think we're probably all aware of these. The first being a /24 network propagation. We do use /24 public address block on each aircraft today. There are concerns within the Internet about the growing number of routes, the default-free zone and obviously this will be part of it. Were aware of that. We've investigated this as we went through how we would leverage BGP and what solution we'd take. Basically we found in discussing this with our providers and in our testing that these routes are being allowed, they're not being filtered or aggregated for the most part and we haven't seen any operational issues at this point of using this /24. It is possible of course and we realise that. In the event that something like that did happen from a provider somewhere in the deep corner of the Internet, that they'd get aggregated then we also advertise an aggregate block for all our public address ranges and we advertise that from a central location on the west coast. So it would provide a path back to our network backhauled if we needed to accomplish that. A couple of other challenges - common question we get - BGP convergence versus the satellite handoff. Are they complementary or do you have one longer than the other? The short answer is yes they are complementary. You're looking at about a minutes worth of time for satellite handoff to occur. In some cases that's just a little bit less. Our testing has shown that the BGP propagation convergence does occur within that time. We have not really had operational issues as far as the satellite coming up in two-way communications and not having the BGP routes converged and ready to route properly. Prefix churn is another one. You're changing these routes on the databases but you're really only talking about the normal circumstances probably a route change for a particular flight once, say every 12 hours, you're talking about a transoceanic flights and based on the satellite coverage map probably somewhere in the order of 10 to 12 hours, so you're really not approaching that threshold. As a percentage of the total churn we kind of figured that our part in that was probably somewhere around 1/10th of 1%. Don't quote me on that - but fairly small. Prefixes could have an inconsistent origin based on they were advertised and it changes. No operational issues with that either. We haven't seen that in any case. A quick idea of what this looks like. This was taken from a Lufthansa flight in November from Tokyo to a point in Europe. We used some of the BGP data modelling tools, BGPlay in this instance, we'll here about that later. It's provided by the Routeviews project and we use it to observe the behaviour during the satellite handoffs. It's basically a collection of routers that collect realtime BGP session data and show you an animated view of your prefixes is treated. Here is a screen shot of it. Just shows RAS numbers, there are a few of them there. It shows you that currently the prefix here, 216, 65 is one of our ranges, one of the commercial aircraft. It is being advertised. Of the Ibaraki ground station. This happens from the prefix situation. Here you are observing the fact that it's been withdrawn by the route servers in Ibaraki and being relearned via the Moscow ground station. The small lines, thin lines, are implying implicit route change, the wider lines that flash up are more of a route withdrawal, more of an explicit change. Each of the end points here on the map show different AS numbers and their view of this particular prefix from where they stand. So that's just a redraw of the map there to show you how - this particular prefix is being worked. SPEAKER FROM THE FLOOR How long does that take? BRIAN SKEEN 40 seconds. Then you would see it's converged, it's going through our two ISP providers in Moscow. I have another one that would show the Moscow to Leuk handoff. Again very similar. We'll just show it converges through the two ISP providers there. Another common question that goes back to the churn. The prefix and the dampening, again, we have not observed that in normal operational circumstances. We did some pretty substantial testing on it as well as talking to the providers. Part of that testing that after about 5 changes in a short amount of time you will potentially see some dampening of that. We have a safety net of the aggregate being advertised allows a backhaul through our network. Handoff within a ground station would not propagate a new advertising to the Internet. That's mostly to the ground station. A couple of things we're looking at dynamic prefix advertisement. It's a semi-static assignment to the aircraft. Shortly it will be more dynamic. Regionalisation of the space, if we have a particular aircraft or a group of aircraft operating primarily in a region. IPv6, we are working on that internally with our customers realising the benefits that provides, Connexion does have IPv6 space so that's something we're working towards. In conclusion - a couple of points there of how we're using BGP and where it's working for us currently. Only suitable for 24 and larger networks so far. Any questions? PHILIP SMITH If you have questions can you please use the mike and state your name. RANDY BUSH Randy Bush. I have two questions regarding the implications of the fact that you're announcing a prefix from one AS then moving to another. 1. What does that mean in terms of policy as we know it today? And 2. How are you going to handle that when something like sBGP is... BRIAN SKEEN In terms of policy today we did consider that, look at what that would mean from having an advertisement from two different spots at once. We talked to the providers and tried to understand from them what that would mean from an operational perspective. Having not really seen any operational issues behind that we haven't delved into I guess how it would be to handle that. What changes we need to make as far as how we would treat those routes based on how we haven't seen anything in the operational issues. The s BGP issue we are working on now. Kind of under development so I don't have any specific answers for how exactly we're going to deal with that. SPEAKER FROM THE FLOOR Steve Kent, BBN. There is a mechanism in sBGP for the holder of an address prefix to authorise an AS to advertise it as the origin nature and you just have to have those address blocks authorised for each of the ground stations then change them over time. Because you know what they are, I don't see a problem at all. BRIAN SKEEN Thank you. GEORGE MICHAELSON George Michaelson from APNIC. I'm interested if you had FCC and FAA compliance issues in terms of process signoff and whether one possible direction is that there will be flight process centric activity or whether this is seen by people like the FAA as purely an information services, not something that could become critical for routing or flight management or an aspect of plane rather than entertainment? That's the first question. BRIAN SKEEN I would say a little of both. We had input on both aspects. Right now it's viewed as an informational service. It's kind of an extension to a flight entertainment system. We are looking at some e-enabled initiatives that provide some functionality or services to the airlines themselves and to the aircraft and those deal with real time-type data both from a security stand point, just a flight op stand point, airport information those sorts of things. When you get into that, yes, you start getting into FAA and FCC issues and conversations. GEORGE MICHAELSON We shouldn't expect to see aerofoils wiggling as a result of this? BRIAN SKEEN The joke is you move the mouse and the plane banks right or left - nothing like that. GEORGE MICHAELSON This is in some ways an observation of something Randy's commented on. Randy at different times has talked about beaconing and how single announced withdraw events can demonstrate this amazing cascade of activity in the global network in the global default-free zone. I think your diagram showed that. You had something that was a handoff event that in one critique is internal and it had this amazing explosion of transactions in the global DFC. Your decision to avoid the dog-leg of a single attachment to the network appears to have a consequence that about global visibility of route change that does seem about your IGP. The observation, the dynamism here is in the global net. Is that really what you want? BRIAN SKEEN That's an interesting question. I see where you're going with it. GEORGE MICHAELSON Maybe one for beer! BRIAN SKEEN It might be one for off-line. Interesting comments. There are a number of outside - of just a technical reason why we do - of course there are business reasons and all the cost reasons that go along with that. So there were considerations taken of how we do it, what we really want to do. I'd be happy to talk to you offline about this. OSAMA DOSARY My question is about what method are you using to perform the BGP session handoff or multiple BGP sessions handoff? BRIAN SKEEN We use the route server capability. That's a combination of some open source with proprietary customisation code that we use. It has a back-in to the satellite subsystem and allows a trigger, in other words, to signal that change, that handoff. OSAMA DOSARY Is it triggered by BGP itself or do you have like an open session that's an idle until you move it - BRIAN SKEEN It's driven by BGP OSAMA DOSARY I'm guessing there's a router on the plane and this has one session or multiple sessions that are idle until it moves into another footprint? BRIAN SKEEN BGP does not extend over to the plane. It's all done from the ground base, you have ground base satellite components that are communicating. They're in control of what's happening with the aircraft and they're the ones sending the trigger to ground base devices to signal this change. OSAMA DOSARY Thank you. PHILIP SMITH OK. I think for the interests of time we should probably move on to the next presentation. So thanks very much for that, Brian. It was a fascinating presentation. Thanks for the questions as well. Our next speaker is Russ Housley, he'll be talking about BGP security. RUSS HOUSLEY Good afternoon. I'm Russ Housley. I have my own consulting company called Vigil Security. I was asked to come here and talk about BGP security. I understand I'll be followed by Steve Kent to talk about secure BGPs. I want to provide a little motivation in the introduction. I want to follow this with what I believe is necessary to have a solution to BGP security. Finally, a quick summary of what has gone on in the IETF as steps towards BGP security. As you'll see, my view to that is - not enough. BGP as every one in the routing system knows is critical component of the routing infrastructure for the Internet. It's the basis for all Internet ISP routing. Sadly, we all know that it's highly vulnerable to human configuration error. In addition to being able to, being fragile in the sense that humans make mistakes those same vulnerabilities can be exploited by attackers. We see common place configuration errors. I think the one that I remember the most happened shortly before I became security director. My predecessor told the story of a new ISP coming online in Florida. They finally got their first trunk to the Internet backbone and their brand new router, they configured it, set it up, they were able to send pings, so they thought they'd done a great job. So they set off to go have a beer and celebrate. The problem was that they had misconfigured their router and were advertising that they were routed for MIT's network, all of the traffic that was supposed to go to MIT was suddenly going to Florida. They didn't do anything malicious, but they were sucking all of MITs traffic to Florida and preventing the people from MIT from communicating. This kind of thing is the kind of thing that I would expect BGP security to prevent. We've also seen BGP purposefully and maliciously attacked and I think we're going to see more and more of the same. Spammers, for example, take advantage of BGP vulnerabilities every day. I'd like to see a comprehensive solution to BGP security. A solution is not something that can be achieved simply. It requires buying in from many, many parties involved in routing. The vendors, ISPs, subscribers, so that's one of the aspects that makes this particular thorny problem - we need to have all of these parties involved to come up with a solution that everybody can accept and employ. Which is not going to happen quickly, I would argue. Not only will the development of the solution take time I believe the deployment of the solution will be one that takes time as well. BGP is used by people for several different things. My view is that when focusing on what part of the problem are we trying to solve we need to focus on the things that affect your neighbours. The internal uses of BGP one presumes are a local business matter but the ones that affect your neighbours, regardless of your architecture, are the ones that I am mostly concerned about. A misconfiguration error within one autonomous system that affects all of its neighbours is the kind of thing we need to worry about. A misconfiguration that only affects yourself well that's your problem and not one that I'm at least in the beginning want to attempt to address. This is a simplified view of what an UPDATE message of BGP is about. There is withdrawing routes which we just heard about. The one the message was leaving issues a withdrawal. The other is to advertise prefixes and the paths associated with those prefixes. These two can be piggy-backed together into one message, so as long as the routes that are being withdrawn are not the same as the ones that are being readvertised or changed. If you look at simplified view of the processing associated with these UPDATE messages the first thing that is dealt with is the router information base, adjacency, is updated. From anyone - anyone who runs a network knows there is filtering applied at this stage which is necessary in the way networks are run today, but is not part of the BGP specification itself. So then the information from those adjacency RIBs are put together in routing algorithm so a particular UPDATE at this point is determined whether it's going to affect the local routing information base or not. Many changes that neighbours make will not affect which paths are going to be used, in which case the message doesn't affect the behaviour of transit traffic. Then you apply to that, local rule which you have to deal with the way the business aspect is being run. Then finally, that view of the local - if the UPDATE is going to affect transit traffic then the local RIB. Then finally you make a decision as to whether you want to share that information with neighbours or not. That is basically a local decision as to whether you want to share that information or not and there's two kinds of sharing that can be dealt with. One is you can share and say, this is only for you and not to be passed on to others. So that is the no-export version. The other is without the no-export you send it to your neighbours and there things are propagated on, if they in turn choose to do so. My understanding of the specifications is that each AS along the path is assumed to have been authorised by the AS that's preceded in the path to advertise those prefixes. So, that means that if you're going to see this long path of AS numbers that each one in there is willing to share that information at least with the recipient of that UPDATE message and no further if the no-export. There is an assumption that the first AS number in that - in that path is authorised to advertise the prefixes by the holder of those prefixes. A route may be withdrawn only by the neighbour of the AS that advertised it. So if any of these assumptions are violated then BGP becomes even more fragile and vulnerable to even more forms of attack, so you need to examine these assumptions and if they don't match the way people are using the system today then we need to figure out what we're going to do, but this is my analysis of the situation today. The notion of a best route is not - is primarily a business decision. It represents the decisions of ISPs as to who is going to be the one to hand over which traffic and we're seeing more and more work being done in the traffic engineering area. Most recently the ISG has been approving a lot of MIBs being associated. Different routes may be allocated to different neighbours based on those local policies. Looking back at the UPDATE messages it means different neighbours can receive different sets of those UPDATE messages and that's how you enforce those local policies. It often leads to asymmetric routes. Private peerings make it a situation that not all routes are visible to all the parties on the Internet. With that kind of foundation I want to talk about how to proceed towards BGP security. So what does an attacker want that a BGP security solution would prevent? The attacker may want to degrade service, either in one particular part of the Internet or the Internet as a whole. If they do - they can do that by any protocol that is going to affect the CPU of the router. BGP is just one of those. SMP are others. But we want to make sure BGP doesn't become the mechanism by which it's easy to mount an anonymous service attack on a router, as you start adding security mechanisms such as digital signatures that require more processing we need to be careful to make sure that they're done in such a way that things that have problems with them can be discarded early in the process as opposed to hanging around. An attacker may also watch a reroute subscriber traffic. Perhaps for passive eavesdropping or perhaps for active wire tapping. These kinds of things would allow them to examine the subscriber's traffic then they can just pass it on. If they want to listen in. They could modify the traffic and pass it on, if they are trying to be especially malicious, or maybe they just want to delete certain traffic. Just the ones with particular host addresses, for example. If you can get into the routing system you want to masquerade as particular traffic. Perhaps just the traffic from the DNS server associated with a particular organisation. As I've already said, the BGP architecture is already such that it makes it highly vulnerable to human errors and malicious attacks. Attacks against the LINX, against the routers themselves and the management as well. The implementations themselves are susceptible to service attacks. Some routers are relatively easy to crash if you know the right message to send. If they don't crash they spend a lot of time dealing with that traffic which denies - or prevent them from handling other traffic that is likely to be more important. So what we see is those filters that are being applied by router operators in order to protect themselves against this kind of traffic and sometimes configuration errors. These are the filters way of talking about. The creation of these filters, there's no standard way to do it, different router vendors Some of them are extremely difficult to handle, theyre very time consuming in terms of the person who tries to structure them correctly, and thus they often are difficult to get just right. About the time you get it right the situation changes and that's when you edit the filter. So, why is this a particular problem today? Are people really exploiting BGP? There's some DARPA-sponsored research that discovered - affecting about 1% of all the routing table. That's a concern with the human entry aspect of this. We know that BGP attack levels have been developed and demonstrated at hacker conferences, so just having people go to those conferences you can learn about that. The question is are those tools really being used in the live Internet? The answer is yes. We see it all the time in terms of ISP routers to get attacked. Then the compromised routers become the launch point of BGP-based attacks against other routers. Basically, enabled passwords are very valuable. I talked earlier about how spammers are using address space that has been allocated but it's not currently in use. They send BGP UPDATE messages, say with a /24 prefix, and create a chunk of that address space basically anywhere they want on the Internet. Send a couple of gigabytes of spam, then withdraw the route. So they've create add little sub-net wherever they want, spew their stuff and then evaporate. All of the outraged phone calls go to the people who own the address space that they advertised and has nothing to do with who actually mounted the attack. We've also seen BGP-based attacks to advertise the chunk of address space associated with the DNS route servers. So you can be viewed as the DNS route server you can associate any host name you want to any IP address you want further down the tree. And you can even do this selectively based on what the source address for the query was. This leads to what I believe are BGP security requirements for solutions. What I believe - we need to get away from a point where we are relying on personal relationships between people that operate various ISPs. It's not that those relationships are bad, it's not that they aren't going to continue, I believe we need a technology based solution going forward that's going to scale as the Internet continues to grow. It's quite possible that some ISPs will not be trustworthy. We need to be prepared to deal with that situation. We've certainly seen that even those trusted people make mistakes so it would be nice if a system aided in the detection of the mistakes they have made rather than propagating them. So we need to have a solution that has the appropriate binding to the way BGP works. Elements of the security system need to exhibit the same dynamic behaviour as BGP and yet be static in the parts where BGP is static. We need to make sure that the processing requirements and memory of the solution scale, at the same time I'm not sure that it's possible to accommodate every piece of installed base equipment, some additional processing and some additional memory are certainly going to be required to implement a security solution. It's vital, I believe, that we accommodate the incremental deployment of a solution. Principle of least privilege is something that the security professionals have viewed for a long, long time. Basically each system element should only be granted the permissions necessary to perform their piece in the overall system. That basically flies in the face of the whole concept of UNIX route, if there is an all-powerful person then that's the person who you are going to pass the money to when you want something done, whereas if each person only has the privileges and capabilities to do the things that are part of their job, or in this case each component of the Internet routing system, only has the privileges that are appropriate for their role in the system then that allows you to put a defence around the amount of damage that compromises that component. This is one of the cornerstones of information assurance. You want to apply that cornerstone concept to BGP. So the security failure or benign error by one ISP or one subscriber does not propagate beyond that person's sphere of influence. Any security strategy for BGP should incorporate a fire break approach so that the security failures and errors don't fall right down like dominoes. The dynamics versus static aspect being you need to realise that some things have local significance, some things have global significance, some things are slow in terms of change to the system and other things are rapid. For example, something that's slow and of local significance is installed on a new link so it takes a lot of time to order it, put it in and all that stuff. The operational staff rollover is another thing. The other extreme - something that has global significance, it happens very rapidly, is a route change which is part of why I asked the question of the previous speaker, when the aeroplane moved from one ground station to the area of another one, he said that basically all of the other ASes within the system noticed that within 40 seconds. So it had global significance and happened very quickly. There's two aspects to improving BGP security. One has to do with implementations today. And the other has to do with architecture. Some implementation improvements are certainly possible. But I don't believe that that is the whole story. Just changing and improving particular implementations from particular vendors is not of itself going to solve this whole problem. It may improve the denial of service, countermeasures within a particular router but it's not going to - going to secure the entire system. Architectural changes are going to be necessary to do that. Yet, the two are both going to need to be done to make the system secure and robust. Every UPDATE that a router receives needs to be able to verify that the holder of that prefix is an authorised origin and that subsequent recipients of that information, the ones down the line in the path, are able to look at that and confirm that they came from an authorised place. So they need to be able to detect and reject unauthorised routes, irrespective of whether it came from misconfiguration or whether they came from malicious behaviour. Failing to do this a BGP speaker will be vulnerable to attacks that result in misrouting of traffic one way or another. Based on that one bold statement I think you can derive that the important part is the verification of ownership and prefix holders, then binding that BGP router to the ASes that it represents then performing authentication of UPDATES and withdrawals. Incremental deployment is, I think, vital to the way the Internet works today. We do not have a flag day. Yet we need to make sure that we provide a secure environment in adjacent ASes. I don't believe we know how to have - improve this where we have an AS that has implemented whatever security system we come up with routed to an AS that has not yet then to one that has adopted that solution. With that one in the middle I don't think we can do anything that that AS in between will be able to mount all of the attacks that we witnessed today. However, two adjacent ones that have implemented security solutions should be able to cooperate and then add on other partners and so on and kind of grow out. I want to spend just - talk about two activities that are going on in the IETF. First is a working group in the routing area where security requirements are being handled. They've got one document, the RFC Editor Queue, and three documents progressing in terms of vulnerability analysis, security requirements. So far they have not done any work on protocol development. This working group will not do that. The idea is once the requirements for security in routing protocols are identified, additional working groups will be set up to develop protocols. Within the PKIX working group RFC 3779 has been developed for including information about ISP prefixes and AS identifiers. That is the first building block of the solutions that's going to allow a prefix holder to have a certificate and then make that authentication available, which will in turn allow recipients to make authorisations. So while we don't yet have a distribution mechanism or protocol to use these things I do think we can take advantage of this one piece in order to help us solve that problem where MIT's traffic was going to Florida. We can at least know who owns which chunks of the address space even if we have to in the near term distribute those through some repository as opposed to the automated routing protocols. My personal opinion is we shouldn't wait for the entire solution to be put out in RFCs but we need to take the piece that is are available and start doing what we can with them and as we climb this mountain the further up we get our view is probably going to change about what those RFCs that aren't yet done ought to say anyway. Questions? GEOFF HUSTON Russ, kind of interested into what role the RIRs play in this kind of area and what are the injection points that are required in this? I heard what you were saying about autonomous systems and being able to understand who has that autonomous system when they're injecting routes into the network as an originating AS and the role of address space, what prefixes are injecting, where are they coming from. What in your view, is the role here for RIRs and more specifically, what is their role in being a trust point in your model of where security BGP is heading? RUSS HOUSLEY I think there's two ways that things might go forward. I have a personal preference, but I can live with either one. Maybe there's a third one, I just don't know. The RIRs are the ones that are passing out the chunks of address space so they I think are the ones that need to issue these certificates, they're the ones who know who the holder for the - that chunk of address space is. So they are a key component of this being done properly. The question is - what's the route of this certification hierarchy? My personal preference is that it will be IANA, IANA issues chunks of huge blocks to each of the RIRs and the RIRs in turn dole it out to others. The reason I like that approach is it's the simplest trust point to implement in a router because it only has one route. The alternative is that you view the RIRs as peers and let all of the personalities, I guess is the word, sort it out, in terms of making sure there's no overlap between chunks that they can hand out. Then teach the routers about trust anchors associated with each of the RIRs as peers, so that requires a little bit more memory, but may actually map to the way things are actually being deployed in the real world today. So I can live with either one but I'm kind of focusing on the let's do it with the least memory we can. GEOFF HUSTON Thanks, I'd like to drill a little further down one direction. I can understand the concept of an IANA route as being the derivation of where these certificates come from. My follow-up question is about the certificate structure and the world as we know it. The RIRs hand out both addresses and autonomous systems to entities. It's pretty clear that when an address gets handed out it might originate from an entity that wasn't the precise recipient of the subdelegations and so on. And the other thing is the autonomous system is more likely to be originated from the entity that the RIR handed it to. It's more about anchor points of injection. RUSS HOUSLEY That gets complicated with mergers and acquisitions. There is no simplistic model that matches the real world, OK. I think we need to embrace that complexity in terms of the certificate structure. RANDY BUSH Let's make it short out of consideration to the next speaker. Just carve him up and let's be done with it! STEVE KENT, BBN I think there are two points to the question that - RIRs I tend to think of as natural certification authorities in this process, but ISPs also wind up being certification authorities when they further hand out those prefixes to downstream providers or multihome providers, whomever. It's just continuing that simple tree structure down as many layers as is necessary in pushing prefixes out to holders. The other direction is a prefix holder authorising a given AS to originate that prefix, which might or not be the same one that - that's a different structure, that's what Russ was alluding to on his slide to say we need these other digitally signed things to allow the prefix holder to indicate which AS or ASes should be authorised to originate. That's a separate data structure charger. GEOFF HUSTON Thank you. RUSS HOUSLEY One can do that with attribute signatures but they don't quite fit. Maybe we don't want to go into all that today! You do need a data structure that says the current ones who are allowed to advertise this prefix. GEOFF HUSTON Thank you. PHILIP SMITH Thank you for the questions. STEPHEN KENT Thank you. What I would describe is a particular technical approach to achieving the sorts of goals that Russ Housley alluded to in his presentation. Fortunately, because of clever scheduling on the part of our leaders here, Russ has done kind of background that I would normally have to do in terms of, you know, what's BGP about and what are appropriate security requirements if you start with BGP specifications and work down from those so I'll just focus on the technology that has been developed, which is a candid technology for doing this. It's an architectural solution so Russ provided s-BGP is an architectural approach. It doesn't say anything about bugs in your implementation per se, but it has the opportunity to protect you from implementation errors in other people's ASs. It is an extension of BGP. It uses a particular standard facility in BGP to carry additional information that's needed about paths in update messages and it has an infrastructure component as well so it requires some infrastructure as well as some additional information to be pushed along with the advertisements as they are set. It also implies that routers will do some additional processing in creating updates and in accepting them, in processing them, in order to achieve security. One thing that it avoids is something that Russ noted, which is any notion of transitive trust. Basically, this is something that follows the precept of privilege. The idea is that each autonomous system and the routers that represent it as order routers should believe only what can be proved to them through the mechanisms that s-BGP develops. So it's not a question of, "Oh, the guys here know what they're doing. I'll take their updates, process them and everything will be OK." That's how we get into trouble today. In designing s-BGP, we try to design it so that the mechanisms that are involved scale in the same fashion as BGP itself, including the sort of things that Russ alluded to in terms of dynamics. Some of the information that you need to verify for routing security in BGP changes fairly slowly. Other parts change very quickly. So we have different ways of disseminating data in s-BGP that try to track the data that, in plain old BGP either changes slowly or quickly. S-BGP has several components. It makes use of IPsec to provide secure point-to-point security for the links between routers. It provides point-to-point security for communications between routers. This is a significant improvement over the previous -- (APNIC staff confer with Steve Kent; Steve switches microphone on) So, IPsec is used to provide point-to-point security. It's an alternative to what we have today in terms of the MD5 - it would make a cryptographer cringe today. would make the use of the current mechanism especially vulnerable. There is a requirement for a PKI, a Public Key Infrastructure, and that has the features that Russ was talking about earlier - that is, an infrastructure that attests to which organisational entities have which prefixes and which autonomous system numbers. And then the notion of attestations, which come in two flavours. They are digitally signed chunks of data, that are used to represent either which AS or ASs are authorised to originate prefixes - and these are of course signed by the prefix-holders themselves - and that's fairly static, just like the PKI is. And then route attestations, which are the dynamic authorisation mechanisms that show as you can from one AS to another, that each AS along a path has authorised the next one to advertise a given prefix or prefixes. Those are the background pieces on an ongoing basis s-BGP calls for routers to generate these additional pieces of data, these route attestations to go along with each update and then correspondingly to validate them as they come in. The US of IPsec here is fairly straightforward, really a replacement for the TCP MD5 checksum option. I won't spend any more time on it. You don't have to use this on every link. If a pair of ADSs decides the link between them is secure, that's a local decision. The rest of the world won't know one way or the other. You can choose not to do it. Certainly, to the extent that we use the TCPMD 5 checksum today, this is a preferable technology. It's more secure in every possible way. This is the sort of allocation diagram that we were talking about a few minutes ago. I would like to start with the IANA and work my down to regional registries and then, as appropriate, to national or local registries, all the way to ISPs and subscriber organisations, recognising that there are a lot of paths to get from the route all the way to the lead. (Refers to slide) In yellow, off to the side, we recognise that there is a legacy allocation here of chunks of address space that were handed out directly to NITs who became ISPs or subscriber organisations and that has to be grandfathered back into the system as well. That's just the reality. But it's not, from a conceptual standpoint, difficult, but the book-keeping will take some work. AS number allocations are more straightforward, because you don't move through successive layers of delegation in those. Again, these are the simplest possible structures for a PKI. Most of the world would love to have something this simple, because we're not talking about creating new organisations and asking people to trust them. We're asking the organisations that already hand out chunks of address space and AS numbers to merely sign digital certificates attesting to what they're already doing and therefore you had to trust them before, they were the game in town for wherever you happened to be. So we're not asking you to trust anybody new here. One property of this PKI, which is different from most of the PKIs people deal with - and I say this from some experience as the co-chair of the PKI working group in the IETF - is that we're not handing these out to identify an organisation explicitly. The names in the certificates, using standard certificates, really aren't that important. We're using the certificate format because it's a standard, it's an easy thing to do. There's lots of software for processing. What's important is that the certificate is binding this prefix or this AS number to an entity who has a private key and so, if they want to prove to you that they are that entity - whatever their name is after bankruptcies, mergers etc, etc - the important point is that they have the private key and can sign something which can be verified with the corresponding public key with the certificate. So this is a PKI for authorisation, not identification. That's different to what most people do but perfectly legitimate from a PKI perspective. It all work with existing technology. The text on this slide reiterates this notion of a simple top-down tree-structured approach. You could, as Russ Housley said, have all the RIRs be peers of one another and do something that we in the PKI business refer to as 'cross-certification' or having multiple routes. All this is possible. The one different thing is that, in the s-BGP model, we never require the routers to actually verify any of the certificates. We assume that's something the router does not want to have to deal with. What we do is have network operations people do this all for the routers they manage and then pass the answers to the routers so the routers just get the secure contents from the certificates and the attestations. They never have to do certificate verification themselves. That's an awfully long process. Now, as I mentioned, there are two flavours of attestations - the address attestations address the question that Geoff Huston raised a few minutes earlier which, if I'm a prefix-holder, how do I verify what I need to authorise the prefix. We need a digitally signed something to do it. We have a particular format for this in s-BGP but, as Russ Housley said, there are lots of opportunities of how one could use to do this. It's not that critical since this is, again, a slowly changing offline thing. We don't change instantaneously which AS is going to originate your prefix and thus cause traffic to flow to you. You have to pay money to do this since there are details to be worked out. The rout attestations are the way each AS, through its order routers, explicitly authorises other ASs to advertise a rout that is originating or passing on, that's it's got yelp from other ASs. So, what you wind up with route attestations is having essentially a nested set or a chain of signatures which allow any router that receives an update in this form to verify that it really did come via the path that it says it followed through that sequence of Autonomous Systems. The format that is we developed for this are common. The address attestation on the bottom just lists a set of prefixes and then the origin AS. The route attestation is different only because the prefixes now deal with a sequence of ASs terminating in the origin AS. So it's the top one here, the route attestation, that's what's put into an update as a transitive optional attribute to be carried through the system. The bottom one, the address attestation, as I said, you really have a lot of flexibility in how you choose to format it. We just chose a common format for the two of them here to make life a little bit easier. When a router receives an update and has been configured to know that its neighbour is another autonomous system that's implementing eBGP so, as Russ Housley said, you need incremental deployment facilities. You need to configure a router with a flag that says "This neighbour does s-BGP, this one doesn't. That's the simple thing it has to know so that, when it gets updates from a router that is a neighbour doing s-BGP, it knows to look for this additional information and to process it. And it goes through some processing and makes sure that, in the rout attestation, that the AS number for the router that received it is already in the route attestation. This is slightly different to what you normally see in an update. Normally, I put my AS number in it. You continue to do that. There's no basic change to how BGP operates. But the route attestation puts your AS number in it as I hand it to you. that way we know that you were the intended recipient. That links all of the route attestations together by following that. A router would verify the signatures that appear in the chain. On average, these paths are 3.5 or 4 hops on. That would give you an idea of the amount of signatures you're looking at. You also need to see that the origin AS is consistent with the address attestation information that has been distributed and we wouldn't distribute that in line in updates because it's relatively static so there's no need to carry that chunk of data to every update as it goes around the network. Housekeeping - it's nice to talk about things like this but there is the question of where the data comes from, where it goes, how it gets there. We split data into two categories in s-BGP. The data that changes fairly slowly, such as who owns which autonomous system numbers, who holds which prefixes and who's been authorised to originate which prefixes and that information is distributed through a repository system. The one thing that you do need to maintain very responsive dynamics for is the currently which route is being advertised, because that can change quite quickly, as the first speaker of the session pointed out. That's where the route rotations are pushed in line with the updates themselves. The idea behind repositories is that they would hold all certificates that deal with all the chunks of address space and certificates that deal with all the autonomous system number allocations. If you issue certificates, you have to issue certificate ratification lists - where you say you didn't mean to do that - and the address attestations as well. All this data needs to go through repositories. Now one of the open questions is - who should be operating repositories? There should be some number of them because we want robustness. They should be loosely synchronised but, if we have too many of them, we create a new problem which is finding the repositories so they can talk to each other and provide synchronisation. We have to achieve the appropriate balance there and that is still I think an open question. Note that routers don't go to repositories to fetch anything. Network operators go to repositories to upload and download data. When we designed s-BGP, admittedly the folks designing it are not operations people - we figured that about once a day would be sufficient for uploading and downloading data. Recognising that the allocation processes are not instantaneous, the changing of who my ISP is is not instantaneous. If that turns out to be a reasonable time frame, we're not talking about high accessibility. It needs to be there. If you don't have new stuff, remember the routers aren't looking at it, the operations people are, so they pull down the data, they process it and push it out to routers. If they're unable to get it, they can go to the old data they have, they can live with that if they wish to. (Refers to slide) This is a diagram that tries to put everything together and suffers the usual problems of a diagram that tries to put everything together. Let's say we have a few repositories - I fit two on the slide in purple - and registry, say a regional registry here. So two ISPs, shown here, interact with the registry to get their certificates identifying them as holders of chunks of address space and as owners of autonomous system numbers. They push their certificates up to these repositories, they push address attestations up to the repositories and then they download everything. I didn't try to get fancy when we did the repository design. We just said, "Take it all". We're talking about tens of megabytes, maybe 40 or 50 megabytes of data. Many of us download more than that in e-mail every day. It's not that hard. Then they process what's been downloaded and push the extracted data to the routers in their autonomous system or systems that execute s-BGP and then those routers exchange updates enhanced with the s-BGP route attestations with their neighbours. So those are the kinds of flows of data we're talking about. No system is perfect. Even if one did s-BGP, there are residual vulnerabilities. There are the sorts of problems that really result from the limitations of BGP Itself. BGP doesn't time stamp or serial number updates and so we have a problem knowing when somebody, for instance, does a withdrawal and then they re-advertise etc. We have trouble keeping sequence with that. There is some ability to do a better job of this with s-BGP by putting basically time to live fields or their equivalents into the route attestations but that's about it. It's relatively coarse. So it's not perfect. So those are the kinds of residual problems that we see with the technology. Now, it's fair for people to say, "How much of this is something that you talked about, wrote half a dozen papers about over the last few years? " we have implemented all of this. We used the MRT code as a basis for this. We augmented it to do s-BGP and including basic policy controls for incremental deployment - that means the ability to say this neighbour does it and this neighbour doesn't do it." We did the housekeeping part, so we set up a set of tools that's a mini registration authority for certification authority basis to issue certificates to your subscribers to whom you've done sub-allocations and to download and upload the data to repositories and then to verify everything that comes down through the repository, process it and reduce the abstract before you ship it out to your routers. We built a repository. Frankly, this was the weakest part of the feature but it has a nice feature that nobody has to manage a bunch of access controls. It's all internal based on the certificate structure. There's a CA, a fairly high insurance implementation, something called SELinux, very high in assurance and we use that along with other hardware to do high-assurance issuance of these. So, s-BGP is a proposal on how to improve BGP security. It has impacts - registries and ISPs have to become certification authorities in this limited context but it's exactly doing what they do today, it's just adding another step to these processings. The biggest problem here is getting routers to be able to do this and, if we were talking about normal operation and we weren't being flooded with lots of updates due to some major worm or virus, most routers could probably handle the digital signature processing load with the existing hardware. But what they couldn't handle is the space in the routing - in memory, not in routing tables but in memory for these digitally signed things, for the route attestations, because my laptop has more memory on it than most of the routers out there. That's just a fact of life. So, if one's willing to assume another seven routers coming out with more memory and we want to throw in an encrypter to ensure plenty of processing horsepower for the crypt owe operations. Then this is technically feasible. It does not require faster-than-light capability or anything like that. It's a doable thing. It requires the basic processing I have on my laptop although, of course, it is a nice laptop. At this point, any questions? Oh, in answer to the first question, this is an osprey. This is a raptor that you find in North America that eats fish more or less exclusively. It grabs a fish out of the water, goes to a telephone pole and basically eat the fish. He was looking at me taking this picture, figuring, "No, he weighs too much for me to be able to pick him up." That's why I was able to take the picture safely. ANDREI ROBACHEVSKY With incremental deployment, I assume that attestation may not be complete. STEPHEN KENT I think the thing that's reasonable to expect people to be able to manage is being able to employ continuous ASs. As you do that, you get benefits for the subscribers and operators for the contiguous ASs. They form islands which, when they connect increases the scope of connection. A precursor to doing that, is you need enough infrastructure to start with IANA and work all the way down so that the folks who choose to deploy have the certificates they need to do it so that's fair. I think because of the impediments in terms of current router hardware, especially in regard to memory, is that the thing one would want to do first is put the infrastructure in place, put the PKI in place and address attestation infrastructure in place. You could distribute that information and use it to help build better tables for filtering purposes. You know, you could do that years before you got the routers doing the rest of what we talk about. That would be a good incremental approach. ANDREI ROBACHEVSKY There is no - if it's not complete, I drop the announcement. STEPHEN KENT If you couldn't find an address attestation matching an originated prefix in the route - and you got the route from a neighbour who was claiming to originate it, then you'd be very unhappy. If they got it from somebody further along, well, then, that's the shortcoming of not having it, you know, back to the edges so, if a given AS is doing this, an ISP is doing it, you would expect them to be pushing this back to their subscribers and, of course, many of their subscribers who aren't running BGP, well, they don't even ever know what's happening. It's not on their behalf. For others sub scribers, those subscribers could choose, in an offline fashion, to sign an attestation saying, "He's originating my address space." How far back you push it depends on the exact circumstances. RANDY BUSH Andrei, drop me a note and I'll forward you a reference to a paper, which describes a modification to s-BGP. There's a number of modifications, most of which I didn't like, but one allows very fragmented incremental deployment. GEOFF HUSTON It's just a follow-up question. You would conceivably see, if I understand the answer correctly, of being able to move into an s-BGP environment with address attestation as the first step here. STEPHEN KENT Yes. GEOFF HUSTON OK. STEPHEN KENT Whether we choose to do s-BGP or not, I think something like the PKI for address allocation and preferably ASs, the attestations are a common framework that anybody should want and then where you go from there is still debatable. GEOFF HUSTON Thank you. ANDREI ROBACHEVSKY Another question related to PKI structure. For s-BGP it's not identification, right. But, from RIR perspective, we certify our numbers. So apparently, there must be different PKI structures co-existing. STEPHEN KENT There don't have to be. If you choose to make the subject names in the certificates be useful for your identification purposes as a regional or a global or national registry, that's fine. S-BGP doesn't care. But it doesn't preclude using the same certificate for both purposes. The thing s-BGP cares about is that whoever has the private key corresponding to the public key in the certificate, no matter what their name is, has just signed an address attestation, for instance, and you can verify it. So you can use have one PKI to solve both of your requirements. It's just s-BGP wouldn't take advantage of that other aspect of it. But you don't have to create parallel ones, no. If you go to the trouble of doing it once, don't do it twice. RANDY BUSH You were right the first time. PHILIP SMITH Thank you very much for the questions and thank you very much Steve for your presentation. Thank you for coming. Thanks to you all for coming to this first session of the Routing SIG. I've put up the agenda for the second session, which starts at four owe lock, so I hope to see you all after the coffee break. Thank you. PHILIP SMITH My watch says that it's 4 o'clock so I think we probably want to make a start, please. (Pause) OK. So welcome to the second session of the APNIC Routing SIG. I have the agenda on the screen. This session we're going to concentrate more on the actual routing system itself. Varieties of reports on projects and various activities on the Internet. The first presentation is Operational experience at OCN/NTT. Route views project update, routing table and BGP movie update, a look at the BGP storms and Internet health during the big worm attacks and the results of anycast stability experiment. Before I invite Tomoya Yoshida up, just a reminder if you want to ask questions just use a microphone, either come to the two at the front of the room or request the roving microphone. Remember to mention your name just to help the stenographers so we all know who you are. I should remind people to look at the online notice board and also the jabber chat rooms which are also available for this session I'd like to invite up Tomoya Yoshida from OCN. TOMOYA YOSHIDA Thank you. My name is Tomoya Yoshida, I'm from NTT Communications. OCN is one of the biggest ISPs in Japan. Today I would like to present about some operational routing experience in NTT/O CN. This is our history, we started our OCN service in 1996. We started the name OCN Economy. This service was very cheap. (Refers throughout to slides) /28, /29 to our dedicated users, we configured at each router and distribute. When the OSPF external route reached around 20,000, OSPF convergence time needed more and more. This was our program. Then we tried many things. The one is to separate the OSPF domain. This was very complicated and the operation was very difficult. Then we changed the routing from OSPF to BGP. We used iBGP route for Internet route. Then the iBGP route is growing very fast because we changed the routing information from OSPF to BGP. So after that, we used a route reflector from the top to the bottom, the hierarchy is to the bottom currently being used. Also we had an address problem. That was at the time we couldn't get enough address space from JPNIC. So in currently IPv6 policy is easy to assign router space, but at the time this was very difficult to get enough address. As a result it was also difficult to allocate the route. You can see the changes of the backbone topology. The left one is in 1996. Those routers are connecting to one switch in 1996. Then after that, we divided the OSPF area, 1998. We used the switch, of the - I think this switch is good technology. We have many clusters but some clusters does not need BGP for route. So we distribute the route for the needed cluster. Some cluster does not need BGP route so we distribute the needed cluster that BGP put out. This is topology in 1999. The area is in Kyoto. Kyoto was connected to Osaka. This area the measure is what was connecting to one Tokyo Port. The worst area was connecting to only one Osaka port. This topology was - if Osaka is down, so all the traffic goes down so we changed to this topology so we called a square topology. In Kyoto it is connecting now both Osaka one and Osaka 2 ports. Japan is an island, a long distance island. So most of this area is connected to Tokyo, the west area is connecting to Osaka. This is just bandwidth. Most Japanese ISP use OSPF. I think this is a very historical case. We divided OSPF area. The router has many segments so we divided this router is - we separate the functions. For BGP, a route reflector hierarchy we use. And distribute for needed cluster. >From here, this is our experience about some specific things. This is a BGP prefix limitation. You know that both Cisco and Juniper have a limited function. But those implementations are different. You can see that if you have a route from peer. Then go to the local RIB. In Juniper's case they think by using this and Cisco use the local RIB. So we have many - this is very confusing. Just now I requested to Juniper to implement it by using the local RIB. They said that 7.4. (Referring to slides) This is next hop self/redistribution. If you forget next-hop-self at the eXchange border route and not redistributed to your backbone the IX segment around /24. In Japan there are 3 major IX, In Japan, 3 major IXs is announcing around /20 the part of the IX's segment IP like /24, so when some ISP forget the next-hop-self and not redistribute those segment to IGP, traffic will go to the IX's AS. In Japan, one or two times we can see this problem. So nobody, the IX segment is not routable but in Japan every major 3 IX is routable. This is LSA refresh experience. Cisco is 30 minutes. Juniper is 50 minutes. This is OK (Indicates screen) but sometimes there's no - in this case there are only 3 parts. We found this situation. We changed the time from 50 minutes to 30 minutes. In some cases you cannot see some paths like this (Indicates screen). This is just operational information, route cache is very useful. Currently almost vendor is implemented route refresh capability. Also IETF is discussing many capabilities internationally. But I think the soft reconfiguration inbound is very useful. When you set a new peer you set firstly, low priority to this new peer, but if you receive more specifics this is the best path, so firstly, we receive. Firstly check the route, not receiving any route, only monitor the route from peer by using cache then receive. This cache is very, very useful. This is also route flapping experience. This line is one flap. (Referring to screen) After 50 minutes more than one flap occurs, so if flap occurs, so Cisco's routing will be over 1500, and in Juniper, over 3,000. So Juniper, suppress is around this 3,000. So only Juniper router is suppressed, Cisco is not suppressed. This is very confusing for us. In this case only Cisco is used and Juniper is not used. So this is a bit complicated. Lastly, the routing hijack. We have around/10 IP blocks. Sometimes our prefix is hijacked. When we hijacked - we announced more space, for example if someone hijacked /20 so we announce two /21s route to the Internet, but when the hijack is /24 this is very difficult because we announce /25, two /25s route to the Internet but many peers does not receive /25s so we also /24 and also we announced two /25s. Then we need BGP origin validation security mechanism. Lastly, we need TTL hack security mechanism for many vendors. Prefix limitation by using LOC-RIB for Juniper. Mac accounting for 10 G. Traffic is another line so we cannot. Mac accounting is very important for 10 G. Feasible path reverse path forwarding for uRPF. Strict mode is dangerous. Loose mode is just loose. BGP inactive reason for Cisco is coming - Cisco implemented for CRS-1, I heard. So our operational additional information is very useful for you. Lastly, dynamic filtering. Just idea. If you receive the BGP route with this community, (for example, 4713:777 attribute), the route which in scope of this community will be rejected automatically. I think this is very useful for filtering for your PA. When some key address is added for us, so we change the router, just my idea, but this is very important and very useful, I think. That is just my idea. Lastly, our backbone. (Referring to screen) If you want more information visit the site on the slide. Thank you very much. PHILIP SMITH Thank you very much. Are there any questions at all? No? PETER SCHOENMAKER Why do you say - PHILIP SMITH Microphone, please. Please state your name. PETER SCHOENMAKER You said that unicast RPF is dangerous in strict mode. Are you just talking about BGP only or can you explain why you say it's dangerous? TOMOYA YOSHIDA We just think the best path to the left way, but the traffic coming another way. So this router is going this way. So if you con figure - PETER SCHOENMAKER You're just talking about BGP connections? General experiences with static customers traffic can only go one way. TOMOYA YOSHIDA I mean that at the gateway router. The customer is only just one, this is very useful, but in some cases, dangerous. PHILIP SMITH Any more questions? No, thank you very much. PETER SCHOENMAKER Is your knock having a hard time contacting upstream providers when - hijacks your address space to get them to stop advertising it? TOMOYA YOSHIDA Yeah, we firstly announced more space then we contact the source. PETER SCHOENMAKER Our general experience is most upstream providers are fairly responsive very quickly to hijacked address space and that it only takes maybe a phone call and a short period of time. TOMOYA YOSHIDA We also have a system, so this is agent, distributor agent in some areas, this agent cannot hijack, so after the hijack occurs, they send an email and we know the hijacker. PHILIP SMITH Thank you very much for your presentation. Joel. You're up. JOEL JAEGGLI I'm going to talk about what the Routeviews project is, some history, current efforts, new and future efforts, projects we're working on and how you can participate. What is Routeviews? Generically speaking, Routeviews is eight routers that collect realtime information about the global routing system through BGP sessions. It's a repository of historical data on the state of the global routing system that currently goes back to 1997. So, in some respects, it's one of the deepest sets of data we have that's been continuously collected. It's both an operational and research tool. A little history - the original Routeviews router goes back to 1995. It started as a purely operational tool. At the University of Oregon, originally we had one external provider, that was with Westnet. But, when our connectivity became a little more complex, it became, you know, necessary to see how the rest of the world saw our routes, particularly when we started making configuration changes. Looking Glasses, in 1995, were still in the future so, the best thing we could do as far as we could see was to get a BGP feed that was the whole table from outside our network. Randy Bush, who was then in RAINET was generous enough to give us the feed from MAE-WEST and, at the time, if you think about it, the big network operators considered this data to be extremely proprietary so the fact that he was willing to do that was, you know, of incredible utility to a relatively small organisation. We made that data publicly accessible and people began using it and, in the process, contributed more views. Periodically, people would collect information about the state of the whole routing system by teleneting into the router and doing a show ip bgp at intervals. In 1997 that was formalised by NLANR/MOAT. They began collecting it on a once-a-day basis. People then started to use the data for really interesting applications. One of them was Skitter, which allows you to sort of visualise what the interconnectivity between ASs looked like. And so, in this particular plot, which is from 2000, the most connected ASs are in the middle and then the least-connected ASs radiate to the outside. To jump ahead, Routeviews continued to percolate along and people added more views and we got to the point where we had about 50 peers on the router. That was in about mid-2000. It became obvious to us that, for Routeviews to continue to be interesting and relevant, that new things had to happen. In mid-2000, we began to get serious scalability problems, because we had a router that was fairly modern at the time, but was accepting 50-plus multi-hop BGP feeds and had around 5,000 interactive logins a day, a lot for Routeviews to support. At the same time, it was becoming obvious that researchers and some operators had needs that were not being met by any of the existing data collection methods that were available. So we began a new project in March of 2001, we began actually collecting. Show ip bgp dumps ourselves at two-hour intervals, so the data became more fine-grained at that point in terms of the whole table. We also decided to deploy a new service, which was tentatively named route-views2 and that sort of stuck. Route-views2 is a zone area BGPD running on Linux. Initially, that was not a complete success, in part because no-one had actually used Zebra for the particular application that we wanted to use it for - which was taking 50 BGP multi-hop peering sessions. They'd actually tried to use it as a router, you know? And, it couldn't, so it couldn't really handle 60 peers and the cli itself was slow enough that it proved to be unusable. So we punted on actually replacing Routeviews with that and set it up as a separate service. So we continue to run the original Routeviews service to today, including the 2-hour dumps. An upgrade to Routeviews performed at the same time proved to make that a heck of a lot more useable and it continues to be functional to this day. It has about 70 peers at this point. But the thing we really wanted out of Zebra was better collection methods because, with the router, there was basically two ways to get updates on routing events out of the thing - turn on debug and export all of that data to some other host that would then log it, which had fairly serious performance implications and, given the already sad state of the router, seemed unlikely to be successful. Whereas, with the Zebra, we could actually put it directly to a locally attached disk on a two-hour basis and we could actually log updates to a file as they happened. So, basically, every 15 minutes, the route-views2 box dates off its current log file and starts a new one. So that means updates can become available almost immediately and there's, you know, rather significantly more fine-grained information about what's going on with the state of routing than there was when you either had to log in and use the router yourself and collect that information manually or intuit what was going on from the two-hour dumps. At the same time, there was a lot of data all over the place. Hans Verner's data was sitting in San Diego. Bill Woodcock at pch.net had data from the route-views router and we had some sitting on our machine. We created routeviews.org, which was then and continues to be the existing log for all the Routeviews data that we can aggregate. As high-resolution data became available from route-views2, to more problems became apparent, one of which we solved and the other there is an ongoing effort in the IETF to address. One is that MRT had a one-second resolution. With route-views2, that's not actually a huge issue because we have all these eBGP multi-hop sessions with routers all over the world so your ability to make very fine-grained insertions are somewhat limited anyway because some of those routers are up to 150, 200, 300 mil a second away. Other shortcoming that was pointed out by researchers who were drilling into the data and it was obvious to some more than others, was that there are artefacts in the data that are the product of the multi-hop BGP sessions themselves rather than actual BGP events. Basically, like having a large number of different UNIX platforms, routers have TCP stacks of varying quality and, when you put a network in between it of indeterminate length or whose routing may be effect or when you load a CPU up on the router, you may get events that are the product of the TCP session either being reset or becoming really slow or having slow-start sort of artefacts that cause poor performance that are not directly related to routing events and so those artefacts do show up in the data. A new effort was started to place Routeviews routers directly on Internet Exchange fabrics so that we could take single-hop BGP sessions. The first of those new routers went online in July 2003 at dix-ie thanks to the WIDE Project. Akira Kato was kind enough to host that box for us and continues to do so up to this day. Routeviews.wid was followed by route-views.isc located at the PAIX in Palo Alto in California USA. Route-views.linx is located in London and route-views.eqix located in Ashburn. Early in 2003, we actually got our first Routeviews employee. For the most part, up until this time, Routeviews has been by volunteers, myself included. John Heasely is with us on a one-year sabbatical from Verio. Route-views6 was a new box we created which is located at the University of Oregon, takes eBGP multi-hop IPv6 feeds. So, since May 2003, we have been collecting the same information that we collected at v4 for IPv6. Fall 2004, it became apparent that we would need to support tcp-md5 because of excitement with Cisco and Juniper routers that occurred at approximately that time. It actually took kind of a while to deploy and it is still somewhat hackish. At the moment, our Zebra collectors can only initiate tcp-md 5 BGP sessions. They cannot actually receive them. You know, we do now support rfc2385 but, in general, we prefer not to do it if we can. Current efforts - John Heasley has finished up his time. Mike Witt has joined the Routeviews project to take over some of those activities. Continued operation of the Routeviews collectors and archives maintains or requires a bunch of my time. In effect, at this point, the Routeviews project is a globally distributed ISP with no links. Right? We have machines, you know, in POPs all over the world and that is a significant maintenance headache. I haven't actually seen or touched the box in Otemachi since it was installed, yet it's gone through two upgrades and had some disks replaced. So it is, you know, a significant undertaking. Some tool development has occurred in the recent past. Not all of it by us but we hope that it's useful. Two interesting applications that we've deployed are BGPlay, which you saw an example of in the Boeing demo and which I will demonstrate here in a moment, and IP to ASN DNS zones, which is another tool that we thought would be useful for, you know, drilling in the Routeviews data and it turns out a lot of people like to use it for spam filtering. Its actually taken off and has a life of its own. Collaboration with researchers - most of the funding we have now is actually to support research efforts in additional data collection so, without researchers, there's really less of a reason for Routeviews to exist in the large state that it is at this point. It's beyond the ability of the University of Oregon to support by itself. So BGPlay is a Java application which displays animated graphs of the routing activity of a certain prefix within a specified time interval. Its graphical nature makes it much easier to use and understand how BGP updates are affecting a particular AS rather than by, you know, looking in the updates themselves. The BGPlay database stores 10 days worth of data provided by the Routeviews project. We are working on actually increasing the amount stored. It is a significant data set. BGPlay was actually written by the Computer Networks Research Group at Roma Tre University. So we can't really take any credit for it other than it uses our data and that we host one of the current applications. DNS-IP to ASN S-RB/ ASPATH - we have two queryable subdomains of TXT records in routeviews.org. Asn.routeviews.org resolves a reversed IPv4 address or prefix to the origin AS prefix and prefix length of the best route as seen by route-views2. So, I mean, basically, you know, for any given IP address, you can put it in, as in the example (refers to slide) And what it will do is give you the route block that it came from and the autonomous system number. ASN/ASPATH does the same thing but resolves to the full AS path. The zone files are reconstituted twice daily and are available for download from archives.routeviews.org. We don't allow transfer from them because the smallest of the two is 179 megs. Current efforts - these are our current peer counts (refers to slide). Route-views is holding steady at about 70 peers. We have stopped pretty much taking additional peers to Routeviews. We still take them to route-views2. We would really like people to send us views at exchange points. We would like to deploy additional regional collectors, something in the order of three to five, if you're interested in hosting one, that would be cool. We are soliciting more input on tool development and we have a few projects in the works ourselves, including queryable databases. We're interested in providing local computing resources and storage for researchers. So, how can you participate? Well, a lot of people just use Routeviews. That's one way to participate. Bring a view to Routeviews. We're really looking for single-hop views at the IXes but we'll still take multi-hop views. If you have v6, we'd sure of like -- sure as heck like help. Send mail to help@routeviews.org with that. We will not announce or do anything with those routes other than sink them into the machine. Host a collector. We're looking to build out three to five more. It as an operational tool. This is who we are (Refers to slide) That's us. There's a bibliography at the end which will point you to the various pieces I have covered (refers to slide) If you want to take a look at BGPlay here real quick. What you can do is take an arbitrary prefix. I'll use one that I happen to know fairly well. (Types it into the BGPlay query form) Plug in a time interval up to 10 days so we'll go back to the 13th. So we've got a layout. We can see that this is AS-358 2 here in the middle. That's the University of Oregon. We can see our immediate upstream for the neuroproject with the School of Engineering. We can see one of our other upstreams, Williams Communications. If we put it in play, you can watch paths move around for 10 days. You can see, I mean, this is pretty stable for the most part. This is kind of the boring target, which is the way we like it. RANDY BUSH Can you wrap up please, Joel. JOEL JAEGGLI Yes, this is it. Any questions. Thanks for your time. PHILIP SMITH Thanks very much, Joel. Geoff, would you like to... GEOFF HUSTON Hi. My name is Geoff Huston. I'm with APNIC. What I'll be showing you now is actually application of that Routeviews data. So this is actually a very quick status report on the state of the interdomain routing system as seen by Routeviews. Have we got a laser pointer there? I'll show you a whole sequence of graphs and a little movie. (Refers to slide) This is since 1994, looking at the number of entries in the BGP routing table. Routeviews came on line in late '97. This is 0, this is 200,000 entries. Here is the Internet boom. It stopped at around 2001. A bit of a sort of Internet burst there, but now more recently, we're back into a steady growth pattern which is, yet again, accelerating in the number of entries. That was very sharply an exponential growth pattern. This is a growth pattern which one could model at some kind of increasing rate, probably exponential but to a lesser degree. If we take up the noise and look at one particular AS and its growth, you actually see the boom and bust pretty clearly. So around the year 2001 and around 100,000 entries, the number of entries actually stopped growing for around 12 months and then we all decided life is cool, the Internet is wonderful and back we go again. More recently, and that is 2003 up until a couple of days ago, for some reason, in the last half of last year, things started growing more quickly and, indeed, just around December and January, someone was actually being very active here. There were bursts of disaggregation. But the last month or so has actually seen a whole bunch of new routing entries. I shouldn't say 'new. Much of them are more specific. Are still leaking. The number of routing entries is not the entire dimension of the routing system. The other way to look at it is how much address space is actually routed. There are 4 billion /32s in IPv4 and this is a small range from 800 million up to 1.4 billion. So, around a quarter, a little over a quarter of address space is actually advertised in BGP. 1997 up until a few days ago and what I'm trying to find here is curves and distributions. Are we consuming address space more quickly? Again, an Internet boom, not as obvious, a bit of correction - 2001/2002 - and more recently, since mid-2003, all of a sudden, we're starting to see rollouts again, which is a dramatic increase in the amount of address space in the routing table. Viewed from a single AS - same data, just one stream - what's all this nice? Someone flaps a /8. Why you would turn on and off 17 million /32s multiple times a day beats me with a stick but, you know, people do. More recently, they stopped, since mid-2004. You can see they've accelerated, coming back again. Are we getting better at aggregation? No. Are we getting any worse? No. Certainly, the boom saw a lot of more specifics getting into the routing table. But, by the end of 2001, the percentage of more specifics stopped at around 55%. So half the table, a little over half, actually doesn't need to be there. Since then, the routing table has grown and, if everyone was simply doing the right thing, the percentage of aggregates, more specifics, would actually drop. But it's not. There's still the same amount of folk who get address space typically a /20 - and all of a sudden produce an advertisement of a /20 and, just for good effect, about eight or so /24s so we can fill up our routing table faster. So the 55% hasn't really changed much. No better, no worse. How many offers are there? Internet boom, very quickly from 3,000 Autonomous Systems up to 12,000 Autonomous Systems in the routing table and then, after the boom, there wasn't really a crash but the trend line is quite different. What used to be a very prominent exponential growth rate, which would have seen us run out of addresses by approximately 2007/2008, is now into a more linear trend. I haven't done forecasts on that yet, but certainly further out. Currently around 18,800 autonomous system numbers. Steve, I think it was, said the average length of AS paths, if you remove all the AS path stuffing, is around three or four. He's dead set correct and, actually hasn't altered much since 1997. Here's the AS path link as used by every AS peer as from 1997 to a few days ago. Most of the Routeviews peers see their average AS path length at around 3.8 or so and, interestingly, it's converging. More entries, but same length. So, if you have a metric about the interconnection densities, you're actually seeing that BGP is becoming more connected. As we grow, the diameter is not expanding. The diameter of the network is actually contracting. There's a big black hole somewhere in the middle of the network and if you said AS 701, you'd probably still be pretty close to it. The AS paths do tend at the moment to continue to converge in the number of players that's growing. If we were better, how better could we be? This is since 2001 the size of the BGP routing table - 100,000 up to 158,000 entries. If we started to remove the more specifics, the folk can have precisely the same propended AS path, 105,000 entries. Strip out propending - 104,000. Say, "If it comes from that area but is more specific, drop it out" - 90,000. Now say some folk checkerboard - given a range of addresses, they'll advertise a /24, then a blank and then a /24 and blank. If I covered an aggregate over the holes, can I do better? Yes, you can. Around 45,000 entries is around the true information load in the routing system. The number of entries that add something new in terms of reachability. Up here is actually traffic engineering. The overload of traffic engineering is approximately 100,000 routs or two-thirds of the routing table. What about v6? The red line is v6. Routeviews started collecting in 2003. The blue line is the 6bone and the green line is what the RIRs are allocating. A phenomenal 400 entries and today we're up to a more phenomenal 700 entries. The activity in the v6 routing table is growing, albeit very slowly. How much address space? This is a logarithmic graph. It's not this slow. That's a /24, /22, /20 and so on. The total v6 table is about a / 17, a little bit over that. It used to be that the 6bone was the major contributor with a number of very large allocations done to major tell KOES/ over the last year -- telcos over the last year ago that are announced in the table. How many people are playing in v6? A little bit over 500. I actually searched for the number of Autonomous Systems that only originate in v6. Whichever of you out there are, there are only I think 11 at the moment. So, in general, people are doing both and there are actually 11 originators which are only. Growing, as you see, quite slowly. Aggregation potential. In general, the policies around v6 have actually been at this particular point - that's very early days - largely what we're seeing is good aggregation. One particular entity not very far from here but not in this country did decide to disaggregate into /48s in July last year - thank you very much - and they've persisted in doing that with a bit of experimentation up until early January. They decided that the experiment was over and they've re-aggregated. So in general, the thing is packed almost as tight as you can get. Which brings me now to one of the more interesting applications of this Routeviews data, that, if you take all of the Routeviews data and assemble it, you can actually make a movie of what's happened, which I have done. So, as this runs, we'll actually just quickly go through what you're seeing. This is the IPv4 space. Here is 16, 32, 128, net 192. Here is the top space, the old class E and class D. That's the entire space. What I've done is just use the /8s. Originally in 1993, IANA was simply allocated Class As, class Bs, class Cs. The red vertical bar is actually where it gets allocated into what is going to become an RIR. In 1985, we haven't even invented the acronym. If you can just see it, the green bits which slowly occupy the red bars are when the RIRs pals the address out. So it's the actual allocation. Down the bottom is the autonomous system space, so AS numbers from 0 up to 65,000 and there you see a couple of metres indicating some sort of spurt rates at how fast we're going through. We're up to 1986/1987 and the network was actually classful. The space is given out to various folk, including universities. If you notice, most of the activity was in the class B space. There was a lot of activity too up around 192 but each allocation is so damn small but at this scale of the entire address space, you can hardly see it. WIDE started doing it around 1986, it might have been JUNET then. You're now seeing phenomenal activity in the class B space, a really phenomenal amount of activity. So much so that, by about now, it became obvious that, if we're going to persist we've run out of Bs 10 years ago. These were just running riot and, by 1990, 1991, the academic and research bodies were running through and getting - you see how fast those green lines are going. Not only were they getting through it, but the registry folks, at that time, network solutions were working extraordinarily effectively. Down here, we started using BGP and what you actually see is BGP is slowly growing in terms of the autonomous system numbers. But, you know, the address space is just romping through. This became a problem. And what you actually see is that, at this point, we're all going to meetings talking about howl we were going to stop ourselves crashing through either a routing or address space explosion. So the Bs are now getting very, very full, as these green lines of allocation. And, by late '93, early '94, it was time to actually devise a different mechanism and we started building classless into the main routing. With classless interdomain routing, we started occupying that top space there. And you notice now the activity there is slowing down and over here the activity is rising. Also, the number of players - remember 1995? Most of you might. The number of players are starting to increase. The stuff is getting commercial. The number of Autonomous Systems in play was really starting to move. So this thing is growing very quickly, whereas CIDR is putting some break on the amount of consumption of address space. So now we've seen a lot of the old /8s being used, a lot of the class Bs and now most of the activity is up there in CIDR. So the next bit to introduce which happens in 1997 as you heard is that Routeviews started collecting true rate routing data. What is allocated is not what's routed. What's routed is actually a subset so what I've done is overlay on this what's actually routed. Most of the old Class As aren't in the routing table. A huge amount of space is just lying there, moribund is the word that springs to my mind. I can't see most of them and there's a flapping /8. Why people do this I don't know but the routeview is seeing this appear and disappear. The old class B space, half of it is being hoarded. Where the RIRs are actually active in this point with CIDR-based allocations, what you're seeing is that the majority of it is routed. The light green is unrouted. The dark green is what I'm seeing. At the ASs too, were seeing relatively slow growth in the AS space. The ASs are actually just romping through. And the other thing is no-one likes an old AS. When an AS is about three years old, you throw it away and get a new one. These old one s, 10,000, 5,000 or know are heading off to the great AS rain guard in the sky. We're interested in the big numbers. Now we're round about an 80% diverting rate on recent autonomous system numbers. The old one s - now they're occupied by the old 64 /8. Those policies that got introduced about, if you get it, you advertise it, is actually largely there but what is actually being allocated appears in the routing table. Routeviews does collapse from time to time. When the thing goes all light green, it's a gateway anomaly. We're back to recent history, September 2004. Old AS numbers are still heading to the AS number radar in the sky. Everyone wants a new one. Where we were a week or two ago is where we are here. The B space is the space of a large amount of unused resource. The old A space is a large amount of unused resource. There is actually quite a deal of life in that table, more than one would expect by even looking at allegations, as to what we're actually using in the public network. And even in the autonomous systems space, there's a little bit under half left, which would actually take some time to consume at this current rate and a huge amount of space to find the AS graveyard in the sky and grab some back. Questions? OK, thank you. APPLAUSE PHILIP SMITH Thank you very much, Geoff. RANDY BUSH Due to time constraints, I'm going to skip the first two of my talks - the first one of my two talks. I'm Randy Bush from IIJ. Please note the word 'early'. That's a very important word here. The graduate student is sending us information as we sit at the podium. OK - oops! That's not enjoyable. (Bottom line of slide not appearing) Anybody have any ideas how to recover the bottom line? Please tell us. OK. Why do we do this? In a NANOG meeting, Revising Mark Koshers did a presentation where they essentially said - do not (Terry Anderson makes some adjustments to projector) and they didn't! TERRY MANDERSON Do you have a different resolution? RANDY BUSH Yeah, I could probably do lots of stuff but there's only so much time that's worth playing with it, you know? This looks a bit better. So where they said, if you follow the URL, on foils 27 to 29, they said the jitter was so much, don't run anycast with tasteful transport. Cast is not sufficiently reliable to keep a long-term connection. But for almost a decade, there have been reports of successful delivery of tasteful services over anycast. Something's wrong. Was their measurement from an abnormal vantage point or are there other things happening? So we set up a little experiment and we sent out an e-mail to the mailing list with a little test program and volunteers at hundreds of hosts around the world ran this multiday experiment for about a month. And every two seconds from each of them, it probed the anycast root servers by doing a dig at that route server - that X is, you know, a CM - the anycast ones - to find out which of the specific instances of the anycast route server it hit so, for instance, if it went after I-Rootservers.net, it would get back either Stockholm or New York or whatever. So it knew which one it hit. And it did it with both UDP and TCP and the results were collected on a central server and they're being analysed as we speak. And the results from a particular host look like this - a time stamp, whether it was UDP or TCP, which root server and which instance. So you can see we have Palo Alto, etc, etc. I want to warn you - this is not about the reliability of route servers, OK? They were probably doing just fine. It's about things we don't know about BGP and IGP routing. OK? This is about routing. The effects you are about to see are probably caused by eBGP between ISPs iBGP within an ISP and the IGP within an ISP. OK? This is the view from one AS. A block scale across the bottom. This says about 37 switches, changes of which one the probe hit, occurred for route server I every, oh, I don't know, 40 seconds or 50 seconds. Here's root server F etc. So this is the frequency of -- What you're seeing there is 35 switches at a frequency of less than a minute reaching route server I. This is what we would like to see. It always gets the same one. Routing was rock stable. Remember all these measurements from this different ASs were taken during the same time period, so the differences we are seeing are the differences of how a particular AS experiences routing. Here's a bad one. Notice it sees the same - what's nice is it sees the same things, generally, across all the different anycast root servers. So, you know, the K-Root server, the RIPE root server, it's getting 70 switches every about 10 seconds. No - it got 70 switches within 10 seconds of the last switch. (TEXT MISSING HERE) PETER SCHOENMAKER You get some oscillation in your traffic engineering. RANDY BUSH Let me insert some optimism. I'm going to go through this very quickly because we're almost out of time. This is the only presentation this one makes clear BGP noise is not a good predictor of bad packet delivery. OK. Many of you will see on the first experiment we conducted which said - I will do this quickly, especially for those people who have a hard time with my accent. What's the relationship between the control plane, ie routing and the data plane, packet delivery. The fact that there are a lot of BGP updates, is it good or bad? OK. People say Internet routing is fragile, collapsing, BGP is broken, Day X was a bad routing day, et cetera. How do we measure routing? We're told a lot of BGP updates is Internet instability. There are too many BGP updates so BGP must be broken. OK. I don't think so. I think BGP announcements could be like white blood cells. That they show that there is a disease, but they are actually fighting it. They're part of the cure not the problem. And that routing quality, what's good routing is how can we say it's good unless we have metrics. I don't think we should assume that the number of prefixes, the speed or completeness of convergence are measures of routing quality. The goal of routing is to get the customers packets there. Reliably. So I think we should measure whether the users packets reach their destinations. If the users' packets reach their destinations we jokingly call them happy packets and the routing system must be working. And there are very well known and rigorous metrics for measuring packet performance. They are delay, drop, jitter and reordering, very formal ways of measuring. We set out to measure the control plane quality by measuring the data plane. We did this other experiment that I reported previously but many of you didn't see it where we had a BGP beacon making announcements out to the Internet and withdrawals. And hundreds of nodes streaming data towards there and we measured delay, drop and jitter. But that's an artificial experiment, not a real event, OK. But we measured the performance, we found no significant correlation between the number or time of updates and data performance, but this was artificial and it didn't check real events on the real Internet. We said what, are the real events? People talk about the code red event, the Nimbda event and the Slammer event. The messages mean the Internet was horrible that day. We got the route views data, thank you, Joel, for the cell plane. A number of BGP announcements ape their frequency. We got the data plane, thank you Andre and the rest of the RIPE folk who are here for the RIPE test traffic measurement project which has many nodes scattered round the world but unfortunately mostly concentrated in Europe, we need more in Asia and the States. These boxes all send packets to each other. They use that formal measurement of delay, drop and jitter. And those data were made available kindly by RIPE to us, for the periods of code red, Nimbda and Slammer. When we look at, for instance, code red, this is when it occurred, OK, we see the big spike in BGP updates and we see no significant change in delay of packet delivery. BGP, the routing system got hit, something hit it. But it worked and the customers packets got delivered. Similarly, Nimbda, big spike. By the way this line is at 1.96 standard deviations which is a 95th percentile, so 95th percentile confident - you know. BGP went to hell. Packets were happy. Slammer not. Slammer, BGP updates, mean delays. So packets were delayed, OK. Because of slammer. Let's look a little at why. If we look at code red these are time series analysis, these are months here. So we're matching a code red, Nimbda on one slide and we see the delay in blue, OK, and we see no - that occurred before code red, it doesn't count. Something else made a radically significant effect on delay packet delivery but it wasn't code red and it wasn't Nimbda. Packet loss, same story. The number of routing changes - aha, we see them! So that's red and Nimbda. But here's Slammer where we did see delay associated with Slammer. But we didn't see packets lost. So the BGP event, the event that caused the routing changes probably caused less efficient routing, the packets took a less efficient path, but they got there. The same thing - here's the routing changes, masses of significant routing changes, OK. So watching BGP update count or frequency is easy but it's not a good predictor. Measure performance directly, please. Here's how to measure them. It would be nice to have more test traffic boxes scattered through Asia and the States so we can. And thanks to - this was done by the way with Matt at University of Adelaide. Whoo! 2 minutes late. Questions? Thank you. APPLAUSE PHILIP SMITH Thank you very much, Randy. If there are no other questions then this is the end of the routing SIG. I would like to thank all the speakers very much for giving their time to give us presentations. If any of them haven't already done so you can please give your slides to APNIC so they can go on the APNIC web site. I would like to thank you all very much for attending. I think these two sessions of the APNIC routing SIG has been successful so I hope we can repeat this when we next meet in six months wherever that may be. So thanks all very much and we'll see you soon! RANDY BUSH That was my question, by the way - do people like the expanded routing SIG agenda? Or would you rather it be short? PHILIP SMITH Everybody stayed so it must've been alright. Thanks very much. APPLAUSE