______________________________________________________________________

				  DRAFT TRANSCRIPT

				     Routing SIG

			    Wednesday 23 February 2005

					  2.00pm

______________________________________________________________________


PHILIP SMITH

We probably should make a start to this. I would like to welcome
you all to APRICOT 2005. This is APNIC 19. The Routing SIG session.

Just some administration issues before we start  - the chairs of
this special interest group are myself, Philip Smith, and Randy
Bush. You can reach us if you have any need to send us email (refers
to web addresses on a slide)

Hopefully useful and contributing appropriately to discussions about
routing in the Asia Pacific region.

Before I dive into the agenda, I just want to go through some general
administration. I want to remind you all that there is an onsite
notice board. Please check that. There is a jabber chat room which
you can participate in. There's also live text streaming. I'd like
to remind both speakers and everyone else who wishes to participate
in the session to speak nice and slowly and clearly so that our two
helpers here can understand what we're saying and convert the words
into writing on the screen for you.

I'd also ask anybody who wants to ask a question or make a comment
to please don't shout at from the room, but come up to the microphone,
state your name and then ask your question.

There will also be a roving microphone. I think I'm using the wrong
microphone at the moment! (Laughs) but I'll hand this over to the
- John will be going around with the mike.

OK. So to the agenda for today. The session is in two halves, we've
got quite a large number of presentations so APNIC kindly agreed
to give us two-90 minute sessions which we're appreciative of. The
first session we'll be starting off with a presentation from Brian
Seen then we'll get stuck into BGP security which is one of the hot
topics of Internet at the moment. We'll have two speakers, Russ
Housley and Steve Kent talking about that.

Then in the second session after the break, more reports into the
routing system. We'll discuss those more after the break but there
is a sample of what's coming in the second session this afternoon
(Refers to slide) so hopefully you'll stay for the entire afternoon
and help us out with this routing special interest group.


OK. So getting back to this session. First presentation I'd like
to invite Brian to come up and tell us about global IP network
mobility using BGP.


BRIAN SKEEN

Everybody hearing me OK?

(Pause)

OK. Good afternoon noon. My name's Brian Skeen. I work in the network
engineering department for Connexion by Boeing. It is a business
unit of the Boeing Company. We provide broadband data services to
numerous commercial, government, private customers. I'm going to
talk today about how we leverage BGP to provide a solution for
global network mobility, talk a little bit about some of the
challenges and some of the considerations that we looked at in doing
so. A quick outline  - give you some background on who Connexion
is, what we do, I'm gonna touch on the current mobility standards,
how they may or may not meet the needs that we have from a network
mobility standpoint. Talk about some of the unique challenges and
service considerations we have.

Moving at a high rate of speed, 506 miles per hour covering a large
geographic region on a daily basis. So some inherent challenges are
there, and why BGP? What does it offer, how do we leverage it in
our solution and of course queues.

Connexion, as I mentioned we are a broadband data services provider.
At the bottom there you can see sampling of our current customer
base, service launched in May of this last year aboard a Lufthansa
flight. They were our launch customer. We have been expanding routes
to some of those other customers and others not listed there. Thats
ongoing on a daily basis. Service is available on both Boeing and
airbus aircraft, contrary to some belief. But it's dependent on the
airline and what aircraft they want to use to cover certain flight
routes. It's pretty evenly split.

What do we provide? Internet access. VPN support. It's like an
802.11 wireless hotspot you find in a book store or airport or
something of that nature. Later this year we'll be introducing
television aboard some Singapore Airlines flights that will expand
to other providers. To the airlines, you can read what we have there
but what we needed was a system that was robust and reliable.
Aircraft systems have to have a lifespan of about ten years. Taking
an aircraft out of service for maintenance or for a retrofit is
cost prohibitive for an airline perspective.

Primarily we use existing 802.11 wireless technology on board,
reasons for that are probably fairly obvious. Reduced weight, reduced
power, again you don't have to take the plane for a retrofit.

Quick picture of what this looks like. You have an antenna profile
mounted on the top. We use existing geosynchronous satellites, we
release numerous transponders from different providers around the
world. There is a data transceiver router function on board that
provides both the on and off routing. Interconnection with the
satellite subsystems and also the interface to the wireless access
points. There can be up to 7 of them. Wireless access points, that
is. depending on the configuration and the size of plane, those
sorts of things.

Quick look at the system architecture, talked  -

There is a core network units which provides the cabin distribution
system interface. It's physical connectivity for the wireless access
points, provides the DNS, DHCP, things of that sort.

On the ground we currently have four operational ground stations
located throughout the world, a fifth has been brought online. Will
be done in April.

Each of those tie backs remain  - to a main data centre on the west
coast. They each have a connector to the Internet. I'll talk about
that more later.

Give you an idea of our current service region. This is constantly
expanding to cover customer flight routes. This is showing the
existing ground stations, one here in Japan north of us. One in
Moscow, Russia, another in Switzerland, Leuk and one in the US just
outside Denver. The fifth one being brought online is on Vancouver
Island in British Columbia. That may be at the end of April. One
thing to note  - all of these ground stations are maintained as
separate BGP autonomous systems within our network.

Just a little bit about the current mobility standards and how they
may or may not be working for us. The current open standard for IP
mobility targets , host mobility, relies on some level of mobility
support within a protocol stack and also relies heavily on tunneling
back to home ranging.

In the small geographic area where you have more confined space,
you don't have a platform that's moving outside that region frequently
or potentially spending 50% of its time out of that region that
tradition traditional model may work well. Our network is highly
mobile, it moves over large geographic region throughout the course
of the day and back again and it spends about 50% of the time outside
of its home region. So you're looking at a potential backhaul.

Some of the network, NEMO basic support protocol recently  - that
starts to address some of our concerns although it deals with IPv6
primarily at this time and again talking here about IPv4 and still
relies heavily on IP tunneling.

What were some of our challenges? I've mentioned the platforms
specific challenges, the traditional network mobility model.

You could have up to a few hundred within a network depending on
the size and configuration of the aircraft. To give you an idea of
the scope  - a potential  - typical flight between Europe and Asia
could touch about 3 different ground stations and up to four
transponders. More likely probably two ground stations and two to
three transponders but there are some routes that do fall into that
category. That leads to what mobility tries to address in the first
place which is a similar user experience  - we have frequent changes,
point of attachment to the Internet. A transparent experience and
a seamless situation for the users is desired.

If you're looking at the standard mobility implementation, again,
I mentioned the fact that you have increased latency and the capacity
of the land service. This gives you a high level idea of what we're
talking about there. In this case, where you have a statistically
homed plane, say in Europe but it's operating out of this region
currently and it wants to reach a web site in this region you can
get an idea of what some of the latency would be involved there.
Those are some references based on some testing we've done. You
can't get away from the satellite delay. That's roughly 5.5
milliseconds, something of that order. You start to get into some
back and forth across the ocean situation that is drive up the
latency time. So you can approach 3 seconds in some cases.

What we set out to do was to find a better way to do that. Try to
leverage some existing technologies, what we had available to us,
to find a better path, reduce that latency, reduce the requirements
on our land links and hopefully improve the overall reliability.
If we reduce latency and  - in the time it takes to move back and
forth across the ocean the network should be more reliable and be
a better user experience. We also wanted to leverage existing
technologies. We didn't want to have special requirements that we
had to go to our providers. We didn't want to have any special
requirements or situations that they would have to deal with. That
basically led us to a conclusion that we need to follow the geography
of the plane. And how do we do that?

Leverage BGP. At this point we leveraged the fact that it's supported
natively, everywhere on the Internet. That we have the ability to
control the routes that we use within the Internet as a whole and
not just within our network and it allows us to selectively advertise
those aircraft routes they move.

Going back to the other example of where that leaves us. Now we're
talking about a situation where a plane is dynamically homed to a
particular gateway when it's in that region. You can see the latency
savings in doing that. You don't get away from the satellite latency
but you try and get a web site directly in Asia then you're there.
You can see the change there.

Just a quick diagram of how this works. A representation of a few
of the gateways, all have ties to the Internet as well as back to
primary data centres in the US. Essentially what happens is as the
plane moves into certain regions, we will make a selective route
advertisement from that ground station and all passenger traffic
will be sent directly to the Internet from that location, to and
from.

As the plane transitions and moves let's say to the European region
we do a selective withdrawal of that route and will readvertise it
out of the European region. Traffic in and out of the providers in
that region. Each ground station is only serving or advertising the
routes for the aircraft that's currently serving.

Just a quick network diagram to show you what the ground station
network looks like. Only necessary to retain dual ISP. We have a
set of routers here that are primarily responsible for the dynamic
rejection and withdrawal. Those route servers have a back-end tie
to some of our satellite subsystems and allow us to  very dynamically
and soon to be fully dynamically move those routes around .

Just talk a little bit about some of the challenges. I think we're
probably all aware of these. The first being a /24 network propagation.
We do use /24 public address block on each aircraft today. There
are concerns within the Internet about the growing number of routes,
the default-free zone and obviously this will be part of it. Were
aware of that. We've investigated this as we went through how we
would leverage BGP and what solution we'd take. Basically we found
in discussing this with our providers and in our testing that these
routes are being allowed, they're not being filtered or aggregated
for the most part and we haven't seen any operational issues at
this point of using this /24. It is possible of course and we realise
that.   In the event that something like that did happen from a
provider somewhere in the deep corner of the Internet, that they'd
get aggregated then we also advertise an aggregate block for all
our public address ranges and we advertise that from a central
location on the west coast. So it would provide a path back to our
network backhauled if we needed to accomplish that.

A couple of other challenges  - common question we get  - BGP
convergence versus the satellite handoff. Are they complementary
or do you have one longer than the other? The short answer is yes
they are complementary. You're looking at about a minutes worth of
time for satellite handoff to occur. In some cases that's just a
little bit less. Our testing has shown that the BGP propagation
convergence does occur within that time. We have not really had
operational issues as far as the satellite coming up in two-way
communications and not having the BGP routes converged and ready
to route properly.

Prefix churn is another one. You're changing these routes on the
databases but you're really only talking about the normal circumstances
probably a route change for a particular flight once, say every 12
hours, you're talking about a transoceanic flights and based on the
satellite coverage map probably somewhere in the order of 10 to 12
hours, so you're really not approaching that threshold. As a
percentage of the total churn we kind of figured that our part in
that was probably somewhere around 1/10th of 1%. Don't quote me on
that  - but fairly small. Prefixes could have an inconsistent origin
based on they were advertised and it changes. No operational issues
with that either. We haven't seen that in any case.

A quick idea of what this looks like. This was taken from a Lufthansa
flight in November from Tokyo to a point in Europe. We used some
of the BGP data modelling tools, BGPlay in this instance, we'll
here about that later. It's provided by the Routeviews project and
we use it to observe the behaviour during the satellite handoffs.
It's basically a collection of routers that collect realtime BGP
session data and show you an animated view of your prefixes is
treated.

Here is a screen shot of it. Just shows RAS numbers, there are a
few of them there. It shows you that currently the prefix here,
216, 65 is one of our ranges, one of the commercial aircraft. It
is being advertised. Of the Ibaraki ground station.

This happens from the prefix situation. Here you are observing the
fact that it's been withdrawn by the route servers in Ibaraki and
being relearned via the Moscow ground station.

The small lines, thin lines, are implying implicit route change,
the wider lines that flash up are more of a route withdrawal, more
of an explicit change.

Each of the end points here on the map show different AS numbers
and their view of this particular prefix from where they stand. So
that's just a redraw of the map there to show you how  - this
particular prefix is being worked.


SPEAKER FROM THE FLOOR

How long does that take?


BRIAN SKEEN

40 seconds.

Then you would see it's converged, it's going through our two ISP
providers in Moscow. I have another one that would show the Moscow
to Leuk handoff. Again very similar. We'll just show it converges
through the two ISP providers there.

Another common question that goes back to the churn. The prefix and
the dampening, again, we have not observed that in normal operational
circumstances. We did some pretty substantial testing on it as well
as talking to the providers. Part of that testing that after about
5 changes in a short amount of time you will potentially see some
dampening of that. We have a safety net of the aggregate being
advertised allows a backhaul through our network.

Handoff within a ground station would not propagate a new advertising
to the Internet. That's mostly to the ground station.

A couple of things we're looking at dynamic prefix advertisement.
It's a semi-static assignment to the aircraft. Shortly it will be
more dynamic. Regionalisation of the space, if we have a particular
aircraft or a group of aircraft operating primarily in a region.
IPv6, we are working on that internally with our customers realising
the benefits that provides, Connexion does have IPv6 space so that's
something we're working towards.

In conclusion  - a couple of points there of how we're using BGP
and where it's working for us currently. Only suitable for 24 and
larger networks so far.

Any questions?


PHILIP SMITH

If you have questions can you please use the mike and state your
name.


RANDY BUSH

Randy Bush. I have two questions regarding the implications of the
fact that you're announcing a prefix from one AS then moving to
another. 1. What does that mean in terms of policy as we know it
today? And 2. How are you going to handle that when something like
sBGP is...


BRIAN SKEEN

In terms of policy today we did consider that, look at what that
would mean from having an advertisement from two different spots
at once. We talked to the providers and tried to understand from
them what that would mean from an operational perspective. Having
not really seen any operational issues behind that we haven't delved
into I guess how it would be to handle that. What changes we need
to make as far as how we would treat those routes based on how we
haven't seen anything in the operational issues.

The s BGP issue we are working on now. Kind of under development
so I don't have any specific answers for how exactly we're going
to deal with that.


SPEAKER FROM THE FLOOR

Steve Kent, BBN.

There is a mechanism in sBGP for the holder of an address prefix
to authorise an AS to advertise it as the origin nature and you
just have to have those address blocks authorised for each of the
ground stations then change them over time. Because you know what
they are, I don't see a problem at all.


BRIAN SKEEN

Thank you.


GEORGE MICHAELSON

George Michaelson from APNIC. I'm interested if you had FCC and
FAA compliance issues in terms of process signoff and whether one
possible direction is that there will be flight process centric
activity or whether this is seen by people like the FAA as purely
an information services, not something that could become critical
for routing or flight management or an aspect of plane rather than
entertainment? That's the first question.


BRIAN SKEEN

I would say a little of both. We had input on both aspects. Right
now it's viewed as an informational service. It's kind of an extension
to a flight entertainment system. We are looking at some e-enabled
initiatives that provide some functionality or services to the
airlines themselves and to the aircraft and those deal with real
time-type data both from a security stand point, just a flight op
stand point, airport information those sorts of things. When you
get into that, yes, you start getting into FAA and FCC issues and
conversations.


GEORGE MICHAELSON

We shouldn't expect to see aerofoils wiggling as a result of this?


BRIAN SKEEN

The joke is you move the mouse and the plane banks right or left
- nothing like that.


GEORGE MICHAELSON

This is in some ways an observation of something Randy's commented
on.

Randy at different times has talked about beaconing and how single
announced withdraw events can demonstrate this amazing cascade of
activity in the global network in the global default-free zone. I
think your diagram showed that. You had something that was a handoff
event that in one critique is internal and it had this amazing
explosion of transactions in the global DFC. Your decision to avoid
the dog-leg of a single attachment to the network appears to have
a consequence that about global visibility of route change that
does seem about your IGP. The observation, the dynamism here is in
the global net. Is that really what you want?


BRIAN SKEEN

That's an interesting question. I see where you're going with it.


GEORGE MICHAELSON

Maybe one for beer!


BRIAN SKEEN

It might be one for off-line. Interesting comments. There are a
number of outside - of just a technical reason why we do  - of
course there are business reasons and all the cost reasons that go
along with that. So there were considerations taken of how we do
it, what we really want to do. I'd be happy to talk to you offline
about this.


OSAMA DOSARY

My question is about what method are you using to perform the BGP
session handoff or multiple BGP sessions handoff?


BRIAN SKEEN

We use the route server capability. That's a combination of some
open source with proprietary customisation code that we use. It has
a back-in to the satellite subsystem and allows a trigger, in other
words, to signal that change, that handoff.


OSAMA DOSARY

Is it triggered by BGP itself or do you have like an open session
that's an idle until you move it  -


BRIAN SKEEN

It's driven by BGP


OSAMA DOSARY

I'm guessing there's a router on the plane and this has one session
or multiple sessions that are idle until it moves into another
footprint?


BRIAN SKEEN

BGP does not extend over to the plane. It's all done from the ground
base, you have ground base satellite components that are communicating.
They're in control of what's happening with the aircraft and they're
the ones sending the trigger to ground base devices to signal this
change.


OSAMA DOSARY

Thank you.


PHILIP SMITH

OK. I think for the interests of time we should probably move on
to the next presentation. So thanks very much for that, Brian. It
was a fascinating presentation. Thanks for the questions as well.

Our next speaker is Russ Housley, he'll be talking about BGP security.


RUSS HOUSLEY

Good afternoon. I'm Russ Housley. I have my own consulting company
called Vigil Security.

I was asked to come here and talk about BGP security. I understand
I'll be followed by Steve Kent to talk about secure BGPs. I want
to provide a little motivation in the introduction. I want to follow
this with what I believe is necessary to have a solution to BGP
security. Finally, a quick summary of what has gone on in the IETF
as steps towards BGP security. As you'll see, my view to that is
- not enough.

BGP as every one in the routing system knows is critical component
of the routing infrastructure for the Internet. It's the basis for
all Internet ISP routing. Sadly, we all know that it's highly
vulnerable to human configuration error. In addition to being able
to, being fragile in the sense that humans make mistakes those same
vulnerabilities can be exploited by attackers. We see common place
configuration errors. I think the one that I remember the most
happened shortly before I became security director. My predecessor
told the story of a new ISP coming online in Florida. They finally
got their first trunk to the Internet backbone and their brand new
router, they configured it, set it up, they were able to send pings,
so they thought they'd done a great job. So they set off to go have
a beer and celebrate. The problem was that they had misconfigured
their router and were advertising that they were routed for MIT's
network, all of the traffic that was supposed to go to MIT was
suddenly going to Florida. They didn't do anything malicious, but
they were sucking all of MITs traffic to Florida and preventing the
people from MIT from communicating.

This kind of thing is the kind of thing that I would expect BGP
security to prevent.

We've also seen BGP purposefully and maliciously attacked and I
think we're going to see more and more of the same. Spammers, for
example, take advantage of BGP vulnerabilities every day.

I'd like to see a comprehensive solution to BGP security.

A solution is not something that can be achieved simply. It requires
buying in from many, many parties involved in routing. The vendors,
ISPs, subscribers, so that's one of the aspects that makes this
particular thorny problem  - we need to have all of these parties
involved to come up with a solution that everybody can accept and
employ. Which is not going to happen quickly, I would argue. Not
only will the development of the solution take time I believe the
deployment of the solution will be one that takes time as well.

BGP is used by people for several different things. My view is that
when focusing on what part of the problem are we trying to solve
we need to focus on the things that affect your neighbours. The
internal uses of BGP one presumes are a local business matter but
the ones that affect your neighbours, regardless of your architecture,
are the ones that I am mostly concerned about. A misconfiguration
error within one autonomous system that affects all of its neighbours
is the kind of thing we need to worry about. A misconfiguration
that only affects yourself well that's your problem and not one
that I'm at least in the beginning want to attempt to address.

This is a simplified view of what an UPDATE message of BGP is about.
There is withdrawing routes which we just heard about. The one the
message was leaving issues a withdrawal.

The other is to advertise prefixes and the paths associated with
those prefixes. These two can be piggy-backed together into one
message, so as long as the routes that are being withdrawn are not
the same as the ones that are being readvertised or changed.

If you look at simplified view of the processing associated with
these UPDATE messages the first thing that is dealt with is the
router information base, adjacency, is updated. From anyone  -
anyone who runs a network knows there is filtering applied at this
stage which is necessary in the way networks are run today, but is
not part of the BGP specification itself. So then the information
from those adjacency RIBs are put together in routing algorithm so
a particular UPDATE at this point is determined whether it's going
to affect the local routing information base or not. Many changes
that neighbours make will not affect which paths are going to be
used, in which case the message doesn't affect the behaviour of
transit traffic. Then you apply to that, local rule which you have
to deal with the way the business aspect is being run.

Then finally, that view of the local  - if the UPDATE is going to
affect transit traffic then the local RIB.

Then finally you make a decision as to whether you want to share
that information with neighbours or not. That is basically a local
decision as to whether you want to share that information or not
and there's two kinds of sharing that can be dealt with. One is you
can share and say, this is only for you and not to be passed on to
others. So that is the no-export version. The other is without the
no-export you send it to your neighbours and there things are
propagated on, if they in turn choose to do so.

My understanding of the specifications is that each AS along the
path is assumed to have been authorised by the AS that's preceded
in the path to advertise those prefixes. So, that means that if
you're going to see this long path of AS numbers that each one in
there is willing to share that information at least with the recipient
of that UPDATE message and no further if the no-export.

There is an assumption that the first AS number in that  - in that
path is authorised to advertise the prefixes by the holder of those
prefixes. A route may be withdrawn only by the neighbour of the AS
that advertised it. So if any of these assumptions are violated
then BGP becomes even more fragile and vulnerable to even more forms
of attack, so you need to examine these assumptions and if they
don't match the way people are using the system today then we need
to figure out what we're going to do, but this is my analysis of
the situation today.

The notion of a best route is not  - is primarily a business decision.
It represents the decisions of ISPs as to who is going to be the
one to hand over which traffic and we're seeing more and more work
being done in the traffic engineering area. Most recently the ISG
has been approving a lot of MIBs being associated.

Different routes may be allocated to different neighbours based on
those local policies.

Looking back at the UPDATE messages it means different neighbours
can receive different sets of those UPDATE messages and that's how
you enforce those local policies. It often leads to asymmetric
routes.

Private peerings make it a situation that not all routes are visible
to all the parties on the Internet.

With that kind of foundation I want to talk about how to proceed
towards BGP security. So what does an attacker want that a BGP
security solution would prevent? The attacker may want to degrade
service, either in one particular part of the Internet or the
Internet as a whole. If they do  - they can do that by any protocol
that is going to affect the CPU of the router. BGP is just one of
those. SMP are others. But we want to make sure BGP doesn't become
the mechanism by which it's easy to mount an anonymous service
attack on a router, as you start adding security mechanisms such
as digital signatures that require more processing we need to be
careful to make sure that they're done in such a way that things
that have problems with them can be discarded early in the process
as opposed to hanging around.

An attacker may also watch a reroute subscriber traffic. Perhaps
for passive eavesdropping or perhaps for active wire tapping. These
kinds of things would allow them to examine the subscriber's traffic
then they can just pass it on. If they want to listen in. They could
modify the traffic and pass it on, if they are trying to be especially
malicious, or maybe they just want to delete certain traffic. Just
the ones with particular host addresses, for example.

If you can get into the routing system you want to masquerade as
particular traffic. Perhaps just the traffic from the DNS server
associated with a particular organisation.

As I've already said, the BGP architecture is already such that it
makes it highly vulnerable to human errors and malicious attacks.
Attacks against the LINX, against the routers themselves and the
management as well.  The implementations themselves are susceptible
to service attacks. Some routers are relatively easy to crash if
you know the right message to send. If they don't crash they spend
a lot of time dealing with that traffic which denies  - or prevent
them from handling other traffic that is likely to be more important.
So what we see is those filters that are being applied by router
operators in order to protect themselves against this kind of traffic
and sometimes configuration errors. These are the filters way of
talking about.

The creation of these filters, there's no standard way to do it,
different router vendors  Some of them are extremely difficult to
handle, theyre very time consuming in terms of the person who tries
to structure them correctly, and thus they often are difficult to
get just right. About the time you get it right the situation changes
and that's when you edit the filter.

So, why is this a particular problem today? Are people really
exploiting BGP? There's some DARPA-sponsored research that discovered
- affecting about 1% of all the routing table. That's a concern
with the human entry aspect of this. We know that BGP attack levels
have been developed and demonstrated at hacker conferences, so just
having people go to those conferences you can learn about that.

The question is are those tools really being used in the live
Internet? The answer is yes. We see it all the time in terms of ISP
routers to get attacked. Then the compromised routers become the
launch point of BGP-based attacks against other routers. Basically,
enabled passwords are very valuable. I talked earlier about how
spammers are using address space that has been allocated but it's
not currently in use. They send BGP UPDATE messages, say with a /24
prefix, and create a chunk of that address space basically anywhere
they want on the Internet. Send a couple of gigabytes of spam, then
withdraw the route. So they've create add little sub-net wherever
they want, spew their stuff and then evaporate. All of the outraged
phone calls go to the people who own the address space that they
advertised and has nothing to do with who actually mounted the
attack.

We've also seen BGP-based attacks to advertise the chunk of address
space associated with the DNS route servers. So you can be viewed
as the DNS route server you can associate any host name you want
to any IP address you want further down the tree. And you can even
do this selectively based on what the source address for the query
was.

This leads to what I believe are BGP security requirements for
solutions.

What I believe  - we need to get away from a point where we are
relying on personal relationships between people that operate various
ISPs. It's not that those relationships are bad, it's not that they
aren't going to continue, I believe we need a technology based
solution going forward that's going to scale as the Internet continues
to grow.

It's quite possible that some ISPs will not be trustworthy. We need
to be prepared to deal with that situation. We've certainly seen
that even those trusted people make mistakes so it would be nice
if a system aided in the detection of the mistakes they have made
rather than propagating them.

So we need to have a solution that has the appropriate binding to
the way BGP works. Elements of the security system need to exhibit
the same dynamic behaviour as BGP and yet be static in the parts
where BGP is static.

We need to make sure that the processing requirements and memory
of the solution scale, at the same time I'm not sure that it's
possible to accommodate every piece of installed base equipment,
some additional processing and some additional memory are certainly
going to be required to implement a security solution.

It's vital, I believe, that we accommodate the incremental deployment
of a solution.

Principle of least privilege is something that the security
professionals have viewed for a long, long time. Basically each
system element should only be granted the permissions necessary to
perform their piece in the overall system.

That basically flies in the face of the whole concept of UNIX route,
if there is an all-powerful person then that's the person who you
are going to pass the money to when you want something done, whereas
if each person only has the privileges and capabilities to do the
things that are part of their job, or in this case each component
of the Internet routing system, only has the privileges that are
appropriate for their role in the system then that allows you to
put a defence around the amount of damage that compromises that
component.

This is one of the cornerstones of information assurance. You want
to apply that cornerstone concept to BGP. So the security failure
or benign error by one ISP or one subscriber does not propagate
beyond that person's sphere of influence.

Any security strategy for BGP should incorporate a fire break
approach so that the security failures and errors don't fall right
down like dominoes.

The dynamics versus static aspect being you need to realise that
some things have local significance, some things have global
significance, some things are slow in terms of change to the system
and other things are rapid. For example, something that's slow and
of local significance is installed on a new link so it takes a lot
of time to order it, put it in and all that stuff. The operational
staff rollover is another thing.

The other extreme  - something that has global significance, it
happens very rapidly, is a route change which is part of why I asked
the question of the previous speaker, when the aeroplane moved from
one ground station to the area of another one, he said that basically
all of the other ASes within the system noticed that within 40
seconds. So it had global significance and happened very quickly.

There's two aspects to improving BGP security. One has to do with
implementations today. And the other has to do with architecture.
Some implementation improvements are certainly possible. But I don't
believe that that is the whole story. Just changing and improving
particular implementations from particular vendors is not of itself
going to solve this whole problem. It may improve the denial of
service, countermeasures within a particular router but it's not
going to  - going to secure the entire system. Architectural changes
are going to be necessary to do that.   Yet, the two are both going
to need to be done to make the system secure and robust.

Every UPDATE that a router receives needs to be able to verify that
the holder of that prefix is an authorised origin and that subsequent
recipients of that information, the ones down the line in the path,
are able to look at that and confirm that they came from an authorised
place. So they need to be able to detect and reject unauthorised
routes, irrespective of whether it came from misconfiguration or
whether they came from malicious behaviour. Failing to do this a
BGP speaker will be vulnerable to attacks that result in misrouting
of traffic one way or another.

Based on that one bold statement I think you can derive that the
important part is the verification of ownership and prefix holders,
then binding that BGP router to the ASes that it represents then
performing authentication of UPDATES and withdrawals.

Incremental deployment is, I think, vital to the way the Internet
works today. We do not have a flag day. Yet we need to make sure
that we provide a secure environment in adjacent ASes. I don't
believe we know how to have  - improve this where we have an AS
that has implemented whatever security system we come up with routed
to an AS that has not yet then to one that has adopted that solution.
With that one in the middle I don't think we can do anything that
that AS in between will be able to mount all of the attacks that
we witnessed today. However, two adjacent ones that have implemented
security solutions should be able to cooperate and then add on other
partners and so on and kind of grow out.

I want to spend just  - talk about two activities that are going
on in the IETF. First is a working group in the routing area where
security requirements are being handled. They've got one document,
the RFC Editor Queue, and three documents progressing in terms of
vulnerability analysis, security requirements. So far they have not
done any work on protocol development. This working group will not
do that. The idea is once the requirements for security in routing
protocols are identified, additional working groups will be set up
to develop protocols. Within the PKIX working group RFC 3779 has
been developed for including information about ISP prefixes and AS
identifiers. That is the first building block of the solutions
that's going to allow a prefix holder to have a certificate and
then make that authentication available, which will in turn allow
recipients to make authorisations.  So while we don't yet have a
distribution mechanism or protocol to use these things I do think
we can take advantage of this one piece in order to help us solve
that problem where MIT's traffic was going to Florida. We can at
least know who owns which chunks of the address space even if we
have to in the near term distribute those through some repository
as opposed to the automated routing protocols.

My personal opinion is we shouldn't wait for the entire solution
to be put out in RFCs but we need to take the piece that is are
available and start doing what we can with them and as we climb
this mountain the further up we get our view is probably going to
change about what those RFCs that aren't yet done ought to say
anyway.

Questions?


GEOFF HUSTON

Russ, kind of interested into what role the RIRs play in this kind
of area and what are the injection points that are required in this?
I heard what you were saying about autonomous systems and being
able to understand who has that autonomous system when they're
injecting routes into the network as an originating AS and the role
of address space, what prefixes are injecting, where are they coming
from. What in your view, is the role here for RIRs and more
specifically, what is their role in being a trust point in your
model of where security BGP is heading?


RUSS HOUSLEY

I think there's two ways that things might go forward. I have a
personal preference, but I can live with either one. Maybe there's
a third one, I just don't know. The RIRs are the ones that are
passing out the chunks of address space so they I think are the
ones that need to issue these certificates, they're the ones who
know who the holder for the  - that chunk of address space is. So
they are a key component of this being done properly. The question
is  - what's the route of this certification hierarchy? My personal
preference is that it will be IANA, IANA issues chunks of huge
blocks to each of the RIRs and the RIRs in turn dole it out to
others.

The reason I like that approach is it's the simplest trust point
to implement in a router because it only has one route.

The alternative is that you view the RIRs as peers and let all of
the personalities, I guess is the word, sort it out, in terms of
making sure there's no overlap between chunks that they can hand
out.

Then teach the routers about trust anchors associated with each of
the RIRs as peers, so that requires a little bit more memory, but
may actually map to the way things are actually being deployed in
the real world today.  So I can live with either one but I'm kind
of focusing on the let's do it with the least memory we can.


GEOFF HUSTON

Thanks, I'd like to drill a little further down one direction. I
can understand the concept of an IANA route as being the derivation
of where these certificates come from. My follow-up question is
about the certificate structure and the world as we know it. The
RIRs hand out both addresses and autonomous systems to entities.
It's pretty clear that when an address gets handed out it might
originate from an entity that wasn't the precise recipient of the
subdelegations and so on. And the other thing is the autonomous
system is more likely to be originated from the entity that the RIR
handed it to. It's more about anchor points of injection.


RUSS HOUSLEY

That gets complicated with mergers and acquisitions. There is no
simplistic model that matches the real world, OK. I think we need
to embrace that complexity in terms of the certificate structure.


RANDY BUSH

Let's make it short out of consideration to the next speaker. Just
carve him up and let's be done with it!


STEVE KENT, BBN

I think there are two points to the question that  - RIRs I tend
to think of as natural certification authorities in this process,
but ISPs also wind up being certification authorities when they
further hand out those prefixes to downstream providers or multihome
providers, whomever. It's just continuing that simple tree structure
down as many layers as is necessary in pushing prefixes out to
holders. The other direction is a prefix holder authorising a given
AS to originate that prefix, which might or not be the same one
that  - that's a different structure, that's what Russ was alluding
to on his slide to say we need these other digitally signed things
to allow the prefix holder to indicate which AS or ASes should be
authorised to originate.

That's a separate data structure charger.


GEOFF HUSTON

Thank you.


RUSS HOUSLEY

One can do that with attribute signatures but they don't quite fit.
Maybe we don't want to go into all that today! You do need a data
structure that says the current ones who are allowed to advertise
this prefix.


GEOFF HUSTON

Thank you.


PHILIP SMITH

Thank you for the questions.


STEPHEN KENT

Thank you. What I would describe is a particular technical approach
to achieving the sorts of goals that Russ Housley alluded to in his
presentation. Fortunately, because of clever scheduling on the part
of our leaders here, Russ has done kind of background that I would
normally have to do in terms of, you know, what's BGP about and
what are appropriate security requirements if you start with BGP
specifications and work down from those so I'll just focus on the
technology that has been developed, which is a candid technology
for doing this.

It's an architectural solution so Russ provided  s-BGP is an
architectural approach. It doesn't say anything about bugs in your
implementation per se, but it has the opportunity to protect you
from implementation errors in other people's ASs. It is an extension
of BGP. It uses a particular standard facility in BGP to carry
additional information that's needed about paths in update messages
and it has an infrastructure component as well so it requires some
infrastructure as well as some additional information to be pushed
along with the advertisements as they are set. It also implies that
routers will do some additional processing in creating updates and
in accepting them, in processing them, in order to achieve security.
One thing that it avoids is something that Russ noted, which is any
notion of transitive trust.

Basically, this is something that follows the precept of privilege.
The idea is that each autonomous system and the routers that represent
it as order routers should believe only what can be proved to them
through the mechanisms that s-BGP develops. So it's not a question
of, "Oh, the guys here know what they're doing. I'll take their
updates, process them and everything will be OK." That's how we get
into trouble today.

In designing s-BGP, we try to design it so that the mechanisms that
are involved scale in the same fashion as BGP itself, including the
sort of things that Russ alluded to in terms of dynamics. Some of
the information that you need to verify for routing security in BGP
changes fairly slowly. Other parts change very quickly. So we have
different ways of disseminating data in s-BGP that try to track the
data that, in plain old BGP either changes slowly or quickly.

S-BGP has several components. It makes use of IPsec to provide
secure point-to-point security for the links between routers. It
provides point-to-point security for communications between routers.
This is a significant improvement over the previous --

(APNIC staff confer with Steve Kent; Steve switches microphone on)

So, IPsec is used to provide point-to-point security. It's an
alternative to what we have today in terms of the MD5 - it would
make a cryptographer cringe today. would make the use of the current
mechanism especially vulnerable.

There is a requirement for a PKI, a Public Key Infrastructure, and
that has the features that Russ was talking about earlier - that
is, an infrastructure that attests to which organisational entities
have which prefixes and which autonomous system numbers. And then
the notion of attestations, which come in two flavours.  They are
digitally signed chunks of data, that are used to represent either
which AS or ASs are authorised to originate prefixes - and these
are of course signed by the prefix-holders themselves - and that's
fairly static, just like the PKI is. And then route attestations,
which are the dynamic authorisation mechanisms that show as you can
from one AS to another, that each AS along a path has authorised
the next one to advertise a given prefix or prefixes.

Those are the background pieces on an ongoing basis s-BGP calls for
routers to generate these additional pieces of data, these route
attestations to go along with each update and then correspondingly
to validate them as they come in. The US of IPsec here is fairly
straightforward, really a replacement for the TCP MD5 checksum
option. I won't spend any more time on it. You don't have to use
this on every link. If a pair of ADSs decides the link between them
is secure, that's a local decision. The rest of the world won't
know one way or the other. You can choose not to do it. Certainly,
to the extent that we use the TCPMD 5 checksum today, this is a
preferable technology. It's more secure in every possible way.

This is the sort of allocation diagram that we were talking about
a few minutes ago. I would like to start with the IANA and work my
down to regional registries and then, as appropriate, to national
or local registries, all the way to ISPs and subscriber organisations,
recognising that there are a lot of paths to get from the route all
the way to the lead. (Refers to slide) In yellow, off to the side,
we recognise that there is a legacy allocation here of chunks of
address space that were handed out directly to NITs who became ISPs
or subscriber organisations and that has to be grandfathered back
into the system as well. That's just the reality. But it's not,
from a conceptual standpoint, difficult, but the book-keeping will
take some work. AS number allocations are more straightforward,
because you don't move through successive layers of delegation in
those.

Again, these are the simplest possible structures for a PKI. Most
of the world would love to have something this simple, because we're
not talking about creating new organisations and asking people to
trust them. We're asking the organisations that already hand out
chunks of address space and AS numbers to merely sign digital
certificates attesting to what they're already doing and therefore
you had to trust them before, they were the game in town for wherever
you happened to be. So we're not asking you to trust anybody new
here.

One property of this PKI, which is different from most of the PKIs
people deal with - and I say this from some experience as the
co-chair of the PKI working group in the IETF - is that we're not
handing these out to identify an organisation explicitly. The names
in the certificates, using standard certificates, really aren't
that important. We're using the certificate format because it's a
standard, it's an easy thing to do. There's lots of software for
processing. What's important is that the certificate is binding
this prefix or this AS number to an entity who has a private key
and so, if they want to prove to you that they are that entity -
whatever their name is after bankruptcies, mergers etc, etc - the
important point is that they have the private key and can sign
something which can be verified with the corresponding public key
with the certificate.

So this is a PKI for authorisation, not identification. That's
different to what most people do but perfectly legitimate from a
PKI perspective. It all work with existing technology.

The text on this slide reiterates this notion of a simple top-down
tree-structured approach. You could, as Russ Housley said, have all
the RIRs be peers of one another and do something that we in the
PKI business refer to as 'cross-certification' or having multiple
routes. All this is possible. The one different thing is that, in
the s-BGP model, we never require the routers to actually verify
any of the certificates. We assume that's something the router does
not want to have to deal with. What we do is have network operations
people do this all for the routers they manage and then pass the
answers to the routers so the routers just get the secure contents
from the certificates and the attestations. They never have to do
certificate verification themselves.

That's an awfully long process.

Now, as I mentioned, there are two flavours of attestations - the
address attestations address the question that Geoff Huston raised
a few minutes earlier which, if I'm a prefix-holder, how do I verify
what I need to authorise the prefix. We need a digitally signed
something to do it. We have a particular format for this in s-BGP
but, as Russ Housley said, there are lots of opportunities of how
one could use to do this. It's not that critical since this is,
again, a slowly changing offline thing. We don't change instantaneously
which AS is going to originate your prefix and thus cause traffic
to flow to you. You have to pay money to do this since there are
details to be worked out. The rout attestations are the way each
AS, through its order routers, explicitly authorises other ASs to
advertise a rout that is originating or passing on, that's it's got
yelp from other ASs.

So, what you wind up with route attestations is having essentially
a nested set or a chain of signatures which allow any router that
receives an update in this form to verify that it really did come
via the path that it says it followed through that sequence of
Autonomous Systems. The format that is we developed for this are
common. The address attestation on the bottom just lists a set of
prefixes and then the origin AS. The route attestation is different
only because the prefixes now deal with a sequence of ASs terminating
in the origin AS. So it's the top one here, the route attestation,
that's what's put into an update as a transitive optional attribute
to be carried through the system. The bottom one, the address
attestation, as I said, you really have a lot of flexibility in how
you choose to format it.  We just chose a common format for the two
of them here to make life a little bit easier.

When a router receives an update and has been configured to know
that its neighbour is another autonomous system that's implementing
eBGP so, as Russ Housley said, you need incremental deployment
facilities. You need to configure a router with a flag that says
"This neighbour does s-BGP, this one doesn't. That's the simple
thing it has to know so that, when it gets updates from a router
that is a neighbour doing s-BGP, it knows to look for this additional
information and to process it. And it goes through some processing
and makes sure that, in the rout attestation, that the AS number
for the router that received it is already in the route attestation.
This is slightly different to what you normally see in an update.
Normally, I put my AS  number in it.

You continue to do that. There's no basic change to how BGP operates.
But the route attestation puts your AS number in it as I hand it
to you. that way we know that you were the intended recipient. That
links all of the route attestations together by following that.

A router would verify the signatures that appear in the chain. On
average, these paths are 3.5 or 4 hops on. That would give you an
idea of the amount of signatures you're looking at. You also need
to see that the origin AS is consistent with the address attestation
information that has been distributed and we wouldn't distribute
that in line in updates because it's relatively static so there's
no need to carry that chunk of data to every update as it goes
around the network.

Housekeeping - it's nice to talk about things like this but there
is the question of where the data comes from, where it goes, how
it gets there. We split data into two categories in s-BGP. The data
that changes fairly slowly, such as who owns which autonomous system
numbers, who holds which prefixes and who's been authorised to
originate which prefixes and that information is distributed through
a repository system. The one thing that you do need to maintain
very responsive dynamics for is the currently which route is being
advertised, because that can change quite quickly, as the first
speaker of the session pointed out. That's where the route rotations
are pushed in line with the updates themselves.

The idea behind repositories is that they would hold all certificates
that deal with all the chunks of address space and certificates
that deal with all the autonomous system number allocations. If you
issue certificates, you have to issue certificate ratification lists
- where you say you didn't mean to do that - and the address
attestations as well. All this data needs to go through repositories.
Now one of the open questions is - who should be operating repositories?
There should be some number of them because we want robustness.
They should be loosely synchronised but, if we have too many of
them, we create a new problem which is finding the repositories so
they can talk to each other and provide synchronisation. We have
to achieve the appropriate balance there and that is still I think
an open question.

Note that routers don't go to repositories to fetch anything. Network
operators go to repositories to upload and download data. When we
designed s-BGP, admittedly the folks designing it are not operations
people - we figured that about once a day would be sufficient for
uploading and downloading data. Recognising that the allocation
processes are not instantaneous, the changing of who my ISP is is
not instantaneous. If that turns out to be a reasonable time frame,
we're not talking about high accessibility. It needs to be there.
If you don't have new stuff, remember the routers aren't looking
at it, the operations people are, so they pull down the data, they
process it and push it out to routers. If they're unable to get it,
they can go to the old data they have, they can live with that if
they wish to.


(Refers to slide) This is a diagram that tries to put everything
together and suffers the usual problems of a diagram that tries to
put everything together. Let's say we have a few repositories - I
fit two on the slide in purple - and registry, say a regional
registry here. So two ISPs, shown here, interact with the registry
to get their certificates identifying them as holders of chunks of
address space and as owners of autonomous system numbers. They push
their certificates up to these repositories, they push address
attestations up to the repositories and then they download everything.

I didn't try to get fancy when we did the repository design. We
just said, "Take it all". We're talking about tens of megabytes,
maybe 40 or 50 megabytes of data. Many of us download more than
that in e-mail every day. It's not that hard.

Then they process what's been downloaded and push the extracted
data to the routers in their autonomous system or systems that
execute s-BGP and then those routers exchange updates enhanced with
the s-BGP route attestations with their neighbours. So those are
the kinds of flows of data we're talking about.

No system is perfect. Even if one did s-BGP, there are residual
vulnerabilities. There are the sorts of problems that really result
from the limitations of BGP Itself. BGP doesn't time stamp or serial
number updates and so we have a problem knowing when somebody, for
instance, does a withdrawal and then they re-advertise etc. We have
trouble keeping sequence with that. There is some ability to do a
better job of this with s-BGP by putting basically time to live
fields or their equivalents into the route attestations but that's
about it. It's relatively coarse. So it's not perfect. So those are
the kinds of residual problems that we see with the technology.

Now, it's fair for people to say, "How much of this is something
that you talked about, wrote half a dozen papers about over the
last few years? " we have implemented all of this. We used the MRT
code as a basis for this. We augmented it to do s-BGP and including
basic policy controls for incremental deployment - that means the
ability to say this neighbour does it and this neighbour doesn't
do it." We did the housekeeping part, so we set up a set of tools
that's a mini registration authority for certification authority
basis to issue certificates to your subscribers to whom you've done
sub-allocations and to download and upload the data to repositories
and then to verify everything that comes down through the repository,
process it and reduce the abstract before you ship it out to your
routers. We built a repository. Frankly, this was the weakest part
of the feature but it has a nice feature that nobody has to manage
a bunch of access controls.

It's all internal based on the certificate structure. There's a CA,
a fairly high insurance implementation, something called SELinux,
very high in assurance and we use that along with other hardware
to do high-assurance issuance of these.

So, s-BGP is a proposal on how to improve BGP security. It has
impacts - registries and ISPs have to become certification authorities
in this limited context but it's exactly doing what they do today,
it's just adding another step to these processings. The biggest
problem here is getting routers to be able to do this and, if we
were talking about normal operation and we weren't being flooded
with lots of updates due to some major worm or virus, most routers
could probably handle the digital signature processing load with
the existing hardware. But what they couldn't handle is the space
in the routing - in memory, not in routing tables but in memory for
these digitally signed things, for the route attestations, because
my laptop has more memory on it than most of the routers out there.
That's just a fact of life.

So, if one's willing to assume another seven routers coming out
with more memory and we want to throw in an encrypter to ensure
plenty of processing horsepower for the crypt owe operations. Then
this is technically feasible. It does not require faster-than-light
capability or anything like that. It's a doable thing. It requires
the basic processing I have on my laptop although, of course, it
is a nice laptop. At this point, any questions? Oh, in answer to
the first question, this is an osprey. This is a raptor that you
find in North America that eats fish more or less exclusively. It
grabs a fish out of the water, goes to a telephone pole and basically
eat the fish. He was looking at me taking this picture, figuring,
"No, he weighs too much for me to be able to pick him up."

That's why I was able to take the picture safely.


ANDREI ROBACHEVSKY

With incremental deployment, I assume that attestation may not be
complete.


STEPHEN KENT

I think the thing that's reasonable to expect people to be able to
manage is being able to employ continuous ASs. As you do that, you
get benefits for the subscribers and operators for the contiguous
ASs. They form islands which, when they connect increases the scope
of connection. A precursor to doing that, is you need enough
infrastructure to start with IANA and work all the way down so that
the folks who choose to deploy have the certificates they need to
do it so that's fair. I think because of the impediments in terms
of current router hardware, especially in regard to memory, is that
the thing one would want to do first is put the infrastructure in
place, put the PKI in place and address attestation infrastructure
in place. You could distribute that information and use it to help
build better tables for filtering purposes.

You know, you could do that years before you got the routers doing
the rest of what we talk about. That would be a good incremental
approach.


ANDREI ROBACHEVSKY

There is no - if it's not complete, I drop the announcement.


STEPHEN KENT

If you couldn't find an address attestation matching an originated
prefix in the route - and you got the route from a neighbour who
was claiming to originate it, then you'd be very unhappy. If they
got it from somebody further along, well, then, that's the shortcoming
of not having it, you know, back to the edges so, if a given AS is
doing this, an ISP is doing it, you would expect them to be pushing
this back to their subscribers and, of course, many of their
subscribers who aren't running BGP, well, they don't even ever know
what's happening. It's not on their behalf. For others sub scribers,
those subscribers could choose, in an offline fashion, to sign an
attestation saying, "He's originating my address space." How far
back you push it depends on the exact circumstances.


RANDY BUSH

Andrei, drop me a note and I'll forward you a reference to a paper,
which describes a modification to s-BGP. There's a number of
modifications, most of which I didn't like, but one allows very
fragmented incremental deployment.


GEOFF HUSTON

It's just a follow-up question. You would conceivably see, if I
understand the answer correctly, of being able to move into an s-BGP
environment with address attestation as the first step here.


STEPHEN KENT

Yes.


GEOFF HUSTON

OK.


STEPHEN KENT

Whether we choose to do s-BGP or not, I think something like the
PKI for address allocation and preferably ASs, the attestations are
a common framework that anybody should want and then where you go
from there is still debatable.


GEOFF HUSTON

Thank you.


ANDREI ROBACHEVSKY

Another question related to PKI structure. For s-BGP it's not
identification, right. But, from RIR perspective, we certify our
numbers. So apparently, there must be different PKI structures
co-existing.


STEPHEN KENT

There don't have to be. If you choose to make the subject names in
the certificates be useful for your identification purposes as a
regional or a global or national registry, that's fine. S-BGP doesn't
care. But it doesn't preclude using the same certificate for both
purposes. The thing s-BGP cares about is that whoever has the private
key corresponding to the public key in the certificate, no matter
what their name is, has just signed an address attestation, for
instance, and you can verify it. So you can use have one PKI to
solve both of your requirements. It's just s-BGP wouldn't take
advantage of that other aspect of it. But you don't have to create
parallel ones, no. If you go to the trouble of doing it once, don't
do it twice.


RANDY BUSH

You were right the first time.


PHILIP SMITH

Thank you very much for the questions and thank you very much Steve
for your presentation. Thank you for coming. Thanks to you all for
coming to this first session of the Routing SIG. I've put up the
agenda for the second session, which starts at four owe lock, so I
hope to see you all after the coffee break. Thank you.


PHILIP SMITH

My watch says that it's 4 o'clock so I think we probably want to
make a start, please.

(Pause)

OK. So welcome to the second session of the APNIC Routing SIG. I
have the agenda on the screen. This session we're going to concentrate
more on the actual routing system itself. Varieties of reports on
projects and various activities on the Internet. The first presentation
is Operational experience at OCN/NTT. Route views project update,
routing table and BGP movie update, a look at the BGP storms and
Internet health during the big worm attacks and the results of
anycast stability experiment. Before I invite Tomoya Yoshida up,
just a reminder if you want to ask questions just use a microphone,
either come to the two at the front of the room or request the
roving microphone. Remember to mention your name just to help the
stenographers so we all know who you are. I should remind people
to look at the online notice board and also the jabber chat rooms
which are also available for this session

I'd like to invite up Tomoya Yoshida from OCN.


TOMOYA YOSHIDA

Thank you. My name is Tomoya Yoshida, I'm from NTT Communications.
OCN is one of the biggest ISPs in Japan. Today I would like to
present about some operational routing experience in NTT/O CN.

This is our history, we started our OCN service in 1996. We started
the name OCN Economy. This service was very cheap.

(Refers throughout to slides)

/28, /29 to our dedicated users, we configured at each router and
distribute.

When the OSPF external route reached around 20,000, OSPF convergence
time needed more and more. This was our program. Then we tried many
things. The one is to separate the OSPF domain. This was very
complicated and the operation was very difficult. Then we changed
the routing from OSPF to BGP. We used iBGP route for Internet route.
Then the iBGP route is growing very fast because we changed the
routing information from OSPF to BGP. So after that, we used a route
reflector from the top to the bottom, the hierarchy is to the bottom
currently being used.

Also we had an address problem. That was at the time we couldn't
get enough address space from JPNIC. So in currently IPv6 policy
is easy to assign router space, but at the time this was very
difficult to get enough address. As a result it was also difficult
to allocate the route.

You can see the changes of the backbone topology. The left one is
in 1996. Those routers are connecting to one switch in 1996. Then
after that, we divided the OSPF area, 1998. We used the switch, of
the  - I think this switch is good technology.

We have many clusters but some clusters does not need BGP for route.
So we distribute the route for the needed cluster. Some cluster
does not need BGP route so we distribute the needed cluster that
BGP put out.

This is topology in 1999. The area is in Kyoto. Kyoto was connected
to Osaka. This area the measure is what was connecting to one Tokyo
Port. The worst area was connecting to only one Osaka port.

This topology was  - if Osaka is down, so all the traffic goes down
so we changed to this topology so we called a square topology. In
Kyoto it is connecting now both Osaka one and Osaka 2 ports. Japan
is an island, a long distance island. So most of this area is
connected to Tokyo, the west area is connecting to Osaka.

This is just bandwidth.

Most Japanese ISP use OSPF. I think this is a very historical case.
We divided OSPF area.

The router has many segments so we divided this router is  - we
separate the functions. For BGP, a route reflector hierarchy we
use. And distribute for needed cluster.

>From here, this is our experience about some specific things. This
is a BGP prefix limitation. You know that both Cisco and Juniper
have a limited function. But those implementations are different.
You can see that if you have a route from peer. Then go to the local
RIB. In Juniper's case they think by using this and Cisco use the
local RIB. So we have many  - this is very confusing.

Just now I requested to Juniper to implement it by using the local
RIB. They said that 7.4. (Referring to slides)

This is next hop self/redistribution.

If you forget next-hop-self at the eXchange border route and not
redistributed to your backbone the IX segment around /24. In Japan
there are 3 major IX, In Japan, 3 major IXs is announcing around
/20 the part of the IX's segment IP like /24, so when some ISP
forget the next-hop-self and not redistribute those segment to IGP,
traffic will go to the IX's AS. In Japan, one or two times we can
see this problem. So nobody, the IX segment is not routable but in
Japan every major 3 IX is routable.

This is LSA refresh experience.

Cisco is 30 minutes. Juniper is 50 minutes.

This is OK (Indicates screen) but sometimes there's no  - in this
case there are only 3 parts.

We found this situation. We changed the time from 50 minutes to 30
minutes. In some cases you cannot see some paths like this (Indicates
screen).

This is just operational information, route cache is very useful.
Currently almost vendor is implemented route refresh capability.
Also IETF is discussing many capabilities internationally. But I
think the soft reconfiguration inbound is very useful. When you set
a new peer you set firstly, low priority to this new peer, but if
you receive more specifics this is the best path, so firstly, we
receive. Firstly check the route, not receiving any route, only
monitor the route from peer by using cache then receive.

This cache is very, very useful.

This is also route flapping experience.

This line is one flap. (Referring to screen)

After 50 minutes more than one flap occurs, so if flap occurs, so
Cisco's routing will be over 1500, and in Juniper, over 3,000. So
Juniper, suppress is around this 3,000. So only Juniper router is
suppressed, Cisco is not suppressed. This is very confusing for us.
In this case only Cisco is used and Juniper is not used. So this
is a bit complicated.

Lastly, the routing hijack. We have around/10 IP blocks. Sometimes
our prefix is hijacked. When we hijacked  - we announced more space,
for example if someone hijacked /20 so we announce two /21s route
to the Internet, but when the hijack is /24 this is very difficult
because we announce /25, two /25s route to the Internet but many
peers does not receive /25s so we also /24 and also we announced
two /25s. Then we need BGP origin validation security mechanism.

Lastly, we need TTL hack security mechanism for many vendors. Prefix
limitation by using LOC-RIB for Juniper. Mac accounting for 10 G.
Traffic is another line so we cannot.  Mac accounting is very
important for 10 G.

Feasible path reverse path forwarding for uRPF. Strict mode is
dangerous. Loose mode is just loose.

BGP inactive reason for Cisco is coming  - Cisco implemented for
CRS-1, I heard. So our operational additional information is very
useful for you.

Lastly, dynamic filtering. Just idea.

If you receive the BGP route with this community, (for example,
4713:777 attribute), the route which in scope of this community
will be rejected automatically. I think this is very useful for
filtering for your PA. When some key address is added for us, so
we change the router, just my idea, but this is very important and
very useful, I think. That is just my idea.

Lastly, our backbone. (Referring to screen)

If you want more information visit the site on the slide.

Thank you very much.


PHILIP SMITH

Thank you very much. Are there any questions at all?

No?


PETER SCHOENMAKER

Why do you say -


PHILIP SMITH

Microphone, please. Please state your name.


PETER SCHOENMAKER

You said that unicast RPF is dangerous in strict mode. Are you just
talking about BGP only or can you explain why you say it's dangerous?


TOMOYA YOSHIDA

We just think the best path to the left way, but the traffic coming
another way. So this router is going this way. So if you con figure
-


PETER SCHOENMAKER

You're just talking about BGP connections? General experiences with
static customers traffic can only go one way.


TOMOYA YOSHIDA

I mean that at the gateway router. The customer is only just one,
this is very useful, but in some cases, dangerous.


PHILIP SMITH

Any more questions? No, thank you very much.


PETER SCHOENMAKER

Is your knock having a hard time contacting upstream providers when
- hijacks your address space to get them to stop advertising it?


TOMOYA YOSHIDA

Yeah, we firstly announced more space then we contact the source.


PETER SCHOENMAKER

Our general experience is most upstream providers are fairly
responsive very quickly to hijacked address space and that it only
takes maybe a phone call and a short period of time.


TOMOYA YOSHIDA

We also have a system, so this is agent, distributor agent in some
areas, this agent cannot hijack, so after the hijack occurs, they
send an email and we know the hijacker.


PHILIP SMITH

Thank you very much for your presentation. Joel. You're up.


JOEL JAEGGLI

I'm going to talk about what the Routeviews project is, some history,
current efforts, new and future efforts, projects we're working on
and how you can participate. What is Routeviews? Generically speaking,
Routeviews is eight routers that collect realtime information about
the global routing system through BGP sessions. It's a repository
of historical data on the state of the global routing system that
currently goes back to 1997. So, in some respects, it's one of the
deepest sets of data we have that's been continuously collected.
It's both an operational and research tool. A little history - the
original Routeviews router goes back to 1995. It started as a purely
operational tool. At the University of Oregon, originally we had
one external provider, that was with Westnet.

But, when our connectivity became a little more complex, it became,
you know, necessary to see how the rest of the world saw our routes,
particularly when we started making configuration changes.

Looking Glasses, in 1995, were still in the future so, the best
thing we could do as far as we could see was to get a BGP feed that
was the whole table from outside our network.

Randy Bush, who was then in RAINET was generous enough to give us
the feed from MAE-WEST and, at the time, if you think about it, the
big network operators considered this data to be extremely proprietary
so the fact that he was willing to do that was, you know, of
incredible utility to a relatively small organisation. We made that
data publicly accessible and people began using it and, in the
process, contributed more views. Periodically, people would collect
information about the state of the whole routing system by teleneting
into the router and doing a show ip bgp at intervals. In 1997 that
was formalised by NLANR/MOAT. They began collecting it on a once-a-day
basis. People then started to use the data for really interesting
applications.

One of them was Skitter, which allows you to sort of visualise what
the interconnectivity between ASs looked like. And so, in this
particular plot, which is from 2000, the most connected ASs are in
the middle and then the least-connected ASs radiate to the outside.

To jump ahead, Routeviews continued to percolate along and people
added more views and we got to the point where we had about 50 peers
on the router. That was in about mid-2000. It became obvious to us
that, for Routeviews to continue to be interesting and relevant,
that new things had to happen. In mid-2000, we began to get serious
scalability problems, because we had a router that was fairly modern
at the time, but was accepting 50-plus multi-hop BGP feeds and had
around 5,000 interactive logins a day, a lot for Routeviews to
support. At the same time, it was becoming obvious that researchers
and some operators had needs that were not being met by any of the
existing data collection methods that were available. So we began
a new project in March of 2001, we began actually collecting.

Show ip bgp dumps ourselves at two-hour intervals, so the data
became more fine-grained at that point in terms of the whole table.
We also decided to deploy a new service, which was tentatively named
route-views2 and that sort of stuck. Route-views2 is a zone area
BGPD running on Linux. Initially, that was not a complete success,
in part because no-one had actually used Zebra for the particular
application that we wanted to use it for - which was taking 50 BGP
multi-hop peering sessions. They'd actually tried to use it as a
router, you know? And, it couldn't, so it couldn't really handle
60 peers and the cli itself was slow enough that it proved to be
unusable.

So we punted on actually replacing Routeviews with that and set it
up as a separate service. So we continue to run the original
Routeviews service to today, including the 2-hour dumps. An upgrade
to Routeviews performed at the same time proved to make that a heck
of a lot more useable and it continues to be functional to this
day. It has about 70 peers at this point. But the thing we really
wanted out of Zebra was better collection methods because, with the
router, there was basically two ways to get updates on routing
events out of the thing - turn on debug and export all of that data
to some other host that would then log it, which had fairly serious
performance implications and, given the already sad state of the
router, seemed unlikely to be successful.

Whereas, with the Zebra, we could actually put it directly to a
locally attached disk on a two-hour basis and we could actually log
updates to a file as they happened. So, basically, every 15 minutes,
the route-views2 box dates off its current log file and starts a
new one. So that means updates can become available almost immediately
and there's, you know, rather significantly more fine-grained
information about what's going on with the state of routing than
there was when you either had to log in and use the router yourself
and collect that information manually or intuit what was going on
from the two-hour dumps.

At the same time, there was a lot of data all over the place. Hans
Verner's data was sitting in San Diego. Bill Woodcock at pch.net
had data from the route-views router and we had some sitting on our
machine. We created routeviews.org, which was then and continues
to be the existing log for all the Routeviews data that we can
aggregate.

As high-resolution data became available from route-views2, to more
problems became apparent, one of which we solved and the other there
is an ongoing effort in the IETF to address.

One is that MRT had a one-second resolution. With route-views2,
that's not actually a huge issue because we have all these eBGP
multi-hop sessions with routers all over the world so your ability
to make very fine-grained insertions are somewhat limited anyway
because some of those routers are up to 150, 200, 300 mil a second
away. Other shortcoming that was pointed out by researchers who
were drilling into the data and it was obvious to some more than
others, was that there are artefacts in the data that are the product
of the multi-hop BGP sessions themselves rather than actual BGP
events.

Basically, like having a large number of different UNIX platforms,
routers have TCP stacks of varying quality and, when you put a
network in between it of indeterminate length or whose routing may
be effect or when you load a CPU up on the router, you may get
events that are the product of the TCP  session either being reset
or becoming really slow or having slow-start sort of artefacts that
cause poor performance that are not directly related to routing
events and so those artefacts do show up in the data.

A new effort was started to place Routeviews routers directly on
Internet Exchange fabrics so that we could take single-hop BGP
sessions. The first of those new routers went online in July 2003
at dix-ie thanks to the WIDE Project. Akira Kato was kind enough
to host that box for us and continues to do so up to this day.

Routeviews.wid was followed by route-views.isc located at the PAIX
in Palo Alto in California USA. Route-views.linx is located in
London and route-views.eqix located in Ashburn. Early in 2003, we
actually got our first Routeviews employee. For the most part, up
until this time, Routeviews has been by volunteers, myself included.
John Heasely is with us on a one-year sabbatical from Verio.
Route-views6 was a new box we created which is located at the
University of Oregon, takes eBGP multi-hop IPv6 feeds. So, since
May 2003, we have been collecting the same information that we
collected at v4 for IPv6. Fall 2004, it became apparent that we
would need to support tcp-md5 because of excitement with Cisco and
Juniper routers that occurred at approximately that time. It actually
took kind of a while to deploy and it is still somewhat hackish.

At the moment, our Zebra collectors can only initiate tcp-md 5 BGP
sessions. They cannot actually receive them. You know, we do now
support rfc2385 but, in general, we prefer not to do it if we can.

Current efforts - John Heasley has finished up his time. Mike Witt
has joined the Routeviews project to take over some of those
activities. Continued operation of the Routeviews collectors and
archives maintains or requires a bunch of my time. In effect, at
this point, the Routeviews project is a globally distributed ISP
with no links. Right? We have machines, you know, in POPs all over
the world and that is a significant maintenance headache. I haven't
actually seen or touched the box in Otemachi since it was installed,
yet it's gone through two upgrades and had some disks replaced. So
it is, you know, a significant undertaking. Some tool development
has occurred in the recent past. Not all of it by us but we hope
that it's useful. Two interesting applications that we've deployed
are BGPlay, which you saw an example of in the Boeing demo and which
I will demonstrate here in a moment, and IP to ASN DNS zones, which
is another tool that we thought would be useful for, you know,
drilling in the Routeviews data and it turns out a lot of people
like to use it for spam filtering. Its actually taken off and has
a life of its own.

Collaboration with researchers - most of the funding we have now
is actually to support research efforts in additional data collection
so, without researchers, there's really less of a reason for
Routeviews to exist in the large state that it is at this point.
It's beyond the ability of the University of Oregon to support by
itself.

So BGPlay is a Java application which displays animated graphs of
the routing activity of a certain prefix within a specified time
interval. Its graphical nature makes it much easier to use and
understand how BGP updates are affecting a particular AS rather
than by, you know, looking in the updates themselves. The BGPlay
database stores 10 days worth of data provided by the Routeviews
project. We are working on actually increasing the amount stored.
It is a significant data set. BGPlay was actually written by the
Computer Networks Research Group at Roma Tre University. So we can't
really take any credit for it other than it uses our data and that
we host one of the current applications.

DNS-IP to ASN S-RB/ ASPATH - we have two queryable subdomains of
TXT records in routeviews.org. Asn.routeviews.org resolves a reversed
IPv4 address or prefix to the origin AS prefix and prefix length
of the best route as seen by route-views2. So, I mean, basically,
you know, for any given IP address, you can put it in, as in the
example (refers to slide) And what it will do is give you the route
block that it came from and the autonomous system number. ASN/ASPATH
does the same thing but resolves to the full AS path. The zone files
are reconstituted twice daily and are available for download from
archives.routeviews.org. We don't allow transfer from them because
the smallest of the two is 179 megs.

Current efforts - these are our current peer counts (refers to
slide). Route-views is holding steady at about 70 peers. We have
stopped pretty much taking additional peers to Routeviews. We still
take them to route-views2. We would really like people to send us
views at exchange points.

We would like to deploy additional regional collectors, something
in the order of three to five, if you're interested in hosting one,
that would be cool. We are soliciting more input on tool development
and we have a few projects in the works ourselves, including queryable
databases. We're interested in providing local computing resources
and storage for researchers. So, how can you participate? Well, a
lot of people just use Routeviews. That's one way to participate.
Bring a view to Routeviews. We're really looking for single-hop
views at the IXes  but we'll still take multi-hop views. If you
have v6, we'd sure of like -- sure as heck like help. Send mail to
help@routeviews.org with that. We will not announce or do anything
with those routes other than sink them into the machine.

Host a collector. We're looking to build out three to five more.
It as an operational tool. This is who we are

(Refers to slide) That's us. There's a bibliography at the end
 which will point you to the various pieces I have covered (refers
 to slide) If you want to take a look at BGPlay here real quick.
 What you can do is take an arbitrary prefix. I'll use one that I
 happen to know fairly well. (Types it into the BGPlay query form)
 Plug in a time interval up to 10 days so we'll go back to the 13th.

So we've got a layout. We can see that this is AS-358 2 here in the
middle. That's the University of Oregon. We can see our immediate
upstream for the neuroproject with the School of Engineering. We
can see one of our other upstreams, Williams Communications. If we
put it in play, you can watch paths move around for 10 days. You
can see, I mean, this is pretty stable for the most part. This is
kind of the boring target, which is the way we like it.


RANDY BUSH

Can you wrap up please, Joel.


JOEL JAEGGLI

Yes, this is it. Any questions. Thanks for your time.


PHILIP SMITH

Thanks very much, Joel. Geoff, would you like to...


GEOFF HUSTON

Hi. My name is Geoff Huston. I'm with APNIC. What I'll be showing
you now is actually application of that Routeviews data. So this
is actually a very quick status report on the state of the interdomain
routing system as seen by Routeviews. Have we got a laser pointer
there? I'll show you a whole sequence of graphs and a little movie.

(Refers to slide) This is since 1994, looking at the number of
entries in the BGP routing table. Routeviews came on line in late
'97. This is 0, this is 200,000 entries. Here is the Internet boom.
It stopped at around 2001. A bit of a sort of Internet burst there,
but now more recently, we're back into a steady growth pattern which
is, yet again, accelerating in the number of entries. That was very
sharply an exponential growth pattern. This is a growth pattern
which one could model at some kind of increasing rate, probably
exponential but to a lesser degree.

If we take up the noise and look at one particular AS and its growth,
you actually see the boom and bust pretty clearly. So around the
year 2001 and around 100,000 entries, the number of entries actually
stopped growing for around 12 months and then we all decided life
is cool, the Internet is wonderful and back we go again. More
recently, and that is 2003 up until a couple of days ago, for some
reason, in the last half of last year, things started growing more
quickly and, indeed, just around December and January, someone was
actually being very active here. There were bursts of disaggregation.
But the last month or so has actually seen a whole bunch of new
routing entries. I shouldn't say 'new. Much of them are more specific.
Are still leaking. The number of routing entries is not the entire
dimension of the routing system.

The other way to look at it is how much address space is actually
routed. There are 4 billion /32s in IPv4 and this is a small range
from 800 million up to 1.4 billion. So, around a quarter, a little
over a quarter of address space is actually advertised in BGP. 1997
up until a few days ago and what I'm trying to find here is curves
and distributions. Are we consuming address space more quickly?
Again, an Internet boom, not as obvious, a bit of correction -
2001/2002 - and more recently, since mid-2003, all of a sudden,
we're starting to see rollouts again, which is a dramatic increase
in the amount of address space in the routing table. Viewed from a
single AS - same data, just one stream - what's all this nice?
Someone flaps a /8. Why you would turn on and off 17 million /32s
multiple times a day beats me with a stick but, you know, people
do.

More recently, they stopped, since mid-2004. You can see they've
accelerated, coming back again. Are we getting better at aggregation?
No. Are we getting any worse? No. Certainly, the boom saw a lot of
more specifics getting into the routing table. But, by the end of
2001, the percentage of more specifics stopped at around 55%. So
half the table, a little over half, actually doesn't need to be
there. Since then, the routing table has grown and, if everyone was
simply doing the right thing, the percentage of aggregates, more
specifics, would actually drop. But it's not. There's still the
same amount of folk who get address space  typically a /20 - and
all of a sudden produce an advertisement of a /20 and, just for
good effect, about eight or so /24s so we can fill up our routing
table faster.

So the 55% hasn't really changed much. No better, no worse. How
many offers are there? Internet boom, very quickly from 3,000
Autonomous Systems up to 12,000 Autonomous Systems in the routing
table and then, after the boom, there wasn't really a crash but the
trend line is quite different. What used to be a very prominent
exponential growth rate, which would have seen us run out of addresses
by approximately 2007/2008, is now into a more linear trend. I
haven't done forecasts on that yet, but certainly further out.
Currently around 18,800 autonomous system numbers.

Steve, I think it was, said the average length of AS paths, if you
remove all the AS path stuffing, is around three or four. He's dead
set correct and, actually hasn't altered much since 1997. Here's
the AS path link as used by every AS peer as from 1997 to a few
days ago. Most of the Routeviews peers see their average AS path
length at around 3.8 or so and, interestingly, it's converging.
More entries, but same length. So, if you have a metric about the
interconnection densities, you're actually seeing that BGP is
becoming more connected. As we grow, the diameter is not expanding.
The diameter of the network is actually contracting. There's a big
black hole somewhere in the middle of the network and if you said
AS 701, you'd probably still be pretty close to it.

The AS paths do tend at the moment to continue to converge in the
number of players that's growing.

If we were better, how better could we be? This is since 2001 the
size of the BGP routing table - 100,000 up to 158,000 entries. If
we started to remove the more specifics, the folk can have precisely
the same propended AS path, 105,000 entries. Strip out propending
- 104,000. Say, "If it comes from that area but is more specific,
drop it out" - 90,000. Now say some folk checkerboard - given a
range of addresses, they'll advertise a /24, then a blank and then
a /24 and blank. If I covered an aggregate over the holes, can I
do better? Yes, you can. Around 45,000 entries is around the true
information load in the routing system. The number of entries that
add something new in terms of reachability. Up here is actually
traffic engineering.

The overload of traffic engineering is approximately 100,000 routs
or two-thirds of the routing table. What about v6? The red line is
v6. Routeviews started collecting in 2003. The blue line is the
6bone and the green line is what the RIRs are allocating. A phenomenal
400 entries and today we're up to a more phenomenal 700 entries.
The activity in the v6 routing table is growing, albeit very slowly.

How much address space? This is a logarithmic graph. It's not this
slow. That's a /24, /22, /20 and so on. The total v6 table is about
a / 17, a little bit over that. It used to be that the 6bone was
the major contributor with a number of very large allocations done
to major tell KOES/ over the last year -- telcos over the last year
ago that are announced in the table.

How many people are playing in v6? A little bit over 500. I actually
searched for the number of Autonomous Systems that only originate
in v6. Whichever of you out there are, there are only I think 11
at the moment. So, in general, people are doing both and there are
actually 11 originators which are only. Growing, as you see, quite
slowly.

Aggregation potential. In general, the policies around v6 have
actually been at this particular point - that's very early days -
largely what we're seeing is good aggregation. One particular entity
not very far from here but not in this country did decide to
disaggregate into /48s in July last year - thank you very much -
and they've persisted in doing that with a bit of experimentation
up until early January. They decided that the experiment was over
and they've re-aggregated. So in general, the thing is packed almost
as tight as you can get. Which brings me now to one of the more
interesting applications of this Routeviews data, that, if you take
all of the Routeviews data and assemble it, you can actually make
a movie of what's happened, which I have done. So, as this runs,
we'll actually just quickly go through what you're seeing.

This is the IPv4 space. Here is 16, 32, 128, net 192. Here is the
top space, the old class E and class D. That's the entire space.
What I've done is just use the /8s. Originally in 1993, IANA was
simply allocated Class As, class Bs, class Cs. The red vertical bar
is actually where it gets allocated into what is going to become
an RIR. In 1985, we haven't even invented the acronym. If you can
just see it, the green bits which slowly occupy the red bars are
when the RIRs pals the address out. So it's the actual allocation.

Down the bottom is the autonomous system space, so AS numbers from
0 up to 65,000 and there you see a couple of metres indicating some
sort of spurt rates at how fast we're going through. We're up to
1986/1987 and the network was actually classful. The space is given
out to various folk, including universities. If you notice, most
of the activity was in the class B space. There was a lot of activity
too up around 192 but each allocation is so damn small but at this
scale of the entire address space, you can hardly see it.

WIDE started doing it around 1986, it might have been JUNET then.

You're now seeing phenomenal activity in the class B space, a really
phenomenal amount of activity. So much so that, by about now, it
became obvious that, if we're going to persist we've run out of Bs
10 years ago. These were just running riot and, by 1990, 1991, the
academic and research bodies were running through and getting - you
see how fast those green lines are going. Not only were they getting
through it, but the registry folks, at that time, network solutions
were working extraordinarily effectively. Down here, we started
using BGP and what you actually see is BGP is slowly growing in
terms of the autonomous system numbers. But, you know, the address
space is just romping through.

This became a problem. And what you actually see is that, at this
point, we're all going to meetings talking about howl we were going
to stop ourselves crashing through either a routing or address space
explosion. So the Bs are now getting very, very full, as these green
lines of allocation. And, by late '93, early '94, it was time to
actually devise a different mechanism and we started building
classless into the main routing. With classless interdomain routing,
we started occupying that top space there. And you notice now the
activity there is slowing down and over here the activity is rising.
Also, the number of players - remember 1995? Most of you might. The
number of players are starting to increase. The stuff is getting
commercial. The number of Autonomous Systems in play was really
starting to move.

So this thing is growing very quickly, whereas CIDR is putting some
break on the amount of consumption of address space.

So now we've seen a lot of the old /8s being used, a lot of the
class Bs and now most of the activity is up there in CIDR. So the
next bit to introduce which happens in 1997 as you heard is that
Routeviews started collecting true rate routing data. What is
allocated is not what's routed. What's routed is actually a subset
so what I've done is overlay on this what's actually routed. Most
of the old Class As aren't in the routing table. A huge amount of
space is just lying there, moribund is the word that springs to my
mind. I can't see most of them and there's a flapping /8. Why people
do this I don't know but the routeview is seeing this appear and
disappear. The old class B space, half of it is being hoarded. Where
the RIRs are actually active in this point with CIDR-based allocations,
what you're seeing is that the majority of it is routed.

The light green is unrouted. The dark green is what I'm seeing. At
the ASs too, were seeing relatively slow growth in the AS space.
The ASs are actually just romping through. And the other thing is
no-one likes an old AS. When an AS is about three years old, you
throw it away and get a new one. These old one s, 10,000, 5,000 or
know are heading off to the great AS rain guard in the sky. We're
interested in the big numbers.

Now we're round about an 80% diverting rate on recent autonomous
system numbers. The old one s - now they're occupied by the old 64
/8. Those policies that got introduced about, if you get it, you
advertise it, is actually largely there but what is actually being
allocated appears in the routing table. Routeviews does collapse
from time to time. When the thing goes all light green, it's a
gateway anomaly. We're back to recent history, September 2004. Old
AS numbers are still heading to the AS number radar in the sky.
Everyone wants a new one.

Where we were a week or two ago is where we are here. The B space
is the space of a large amount of unused resource. The old A space
is a large amount of unused resource. There is actually quite a
deal of life in that table, more than one would expect by even
looking at allegations, as to what we're actually using in the
public network. And even in the autonomous systems space, there's
a little bit under half left, which would actually take some time
to consume at this current rate and a huge amount of space to find
the AS graveyard in the sky and grab some back. Questions? OK, thank
you.


APPLAUSE


PHILIP SMITH

Thank you very much, Geoff.


RANDY BUSH

Due to time constraints, I'm going to skip the first two of my talks
- the first one of my two talks.

I'm Randy Bush from IIJ. Please note the word 'early'. That's a
very important word here. The graduate student is sending us
information as we sit at the podium.

OK - oops!  That's not enjoyable. (Bottom line of slide not appearing)
Anybody have any ideas how to recover the bottom line? Please tell
us. OK. Why do we do this? In a NANOG meeting, Revising Mark Koshers
did a presentation where they essentially said - do not  (Terry
Anderson makes some adjustments to projector) and they didn't!


TERRY MANDERSON

Do you have a different resolution?


RANDY BUSH

Yeah, I could probably do lots of stuff but there's only so much
time that's worth playing with it, you know? This looks a bit better.
So where they said, if you follow the URL, on foils 27 to 29, they
said the jitter was so much, don't run anycast with tasteful
transport. Cast is not sufficiently reliable to keep a long-term
connection.

But for almost a decade, there have been reports of successful
delivery of tasteful services over anycast. Something's wrong. Was
their measurement from an abnormal vantage point or are there other
things happening?

So we set up a little experiment and we sent out an e-mail to the
mailing list with a little test program and volunteers at hundreds
of hosts around the world ran this multiday experiment for about a
month. And every two seconds from each of them, it probed the anycast
root servers by doing a dig at that route server - that X is, you
know, a CM - the anycast ones - to find out which of the specific
instances of the anycast route server it hit so, for instance, if
it went after I-Rootservers.net, it would get back either Stockholm
or New York or whatever. So it knew which one it hit. And it did
it with both UDP and TCP and the results were collected on a central
server and they're being analysed as we speak.

And the results from a particular host look like this - a time
stamp, whether it was UDP or TCP, which root server and which
instance. So you can see we have Palo  Alto, etc, etc.

I want to warn you - this is not about the reliability of route
servers, OK? They were probably doing just fine. It's about things
we don't know about BGP and IGP routing. OK? This is about routing.
The effects you are about to see are probably caused by eBGP between
ISPs iBGP within an ISP and the IGP within an ISP. OK?

This is the view from one AS. A block scale across the bottom. This
says about 37 switches, changes of which one the probe hit, occurred
for route server I every, oh, I don't know, 40 seconds or 50 seconds.
Here's root server F etc. So this is the frequency of --

What you're seeing there is 35 switches at a frequency of less than
a minute reaching route server I. This is what we would like to
see. It always gets the same one. Routing was rock stable. Remember
all these measurements from this different ASs were taken during
the same time period, so the differences we are seeing are the
differences of how a particular AS experiences routing. Here's a
bad one. Notice it sees the same - what's nice is it sees the same
things, generally, across all the different anycast root servers.
So, you know, the K-Root server, the RIPE root server, it's getting
70 switches every about 10 seconds. No - it got 70 switches within
10 seconds of the last switch.


(TEXT MISSING HERE)


PETER SCHOENMAKER

You get some oscillation in your traffic engineering.


RANDY BUSH

Let me insert some optimism. I'm going to go through this very
quickly because we're almost out of time. This is the only presentation
this one makes clear BGP noise is not a good predictor of bad packet
delivery. OK. Many of you will see on the first experiment we
conducted which said  - I will do this quickly, especially for those
people who have a hard time with my accent. What's the relationship
between the control plane, ie routing and the data plane, packet
delivery. The fact that there are a lot of BGP updates, is it good
or bad? OK. People say Internet routing is fragile, collapsing, BGP
is broken, Day X was a bad routing day, et cetera. How do we measure
routing? We're told a lot of BGP updates is Internet instability.
There are too many BGP updates so BGP must be broken.  OK.

I don't think so. I think BGP announcements could be like white
blood cells. That they show that there is a disease, but they are
actually fighting it. They're part of the cure not the problem.

And that routing quality, what's good routing is how can we say
it's good unless we have metrics. I don't think we should assume
that the number of prefixes, the speed or completeness of convergence
are measures of routing quality. The goal of routing is to get the
customers packets there. Reliably. So I think we should measure
whether the users packets reach their destinations. If the users'
packets reach their destinations we jokingly call them happy packets
and the routing system must be working. And there are very well
known and rigorous metrics for measuring packet performance. They
are delay, drop, jitter and reordering, very formal ways of measuring.
We set out to measure the control plane quality by measuring the
data plane.

We did this other experiment that I reported previously but many
of you didn't see it where we had a BGP beacon making announcements
out to the Internet and withdrawals. And hundreds of nodes streaming
data towards there and we measured delay, drop and jitter. But
that's an artificial experiment, not a real event, OK. But we
measured the performance, we found no significant correlation between
the number or time of updates and data performance, but this was
artificial and it didn't check real events on the real Internet.
We said what, are the real events? People talk about the code red
event, the Nimbda event and the Slammer event.

The messages mean the Internet was horrible that day. We got the
route views data, thank you, Joel, for the cell plane. A number of
BGP announcements ape their frequency. We got the data plane, thank
you Andre and the rest of the RIPE folk who are here for the RIPE
test traffic measurement project which has many nodes scattered
round the world but unfortunately mostly concentrated in Europe,
we need more in Asia and the States. These boxes all send packets
to each other. They use that formal measurement of delay, drop and
jitter. And those data were made available kindly by RIPE to us,
for the periods of code red, Nimbda and Slammer.

When we look at, for instance, code red, this is when it occurred,
OK, we see the big spike in BGP updates and we see no significant
change in delay of packet delivery.

BGP, the routing system got hit, something hit it. But it worked
and the customers packets got delivered.

Similarly, Nimbda, big spike. By the way this line is at 1.96
standard deviations which is a 95th percentile, so 95th percentile
confident  - you know.   BGP went to hell. Packets were happy.
Slammer not. Slammer, BGP updates, mean delays. So packets were
delayed, OK. Because of slammer.

Let's look a little at why. If we look at code red these are time
series analysis, these are months here. So we're matching a code
red, Nimbda on one slide and we see the delay in blue, OK, and we
see no  - that occurred before code red, it doesn't count. Something
else made a radically significant effect on delay packet delivery
but it wasn't code red and it wasn't Nimbda. Packet loss, same
story. The number of routing changes  - aha, we see them! So that's
red and Nimbda. But here's Slammer where we did see delay associated
with Slammer. But we didn't see packets lost. So the BGP event, the
event that caused the routing changes probably caused less efficient
routing, the packets took a less efficient path, but they got there.
The same thing  - here's the routing changes, masses of significant
routing changes, OK.  So watching BGP update count or frequency is
easy but it's not a good predictor.

Measure performance directly, please. Here's how to measure them.
It would be nice to have more test traffic boxes scattered through
Asia and the States so we can.

And thanks to  - this was done by the way with Matt at University
of Adelaide. Whoo! 2 minutes late. Questions?

Thank you.


APPLAUSE


PHILIP SMITH

Thank you very much, Randy. If there are no other questions then
this is the end of the routing SIG. I would like to thank all the
speakers very much for giving their time to give us presentations.
If any of them haven't already done so you can please give your
slides to APNIC so they can go on the APNIC web site.

I would like to thank you all very much for attending. I think these
two sessions of the APNIC routing SIG has been successful so I hope
we can repeat this when we next meet in six months wherever that
may be. So thanks all very much and we'll see you soon!


RANDY BUSH

That was my question, by the way  - do people like the expanded
routing SIG agenda? Or would you rather it be short?


PHILIP SMITH

Everybody stayed so it must've been alright. Thanks very much.


APPLAUSE