APNIC Home

SANOG Plenary

Wednesday 5 September 2007 0900-1100

RS PERHAR:

Can I request everyone to settle down.

Good morning, everybody. On the dais, Paul Wilson, Philip, Hideo Ishii, Gaurab, Vijay and Kurtis, and dear delegates, it gives me great pleasure in welcoming you to India and to the SANOG and APNIC meet, on behalf of Internet Service Providers Association of India.

I remember, it was just a short time ago, probably this year in January, when I went to Sri Lanka to invite SANOG delegates to come to India for the SANOG 10 and a little while later, I called to Bali and did the same job of inviting the delegates to come for APNIC Open Policy Meeting. Well, a few months have passed. A fair amount of work was put in. And we are very happy that you all took so much of time and came for this Open Policy Meeting.

As you know, in the next couple of days, you'll be busy with the conferences, the technical conferences and the plenary sessions. Also, I believe there is an issue on which you would be voting regarding the fee structure. We are very - in India, we are very democratic in nature. We love to vote. We love to discuss. So I'm sure all of you will take part in this voting process and help APNIC decide on the future course of action.

This event could not have been successful without the support of our sponsors. The main sponsor has been NIXI. They have helped us a lot, not only in terms of money, but in terms of assistance, help. Followed by Google, Afilias, Reliance Communications has also pitched in as the gold sponsors. In the silver, we have VSNL, Tulip, Juniper and Cisco. We have also been provided sponsorship by F-Secure, Force 10 and Guavus. Certain other sponsors have been Autonomica, JPNIC, CNNIC, NIDA, KRNIC, TWNIC. We have also been supported by Internet Society, the Asia Pacific Network Resource Centre, APIA. I would also at this point in time like to thank Spectranet and Aircel. They are the people who have provided us with the bandwidth for the last seven days and will keep providing to you till these sessions are on. A fair amount of work has been done by both these companies. We also wish to thank IPv6 connectivity, which has been provided by VSNL and VSNL International.

Events such as this do require a fair amount of investment. We are very happy and we are very thankful to our sponsors, who have helped in meeting the expenses for this event. Remember that, unless these sponsors pitch in, it is very difficult for the - for fellowships to be provided, for people who can't afford, for them to come and learn. After all, basically, what are SANOG and APNIC all about? About spreading knowledge and information. We are very thankful to them.

In the next couple of days, the plenary sessions will be on. Detailed discussions will be on. We wish you all the best. We have tried our best to make your stay as comfortable as possible. Still, if you feel that there is anything which we can do for the better, please inform my team, led by Puneet. I wish you a great time for the next two days. Thank you.

APPLAUSE

I now hand you over to Gaurab, who will guide this session further.

GAURAB RAJ UPADHAYA:

Thank you, Colonel Perhar. Colonel Perhar is the secretary of the ISP Association of India. Thank you very much for hosting us here. As he said, a lot of work has been put by all of his team members. So before we go on to the agenda, I'd just like to give a slight overview of what this meeting is about, because a lot of you must have heard about SANOG and also heard about APNIC and how come these two events are combined together here.

So since the last year, and what is APOPS - that is probably the big question a lot of people probably have not heard about APOPS. So APOPS is the bigger version of SANOG, I would say, Asia Pacific Operators Forum. And SANOG is more for the south eastern Asia, so we have a combined meeting. And, you know, the content of both these meetings are same. We decided to merge it instead of doing our own thing parallelly here in Delhi. So I think it has worked out pretty well. And we can talk about SANOG, APOPS things later. But now I'd like to invite Paul Wilson to say a few words from APNIC.

PAUL WILSON:

Yes, good morning and welcome, a warm welcome to the 24th APNIC Open Policy Meeting, combined with the 10th SANOG meeting. It's the first time, in fact, that the APNIC Open Policy Meeting has been held in South Asia at all. It's the first time that we've had a joint official meeting with SANOG. So I hope it's not the last in either category, but it's very good to be here and have those two milestones. And I'd like to thank SANOG very much for working with us on this meeting, and ISPAI as well, of course, for supporting the meeting so very well. So far, the preparations and arrangements have been really very good so far and I'm sure it will be a smooth and enjoyable week for the rest of the time here. So thanks also to the sponsors of the meeting and we've heard a few names, but in particular, as I said, ISPAI and NIXI and Reliance, who are up in the top sponsorship tier and have contributed greatly to this meeting.

I'm here in New Delhi, not for the first time, but I'm here with quite a contingent of APNIC staff, who are looking after the meeting, in particular working closely with the SANOG staff, but the APNIC staff or here also to run a variety of services on behalf of the APNIC members, so we have hostmaster staff running hostmaster consultations, we have a number of technical staff, who are looking after the webcast and the stenocaptioning and the - also the archiving and the onsite website, which is available as well for latest information about the meeting. We've also got later in the week the APNIC Member Meeting, so in particular for APNIC members on Friday, there'll be discussions about the policy proceedings of the week, there'll be reports from APNIC, there'll also be - there's be an important vote on the APNIC fee structure, so all APNIC members who are here are really encouraged strongly to come along to participate in the day on Friday, as well as everything else in the meantime, of course. But in particular, in the vote that's still under way online - I think - for a little while today. But, if not, then there's the opportunity for voting at the meeting on Friday.

So I'd like to welcome you again to New Delhi. I'd like to encourage you to give your encouragement to the speakers for today and for the rest of the meeting, by moving down, if possible, towards the front of the room and signalling that you are listening intently and attentively to the presentations, which are being made. The tables at the back of the room are being provided for those who really can't tear themselves away from e-mail and who are happy to try and multitask, but I know how difficult that is, so I'd really encourage you to make the best of actually being here in New Delhi and to manufacture down and occupy the front of the room so that we can have a really - a good atmosphere and encourage our speakers to give us their best as well.

So, with that, I'd like to hand back to the SANOG opening plenary and say thanks again for being here and I really look forward very much to seeing you all through the rest of the week and I hope you all have an excellent time here in New Delhi. Thanks.

APPLAUSE

GAURAB RAJ UPADHAYA:

Thank you, Paul.

Before we actually go on to the speakers, I'd like to introduce my co-chairs today - Philip Smith and Hideo Ishii. They're the co-chairs of APOPS and I'm, of course, the chair of SANOG.

So let's get to work now.

Vijay. Vijay Gill works for Google and he's the manager in network engineering. He used to work with AOL before and before that with a lot of other Internet companies and has more than 15 years or 16 years working on Internet architecture, network design and so on and so forth. So he's going to talk about peering and network design in the 10-gig global world. Vijay from Google.

Peering and network design in the 10g global world

VIJAY GILL:

OK, so, I'm Vijay Gill. I'm manager of production network engineering for Google. Can you hear me, Randy?

VIJAY GILL:

A little louder.

VIJAY GILL:

OK. I'm Vijay Gill, from Google. And I will talk about the lessons of failure, the harvest of pain, because every - pretty much in every job I've had, we have ignored these principles and we have failed miserably. This includes companies that went bankrupt in the middle of the telecom boom. Ignore these at our peril. We have been through these. So these should be good learnings for everybody else.

So our Google legal folks made me put this in. This is a part of common practice, not necessarily representing Google, blah, blah, blah.

OK, introduction, fundamental constraints, differentiators, design principles and conclusions.

So normally when you build up a network, this is how it goes. You go to your vendor and say, "Oi! Run me up a network." And the vendor says, "Sure," because they have a large amount of people on staff whose job is to sell equipment and so what will happen is blank slide, profit, mostly for the vendor, not so much for you.

So, the good old days really were very horrible old days. For example, when I was at UUNET, one of our design documents, written by a well-known Internet luminary, the words 'frame relay interim crutch' were used when we were designing the UUNET backbone and let me tell you when the words 'interim crutch' are used in a design doc describing your production backbone, life is not good. We had three-com cards with 2KB of pact memory with an interim DU of 1,500 bites. You can imagine what kind of packet loss we used to experience on our net edges, thanks to MFS. Routing router upgrades. For example, today people want to upgrade OIS, OK, we test it in the lab, we put it out in the network, and then we roll it out. Those days, it was so bad, that we had to actually call the engineers who were actually compiling and writing the pact-forwarding code, online with us on the phone, start the upgrade, discover a whole new set of problems, have them fixed and recompile the code on the phone, ship it to us via FTP and then employ the next revision of the code.

Routing updates - whenever we had a link down and then the network propagated the routing update around the board, routers would step forwarding packets. When I was at UUNET, this was actually peculiar to the green-money, blue-money accept Arturo of the US providers. Since we were acquired by WorldCom, our internal circuits were not counted as revenue for the business units providing OC 12 and OC 48 circuits, so what they did was, since we were internal blue money, they would build our circuits for the backbone on the protect side of OC 48 and OC 12 rings so whenever there was a switch hilt or a fibre cut, our network - of course, the paying customers would switch over to the protect side and our circuits would just go away.

DEC gig switches. Light stream switches that took 40 minutes to boot. Time to market LAN cards that were ostensibly OC 48 but actually were close tore OC 24 or OC 26 in actual terms of packet forwarding. All those problems are no longer with us. Pretty much any idiot - and I stand here claiming to be one - can stand up in the backbone today just by going to a vendor and saying, "Stand me up a backbone," you buy a bunch of routers and circuits and you turn it on. It is a solved problem.

The methodology of how to construct and run a backbone is now a business case. Go to a vendor and ask for a white paper on how to start up a backbone and they will give it to you. This is trivial.

However, what the distilled experience I have reduces the entire science and art of network construction down to the following four constraints:

You're constrained by fibre topology, you're constrained by forwarding capacity, you're constrained by the power and space - and this is actually a bigger constraint than it looks here - and you're constrained by the flow of money, the business case. Everything else, no matter what people write in white papers, no matter what people tell you in books, fundamentally comes down to these following four constraints.

And here is the killer - everybody has the same constraints. It is not that Level 3, for example, can buy routers that are faster than you or cheaper than you. It is not that they have speed of light that is faster than the speed of light that you have. It is not that their LAN cards have more RAM. The question is, in this world, how do you differentiate yourself?

And the first part is latency. Latency matters, especially as more people vote on the web 2.0 world, where more and more applications live in the cloud, latency, the time to get results back, is now a key differentiator. There is direct revenue correlation with latency.

Cost. Of course, the fundamental problem - how cheap can you sell me service? Ideally for free, if not as close to zero as possible. Notice I did not use the word 'price', because, as we have discovered, selling stuff below cost and trying to make it up in volume, is a joke, not a business plan.

Open networks - and I will go into more detail later.

And the final portion - rich connectivity. How well you're connected is a competitive differentiator in today's world.

And the hardest problem to solve, which nobody can solve for you except yourself, is that your OSS and network management system, your provisioning systems, your auditing databases, these are key differentiators going forward in today's network world.

So latency. I'm sure people that realise that there is a whole slew of content distribution networks standing up, because latency matters. What we do now in many networks is that we actually do dollar-based forwarding, is that user facing traffic is forwarded on different lower-latency bands than traffic that is not user-facing. And when I say this is actually peculiar to people that have a combination of user-facing traffic and internal data that they ship around, so bulk data, for example data centre copies, tend to go on the longer-latency paths. Anything that is user-facing tends to go on the very short latency paths.

And, frankly, some traffic is more important than others - revenue peering traffic is far more important than a backend index push, for example.

One of the things we've realised is that traffic topology changes very dynamically, especially for a company that has a large amount of software engineers and a large amount of data centre space, because what happens is this is the flash route phenomena, where suddenly one data centre is active and next thing you now, they have shifted 500 gigabits from one data centre to another data centre because somebody typed in a command incorrectly. This happens more often than we would like to admit.

And of course the key part is the transaction latency is on the human visibility threshold, in the sense that no matter how quickly your backbone reacts, the key is the user experience and the user experience is, "When I click on something, can I get a result back right away?" So the controlled loop for forwarding on your backbone is extremely short. It is in the order of four milliseconds, which is the normal human response time when things start timing out.

So here is an example (refers to slide). So user-facing traffic, we will run on the blue path. Anything that is not user-facing, we will run on the gold path, which is the 9-millisecond, 9-millisecond, 10 and 7-millisecond path. And we can use this by using glass-based forwarding on all modern vendors. You pick which class of user traffic you want to send over the short links and then you turn on pre-emption, so make the user-facing stuff high priority and bulk copies be low priority and then let the network route itself. This pre-supposes that your IGP costing follows latency measurements on circuit.

So, second portion - sorry, I was like, my laptop has gone insane, sorry - so, here's a quote from Sean Doran back in the day that said, "People that survive will be able to build a network at the lowest cost commensurate with their SLA." This is a very simple sentence. It is amazing how many people lose sight of this basic fact. You are here not to engineer a network because it is cool. You are here to engineer a network that meets your SLA guarantees at the lowest possible cost. Seems simple but, like all simple things, it's very hard to grasp, especially for management.

What Sean did forget to add was that this only works in a free market. If you have a distorted market, well, there is no competition, or there is distorted competition, in which case, ignore this advice, because you're good to go.

So what have we learned trying to make networks cheaper and that still meet our SLA?

So the first thing we have to drum into management is that running a business based on a model that predicts 50% or more margin like the good old days is no longer valid. Think trying to run a business at 10% of margin on revenue. These are margins that you experience on brick-and-mortar stores like Wal-Mart or other retail. You are in a retail business. Your margins have to just barely scrape above your cost. And this goes directly back to the fact that nobody else can buy equipment that you can buy, so your cost basis across all your competitors on a capital level is pretty much the same.

The other thing we've learned is that the killer app is bandwidth. Give lots of bandwidth at a cost-plus model, you will do OK. People don't want ever-higher services. Most people just want large amounts of decent bandwidth and they will put their own utilisation on the edges and do what they want with it. This is a lesson that several companies previous that I worked at we forgot and then we were bitten and we failed. And, of course, there is entire section on OSS and MMS.

The other thing is that, if you have a failing business, dumping more money into the failing business, like, for example, the US Chapter 11 laws, does not necessarily make that business suddenly viable again. And enter is a very good quote from Warren Buffet, I believe, up here on the slides for you to read later.

So, what are the key components of cost? Number of boxes is a large component because you have to touch these boxes, have you to upgrade these boxes whenever we have to do an upgrade on a box that faces a customer, have you to make sure that you get the customer notification out. Your people have to go talk to the box, bring it down, manage customers, bring it back up. It is expensive. The more policy you put on the box, means that the smarter people, the more obtuse knowledge they have to have when touching a particular box to prevent an SLA violation.

Think about the 3am on-call. It's 3am, your network has melted down. You pick up the phone and call an engineer. Do you think an engineer at 3am - if your engineers are anything like ours, they're mostly drunk at 3am and will not be able to do anything. Make sure that your network is simple and understandable by normal people.

Every group has a bunch of superstars who would love to design what they consider to be an exhibition of their skill. Fire those people, because those people are 2% or 3% of your total networking base. The vast majority of your people are people like you and me and when we get up at 3am and look at a huge morass of, like, 500-stanza route maps. I'm like, "I can't deal with this." I just go back to sleep. And, of course, one of the issues is that even though our routers are very good today, if you add idiotic policy and huge amounts of tables, you will have packet forwarding. This has happened.

So that was the cost basis. Now, open networks. So there is an entire body of work called 'real options'. I will not bore you with going into the details of it here but basically what the distilled knowledge from the network application of real options is the following:

Market uncertainty is high. Just because you make a new product does not mean that the product will sell and, as we have found out, for example, with people like YouTube, just because you have something that looks idiotic at the surface does not mean that it will not gain traction and become extremely popular.

The thing is you do not understand what makes things popular and not. The strangest things become popular and the sure-fire, 'this product will never fail, I guarantee my job on it' will almost always flop badly. There are whole bunches of sales VPs walking around the Virginia area chanting this mantra - "But it was a sure shot" - and they're out of jobs now.

So basically what we are trying to do here is make your network a great on a cost-plus basis on transporting bets. We are in a commodity business. Have no illusions about that.

What you can do. Many people in the commodity business are making lots of money. For example, oil is a commodity business and yet there are certain large oil companies who are making record profits. You can make money in the commodity business. Just don't try to jazz it up by pretending that it is not a commodity business.

So make your network operate on a cost-plus basis, make it open. Let other people utilise the network for innovation at a low cost, because, if it fails, who cares? I just got out of 50,000 or 10,000 dollars and my network, my idea failed, I'll just go try again. The cost of experimentation in today's world is very low.

So if you have a cost-plus network and you foster a virtuous cycle - for example, the bandwidth utilisation of these video-sharing sites has exceeded anything I've ever seen and if you are operating on a cost-plus basis, you will be able to leverage that into higher profits for yourself. Again, fundamental truth but apparently most people don't seem to get it.

So this is - at this point, typically the salespeople come running in and shout, "We want to climb up the value chain." And that is fine. You can climb up the value chain, just realise that your network is not suited to climbing up the value chain, because you need to decide what business you are in. Are you in the business of providing transport or are you in the business of climbing up the value chain? Because one scales linearly with the number of people providing the service, so you have people like IBM Global Services, Accenture, the consulting companies, who are up the value chain, but their cost basis also rises because, if they take on a large amount of contracts, they will need to hire a large number of people.

If you are on a network side providing a commodity business, what you want to do is that sense of marginal cost of providing the core is zero, or tends to zero, you want to provide as many bits as you can on the lowest cost basis possible, which means ensuring that you have the lowest SGNA - which is the human capital cost in your network. Normally human capital cost is about one-third to one-fourth of your cost basis on most modern networks.

So realise that moving from transport into services are completely disjoint businesses with completely disjoint pressures on growth, completely disjoint pressures on cost. You cannot push them together and 'climb up the value chain' while trying to provide what is fundamentally a commodity transport and trying to layer services on top of it. You will get killed. Many people are learning this lesson now because the laser-like focus of the low-cost transport providers likes Cogent has put a large amount of hurting on people that were 'climbing up the value chain' with ever hyped-up services.

And the other thing is rich connectivity. This comes under the rubric of peering. So enter the Tier-1 club, chances are nobody will ever get into the Tier-1 club again. It's pretty much over. What you want to do is use - not use the TON differentiator but use having rich connectivity around the Tier-1s as a differentiator.

So this is a nice quote by Joi Ito, which says that people always seem to overvalue content. What people really want is connectivity because then they can foster their own content among people, among their group of friends. People don't want pre-packaged content that you will deliver to them. People want to connect to people and talk. Voice is still the kipper app on mobile, for example, no matter how much push the telecom people put on data. People just want to pick up the phone and talk and therefore connectivity dominates content in mobile. This is true for Internet, even though it doesn't appear to be that way, because e-mail for example is connectivity. Most people would not give up e-mail before giving up something else, because connectivity is the key.

So what does that have to do with networks? The Internet is a network of networks, so basically how well you connect with other networks that comprise the Internet is a competitive differentiating factor. So the real value of your network - your network alone has some value, right? So I have a network, say, for example, in India and I can connect people in Mumbai to people in Chennai. That's good for people who are in Mumbai or Chennai. However, if somebody is in the US and wants to talk to somebody in Mumbai, the fact that you now interconnect with networks that are in the US and that are in the Middle East gives that particular person greater value from your network than just if that particular person can connect only to cities in India. So it's a virtuous cycle of connectivity and you can look up Metcalfe's law and Reed's law which states that connectivity in and of itself is a large value for a network.

So I hate to use this but peering is heavily overloaded. So when people say 'peering', what they really mean is settlement-free interconnect. This is a losing battle. I've been trying to fight this battle for four years and it just doesn't work, so, anyway, I'm still going to bring it up. When people say 'peering', use the word 'SFI, settlement free interconnect'. It will make me happy and you want me to be happy. The other part is settlement-free interconnection is a business relationship, it is not what technical problem. Most people, most ISPs see the approach of SFI as a technical issue. It is not. This leads back to my last point I made, which is follow the money. Interconnection is a business problem, treat it as a business development issue, not a technological interconnection issue. You will have much more success.

So this is a slide that basically explains peering versus transit. If you are buying peer transit from ISP A, they will send your routes to the other routes, our peers that they interconnect with. However, as you notice, ISP B has an interconnection with ISP A and ISP C has an interconnection with ISP C and ISP BC interconnects with ISP C. But ISP B will not send the route it learned from ISP A to ISP C because you will not transit routes that you learned over a peering connection. Once you understand this, you understand why you have to buy transit, why it is important to interconnect with as many people as you possibly can to avoid partitioning your network to make sure that people have full connectivity.

So Bill Woodcock coined the donut model. This is the model everybody in the future will fall into. You have the central Tier-1 core. What you will do is then you are in the green. You will peer around the Tier-1 core with all the other customers, ISP customers, of the Tier-1s and then increase your peering edge. And this will allow you to reduce cost, because you're no longer paying the Tier-1 providers for transit to other Tier-2 ISPs. You will connect up with them and it will reduce latency for your customers and peering is - or peering connectivity is a differentiating factor because there are several RFPs that come out every month that ask for your settlement free interconnect details, because people want, people are getting savvy now, they want to know how well you're connected with other networks because all their traffic is not necessarily going to be on your network. They will need to talk with other networks as well.

So this is the connectivity end state. There will always be some transit. Most people will be buying. That's the transit at the top. The darker black lines on the right-hand side - that is your peering connectivity. Get as much of that as you can because your cost on the yellow will reduce directly in proportion to how much peering you manage to get on the side and then, of course, you get money from people that you're selling transit to, your customers. So improve your connectivity, make life better for your customers, enable more people to use the net, you will gain, everybody will gain and, of course, attend peering forums.

This is the hardest lesson to learn. Every network I have worked at, this was the biggest single point of pain that we had in the network. The key fundamentals are the database is the one source of truth. Not the network. This is almost completely opposed from how things work in most networks. In most networks, what's on the router is the source of truth. This is wrong. Not that this is - it doesn't work for you, but, if it you want to grow, if you want to reduce your cost basis, you have to move the network authoritative thinking to a database authoritative thought model. What this does, it allows for faster provisioning and provisioning is a key differentiator. If you can turn up a customer in five minutes, if you can turn up a customer in ten minutes from the time the server is delivered or they plug in physically, you are losing revenue.

It allows you to have a consistent network. This goes directly back to the 3am conference call. If every router in your network looks different and you wake up in the morning and you're half drunk and you go look on the network to see what's wrong, you will have far longer troubleshooting time horizon than if the network looked pretty much exactly the same and you can say, "Oh, this particular peer is down, or this particular iBGP session is down, why is that?" when you don't even know what route mesh or group your particular router is supposed to be in.

It allows you to audit the network. This is much more - how should I put it this way? - important than you think. Because we've had persistent loops knocking customers out for days at a time because we forgot to fully mesh in iBGP in the new router that we turned up. Basic simple things like that. And of course, the customer is very irate and we lost many customers because of these kind of routing loops. All these kind of things combined give a better quality of service to the end-user and that is basically people just want a network that works and not get treated badly. They want something that works consistently all the time for a consist enterprise. This will enable you to give it to them. And, of course, this is a competitive advantage because you are reducing your SLA pay-outs due to misconfigurations.

So all that distilled down comes down to the following design principles. Complete the network design life cycle in-house, do not outsource, even though it may look tempting. If you are outsourcing the network design, you are a marketing company, not a network operator and it is fine to be a marketing company, but many people who are marketing companies are doing extremely well. They're just not in the business of providing networks and any person that buys a service from them will get what they deserve. So I have yet to see an ISP flourish in a free market and the key word here is a free market, that they're not hold this as core DNA. This is the core concept of any large ISP - that you must be able to design, implement and operate the networks inhouse, not outsource, not outsource to vendors, not outsource to consultants.

So basically, the rest is simple - diversity of components, paths, there should be no single points of failure. This is harder than it looks, especially on the edge. Try to keep a routing topology. The BGP-free core is - I will go into that in more detail later - and of course have survivability reviews every quarter, get a bunch of your sharp engineers together and say, "I'm taking out this particular node, I'm taking out this particular fibre, what happens to the network?" this, of course, assumes that you have a good NMS and OSS that collects traffic flow stats from your network so you actually know what your network is doing. This is also surprisingly hard for most people to figure out. Most people do not know what the network is doing on a systemic basis. They know how links are loaded. That is trivial. MRTG will tell you that. But do you know edge-to-edge paths so that if I take something down in the middle of the network, how traffic will reroute around that particular failed component? The answer in almost all cases is, "I have no idea." This is bad.

So invest in your engineers. This is the second part of having the ability to architect design and operate inhouse. The key part is in the metal. It is not the fibre. Everyone has the same metal. Pretty much everybody has the same fibre. What will differentiate is having smart engineers, a good operations team, a good tools team and tools are far more important than network engineers, and I say this as a network engineer.

So try to keep it as much standards-based as possible, because you will change your vendors, you will change your network, you will change what the network looks like. If it is standard-based, chances of your transition being - I wouldn't say painless but it being less painful than if you're running something different as your core iBGP. When you transit away, it will not be good.

Try to be consistent. The bean counters love trying to optimise the network to the finest cent. They will say, "OK, this particular node or this particular PoP, we will only put one router in and put it in the full iBGP mesh and hey, we've saved, like, $50,000." Which is good but that makes the network inconsistent, which is directly opposed to running an overall low-cost network. What you have managed to do is die the death of a thousand cuts. You've saved $50,000 here and $40,000 here in capital, because capital shows up directly on the bottom line. What you have done is increased your op-ex cost overall on an operational basis and op-ex is harder to see. Capital items have line items you can see - router, $50,000, LAN card, $100,000. But six engineers needed to maintain a 24/7 rotation so you can operate this network does not show up on the capital line items so accountants tend to optimise for stuff that they can see, not stuff they can't see and then the answer is our network engineering department is too expensive. Why not outsource it to a vendor? Done.

So try not to be too innovative. Because if you try to be too innovative you will bleed and this comes from a person who has been on the cutting edge - and I have the bleeding to prove it. Automation. Make sure that your stuff is consistent and you can write standards MoPs - methods of procedure - for almost anything you can do because you can take those MoPs and give them to new people and they are more productive. On an average network, it takes three to six months for a new engineer to be productive. That six months of stranded SGNA that you're not getting any use for and since the average employee lifetime is 18 months, you only have 12 productive months for an engineer before they quit and go somewhere else. What you want to do is minimise time for operational efficiency so make sure the network is consistent and you can push your benefit tasks down as quickly as possible so you can get your engineers working and productive as soon as possible.

So when to touch the network. This again goes back to simplicity. Try to test the network based on simple large metrics. They will almost always be good enough for what you're trying to do. Ever the more over-optimisation you do, the more brittle your network becomes and the more brittle your network becomes, your overall operating cost will go higher.

So the most - here are the three most important lessons. Invest in good software tools developers. If you have the choice of hiring a smart network engineer or a really sharp software developer who will write your network automation for you, hire the network automation developer. This is the fundamental truth that I have learned of everything that I have learned. If you forget everything the second I'm done talking remember this - hire software people to automate your network for you. Engineers are prima donnas, they're loud, they're expensive, they're hard to manage - I know as an engineer myself. Software people will return more value to your organisation in the long run than any network engineer to hire.

Invest heavily in processes and automation and ruthlessly stamp out network authoritative thinking. The host, the database, is the master copy of what your network should look like. You should be in the state that if somebody comes in and deletes your entire network, you type in a set of commands and you can regenerate the entire network from scratch. This is not idle speculation. I have worked at companies where you could do this. At UUNET, Bill Barnes wrote a script, a set of programs, that could recreate an entire, global, largest ISP in the world at the time - and probably still is in terms of reach and customers - you could recreate the entire configuration right down to the iBGP, LSP mesh, trafficking and parameters, the Cisco and Juniper configures for the entire network in 20 minutes. So Verio - I think Randy is here, yes he is - Verio has done the exact same thing.

It just requires a very strong force of will to actually make this happen, because engineers are lazy. They don't want to put the effort up front into doing this and this is hard - make no mistake. So engineers will say, "I don't want to do this. It prevents me from doing my job." They are lying. And I speak this as an engineer. I have been through the trenches. I know exactly what they're talking about. Invest in process. Invest in automation. Stamp out network authoritative thinking. You should be - the end goal should be that you can regenerate your network from database at any given point and nobody would notice.

Multiple backbones - I started back in 2000 - of having a multi-service core. Some people say, "I don't want to share a common control plane with the Internet backbone," and they are probably correct. So what people talk about are building application ware backbones and stacking them on top of a single packet transport. It looks extremely good on paper. It looks even better on power point when you're giving it to your senior management. There are nice coloured pictures, lines going in on the multi-service core, and lines going back out. I know. I have drawn some of them. This is the typical spiel that you use to sell to manage: So we have a requirement. This is the only workable spiel is that the requirement not to have a common control plane. Everything else is you can use cheap boxes to build a transport fabric, the MPLS switch.

We can overlay multiple networks on to the same common packet transport core and we will remove expensive equipment from the core and replace it with cheap MPLS switches. This will, of course, fail miserably because what happens is that we will have a cheap packet transport box that will never come to fruition, simply because their entire set of logic about removing the expensive forwarding bits and switching on MPLS is a lie because no matter what you do, somebody will want to use that box as an IP transport, so therefore you will be stuck with paying for all the longest-prefix match, first logic, all the D-RA M&S-RAM for forwarding look-ups anyway. You've managed to add yet another layer of expense into your network cost basis.

However, one thing that people are doing now - and this is, again, one of the changes caused by Cogent - is that they are now building underlay backbones using cheap equipment Internet switches, Ethernet and using that as the 'cheap backbone' and the reason is that the cheap backbone is good enough for almost all your customers and those customers comprise all of your traffic. So there are certain large ISPs in the US who built a very high-touch, very good-looking, very multi-service core and then are now buying equipment by the truck load and standing up a 10-gigabit core as an underlay because, hey, Cogent is killing them by selling service, raw IP, really cheap. They are based in Denver, if you want to investigate more afterwards.

So, in the real world, IP traffic will dominate. So design your network for cheap IP transport. Design for opaque bits - this goes directly back to packet inspection, IMS, additional complexity. Treat your bits as opaque. You don't care what they are. Forward under headers, call it done. Operate on a cost-plus basis. You will raise money when the rising tide of IP traffic raise your vote. And Walled Gardens will fail in a competitive market because as soon as you try to add value, people will say, "I don't want you to add value for me. I want to add my own value. Shut up and give me the cheapest IP transport you possibly can." And they will go to somebody else.

That was it.

Questions and answers?

SPEAKER FROM THE FLOOR:

I'm a project manager with Cisco Systems. That was great, Vijay. I enjoyed it. I know you talked about open network policies and in that you said the service providers shouldn't build their networks based on cost-plus model and consider their business as commodity market. Has any service provider looked into that and built a network on that basis or is it still in the initial stages of your discussion?

VIJAY GILL:

Well, before - the question was basically has any service provider built a network on a cost-plus basis. The answer is before the vendors ran amok, pretty much every network was a cost-plus basis. Rick Adams built a small ISP in Virginia, called UUNET, built on a cost-plus basis and until I had left, it was still operated on a cost-plus basis. Much smarter people than me, Randy, Rick Adams, everybody had the same idea. Build a network that can scale independently of the number of people needed to operate the network. So what that means is that you turn up customers, you turn up bandwidth without having to increase your cost basis except for the capital portion of your spend. The capital portion of your spend will fall off your books in three years. Everything else after that is pure grading on the train.

So make it as simple as possible, build it on a cost-plus basis, so ISPs like UUNET have built networks on a cost-plus basis. Our network, Google, for example, is on a cost-plus basis because we leave the innovation to our customers, which are basically the software developers running stuff in the data centres. All we do is just provide an opaque pipe that transports data centre bits from one end to another data centre or takes a bit from a customer back to the data centre and vice versa. That is all we do. That is the business we are in and we have grown our bandwidth by orders of magnitude without increasing our staff by similar amounts.

VIJAY GILL:

Randy Bush, IIJ. Great preso. Two things.

As you know, I've always been in advocate of open peering - and excuse me for using the word 'peering', I will not let the marketing people chase me around the dictionary. But we mean the same thing. As we are approaching the v4 free pool run-out, as we are anticipating issues with essentially driving the routing table up, besides the routing table and we do not have any great magic secrets about reducing churn, richer topology will mean more load on routers and so there is a trade-off and danger we have to be aware of for the smaller ISPs when we construct a richer topology, because we're going to drive their capital cost up to acquire the routers that can do it and we'll drive the capital cost up for the multihomed enterprise, who also wants to be in the default-free zone. And that's a trade-off that this group particularly in the address allocation and v4/v6 NAT transition environment needs to be very conscious of, and I think it's important. I still very strongly support the open peering idea and argue for it in my own company, etc. But just be aware of that, like everything else, there are trade-offs and that's one of them.

Secondly, I just want to heavily underscore the silliness of the multi-service core on MPLS. As you know, I've ranted against it for some years. But the purpose of it, really, isn't cost reduction. The purpose of it is so that the IP network people can cannibalise the frame relay and ATM people, so they can own the network politically. Frame relay and ATM are antique and silly things. They have one nice attribute, they're profitable, and what you do when you cannibalise them to put them in the multi-service core, is take the operating profit and turn it into capital expense to the vendors and now you know why the vendors market this very heavily.

VIJAY GILL:

Yes. Thank you, Randy.

I think that's it. Thank you very much for giving me the opportunity to talk to you and I will be available for questions. I will be here through Friday.

APPLAUSE

GAURAB RAJ UPADHAYA:

Thank you, Vijay. Next, we have Kurt Lindqvist from Netnod, Netnod/Autonomica or whatever else he is involved with.

Kurt is is going to talk about the architecture of the Internet as critical national infrastructure.

While Kurt is setting up his equipment, I have a couple of announcements to make.

Please use the microphone if you are asking questions. There are four microphones in the room. This session is being audiocasted and broadcasted, so make sure that you speak your name.

Morning tea and afternoon tea is outside the meeting room. Lunch is at Tennis Plaza, which is outside this hall, as you exit the doors on your right.

There's a MyAPNIC and policy flash demo all day at the APNIC services lounge Helpdesk.

There is an APNIC 24 feedback form on the APNIC 24 website. If you give your feedback and, you know, there is going to be a lucky draw on the feedback and if you - the prize is an MP3 player, sow might as well give your feedback on the website.

We also have a call for papers for lightning talks now available. If you want to speak up for ten minutes or less in the lightning talks, they will be held tomorrow evening or late afternoon, you can talk to me or Philip Smith or Hideo Ishii here. Or send an e-mail.

The CFP has already been sent to those registered for the conference before last week. Or you can go to the local website at conference.sanog.org and will find the call for papers as well as the slides from tutorials and workshops and the conference on the local website here.

And, of course, in the evening today, we have colour and culture of India reception at the Tennis Plaza, which is, again, as you exit the ballroom on the right, and there is a big covered area where we have dance, music and whatever. The reception starts at 8 pm.

Thank you.

Next is Kurt Lindqvist.

The architecture of the Internet as critical national infrastructure - Kurt Lindqvist

KURT LINDQVIST:

Thank you, Gaurab.

So, before we start, let me first say that I'm especially honoured to be here today, because I was also the keynote speaker at SANOG 1, and it's been fun to see it follow over the years and develop into this SANOG 10 conference that we have here today.

So I want to talk about the Internet as critical national infrastructure. And while this is maybe not as - going to be as much operational direct points as in Vijay's talk, this - a lot of the thoughts in here came from a study we conducted in Sweden and we thought the original baseline of the discussion of the study was, if you had a reliable infrastructure that could be used by some sort of national means or a national consensus, what would be the benefit or added value for service providers or for you as end-users. The thinking behind that was that end-users generally tend to be the ones that benefit from a stable network, no matter what services you're trying to achieve.

And that's basically what I'm going it try and base this on.

First of all, so this Internet thing, why do we care if it works or not at all and an observation that's pretty easy to make is actually you shouldn't care too much if this Internet works or not. When you speak of the study, one of the things we wanted to see is what dependencies were there and we split it between working systems and working procedures and processes.

One memorable event was the largest hospital in Sweden, in Stockholm and we had all the heads of the departments there and a lot of the systems tend to have external dependencies in terms of DNS funnily enough, which basically means if they didn't have internal working DNS, they couldn't run some of the lab systems, some of the support systems they have internal to the hospital and a lot of the doctors and heads of departments saw this as a problem. Except one. And he was sitting there laughing at his colleagues and I finally had to ask him why. And it turns out he runs the emergency ward. He says, "When somebody is dying in front of you, you don't really care if you have the completely accurate data in front of you or not, you do what you can." That's the important lesson to keep in mind. When it comes to some of the dependencies in the network, you need to split this between processes and systems. Your process should work, even if the support systems doesn't. That's the real benefit.

Anyway, that said, a lot of the Internet that we used today has turned in to be - you know, we have dependencies in our data live systems, we have business critical functions and some of us make our living based on these things working. If it doesn't work, we don't make as much of a living, if any at all. This means that there are some of the systems that you need to pay attention for and care for that they work and to come back to something Vijay said in his talk, if you bought an SLA of a certain level, you expect to get that SLA. If it doesn't, you go somewhere else and that means for you there is dependence on having a reliable infrastructure, at least as reliable as what you sold.

And some points that are worth keeping in mind going along is why was the Internet successful? How come we have these systems depending on the Internet? Isn't that sort of a silly thing? But the Internet - and this is my highly personal belief - is the Internet turned very successful probably for a number of reasons, but most of it is there is a very few of the services that we take for granted today that are invented by someone in a central network to be sold to us. If we look at some of the telecom products on the market, most of these came from some sort of user innovation - whatever user is, some sort of enterprise in this case - but coming from the edge and being sold on the network. And we have SMS, file sharing, peer-to-peer networks, Skype and things that we sell as a service - a lot of these products and these markets were actually, you know, discouraged by the providers.

Once SMS first took up in Europe, it was seen as a nuisance by the providers and they tried to price it out of the market for people to stop using it until they realised it that people saw the feature as very useful. On the other hand, a lot of the products that the telcos invented hoping that we would buy it, and make a margin out of, have been complete and utter disasters. MMS, sending pictures, was priced so high that no-one used it, video calls in 3G, all of these things designed by the telcos, Internet work in a utilitarian view that people will use it because we tell them it's cool. It turns out it wasn't that cool so people didn't use it. While these features that are becoming very popular, like YouTube, is something that sprung out of the user of the network and not the centralised control of the telcos.

And the ability to create new products and new features has been part of the reason why this network has become so successful.

Also in the early days, more maybe perhaps than now, a lot of the formal networks - and networks here is a soft term and not meaning the physical infrastructure and the cables in the ground - I mean as much the people that are allowed to connect people and getting to know new people and this the social networks, the contact that allow you to form new enterprises, new ideas and new direction - and this interaction again led to some of these proceedings and processes being moved from the physical world into this virtual world of the Internet.

One of the examples in this study of - that we did in Sweden, it turned out that a lot of the previous contacts that people used to have by going to the coffee machine or having a discussion on the phone had been replaced by very real-time interaction through chat channels, you know, either IM or MSN Messenger or whatever they're called and a lot of this discussion was because it was now much more direct and much more realtime and add ad realtime value in the sense that people were able to resolve issues or answer the questions much, much quicker than they used to be and again this created a form of dependency on a very invisible on these systems actually working, because people no longer apparently have the ability to go to the coffee machine, if it doesn't work. Funnily enough.

Another thing was that the Internet when it first sprung up had a lot more cooperative environment than I think the traditional services we're used to, peering being one of these examples. For the first time, you had fry peering, people exchanging traffic for the Internet to actually achieve this connectivity and this service providing of services to the end-users, while the telco had this very rigid systems of interconnect based on tariffs and there was no lack of trying by the telcos to impose this on the Internet as well and luckily they failed.

(Pauses)

Every now and then I have to make sure the transcriber is keeping up with me.

The ability to allow the end-users to create new functions and systems was a critical component and I believe that this, again, led to this trust towards the network. You had people who had no previous technological knowledge or experience and they put a lot of trust in this and it's fun to see how easily people take to this and they start using these services without questioning it. Most people just believed that these would work because everybody else uses it and it created an implicit trust around the network, and that's a trust that we have to care and cater for quite well. And the reason why I say that is because this trust has created a kind of led to this infrastructure being a critical component on day-to-day society. How critical and important it is depends on where you are and the level of usage of the Internet and computers in various nations and societies but there is still some implicit trust there.

And one example of this is, for example, my now highly publicised attack on Estonia. Estonia is an interesting example because when it gained independence in the early '90s, they had a very, very fast jump-up process to build a government systems and a government interaction with their citizens and we had to provide not only a government but any service - banks, post offices, etc, movie theatres. They all had to create some sort of interactive system basically in no time towards the users and their citizens.

And they literally skipped 150 years of bureaucracy that the rest of us had to go through. They never had the legacy of a paper or, they never had this legacy of having an entire, you know, citizenship that was dependent - that their livelihood depended on bureaucracy actually working. They could skip this step and go straight to the streamlined process where most of these services can be provided online electronically. This gave them a huge competitive advantage towards the rest of the world because they have a much lower cost providing these services.

But the interesting thing was they created an implicit trust that the end-users put into this network and these services. And when the attacks happened on Estonia, one of the large implications wasn't so much the fact that things were down but the worries were that this trust was being hurt, that people no longer could access their banking, people could no longer buy their cinema tickets, etc, etc.

And part of this problem is an architectural problem with the Internet. When I go to a bank, I expect that the guy who sits behind the cash register or behind the desk won't try to rob me. When I go to this online bank, I'm not so sure. That's part of the implicit trust. There are safeguards that will protect me against this, if they are being used, and most online banks are very good at using this. There was a case of an Indian bank the other day, but this implicit trust, for some reason, doesn't seem to translate into government. I don't know why.

When we did this study in Sweden, out of 460 government agencies, there are only 360 with a website, funny in itself, but out of the 360, people providing a signed webpage had self-signed certificates, so I have no idea if the government agency was who I wanted to talk to. It could have been an imposter, anyone pretending to be this government agency. And some of the information being shared is fairly critical. In the last election we had a year ago, the realtime results of the election provided to the newspapers and news agencies were done over the Internet on a completely unsecured, unsigned webpage, and basically anyone could have hijacked the route or injected data into this web serve Ayr no-one could have told. Cow have altered the election result, anything. Impossible to know. Interestingly enough.

But, again, you know, for some people have this implicit trust and unless we get very much better at protecting this trust, we might end up having it hurt permanently.

And one of the things we also got from this study was that you have this belief that the Internet is just there for non-critical things, there is no exterm dependencies or very few external dependencies. But as we found was that people are starting to get a lot more versed in the Internet, they are moving a lot of the either implicit or explicit communication channels or dependencies over to the ENS, which is just one of these things. There is a long list of things we found that were no longer being - there was no handle on the Internet.

An interesting example is that the Swedish Government released their Budget, which is a fairly unexciting event by most accounts, it just happens that they released this, the actual document, not the webcast, the document was released on the Government website like any other press release. The Government website crashed. Just, you know, if you, you know, national economic system in Sweden was enough to take it down in hours. I don't think it's a large-scale thing of the population wanting to know what they're using the money for. It was just a few people wanting to get this file. And it crashed. Now, that's interesting because, I have no idea how these people got the budget before, they probably get Goth it mailed to them and waited for a few days, I have no idea, but this system wasn't built to scale even for a fairly moderate interest by citizens in its information. And this, again, is a sign of how traditional uses of other means of communicating with governments - I'm just using the governments as an example because that's what we studied but I think it would be true for any form of use - when you move this over to the Internet implicitly you have to take into account that the shift is happening and make sure you build systems that work for you in order to cater for this trust.

Critical can mean a lot of things and I'm going to discuss this and realise that it can be critical as in crisis management, which is basically the informational systems used in terms of, um, providing information in a crisis, either to people handling the crisis or information being provided to the citizens of what is happening around them. One interesting thing - there are two interesting observations, again - from the study was that we noticed in a large-scale exercise that was conducted, the exercise, the crisis management was for government agencies and the plot was a terrorist attack on a dotcom and it was a fairly small exercise with just a few hundred people participating. The only problem was that the phone number list for the people participating was put on a public website, on the website that was supposed to be useed in terms of crisis management. That website crashed. That was the public-facing website, instead of having a dedicated system for this. Again, they didn't really realise the scale. What made it worse was that, when it crashed, this was publicised by the people which led to an increasing number of people going to a website that didn't work. People just wanted to verify that it didn't work.

And the other thing is that when it comes to crisis management, it's also been noticed that along with recent news events or news-breaking events, we had a large part of Asia and all the casualties of Swedish tourists and other parts of the world, one of the major sources of information turned out to be the website. People no longer waited to get the morning paper in the mail or listen to radio or TV news. People went to the news websites online and not only the news sites, they wanted to have believable information so they went to the Swedish foreign department website that subsequently crashed and, again, a lot of this information, these dependencies and trust, v has been moved to the Internet and people take this as the authority and part of the thing for people was that the Government and service providers, in order to cater for this trust, needs to make sure that they can handle these one-time events, these news-breaking events on a grand scale.

I should slow down a bit. OK.

And, just again, it's providing of information can be a network issue, a real issue, even both under stress and not under stress. This budget was - this budget release was certainly not a crisis situation. Well, I'm a taxpayer so... but anyway, it was still an event where the government was expected to provide information to the end-user, to the citizens or to the users. The interesting thing from this study was that it turns out that most businesses that had this as a business critical dependency or much better at doing this than governments or other end-users. Online banks are probably some of the best in the world to deliver this service. A lot of the other - the Google s, the AOLs, the CNNs are also fairly good at delivering this service. The people we trust the most and the people who are normally the best at going this, are actually fairly bad at it. They can't even be bothered to buy the service. And that was some of point we picked quite a bit up because it's a fairly well-known and trivial thing to do and I don't really understand why people can't be bothered to do it. One of the things we discovered when we did this study of 360 government agencies, was we expected to provide a list of things they could do to improve how they communicated with the rest of the world. To our shock and horror, we realised that most of these agencies you could only reach by pure lock. The number of DNS servers were like an '80s, page, if they were there at all. It was just purely shocking.

My favourite was the one that had a single IP address that handled DNS, mail, web and everything else. It was also the firewall.

But that aside, there is some - to try and bring this back into why we're here and what you guys can do and what should be done - the first thing we identified is that I think there it is important that some government agency has a clear mandate to be charge in terms of crisis and cater and plan for this. And why I say that is because if there is an attack that happens, like the one in Estonia, there has to be someone who detects the attack and who takes action. It's no use if there are 15 agencies who detect the same attack and action it independently. It doss nothing. Estonia was actually interesting because in Estonia they had a fairly coordinated network and all the government agencies are run off a government-run ISP that provides the services for the Estonian Government and they had their own group and they were fairly knowledgeable and active - a CERT group. In a lot of other countries, they is a -- and they detected the attack and acted on it. In other countries, this is a discordant thing.

This can be everything from a route hijack with a government prefix redirecting e-mails going into the government. I, as a citizen, expect that if I send a mail to a government agency, it reaches the government and someone reads it and acts on it and I should get a reference number back and it should be handled according to the process. But if this is on the Internet today, there is no way to verify if it reached the government and the government had has no idea. There is no way to know if someone else read your e-mail. This is a matter of pinpointing responsibility and holding someone accountable for doing this.

And, again, how can I as a citizen trust this information provided to me by the government? How do I know that this website went to the government? Why are they not signed? Where can I find a certificate and verify it? And how do I know, again RBGS that the e-mails end up with a real government? Why don't they provide signed keys? If I want to send an encrypted e-mail to the government, why isn't there a PGP public key that I can do that? This is simple in a large-scale government. They just don't do it.

And another thing is that, while governments tend to not care too much about this, but when I gave this presentation for a large company in Sweden, we thought about this and there was a guy who stood up and said he was responsible for the external-facing part of the Swedish defence network and he said, "We have made a decision that our external-facing parts are non-critical." I said, "That's fine. I understand that, but all 360 government agencies can't make the same decision. One of you has to decide you're critical and will always be there." A lot of the citizenship abroad, instead of listening to the Swedish radio or the international broadcasts on the medium-wave or short-wave radios, they have started looking up this information on the Internet. And the Internet has therefore become a fairly important link to citizens abroad and to people living abroad, depending on services provided by the Swedish Government.

And in terms of a big national crisis, if something would happen inside the country, providing this information to relatives or people abroad is, of course, turning more and more critical.

Another thing that was an interesting study we did was we looked at what happens if there was an actual crisis, how long would the infrastructure last as a stand-alone infrastructure. Now that way we build networks today and they are interconnected and fairly meshed, it means the chances of them becoming completely isolated is fairly low to extinct. Having said that, the interesting thing from the Estonia attacks was they are so severe that Estonia decided to cut the links to the rest of the world. This was not the fact that the physical infrastructure was threatened in any way. It was in order to keep critical parts of the national infrastructure working, they decided to sever ties to the rest of the world. If have you to do that, you have to realise that the network inside the country has to be self-sustained enough and how long you can do that for. It turns out you can do it fairly long. Our study turps out that we did it in Sweden it's mostly right by pure lock and coincidence rather than because of coordinated planning, but all this effort into making sure you can run in isolation is something of no interest to commercial players. They don't have the interest or the will or the money to spend to build in highly robust infrastructure because there is very few people willing to pay for it, except maybe taxpayers.

And another thing we started again to discuss is that it's very hard to provide information to citizens realtime in terms of the crisis. But your citizens have come to realise they want to get this data in realtime. They want to see it as it becomes available. How do you do that? Maybe the Government should run a crisis server for them. Maybe there should be a way for the Government to provide data at all times to the end users and this data has to be, this new data has to be in a topology that's independent of a centralised system so you can push the data to the system as it continues to operate in independence.

One of the ideas we played with was why can't the governments take control and hijack popular sites? Why don't they hijack the most popular news site in Sweden and publish the data there or give them the data to publicise, signed somewhere? And again the question is how much is the scale? Well, we did the number crunching, there is 3.5 million households in Sweden and you can assume that, if this is a real crisis, you can assume that all of these 3.5 million households are clicking the 'reload' button because that's what humans do -"It might have changed, it might be something new and I want to be the first one to know it." It means whatever the system is, has to take 3.5 million packets per second at all times. I don't think it would get quite there but that's the scale have you to sign for and households in this case is the number of households that have an Internet connection. Do the numbers. It's a scarily high number, actually.

The information should be signed, I already said before, in all cases, especially from the Government. An important thing is that these systems and this networks, they should be built so they have the high reliability and inspection in normal operations. This is not something you want to have to activate in terms of crisis. This is how it should be built from the beginning, from day one. That said, it's fine if operators plan for in terms of a severe event to call in the extra shift or to call in more engineers. That's one thing. But actual operations themselves, the network topology, the internal systems of the country, should be built that they actually run this way in normal operations and a stand-alone operation. And part of this is that you really want to have a working DNS.

Now the DNS is the world's most overanalysed and overhyped and oversecured system in the world. It's not that hard to make it right. Just don't put all the servers in the same subnet and you're doing right. If you can put them in different providers, even better. If you can make sure all the providers can reach them, you're doing great. And that was actually one of the surprising things was, again, because these government agencies were fairly uncoordinated, it turns out that they have over 60% of all the government agencies will all the DNS servers, all the web servers and all the Internet connections with the same provider, the cheapest one. If you just do the planning, you're doing fine. Again, signed e-mails and data, I can't say this often enough but I really want to have this as a service.

The more harder part to achieve and the more harder part from a regular point of view to do something about is to make sure the exchange of traffic inside the country works. Building Internet exchanges is a great step forward. Having a neutral, independent, Internet Exchange that will were exchange traffic with Internet providers at any time in an o open and fair manner is a good thing. Now, I don't believe in regulatory environments for the actual settlements or the settlement agreements, that's for the providers to sort out, but the infrastructure for having this in operation, having this in place is a great value. I use the web for a company called KPM Qwest, which is the largest provider in Europe. When the September 11 attacks happened, there was a number of carriers whose transatlantic links failed because it was stopped and in the end, two providers had spare capacity running across the Atlantic and we were one of them and we shifted information over the Atlantic for free.

We couldn't have done this unless there was free and open point that we could interconnect with these networks very easily and quickly. This was all done in an afternoon and night in close cooperation with the exchanges but having this infrastructure in place where there's enough spare capacity that these people can exchange traffic is very critical for the work and from an operational point of view for the providers and from a national perspective. These IXs should have no external dependencies. Rootservers - I'm sure they're all fine - I don't really believe that but let's say they're fine but in terms of a crisis, in an urgent situation, you want to have this as simple as possible and you want to be able to do this on a per-peer basis. You want this session establishment quickly and with the providers quickly without the dependencies on the external party like the IX operator.

Again the top-level domain which is an overanalysed DNS system, but these TLDs are important for the national infrastructure and being able to reach them in terms of a crisis in highly important. So locating these at the IXs and having them able to peer with anyone and make sure they peer with all the providers is very important. If you can put all the DNS servers there for all the government agencies and even some ears, even better. Make sure there is scope for everyone and there is no policy and regulatory framework blocking access to these servers.

All this is actually the easy part. The hard part is to make sure that the end-users get to this. We can oversecure and overdo all these centralised systems in terms of the servers and information to be distributed unless the end-users can actually get to it, that's completely pointless.

Now, securing the last mile for everyone and making sure everyone's Internet connection always works is impossible but you can make an effort to make it work for as many people as possible and having this information available in a network topology way will help this. Basically if one provider goes down, we don't want to have 60% of the government agencies disappear with you just because it was the cheapest one. You want to make sure that the information switches to the other providers in the meantime. You don't know why a provider goes away. They have gone away for stupid operational mistakes inside their own networks. People have shot themselves in the foot with fairly large networks.

So making a decision of how these systems are distributed and how they are located is fairly important.

Another argument to be made - and this was actually made in Sweden - was that, you know, this is fine in a highly connected environment, like Estonia or western Europe, where - well, I wouldn't call western Europe highly connected but let's assume you are - and if you say, "This doesn't apply to me because we don't have this implicit dependency on the Internet in my country," I don't think that's true. You will have it sooner or later and it means you still have the implicit trust asset. I have yet to see a country where the implicit trust in the system isn't there and having these systems work in a normal operational environment and having it open, free and fair and accessible to all in this neutral model is a fairly strategic advantage. So building these networks in this way, this highly dense interconnectivity, will provide for a better service for the end-users. In Sweden, there are actually five exchanges in the country, it's not very large by Asian standards. And the reason for that is basically partly for the resilience of the business but also to make sure that we have a fairly connected network environment so the end-users can have a low latency between them. We try to keep traffic regional and all the providers who work with this and demand it in a peering agreement.

That again is the strategic advantage of these systems.

So adding this developed infrastructure - developed doesn't necessarily mean universally available but having a reliable and trustable infrastructure - I believe is a fairly strategic advantage and important from a national point of view.

Take the Koreans and the Japanese. The governments have a number of initiatives, the Japanese have the IPv6 initiatives, all these initiatives have led to stimulus for the network operators. I'm sure we could have a long discussion of how open and fair all these programs are. The European Union also have this these stimulus programs, most tend to give men to the PGPs to keep them alive but it has injected money into the market and led to innovation, which is important from a strategic point of view.

Another thing that is interconnectivity, the non-visible slide on the right here is a copy of principles I gave in 2001. You can't really see what's on the slide but it's a map of Europe and - when I started running my first ISP in 1993 and around the 1997, 1998 time frame when I started in Qwest, we had a number that 80% of all the traffic we generated went to the US. And 20% stayed local. Part of this reason was we couldn't get traffic between the countries. The infrastructure wasn't there. We started BDIX in 1992 and 1993 in Stockholm. We could exchange STD in the country. In the rest of Europe, it is a much newer innovation. In the rest of the country, we had a lot of local language content become agency veilable because that was part of the business cases that people were building but also I believe a part of this was facilitated, especially in Europe, by the emergence of exchange points across Europe and by 2001 you could clearly say because KPM Qwest had fibre rings that happened to span the language regions of Europe.

We had one ring around Germany, Switzerland and Austria and most of the traffic generated in these countries stayed on that ring and never left the fibre ring because they share the same culture and language and they have - they saw the emergence of local content and this traffic was staying in this region. One of the prime examples is my home country, which is Finland, that had the highest rate of local traffic I think it's around 80% or 90%. I believe it's a simple thing. English knowledge in Finland is very low at this time. So the US content wasn't that interesting while the Finnish language content was very interesting. And Finland doesn't share a language with any other country. There is nothing like it. The other Scandinavian countries, Sweden, Norway, Denmark, you saw a lot of traffic between the countries like you saw in Germany, Switzerland, Austria. So the ability to have a highly meshed infrastructure, let localised content.

Last slide. So we thought about this from a national level and, as I started saying, this is from a business case perspective. It doesn't matter if you are a citizen or customer, this trust and dependency on the information is still there, there's been a fairly popular reason saying, "Doesn't political process X help me?" and my country needs a root server or whatever. There is certainly a political dimension to this that I won't go into and I won't value or judge those comments, but, you know, you can tear up ICANN all we want, we can take over the IETF, we can do whatever we want from a process point of view. Unless I have an infrastructure in place before that that works, this is all pointless. There is no UN agency in the world that can replace TLD Nameserver. It won't happen. Fix this is and make sure you have a working national infrastructure first is probably of the utmost importance. Everything else will then follow. And unless we can regain there trust in the network, by having this working infrastructure, again the UN agencies won't help you. I'm sorry. That was it.

Questions.

Almost on time. I wasn't too fast. OK.

Thank you.

GAURAB RAJ UPADHAYA:

Thank you.

APPLAUSE

You can talk to Vijay and Kurtis during the break they'll be here for the next few days, up to Friday.

Before we close the session, I would like to invite Colonel Perhar back up here to give some gifts out to our speakers.

RS PERHAR:

Vijay, thanks a lot.

APPLAUSE

RS PERHAR:

Kurtis, thanks a lot.

GAURAB RAJ UPADHAYA:

Thank you, Colonel Perhar. We'll take a 15-minute break. Coffee is outside.

And, for the rest of the day, you've probably got the printed program out, we will just continue with that. There seems to be a bit of confusion about the last session today. So the way it will work is the third session will start off and then, towards the end, we'll convert it into a BoF with the last presentation from Barry.

So, having said that, thank you, everyone, for coming to the plenary here and we'll see you in the next session.

Also - special announcement. Speakers in the next session, Abhishek, Devdas, Samit and Barry, please come and talk to Philip now. Coffee break.

RS PERHAR:

You can have tea also.

GAURAB RAJ UPADHAYA:

Tea also. Not just coffee. And cookies.

(End of session)