Carrier Class NATs Considered Harmful

In the decade or longer transition to IPv6, NAT is inescapable.  This is a direct consequence of the design of a new protocol which is 100% incompatible on the wire, there is not the slightest compatibility mode.

We may not like this, and I certainly do not, but NATs are inevitable, get over it.

Alain Durand (Comcast) and Miyakawa Shin (NTT) have a common problem.  They say that their networks’ customer edges are so large that, even giving each CPE a small bit of IPv4 space for the ‘front’ of a NAT, they need two to five /8s of IPv4 space.  They are in a terrible trade-off space.  They either request many public IPv4 /8s from the internet registries (not likely with the exhaustion of the IPv4 IANA free pool on the horizon) or do something more complex.

The approach they are testing is being called Carrier Class NAT.  It is essentially one or more IPv4 NATs in the core of their networks and various tunneling and translation techniques.  If the CPE has dual stack, traffic where source and destination is IPv6 would not have to me NATted.

We can contrast this to, for example, NAT-PT on the CPE, which would probably scale to the needs of even a large non-consumer backbone.  But, as we noted above, Alain and Shin would need way too much IPv4 space just to front the NAT-PT front ends for their large consumer networks.

I and a few other researchers have taken up a desperate search for alternatives.  The reasons are simple.

‘Carrier class’ is a euphemism for centralized.  More semantics move to the core of the network.  This is bad in and of itself.  Net-heads call it ‘telco-think’ because it is the telco model of smarts in the core as opposed to the internet model of a simple, just forward packets, core and smart edges.

With the smarts at the edges, e.g. NAT-PT, I can easily field new protocols between consenting end-points by just tweaking the NATs at the consenting CPEs, even adding ALGs if needed.

With NAT in the core, then if a customer wants to field a new application protocol which requires cooperation from the NAT, they get to beg help from Comcast and NTT and all other users of carrier class NATs.  This is the ultimate horror the NAT-haters fear, and they are not all that wrong.

You don’t build an internet walled garden at the edges, you build it by restricting the core.  Comcast has recently received a lot of bad press for just this, though I know that Alain is very far from the those responsible.  But be assured that the developer/deployer of new applications will be talking to Comcast’s walled garden loving lawyers not Alain.

It means that all new application protocols have to go through the carrier lawyers to be allowed to be handled by the NATs in their core.  So, if someone wants to deploy a new application, they can talk to Comcast’s and NTT’s lawyers or do it over HTTP, pick your poison.

And remember that, as IPv6 deploys, and we want to have one internet, i.e. IPv4 nodes talking freely with IPv6 nodes, then translation must be done somewhere.  The challenge is whether someone can figure out a scheme where it is done for these large networks at the customer edge, not in the core.

Comments (1)

A Lesson in BGP Visibility

While Olaf and I were looking for other things, we stumbled on a revealing sub-experimental result.

  • We announced a /25 prefix via BGP from our research routers at the Westin in Seattle
  • Sprint did not listen to it
  • Verio/NTT did, and said they propagated to their customers but not their peers
  • We looked at BGP feeds from RouteViews, RIS, and 700+ other BGP feeds
  • BGP data said the /25 reached 15 ASs
  • From a prefix in the source /25, we probed IP addresses in over 20,000 ASs
  • 1023 ASs replied
  • I.e. RV et alia showed a shockingly small fraction of the real AS topology, less than 1.5%

Interestingly, the AS path length shown in the 15 ASs visible in BGP was 3, while pingability and the BGP path length for a /20 was the normal >4. See the diagram.

Comments off

IPv6 Talks at Amazon

Tom Killalea had invited me to speak on IPv6 at Amazon. Due to ferry silliness, I could not make it for lunch, but got there at 12:30 for a 13:00 talk. There were a lot of folk a fairly large conference room, with a couple of dozen having to stand. I did a version of the IPv6 Operational Reality talk and threw in a modified version of my slides from the IP Addressing and Economics conference earlier in the week. It was interesting to have an audience of geeks from my home culture, they got the humor. They seemed actually interested, and some were toying with how to deploy IPv6 in various parts of Amazon. I managed to finish in exactly one hour, gossiped a bit with Tom, and headed for my car.

Comments off

Signing the IRR, a Contrary View

Robert Kisteleki proposed to use the RPKI to sign most of the IRR. I took the opposite view in the following rough proposal. Geoff Huston and I will be writing up my design in the next week.

Date: Mon, 03 Mar 2008 21:53:30 -0800
From: Randy Bush <randy@psg.com>
To: Robert Kisteleki <robert@ripe.net>
Cc: Resource Cert List <rescert@apnic.net>
Subject: Re: [Rescert] RPSL+RPKI proposals

robert,

i take a somewhat different view.

though i was hacking before ed codd, my mommy trained me to be
extremely wary when the same information is in two places.

but more important, i have a slightly different goal set.  i would ask
what we need to do in order to make the rpki helpful to isps in the
current task of configuring routing filters, but with more assurance of
correctness?

for this we do not need signed route: objects in the irr, as we have
roas and merely need to invert them, just as we do in the irr software,
to form the set of prefixes which each asn _may_ announce.

what we do not have in the rpki, which is in the irr, is the inter-asn
topology.  while josh and jrex would gather it from route-views or ris,
i am willing to stick one toe in the irr cesspool and sign the aut-num:,
probably in a fashion similar to the one you suggest.

but doing more is producing redundant data, transferring trust to a weak
sibling whose long-term survival is not required, and trying to make a
sow's ear into a silk purse when we are not in the silk purse business
anyway.

when we have s-bgp (or whatever), the irr will be completely IRRelevant
<tm>.  i see no need to try to touch it any more than we absolutely
needed to now.

randy

Comments off

Cisco Address/Economics Conference

My presentation today at the Cisco IP-Economics Conference.

Comments off

FBI Report on Cisco Clones

Here is an interesting FBI report on Cisco clones which they consider to be dangerous.

Comments off

Emacs on MacOS

The native Emacs seemed not to have a meta key.  While this might be fine for amateurs, it is definitely not fine for hard-core Emacs users. The lack of meta seemed to be the case irrespective of whether it was under MacOS or under X11.  I kept tweaking my .Xmodmap to no avail.

The general advice on the net was to install 22.1 using MacPorts.  So I went down that path.  MacPorts installed, though it required XTools, which was a bit of a pig.  Then I build 22.1.  It did not seem to support the meta key!

A co-worker who was also at the APNIC meeting in Taipei said just install Carbon Emacs.  So I Googled it and installed it, and it worked under X11 and native MacOs.

Of course, now i want to rip out MacTools etc., but am not sure how to do so safely.

Comments off

First Cut at the !J NetCannery Configuration Analyzer

On the Sunday before NANOG, 17 February, an old acquaintance, Tom Pusateri stopped me and told me he had a start-up doing a new network device management appl,ication and did I want to be in the alpha test crew. As Tom had done such a great job on the Juniper configuration system, and done the netconf XML stuff in the IETF, I could not resist. Unfortunately, it was only going to run on the Macintosh, at least initially. So Monday, Joel drove me to the Apple Store a few miles away, and I got a 15″ MacBookPro to run Tom’s software.

I spent much of my spare time on Monday and Monday evening learning how to deal with the Mac. Why did they have to ‘add value’ to FreeBSD? I did manage to get enough tools installed that I could survive. A gang of Internet2 folk at NANOG were very helpful, as was Joel as usual.

Tuesday, I installed NetCannery, Tom’s application, and Tom got me started. Think of RANCID with an analytic back end. But it was clearly early in the development cycle.

The config fetcher has open source front ends for common devices, e.g. Cisco, Juniper, etc. So you can add strange new devices. And it is smart about bastion hosts, where you need to log into a control host to get to the device. But it did not have a bulk loader, which will be very necessary for any non-trivial networks. When I suggested this, Tom understood immediately and promised it for the near future.

I managed to fetch from a Cisco 7206 and some 2511s, but failed on a 1750 with VoIP. It worked for Juniper M5s, but not for a Procket 8801 or for HP ProCurve or SMC switches. Of course, Tom will fix that.

It lacked ways to group devices, e.g. North America, New York, PoP X, and type of device, e.g. infrastructure, backbone, customer-facing, etc. When I pointed this out, Tom and his friends discussed and came back with the idea of ‘smart’ folders. Not being a recent Mac person, I am not sure I understand the metaphor. So we’ll see how that turns out.

Comments off

IPv6 Conversion of Westin Servers

This web page tells the story of my converting a bunch of servers at the Westin to dual stack.

Comments off

« Previous Page « Previous Page Next entries »