It’s been a long while since I’ve blogged about this topic
Probably too long, as IXLeeds, something which inspired me to write Pt 1, is now a fully-fledged IX, not just a couple of networks plugged into a switch in a co-lo (all IXPs have to start somewhere!), but has formed a company, with directors, with about 12 active participants connected to its switch. Hurrah!
So, trying to pick up where I left off; in this post, I’m going to talk about shared fate, with respect to Internet Exchanges.
What do I mean by shared fate?
- Organisational – multiple IXPs, on which peers depend, controlled by a single or small number of organisations.
- Technology – large numbers of IXPs based on the same technology from the same vendors.
- Architecturally – a distributed IXP over a large geographic area (making a big flat L2 domain).
- IX Participant Connectivity – this can take many forms, but I’m specifically referring about the rather dubious practice of connecting a single (or very small number of) BGP speaking peering routers to multiple Internet Exchanges, often using long haul ethernet transport services.
Organisational shared fate doesn’t seem that bad on the surface. It means it’s convenient for the IX participant to connect to multiple IXPs in multiple locations, with a single contract, and paying a single invoice. It may even help keep the costs down: No duplication of back-office helps reduce opex, as does a single NOC being able to run multiple IXs, plus the larger operator is able to use it’s buying power to get better deals on IX equipment, helping to lower capex.
The flip side of this is that the lack of organisational diversity can expose multiple IXs to the operator as a whole failing, and taking all the IXs with it, or leave them open to service impact due to a widespread process failure, because each IX under a common ownership is likely to use the same processes. If those processes are broken in one place, they are likely to be broken everywhere.
A darker side could be the monopolisation of a IXP marketplace by a single dominant operator. As well as reduced choice, the lack of competition may drive prices up (at least as much as they can bear against the cost of IP transit, which is already often lower than peering in some cases!), and lead to the adoption of common technology throughout those IXPs.
A regional IXP can more easily focus on the needs of it’s participants as decisions are made locally, and may actually serve to drive local co-operation, competition and growth.
Shared-fate in Technology can be down to number of influencing factors. There’s the organisational issue highlighted above – it’s likely a single IX operator would choose to deploy equipment from the same supplier in multiple sites, if it suited them.
There’s also the dominance of vendors in the particular marketplace: If several neighbouring IXPs, otherwise unrelated, all happened to use identical hardware from the same vendor, there’s a strong chance that a bug or weakness in that equipment could manifest itself simultaneously in multiple locations, locations in which the ISPs are dependant for redundancy, and can’t afford multiple concurrent failures.
To avoid this, it looks like it’s important that the market for IXP hardware is not dominated by a single supplier, or single underlying set of technologies.
However, the equipment marketplace for the scale of gear needed for a very large national ISP has been dominated by a small number of players for some time.
Regional IXPs, by nature of being smaller, are able to bring some diversity into that gene pool, even if only a relatively small proportion of networks stand to benefit from it.
There have long been goals to join IXPs together, or build a distributed wide-area exchange. Remember the DGIX? The Architectural challenges are pretty immense. However, the growth of ethernet in the long-haul has increased the number of exchanges which are visible outside of a particular area. Only last week, AMS-IX announced a partnership to provide a “virtual” AMS-IX PoP in Manchester, while a number of regional IXPs have used this technology to spread their sphere of operation. Along the way this is changing the nature of their business, losing their regional emphasis. In some cases it’s been done in order to survive. It might even be the lesser evil in the long run.
It sounds great doesn’t it? I get to peer in city A, even though all my infrastructure is in city B, and I don’t even have to build anything into city A. It does get you access to some peers in city A that you couldn’t see before, sure. But, it doesn’t do as much to improve the diversity of connectivity as actually putting some network and equipment in city B.
The achilles’ heel of this scenario is that as most IXPs are implemented as some sort of flat layer 2 domain (even if it’s done as a VPLS instance running over an MPLS/IP substrate) this would give the ability for a nasty layer 2 broadcast or unknown-unicast storm to spill between geographic areas.
No longer localised by seperate regional LANs, even with policing of these types of traffic and effective IXP participant hygiene, these sorts of events can still cause havoc on the occasions they do happen, and that havoc is no longer limited to peering routers in city A anymore, but peering routers spread all over the world. Why should a guy in country X, thousands of miles away, be affected when trying to reach something in country X, because of a failure of something far, far away in City A in country Y? This guy is not going to understand the why. He just wants it to work, “This internet is supposed to be failure tolerant, no?”, is his expectation.
But there’s whole hosts of folk doing this now, and the Internet isn’t grinding to a halt, right? While it does rely heavily on folk having clue and no broken equipment, fortunately, the large majority of folk doing this have lots of clue and equipment breakages are rare.
Side note: In some ways, this is nothing new. The CIX was interconnected to the PacBell SMDS Cloud way back when.
But it makes designing redundancy and resilience into one’s networks and peering agreements so much harder, because you’ve got to look beyond the obvious “We’re peering in City A”, and worry about where your peer’s router actually is, because it may not be in City A like yours.
My point here is that distinctive regional peering actually stands to help improve fault-tolerance for the networks connected to the regional IX.
This starts to bring me on to IX Participant Connectivity, which I think this justify a whole article to itself, so that’s what I’ll do next time! I’ll try not to leave it so long.