IX Scotland – Why might it work this time?

Yesterday the BBC ran this news item about the launch of a new Internet Exchange in Edinburgh – IX Scotland. This is the latest in an emerging trend of local IXPs developing in the UK, such as IX Leeds and IX Manchester.

There was some belief that this is the first Internet Exchange in Scotland, however those people have short memories. There have been two (or three) depending on how you look at it, attempts at getting a working IXP in Edinburgh in the past 15 years, all of which ultimately failed.

So, why should IX Scotland be any different to it’s predecessors? Continue reading “IX Scotland – Why might it work this time?”

My recent talk at INEX – Video

Or, I never thought of myself as a narcissist but…

Thanks to the folks at HEAnet, here’s a link to the video of the talk “It’s peering, Jim…” that I gave at the recent INEX meeting in Dublin, where I discuss topics such as changes in the US peering community thanks to Open-IX and try to untangle what people mean when they say “Regional Peering”.

The talk lasts around 20-25 minutes and I was really pleased to get around 15 minutes of questions at the end of it.

I also provide some fairly pragmatic advice to those seeking to start an IX in Northern Ireland during the questions. 🙂

mh_inex_video

What’s meant by Regional Peering and the case for Peering NI

Last week, I was over in Dublin having been invited to give a talk by my gracious hosts at the Irish Internet Exchange Point, INEX. I asked what sort of thing they might like me to talk about. We agreed that I’d talk about various trends in global peering, mainly because the INEX meeting audience don’t do massive amounts of peering outside of the island of Ireland.

(If you need to understand the difference between the UK, Great Britain, Northern Ireland, the Republic of Eire and the island of Ireland this video will be a massive help. Thanks CGP Grey.)

 

One of the discussions we had was what is meant when we say “Regional” when talking about Internet Exchange points? In the UK, we generally mean exchanges which are outside of London, such as IX Leeds. When a “Regional IXP” is discussed in Africa, they actually mean a “super-national” IXP which possibly interconnects several countries across a region.

Why do the communities in these areas want IXPs that span national boundaries?

The main reason: latency.

There is a lot of suboptimal routing. Traffic being exchanged between adjacent countries on the same continent can end up making a long “trombone-shaped” trip to Europe and back. This has a negative effect on the user experience and on the local internet economy.

Round-trip times from RIPE Atlas probes in Southern African countries to a destination in South Africa
Round-trip times from RIPE Atlas probes in Southern African countries to a destination in South Africa

As you can see above, traffic from the test probes Kenya and Angola, along with the Maldives and the Seychelles is likely being routed to Europe for interconnection, rather than being handled more locally, if the round-trip time is an indication of route taken. The probes in Botswana, Zambia and Tanzania do somewhat better, and are definitely staying on the same continent. The African example is one of the obvious ones. Let’s look at something a bit closer to home…

Regional peering in Northern Ireland and Northern Ireland to the Republic of Ireland

There is already a well established exchange point in Dublin, INEX, with a good number of national and international members. Discussions are taking place between Internet companies in Northern Ireland (which, remember, is part of the UK) about their need for a more local place to exchange traffic, likely in Belfast. The current belief is a large amount of the traffic between sources and sinks in Northern Ireland goes to London or Amsterdam.

Firstly, how does traffic get from the UK (and by inference, most of the rest of Europe) and Northern Ireland? This is what Telegeography say:

Submarine Cables UK to NI
Submarine Cables UK to NI
RIPE Atlas Probes in Northern Ireland
RIPE Atlas Probes in Northern Ireland

So, I thought I’d do some RIPE Atlas measurements.

This isn’t meant to be an exhaustive analysis. More just exploring some existing theories and perceptions.

The first trick is to identify probes in Northern Ireland. From the RIPE PoV, these are all indicated as part of the UK (go and watch the video again if you didn’t get it the first time), so I can’t select them by country.

Fortunately, probe owners have to set their probe’s location – there is a certain amount of trust placed in them, there’s nothing stopping me saying my probe is somewhere else, but most probe owners are responsible techy types. The RIPE Atlas people also put the probe locations onto a coverage map.

I also needed some targets. Probes can’t ping each other (well, they can, if you know their IP address, and they’re not behind some NAT or firewall). The Atlas project provides a number of targets, known as “anchors”, as well as nodes in the NLnog ring which can act as targets. There’s an Atlas anchor in Dublin, but that couldn’t take any more measurements, so that wasn’t suitable as a target, but HEAnet (the Irish R&E network) and Amazon (yep, the folks that sell books and whatnot) have NLnog ring nodes in Dublin.

We also needed targets in Northern Ireland that seemed to answer ICMP relatively unmolested, and I chose DNS servers at Tibus and Atlas/Bytel, both of whom are ISPs in the North. The final things to add were “controls”, so I chose a friend’s NLnog ring box which I know is hosted in London, and two other UK-based Atlas probes, the one I have on my network at home, and one on Paul Thornton’s network in Sussex. These effectively provided known UK-Ireland and UK-NI latencies to the targets, and a known NI-London latency for the probes in NI.

So, let’s look at round-trip time from Northern Ireland to the NLnog ring node in London:

ICMP RTT NI Probes to nuqe.net NLnog ring server
ICMP RTT NI Probes to nuqe.net NLnog ring server

So, we can see there are some variations, no doubt based on last mile access technology. In particular, the node shown here with the 54ms RTT (just North of Belfast) consistently scored a high RTT to all test destinations. Anyway, this gives us an idea of NI-London RTT. The fastest being 15ms.

We can therefore make a reasonable assumption that if traffic were to go from Belfast to London and back to Ireland again, a 30ms RTT would be the best one could expect.

(For the interested, the two “control” test probes in the UK had latencies of 5ms and 8ms to the London target.)

Now, take a look at the RTT from Northern Ireland to the node at HEAnet in Dublin:

ICMP RTT all NI probes to HEAnet NLnog ring node, Dublin
ICMP RTT all NI probes to HEAnet NLnog ring node, Dublin

Only two of the probes in Northern Ireland have <10ms RTTs to the target in Dublin. All other probes have a greater RTT.

It is not unreasonable to assume, given that some have a >30ms RTT, or have exhibited a >15ms gain in RTT between the RTT to London and the RTT to Dublin, that this traffic is routing via London.

Of the two probes which show a <10ms RTT to HEAnet in Dublin, their upstream networks (AS43599 and AS31641) are directly connected to INEX.

Of the others, some of the host ASNs are connected to INEX, but the RTT suggests an indirect routing, possibly via the UK mainland.

The tests were also run against another target in Dublin, on the Amazon network, and show broadly similar results:

ICMP RTT NI probes to Amazon Dublin NLnog node
ICMP RTT all NI probes to Amazon NLnog ring node, Dublin

Again, the same two probes show <10ms RTT to Dublin. All others show >30ms. Doesn’t seem to matter if you’re a commercial or an academic network.

Finally, lets look at round trip times within Northern Ireland.

Here’s the test to a nameserver on the Tibus network:

ICMP RTT all NI Probes to Tibus Nameserver
ICMP RTT all NI Probes to Tibus Nameserver

Again, the same two probes report a lower than <10ms latency. I’d surmise that these are either routing via INEX, both host networks are downstream of the same transit provider in Belfast, or are privately interconnected in Belfast. At least two of the other nodes seem to route via the UK mainland.

To check this result, the same tests performed toward a nameserver on the Atlas/Bytel network:

ICMP RTT all NI probes to Atlas/Bytel Nameserver
ICMP RTT all NI probes to Atlas/Bytel Nameserver

Obviously, one of our probes is on-net, with a 1ms RTT!

Of the others, we’re definitely looking at “trombone routing” of the traffic, in most cases back to the UK mainland.

This may not be entirely surprising, as I’m told that BT don’t provide a 21CN interconnect node in Northern Ireland, so traffic on BT wholesale access products will “trombone” through the mainland in any case.

So, what’s really needed in Northern Ireland?

We’ve shown that if networks are willing to buy capacity to Dublin, they can happily exchange traffic at INEX and keep the latency down. An obvious concern some may have is the export of traffic from one jurisdiction to another, especially in light of recent revelations about systemic monitoring, if it’s NI to NI traffic.

The utility of IX in Northern Ireland could be hampered due to the lack of BT 21CN interconnect capability, as it may as well, for all intents and purposes be in Glasgow which is the nearest interconnect, for the traffic will still be making two trips across the Irish Sea whatever happens, assuming one end or the other is on the end of a BT wholesale pipe. (At worst, it could be 4 trips if both ends are on a BT pipe!)

If the goal is to foster internet growth (e.g. “production” of bandwidth) in Northern Ireland, where is it going to come from?

Are Northern Irish interests better served by connecting to the mature interconnect community in Dublin?

Is a BT 21CN interconnect in Belfast essential for growth, or can NI operators build around it?

Should INEX put a switch in Belfast? If they do, should it be backhauled to the larger community in Dublin? Or is that somehow overstepping the remit of an exchange point? 

AMS-IX: Green Light to Incorporate US entity

Members of the Dutch Amsterdam Internet Exchange have given the organisation a green light to incorporate a US entity in order to engage with the Open IX initiative and have the ability to run an exchange in the US while minimising risk to the Dutch association and the Dutch operating company.

This completes the announcements from the big 3 European exchanges (LINX, AMS-IX and DECIX) to operate interconnection services in the US, with the first to make an overt move being LINX, who are in the process of establishing an operation in Northern Virginia. DECIX issued a press release last week that they plan to enter the New York market, and now AMS-IX have a member endorsement to make a move.

There have been concerns amongst the Dutch technical community, who have long held AMS-IX in high regard, that establishing operations in the US will leave the AMS-IX as a whole vulnerable to the sort of systemic monitoring that has been revealed in the press in past weeks. While this is partly the reason for the AMS-IX company suggesting a separate legal entity, in order to hold the US operations at arms length, is it enough for some of the Dutch community? Seems not. In this message the Dutch R&E Network SURFnet seem to think the whole thing was rushed, might not be in the best interests of the community, and voted against the move.

It has been noted that members of the Open IX community, including members of the Open IX Board, were openly calling for AMS-IX members to vote “YES”, and suggesting they also “go out and get 5 other votes”.

What do people think about that? Given that an IX that affiliates to Open IX will have to pay Open IX membership dues, was it right of them to appear to lobby AMS-IX members?

What do people think about the establishment of the separate legal entity? Will this be enough?

Has this done lasting damage to the standing of AMS-IX in the Dutch networking community? Does this matter, or has AMS-IX grown so large that such goodwill doesn’t matter anymore?

On the bigger question, is this sort of thing damaging in the long term to the EU peering community? Does the growth into different countries with different cultures threaten to dilute the member-based ethos that defines a lot of EU exchanges? Or is that just another management challenge for the IX operator to solve?

Might Equinix, who have so far not directly competed with the established EU exchanges, decide they are taking the gloves off and start their own European IX operations in a turf war?

Interesting times.

The Network Engineering “Skills Gap”

Talking to colleagues in the industry, there’s anecdotal evidence that they are having trouble finding suitable candidates for mid-level Network Engineering roles. They have vacancies which have gone unfilled for some time for want of the right people, or ask where they can go to find good generalists that have a grasp of the whole ecosystem rather than some small corner of it.

Basically, a “skills gap” seems to have opened up in the industry, whereby there are some good all-rounders at a fairly senior level, but trying to find an individual with a few years experience, and a good grounding in IP Networking, system administration (and maybe a bit of coding/scripting), network services (such as DNS) and basic security is very difficult.

Instead, candidates have become siloed, from the basic “network guy/systems guy” split to vendor, technology and service specific skills.

This is even more concerning given the overall trend in the industry toward increasing automation of networking infrastructure deployment and management and a tendency to integrate and coalesce with the service infrastructure such as the data centre and the things in it (such as servers, storage, etc.) – “the data centre as the computer”.

This doesn’t work when there are black and white divisions between the “network guy” and the “server guy” and their specific knowledge.

So, how did we get where we are? Firstly, off down a side-track into some self-indulgence…

I consider myself to be one of the more “all round” guys, although I’ve definitely got more of a lean toward physical networking infrastructure as a result of the roles I’ve had and the direction these took me in.

I come from a generation of engineers who joined the industry during the mid-90’s, when the Internet started to move from the preserve of researchers, academics, and the hardcore geeks, to becoming a more frequently used tool of communication.

Starting out as an Internet user at University (remember NCSA Mosaic and Netscape 0.9?) I got myself a modem and a dialup connection, initially for use when I was back home during the holidays and away from the University’s computing facilities, all thanks to Demon Internet and their “tenner a month” philosophy that meant even poor students like me could afford it. Back then, to get online via dialup, you had to have some grasp of what was going on under the skin when you went online, so you could work out what had gone wrong when things didn’t work. Demonites will have “fond” memories of KA9Q, or the motley collection of things which allowed you to connect using Windows. Back then, TCP/IP stacks were not standard!

So, out I came from University, and fell into a job in the ISP industry.

Back then, you tended to start at the bottom, working in “support”, which in some respects was your apprenticeship in “the Internet’, learning along the way, and touching almost all areas – dialup, hosting, leased lines, ISDN, mail, nntp, Unix sysadmin, etc.

Also, the customers you were talking to were either fellow techies running the IT infrastructure in a business customer, or fellow geeks that were home users. They tended to have the same inquisitiveness that attracted you to the industry, and were on some level a peer.

Those with ambition, skill or natural flair soon found themselves climbing the greasy pole, moving up into more senior roles, handling escalations, or transferring into the systems team that maintained the network and servers. My own natural skill was in networking, and that’s where I ended up. But that didn’t mean I forgot how to work on a Unix command line. Those skills came in useful when building the instrumentation which helped me run the network. I could set up stats collection and monitoring without having to ask someone else to do it for me, which meant I wasn’t beholden to their priorities.

Many of my industry peers date from this period of rapid growth of the Internet.

Where did it start going wrong?

There’s a few sources, like a fire which needs a number of conditions to exist before it will burn, I think a number of things have come together to create the situation that exists today.

My first theory is the growth in outsourcing and offshoring of entry-level roles during the boom years largely cut off this “apprenticeship” route into the industry. There just wasn’t sufficient numbers of jobs for support techs in the countries which now have the demand for the people that most of these support techs might have become.

Coupled with that is the transition of the support level jobs from inquisitive fault-finding and diagnosis to a flowchart-led “reboot/reinstall”, “is it plugged in?” de-skilled operation that seemed to primarily exist for the frustrated to yell at when things didn’t work.

People with half a clue, that had the ability to grow into a good all-round engineer, might not have wanted these jobs, even if they still existed locally and were interested in joining the industry, because they had turned into being verbal punchbags for the rude and technically challenged. (This had already started to some extent in the mid-90s.)

Obviously, the people in these roles by the 2000s weren’t on a fast track to network engineering careers, they were call-centre staff.

My second theory is that vendor specific certification caused a silo mentality to develop. As the all-round apprenticeship of helpdesk work evaporated, did people look to certification to help them get jobs and progress their careers? I suspect this is the case, as there was a growth in the number of various certifications being offered by networking equipment vendors.

This isn’t a criticism of vendor certification per se, it has it’s place when it’s put in the context of a network engineer’s general knowledge. But, when the vendor certification is the majority of that engineer’s knowledge, what this leaves is a person who is good on paper, but can’t cope with being taken off the map, and tends to have difficulty with heterogeneous networking environments.

The other problem sometimes encountered is that people have done enough training to understand the theory, but they haven’t been exposed to enough real-world examples to get their head around the practice. Some have been taught the network equivalent how to fly the equivalent of a Boeing 747 or Airbus A380 on it’s extensive automation without understanding the basics (and fun) of flying stick-and-rudder in a little Cessna.

They haven’t got the experience that being in a “learning on the job” environment brings, and can’t always rationalise why things didn’t work out the way they expected.

The third theory is that there was a divergence of the network from the systems attached to it. During the 2000s, it started to become too much work for the same guys to know everything, and so where there used to be a group of all-rounders, there ended up being “server guys” and “network guys”. The network guys often didn’t know how to write scripts or understand basic system administration.

Finally, it seems we made networking about as glamorous as plumbing. Young folk wanted to go where the cool stuff is, and so fell into Web 2.0 companies and app development, rather than following a career in unblocking virtual drainpipes.

How do we fix it?

There’s no mistaking that this needs to be fixed. The network needs good all-round engineers to be able to deliver what’s going to be asked of it in the coming years.

People wonder why technologies such as IPv6, RPKI and DNSSEC are slow to deploy. I strongly believe that this skills gap is just one reason.

We’ve all heard the term “DevOps”, and whether or not we like it – it can provoke holy-wars, this is an embodiment of the well-rounded skill set that a lot of network operators are now looking for.

Convergence of the network and server environment is growing too. I know Software Defined Networking is often used as a buzzword, but there’s a growing need for people that can understand the interactions, and be able to apply their knowledge to the software-based tools which will be at the heart of such network deployments.

There’s no silver bullet though.

Back in the 2000s, my former employer, LINX, became so concerned about the lack of good network engineering talent, and woeful vendor specific training, that it launched the LINX Accredited Internet Technician programme, working with a training partner to build and deliver a series of platform-agnostic courses which built good all-round Network Engineering skills and how to apply these in the field. These courses are still delivered today through the training partner (SNT), while the syllabus is reviewed and updated to ensure it’s continuing relevance.

IPv6 pioneers HE.net offer a number of online courses in programming languages which are useful to the Network Engineer, in addition to their IPv6 certification programme.

There is also an effort called OpsSchool, which is building a comprehensive syllabus of things Operations Engineers need to know – trying to replicated the solid grounding in technology and techniques that would previously be picked up on the job while working in a helpdesk role, but for the current environment.

We’ve also got attempts to build the inquisitiveness in younger people with projects such as the Raspberry Pi, while venues such as hackspaces and “hacker camps” such as OHM, CCC and EMF exist as venues to exchange knowledge with like-minded folk and maybe learn something new.

We will need to cut our existing network and systems people a bit of slack, and let them embark on their own learning curves to fill the gaps in their knowledge, recognise that their job has changed around them, and make sure they are properly supported.

The fact is that we’re likely to be in this position for a few years yet…

Anti-spoofing filters, BCP38, IETF SAVVI and your network

I was invited to present at the recent IX Leeds open meeting, as “someone neutral” on the topic of BCP38 – largely in relation to the effects from not deploying it, not just on the wider Internet, but on your IP networking business (if you have one), and on the networks you interconnect with.

I basically broke the topic down:

Introduction: I started by introducing the problem in respect of the attack (“that nearly broke the Internet”) on the CloudFlare hosted Spamhaus website in March 2013.

What and how: Quick overview of address spoofing and how a backscatter amplification attack works.

What you should do: BCP38, uRPF, etc., and what you need to do, and what to ask your suppliers.

Why you should care: Yes, it benefits others, but you have costs in terms of bandwidth and abuse/security response too.

The bleeding edge: IETF SAVI working group.

It wasn’t meant to be a technical how-to, but a non-partisan awareness raiser, as the IX Leeds meeting audiences aren’t full of “usual suspects” but people who are less likely to have been exposed to this.

It’s important to get people doing source address filtering and validation, both themselves, and asking their suppliers for it where it’s appropriate.

Here’s the slide deck (.pdf) if you’re interested.

Why a little thing called BCP38 should be followed

A couple of weeks ago, there was a DDoS attack billed as “the biggest attack to date” which nearly broke the Internet (even if that hasn’t been proved).

If you’ve been holidaying in splendid isolation, an anti-spam group and a Dutch hosting outfit had a fallout, resulting in some cyber-floods, catching hosting provider CloudFlare in the middle.

The mode of the attack was such that it used two vulnerabilities in systems attached to the internet:

  • Open DNS Resolvers – “directory” servers which were poorly managed, and would answer any query directed to it, regardless of it’s origin.
    • Ordinarily, a properly configured DNS resolver will only answer queries for it’s defined subscriber base.
  • The ability of a system to send traffic to the internet with an IP address other than the one configured.
    • Normally, an application will use which ever address is configured on the interface, but it is possible to send with another address – commonly used for testing, research or debugging.

The Open Resolver issue has already been well documented with respect to this particular attack.

However, there’s not been that much noise about spoofed source addresses, and how ISPs could apply a thing called BCP 38 to combat this.

For the attack to work properly, what was needed was an army of “zombie” computers, compromised, and under the control of miscreants, which were able to send traffic onto the Internet with a source address other than it’s own, and the Open Resolvers.

Packets get sent from the compromised “zombie army” to the open resolvers, but not with the real source IP addresses, instead using the source address of the victim(s).

The responses therefore don’t return to the zombies, but all to the victim addresses.

It’s like sending letters with someone else’s address as a reply address. You don’t care that you don’t get the reply, you want the reply to go to the victim.

Filtering according to BCP 38 would stop the “spoofing” – the ability to use a source IP address other than one belonging to the network the computer is actually attached to. BCP 38 indicates the application of IP address filters or a check that an appropriate “reverse path” exists, which only admits traffic from expected source IP addresses.

BCP stands for “Best Current Practice” – so if it’s “Best” and “Current” why are enough ISPs not doing it to allow for an attack as big as this one?

The belief seems to be that applying BCP 38 is “hard” (or potentially too expensive based on actual benefit) for ISPs to do. It certainly might be hard to apply BCP 38 filters in some places, especially as you get closer to the “centre” of the Internet – the lists would be very big, and possibly a challenge to maintain, even with the necessary automation.

However, if that’s where people are looking to apply BCP 38 – at the point where ISPs interconnect, or where ISPs attach multi-homed customers – then they are almost certainly looking in the wrong place. If you filter there, if you’ve any attack traffic from customers in your network, you’ve already carried it across your network. If you’ve got Open Resolvers in your network, you’ve already delivered the attack traffic to the intermediate point in the attack.

The place where BCP 38 type filtering is best implemented is close to the downstream customer edge – in the “stub” networks – such as access networks, hosting networks, etc. This is because the network operator should know exactly which source IP addresses it should be expecting at that level in the network – it doesn’t need to be as granular as per-DSL customer or per-hosting customer, but at least don’t allow traffic to pass from “off net” source addresses.

I actually implement BCP 38 myself on my home DSL router. It’s configured so it will only forward packets to the Internet from the addresses which are downstream of the router. I suspect my own ISP does the “right thing”, and I know that I’ve got servers elsewhere in the world where the hosting company does apply BCP 38, but it can’t be universal. We know that from the “success” of the recent attack.

Right now, the situation is that many networks don’t seem to implement BCP 38. But if enough networks started to implement BCP 38 filtering, the ones who didn’t would be in the minority, and this would allow peer pressure to be brought to bear on them to “do the right thing”.

Sure, it may be a case of the good guys closing one door, only for the bad guys to open another, but any step which increases the difficulty for the bad guys can’t be a bad thing, right?

We should have a discussion on this at UKNOF 25 next week, and I dare say at many other upcoming Internet Operations and Security forums.