The Network Engineering “Skills Gap”

Talking to colleagues in the industry, there’s anecdotal evidence that they are having trouble finding suitable candidates for mid-level Network Engineering roles. They have vacancies which have gone unfilled for some time for want of the right people, or ask where they can go to find good generalists that have a grasp of the whole ecosystem rather than some small corner of it.

Basically, a “skills gap” seems to have opened up in the industry, whereby there are some good all-rounders at a fairly senior level, but trying to find an individual with a few years experience, and a good grounding in IP Networking, system administration (and maybe a bit of coding/scripting), network services (such as DNS) and basic security is very difficult.

Instead, candidates have become siloed, from the basic “network guy/systems guy” split to vendor, technology and service specific skills.

This is even more concerning given the overall trend in the industry toward increasing automation of networking infrastructure deployment and management and a tendency to integrate and coalesce with the service infrastructure such as the data centre and the things in it (such as servers, storage, etc.) – “the data centre as the computer”.

This doesn’t work when there are black and white divisions between the “network guy” and the “server guy” and their specific knowledge.

So, how did we get where we are? Firstly, off down a side-track into some self-indulgence…

I consider myself to be one of the more “all round” guys, although I’ve definitely got more of a lean toward physical networking infrastructure as a result of the roles I’ve had and the direction these took me in.

I come from a generation of engineers who joined the industry during the mid-90’s, when the Internet started to move from the preserve of researchers, academics, and the hardcore geeks, to becoming a more frequently used tool of communication.

Starting out as an Internet user at University (remember NCSA Mosaic and Netscape 0.9?) I got myself a modem and a dialup connection, initially for use when I was back home during the holidays and away from the University’s computing facilities, all thanks to Demon Internet and their “tenner a month” philosophy that meant even poor students like me could afford it. Back then, to get online via dialup, you had to have some grasp of what was going on under the skin when you went online, so you could work out what had gone wrong when things didn’t work. Demonites will have “fond” memories of KA9Q, or the motley collection of things which allowed you to connect using Windows. Back then, TCP/IP stacks were not standard!

So, out I came from University, and fell into a job in the ISP industry.

Back then, you tended to start at the bottom, working in “support”, which in some respects was your apprenticeship in “the Internet’, learning along the way, and touching almost all areas – dialup, hosting, leased lines, ISDN, mail, nntp, Unix sysadmin, etc.

Also, the customers you were talking to were either fellow techies running the IT infrastructure in a business customer, or fellow geeks that were home users. They tended to have the same inquisitiveness that attracted you to the industry, and were on some level a peer.

Those with ambition, skill or natural flair soon found themselves climbing the greasy pole, moving up into more senior roles, handling escalations, or transferring into the systems team that maintained the network and servers. My own natural skill was in networking, and that’s where I ended up. But that didn’t mean I forgot how to work on a Unix command line. Those skills came in useful when building the instrumentation which helped me run the network. I could set up stats collection and monitoring without having to ask someone else to do it for me, which meant I wasn’t beholden to their priorities.

Many of my industry peers date from this period of rapid growth of the Internet.

Where did it start going wrong?

There’s a few sources, like a fire which needs a number of conditions to exist before it will burn, I think a number of things have come together to create the situation that exists today.

My first theory is the growth in outsourcing and offshoring of entry-level roles during the boom years largely cut off this “apprenticeship” route into the industry. There just wasn’t sufficient numbers of jobs for support techs in the countries which now have the demand for the people that most of these support techs might have become.

Coupled with that is the transition of the support level jobs from inquisitive fault-finding and diagnosis to a flowchart-led “reboot/reinstall”, “is it plugged in?” de-skilled operation that seemed to primarily exist for the frustrated to yell at when things didn’t work.

People with half a clue, that had the ability to grow into a good all-round engineer, might not have wanted these jobs, even if they still existed locally and were interested in joining the industry, because they had turned into being verbal punchbags for the rude and technically challenged. (This had already started to some extent in the mid-90s.)

Obviously, the people in these roles by the 2000s weren’t on a fast track to network engineering careers, they were call-centre staff.

My second theory is that vendor specific certification caused a silo mentality to develop. As the all-round apprenticeship of helpdesk work evaporated, did people look to certification to help them get jobs and progress their careers? I suspect this is the case, as there was a growth in the number of various certifications being offered by networking equipment vendors.

This isn’t a criticism of vendor certification per se, it has it’s place when it’s put in the context of a network engineer’s general knowledge. But, when the vendor certification is the majority of that engineer’s knowledge, what this leaves is a person who is good on paper, but can’t cope with being taken off the map, and tends to have difficulty with heterogeneous networking environments.

The other problem sometimes encountered is that people have done enough training to understand the theory, but they haven’t been exposed to enough real-world examples to get their head around the practice. Some have been taught the network equivalent how to fly the equivalent of a Boeing 747 or Airbus A380 on it’s extensive automation without understanding the basics (and fun) of flying stick-and-rudder in a little Cessna.

They haven’t got the experience that being in a “learning on the job” environment brings, and can’t always rationalise why things didn’t work out the way they expected.

The third theory is that there was a divergence of the network from the systems attached to it. During the 2000s, it started to become too much work for the same guys to know everything, and so where there used to be a group of all-rounders, there ended up being “server guys” and “network guys”. The network guys often didn’t know how to write scripts or understand basic system administration.

Finally, it seems we made networking about as glamorous as plumbing. Young folk wanted to go where the cool stuff is, and so fell into Web 2.0 companies and app development, rather than following a career in unblocking virtual drainpipes.

How do we fix it?

There’s no mistaking that this needs to be fixed. The network needs good all-round engineers to be able to deliver what’s going to be asked of it in the coming years.

People wonder why technologies such as IPv6, RPKI and DNSSEC are slow to deploy. I strongly believe that this skills gap is just one reason.

We’ve all heard the term “DevOps”, and whether or not we like it – it can provoke holy-wars, this is an embodiment of the well-rounded skill set that a lot of network operators are now looking for.

Convergence of the network and server environment is growing too. I know Software Defined Networking is often used as a buzzword, but there’s a growing need for people that can understand the interactions, and be able to apply their knowledge to the software-based tools which will be at the heart of such network deployments.

There’s no silver bullet though.

Back in the 2000s, my former employer, LINX, became so concerned about the lack of good network engineering talent, and woeful vendor specific training, that it launched the LINX Accredited Internet Technician programme, working with a training partner to build and deliver a series of platform-agnostic courses which built good all-round Network Engineering skills and how to apply these in the field. These courses are still delivered today through the training partner (SNT), while the syllabus is reviewed and updated to ensure it’s continuing relevance.

IPv6 pioneers HE.net offer a number of online courses in programming languages which are useful to the Network Engineer, in addition to their IPv6 certification programme.

There is also an effort called OpsSchool, which is building a comprehensive syllabus of things Operations Engineers need to know – trying to replicated the solid grounding in technology and techniques that would previously be picked up on the job while working in a helpdesk role, but for the current environment.

We’ve also got attempts to build the inquisitiveness in younger people with projects such as the Raspberry Pi, while venues such as hackspaces and “hacker camps” such as OHM, CCC and EMF exist as venues to exchange knowledge with like-minded folk and maybe learn something new.

We will need to cut our existing network and systems people a bit of slack, and let them embark on their own learning curves to fill the gaps in their knowledge, recognise that their job has changed around them, and make sure they are properly supported.

The fact is that we’re likely to be in this position for a few years yet…

Is the Internet facing a “perfect storm”?

The Internet has become a massive part of our everyday lives. If you walk down a British high street, you can’t fail to notice people staring into their phones rather than looking where they are going! I did see a comment on TV this week that you have a 1-in-10 chance of tripping and falling over when walking along looking at your phone and messaging…

There are massive pushes for faster access in countries which already have widespread Internet adoption, both over fixed infrastructure (such as FTTC and FTTH) and wireless (LTE, aka 4G), which at times isn’t without controversy. In the UK, the incumbent, BT, is commonly (and sometimes unfairly) criticised for trying to sweat more and more out of it’s copper last mile infrastructure (the wires that go into people’s homes), while not doing enough to “future-proof” and enable remote areas by investing in fibre. There’s also been problems over the UK regulator’s decision to allow one mobile phone network get a head-start on it’s competitors in offering LTE/4G service ahead of them, using existing allocated radio frequencies (a process known as “spectrum refarming”).

Why do people care? Because the Internet helps foster growth and can reduce the costs of doing business, and it’s why the developing countries are working desperately hard to drive internet adoption, along the way having to manage the threats of “interfering” actors who either don’t fully understand or fear change.

However, a bigger threat could be facing the Internet, and it’s coming from multiple angles, technical and non-technical. A perfect storm?

  • IPv4 Resource Exhaustion
    • The existing addressing (numbering) scheme used by the Internet is running out
    • A secondary market for “spare” IPv4 resources is developing, IPv4 addresses will have a monetary value, driven by lack of IPv6 deployment
  • Slow IPv6 Adoption
  • Increasing Regulatory attention
    • On a national level, such as the French Regulator, ARCEP, wishing to collect details on all interconnects in France or involving French entities
    • On a regional level, such as ETNO pushing for regulation of interconnect through use of QoS – nicely de-constructed by my learned colleague Geoff Huston – possibly an attempt to retroactively fix a broken business model?
    • On a Global level through the ITU, who, having disregarded the Internet as “something for academics” and not relevant to public communications back in 1988, now want to update the International Telecommunication Regulations to extend these to who “controls the Internet” and how.

All of these things threaten some of the basic foundations of the Internet we have today:

  • The Internet is “open” – anyone can connect, it’s agnostic to the data which is run over it, and this allows people to innovate
  • The Internet is “transparent” – managed using a bottom-up process of policy making and protocol development which is open to all
  • The Internet is “cheap” – relatively speaking, Internet service is inexpensive

These challenges facing the Internet combine to break all of the above.

Close the system off, drive costs up, and make development and co-ordination an invite-only closed shop in which it’s expensive to participate.

Time and effort, and investing a little money (in deploying IPv6, in some regulatory efforts, and in checking your business model is still valid), are the main things which will head off this approaching storm.

Adopting IPv6 should just be a (stay in) business decision. It’s something operational and technical that a business is in control of.

But, the regulatory aspect is tougher, unless you are big enough to be able to afford your own lobbyists. Fortunately, if you live in the UK, it’s not reached “write to your MP time”, not just yet. The UK’s position remains one of “light touch” regulation, largely letting the industry self-regulate itself through market forces, and this is being advocated to the ITU. There’s also some very bright, talented and respected people trying to get the message through that it’s economically advantageous not to make the Internet a closed top-down operated system.

Nevertheless, the challenges remain very much real. We live in interesting times.

Recent IPv4 Depletion Events

Those of you who follow these things can’t have missed that the RIPE NCC had got down to it’s last /8 of unallocated IPv4 space last week.

They even made a cake to celebrate…

Photo (and cake?) by Rumy Spratley-Kanis

This means the RIPE NCC are down to their last 16 million IPv4 IP addresses, and they can’t get another big block allocated to them, because there aren’t any more to give out.

Continue reading “Recent IPv4 Depletion Events”

Comcast Residential IPv6 Deployment Pilot

Comcast, long active in the IPv6 arena have announced that they will be doing a native residential IPv6 deployment in Pleasanton, CA, on the edge of the San Francisco Bay Area, which will be a dual-stacked, native v4/v6 deployment with no NAT.

This is a much needed move to try and break the deadlock that seems to have been holding back wide scale v6 deployment in mass market broadband providers. Apart from isolated islands of activity such as XS4ALL‘s pioneering work in the Netherlands, v6 deployment has largely been available only as an option from providers focused on the tech savvy user (such as A&A in the UK).

Sure, it’s a limited trial, and initially aimed at single devices only (i.e. one device connected directly to the cable modem), but it’s a start, and there’s plans to expand this as experience is gained.

Read these good blog articles from Comcast’s John Brzozowski and Jason Livingood about the deployment and it’s aims.

Just let IPv4 run out. It’s over. Just get on with it.

So, I’m currently at the RIPE 63 meeting in Vienna. Obviously, one of the ongoing hot topics here is IPv4 depletion, at times consisting of discussion on either a) the transition away from IPv4 to IPv6 via various transition mechanisms, and b) how to make the pitiful amount of IPv4 addressing that’s left last as long as possible.

One of the things that is often said about (b) is that it shouldn’t be done to death, IPv4 should just be allowed to run out, we get over it, and deploy IPv6. However (b) behaviour is to be expected when dealing with exhaustion of a finite resource.

There are similarities and parallels to be drawn between IPv4 runout and IPv6 adoption, fossil fuel depletion and movement to alternative energy techologies. The early adopters and the laggards. The hoarders and speculators. The evangelists and the naysayers.

So, for a minute don’t think about oil and gas resources being depleted, that’s way in the future. We’re facing one of the first examples of exhaustion of a finite resource on which businesses and economies depend.

If the IPv4 depletion and IPv6 (slow) adoption situation is a dry run of what might actually happen when something like oil runs out, then we should be worried, because we can’t just rely on carrier grade NAT to save us.

Farewell IANA Free Pool…

Or, with apologies to Rolf Harris, “Can you guess what it is yet?”…

The NRO are inviting us to a webcast of a “special announcement” tomorrow. I wonder what it could be?

Might it be the end of the IANA IPv4 free pool? Or could it be that a few more /8s have been found down the back of the sofa? The latter is very unlikely.

We’re probably looking at a ceremonial doling out of the remaining /8s to the various RIRs.

While it may look a bit profligate to fly a load of RIR folk to Miami, it’s probably a necessary media stunt, as implementers and vendors have been sodding around, sat on their hands, for long enough, to the point that many folks’ home broadband routers and systems won’t do IPv6 and therefore can’t support a dual-stacked (v4 and v6 enabled) environment.

(The Real) Geoff Huston has, as usual, produced an interesting graph:

Per RIR IPv4 depletion to /8 and probability of when it's likely to happen
Per RIR IPv4 depletion to /8 and probability of when it's likely to happen

So we should find APNIC moving to activate their “final /8 policy” first, the idea being it – assuming Cyclone Yasi doesn’t try and finish the job the Queensland floods started in Brisbane – will start to issue allocations from the final /8 in smaller blocks, and only one allocation can be made to each APNIC member LIR from the last /8 – to try and give some level of running out fairly.

Anyway, it will certainly be interesting to keep an eye on these graphs!

Fortunately (in some respects), the runout in the RIPE NCC region looks to still be about 12 months away. Still doesn’t give me the warm fuzzies.

Folk need to start using IPv6, and debugging what’s wrong with it to stand any chance of being ready.

World IPv6 Day

ISOC have announced the date for World IPv6 Day – mark it in your calendars now – 8th June 2011.

This will be the day that you will need extra support resources available to deal with the potential for brokenness which will ensue from folks with poor v6 implementations or incomplete v6 connectivity.

It may look a bit drastic to submit the Internet at large to this “experiment”, but I think this is an important and sensible move.

Folks such as Marco Hogewoning and Martin Levy (among many, many others, these are the first two which spring to mind if you mention IPv6 to me) have been demonstrating the various gotchas or downright brokenness which exists out there for some time now. Sadly, while the enlightened have no problem listening, some have been sat with their heads in the sand. The bad thing is that these are often the people we need to listen most – software and system vendors.

So, get the popcorn ready, be ready for the unexpected, and watch what happens.

In the meantime, you might want to check how IPv6-ready your network is… not just how much it claims to be.

Best of all, N(ew/A)NOG 52 meeting will be happening in Denver the following week, so we get to have a discussion of the aftermath in near real time. I’d love to link to it, but there’s nowhere to go.

At least it’s a Wednesday – no-one’s Friday is being ruined. 🙂