The importance of better WiFi – in a 1 Gbps world

We still think the understanding of multi-AP WiFi is in the dark ages, so we are going to try to sum up the articles we have written in the past few months, and then we promise to leave the subject alone for a while. The only reason that we spend so much time on it is because it will eventually be the main weak point in the delivery of the entire visual entertainment industry.

Ask an ISP what keeps it awake at night and it is likely you will get two separate answers – low Net Promoter Scores and WiFi – and these can often be part and parcel of the same problem. If all video is to go over broadband soon, we need to fix this.

Winners and losers in internet connectivity across the US and Western Europe over the next five years will be due to at least one operator in each country managing to survive the shift to all video being delivered online, with their NPS scores intact. Net Promoter Scores are simply a survey of what customers would say about your products to other potential customers.

The answer to this problem is of course better WiFi, but there is currently a massive quasi-religious argument about what makes “better” WiFi. There are the hardware adherents who say it is the 802.11ax standard, there are cellular players who will say it should be replaced by 5G, but the clear winner in delivering “pain-free” WiFi to operators is the process of delivering to more than one WiFi Access Point, where those APs work together to solve WiFi problems, using some form of cloud control and local optimization of performance.

Active managed WiFi with 2 or more APs in each home achieves roughly the PHY performance on the previous generation of WiFi. So while 802.11n claimed to bring speeds of 600 Mbps in PHY rate to a home, it tended to bring a guaranteed rate in 90% of any given home of about 45 Mbps. Using a multi-AP system, the next generation with 802.11ac chips, we have witnessed single connection speeds above 600 Mbps (802.11n speeds), with total overall throughput to multiple devices around 85% of PHY – so over 1 Gbps.

A single AX router in a home of around 2,000 square feet is likely to give you 165 Mbps to all parts of the home, nothing better. Add 2 other APs and get then to work together, and the performance is off the charts.

The problem comes that there are at least two separate motives, perhaps 3 or 4, for trying to install a system of multiple, linked and cloud managed WiFi access points in ISPs’ broadband homes.

The first need of each operator is to stop the phone ringing with people reporting WiFi faults that usually cannot be resolved, the second is to cope with the sudden massive up-swell in the number of devices and the delivery of video – the total throughput needs of each home are going through the roof – and in particular a large Quality of Service problem if anything stops the largest smart TV in the home from getting its fair share of bandwidth to show a shared TV experience. This is just a disaster.

Over the past five years we have interviewed countless vendors and operators about this issue and it comes out in many “slightly different” ways. Some say they want to use less resource on customer care – simply to save money, others say they would kill for an NPS of 40 or 50 (it goes from -100 to +100).

Some vendors focus almost entirely on what is happening to the WiFi streams as they flow around the home. Look at an analytics package, and you can see 600 Mbps streams maintained for hours at a time, but precise streaming numbers will vary with any changes in WiFi activity.

There are a handful of techniques at the local level – you can use a router resident algorithm to ensure that all devices have fair access to spectrum, and this can be far easier to achieve if you have Multi-User MIMO in your device, so that each AP can talk to more than one device at a time. But you can also tell a device it is connected to the wrong Access Point and either abandon that connection, so that it connects to the right one, or slow it down and under-resource it so that it “steers” to the right one by itself.

Manufacturers like Apple resist anyone telling it that its algorithms for picking an Access Point are wrong, and will, for instance, blacklist an Access Point that behaves in a way that it sees as “unpredictable.” So you have to treat Apple devices carefully and remember their preferences.

The most important thing is to use both 2.4 GHz and 5.0 GHz bands and move devices between bands, because in some parts of the world 2.4 GHz is really massively interfered with due to a proliferation of other devices in this spectrum and it remains the default setting on most phones. Over time, as LAA and other services come in where a cellular service uses the 5.0 GHz spectrum, this may also become true of 5 GHz, which is why we are all the time looking at new bandwidth for WiFi (unlicensed spectrum).

So when you move a device from one AP to another, is your algorithm based on which AP has the strongest signal, which of them is closest, which of them has the least work to do, or on the entire system’s health? A phone connected to such a system cannot know much about the entire health of the WiFi system in that particular home, but the local router can; while the local home cannot know much about the interference in WiFi across an entire MDU (Multi-Dwelling Unit) – but a cloud managed system can know plenty about it, because it may be managing much of the bandwidth in that MDU.

This is where the different religious emphasis comes from. We have met people who say one particular approach is dumb, simply because their experience is with an environment which has only 4 concurrent WiFi devices operating per home, or because they have maximum bandwidth to the home of 100 Mbps. We have heard people say there is no need to do “steering,” but they currently don’t have analytics installed which tells them what is happening in real time to all the WiFi carried processes in a home, or that steering should only be reconsidered every ten seconds otherwise you spend too much time changing your AP and nothing ever gets done. Others spend time looking for “interference” from neighboring WiFi set ups and jump the entire system in to different WiFi channels, again something that you can do to excess and end up with poor performance, but you need to be able to do it.

In all such cases there are reasons for people saying these things and it only really goes to show that sometimes decisions need to be made on the ground, in the home, and other times a better decision will be made once you have referenced “millions” of examples of outcomes which are held in the cloud. And sometimes priorities of one operator are totally different from another, as their contracts are, or their support functions.

Some operators don’t care which approach is taken, but simply insist on being  able to “see” what is going on, so their customer care can “take care” of it while on a call, while others point out this simply proliferates customer care, can’t we delegate this to an AI to make a fresh policy and solve the problems as they arise so we never seen them?

It makes a massive difference if your average home has 10 WiFi devices or 4 or 16 and of they are all using video at once, or if they never use video, and if you offer an option to connect your TV to the router via an ethernet cable and most people take that option, in the process willing off your biggest WiFi workload, in which case your network will behave differently. But not everyone can rely on installing new wires.

A lot of the outcomes of this end up on customer forums. We were pointed at one AT&T customer forum recently and it told us that the AirTies system simply doesn’t work. Upon reflection we find out that a rogue reseller decided he could make a killing on AirTies devices and sold them to homes as if they were a retail proposition – working alongside routers which did not have AirTies mesh inside. This was tantamount to sabotage to sell one system of multiple APs to a home that has no cloud management and no master system in-home. And yet it was the first thing one rival emailed us.

This has come about because consumers in the US are used to ignoring what their operators tell them to use, then going to the shop and buying a “faster” router.

This is a bit like you going to the shop and buying your own base station to work with an AT&T phone and finding it would not connect. A reseller can see that a $35 node is cheap, compared to a $200 router, and buys a handful and resells them for $69, without understanding they will not work.

And yet when we spoke to Plume a few weeks back, it cited this as “proving” that the AirTies system does not work. As we said, it is a “quasi-religious war,” which really means it a fake war. An operator has to partner with someone that understands and can solve his problems and these differences should not be used to “delay” the shift to Multi-AP environments – it is essentially the ONLY way that 1 Gbps broadband can be spread around a home – anyone launching 1 Gbps broadband and NOT adopting multi-AP, will struggle to make a success of it and should not even consider it.

But on closer inspection there are more similarities between Plume and AirTies than differences. Both are hybrids – with some intelligence in the home, and some in the cloud. Our opinion is that AirTies is better in the home, and Plume is better in the cloud, or at least that used to be the case. Both have upgrades.

Both also adopt slight differences in policy around both particular devices and how they are managed and the extent to which a local agent runs the show in a single home.

But we have also talked to operators who have bought Google WiFi, because it gives it a chance to regain control of its home gateway from a dozen different retail offerings, but this has absolutely no cloud management features built in – it was happy to build those itself using Google APIs.

One of the Plume strategic moves was to place its local communications to the cloud into open source in a move called OpenSync, and this has been adopted by Comcast, Liberty Global, Bell Canada and Samsung – this has the potential to become a 50 million footprint, but although this tiny piece of software may have already been downloaded millions of times, Open Source dominance in 50 million homes is not the same as someone paying to install your system in 15 million. But it is still a raging endorsement and a likely indicator of success.

Other players still, such as ASSIA and SoftAtHome have again slightly different trajectories for WiFi – ASSIA wants to control all of your existing WiFi, and not stick to multi-AP systems – this is important in that 80211n devices had both bands operational, and simply the ability to switch from one band to another or trigger a DFS (Dynamic Frequency Selection) and jump to a less congested channel, are still powerful tools in home networks even if they only have one WiFi Access Point. SoftAtHome is on the same learning curve, working with Western European operators to suit its solution to their highly specific problems.

Let’s take a simple problem which WiFi people call the “collision” domain. What this means is that if data has to go to more than one AP, it will have to forward content it has been sent, to an end device, and that this is said to “halve” the speed of the network.

There are a number of ways of addressing this – only having one hop from router to extender minimizes this; but also having a 3-way WiFi chip, which uses one 5 GHz channel for backhaul, and another for talking to devices, has been Qualcomm’s chip approach and AP maker Eero’s preferred architecture. Another idea has been to send to a second AP over 5 GHz but talk to the end device using 2.4 GHz, which means both can talk at the same time and there is no delay. AirTies uses 5 GHz for “backhaul” but has a fallback to Powerline or MoCA or Ethernet if the backhaul gets clogged up. So its devices can listen and talk at the same time.

All of these have drawbacks – if there is no mesh, and it is all wireless, there is only one route to each device and if it gets blocked with a bad apple or sticky client, it falls apart. Two 5 GHz channels makes it more likely a neighbor will interfere with your WiFi and it is expensive, and 2.4 GHz is cluttered with non-WiFi devices. Powerline is generally not considered fast enough to backhaul 1 Gbps although Wave 2 devices come close. Where you have MoCA on coax, the TV is usually served with that, and so 1 Gbps WiFi is less important. Having all this go back to the cloud and give a “considered” decision for every instance is not workable either, because it is too slow, so having a hybrid where a “live” policy works in the home, backed up by further data held in the cloud, is where all this is headed.

In the end all of this will be dynamic and each decision will be taken in the cloud, based on “best practice” and AI looking at the best outcomes for every combination. But right now each of the more progressive players are sticking to their particular religion, and for the most part this is due to its particular operator customer requirements.

Standards body the WiFi Alliance has, with EasyMesh, begun the standards making process, but all of this will not be standardized for at least another decade, if that, and it will be different for different connecting devices, and different at time of day, day of week, week of year, based on what everyone else (neighbors) do habitually at those times.

You might maintain multiple connections between APs, including Ethernet, PLC or MoCA as well as Triband WiFi. And this can be a mesh or not a mesh.

If a process that is playing out on a phone is happy with 30 Mbps, your system should know that and not try to optimize if it is already doing the job. Other clients may need 600Mbps or 1Gbps streams in future

And these solutions will evolve as hardware evolves. The routing function may move, mesh may be better understood, routers may become virtual, and therefore cheaper.  There may be a different answer if you already have the content in the home, like a Blu-ray player, rather than streaming it over the internet.

Things like beamforming change the formulas all the time. A device with beamforming or Multi-User MIMO (much the same) after it has connected to a device, will null radio emissions in any other direction when it is talking to one or more designated clients. If these clients move further away or meet interference, the beam makes it feel like the device is closer to the AP, and it sustains a “strong” signal. If that device began connecting all over again, it might be that for the health of the overall airtime, or simply due to proximity and signal strength, it would choose to connect to a different device. But for the time that beamforming is operating, the device cannot not know this. The device has insufficient information to make that decision and so does any AP that is not connected to all the other APs, and sharing information.

Beamforming works better the more antennas a device has, so this is a moving target. This is made even more complex if a Smart TV, which has 4 antennas, is connecting using a 4 x 4 beamforming signal – running at the upper reaches of the MCS (Modulation and Coding Scheme) – each PHY transmits at a “best speed and then gracefully degrades to a slower modulation scheme. An iPhone right next to an AP which is struggling to deliver a UHD video stream to a big screen, would not want to source another video stream from this AP, but the iPhone device asking for it knows nothing of the problems of the AP. So the systems may decide to recalculate this every ten seconds, or every minute or once a day – and only one of these frequencies is optimal for a particular use case.

Cloud versus local is a complex problem with a hybrid being the best solution. There are many parameters to be looked at ranging from integration with network operations centers to response times for user experience. Each piece of functionality needs to be looked at individually and an appropriate decision needs to be made as to where best to run it, in the cloud or locally on the AP’s.

From a user experience perspective, we all know that immediate response is the user expectation. The consumer will not be happy with a service that goes from good performance to bad when he walks to another room or changes what he is viewing from an OTT channel to a live channel, and several minutes later it turns into good performance again.

The cloud is quite fast these days with minimal delay, so theoretically a steering decision could be made every second in the cloud. But if you tried this 24 hours a day, then compute power costs sky rocket and limitations to scale occur. Which is why one religion is “steering should not happen very often,” while another is “respond locally in real time”.

And you have to remember that dumb things happen in the real world. If your help desk is genuinely not in the loop, and cannot “see” into the performance of each home, then when a customer has turned off an AP to plug the hairdryer into that socket, the client ends up with an engineering visit. With analytics in the cloud and delivered in some summary format to the help desk, they can fix such problems over the phone.

Integration with corporate systems, call centers, CRMs, billing, marketing is far easier in the cloud via API level integration versus adding a feature to the firmware and make it available via TR69 and new feature development, deployment and testing in the field are hugely faster in the cloud.

All of which informs the research we did earlier in the year in our sister service Rethink TV, which predicted that there will be 332 million Smart WiFi systems (Multi-AP) by 2023 globally. It’s a start.