What Network Managers Should Know About AI and Machine Learning

It's 2024. We obviously had to do an AI episode of the pod.

And for that, we welcome our guest Michael Wynston, Director of Network & Security Architecture at Fiserv.

Michael is the first esteemed member of TeleGeography Explains the Internet's four-timers club. Indeed, as I'm sure you've guessed, he's back on the show for the fourth time. And this time around he's here to help us better understand how AI is developing as a network management tool.

You can preview our chat below or scroll to the bottom to listen to the whole conversation.

Greg Bryan: Today we're talking about something that's been on everybody's mind. Nerds like us have been probably thinking about AI for a very long time, but it's hit the zeitgeist in the past couple of years.

Maybe a critical mass of folks are starting to see: what can this do for me? And we won't get into whether large language models are truly AI or not; I'll leave that for some other nerdy conversations. But what I wanted to focus on with you—because you have been thinking about and even starting to implement some of this—is the real implications of AI/ML for managing networks, right?

So, I should say this, Fiserv is probably a perfect example of another buzzword that is out there a lot nowadays, like FinTech, right?

Michael Wynston: Yep.

Greg: So Michael, I brought you on to explain to us how we can actually expect to see AI play out in terms of network management.

But I thought before we get there, let's start with—I think as you've alluded to before—there's already a history of AI and automation in network management.

So let's start with the roots of that and where you see that kind of nascent growth coming from.

Michael: So one of the things is—actually a project I worked on going back 25 plus years—was when I was working as a network architect at Merrill Lynch, a company that's no longer around. Well, actually, it's still around, but now part of Bank of America.

Anyway, we were looking to implement a platform called Smarts. I'm not sure how many people out in the audience remember this going back that far. It was actually the first time I was exposed to it, and I was exposed to it again when I was at a large pharmaceutical company.

Smarts was a platform that was designed to correlate application to infrastructure so that you could understand the impact on your applications when you had infrastructure failures or outages.

And the way that this would always work is you would build an application and infrastructure map. Back then, we were using SNMP to go and pull information from the network devices. And then we were using SNMP and other technologies.

And the problem was, back then, for application platforms, most of those systems were proprietary to pull, again, information about that particular device.

And then Smarts would try to map together the applications that it saw running on the host. And then from there, the application and infrastructure folks would work together to build models based on how an application behaved. Because although we could find that there was maybe a web server running on port 80 on this host, and that that host was connected to this switch, it didn't have the intelligence to then know, well, it has to go through this firewall, or there's this load balancer in front of it. Or if I lose this piece of the application, here's the standby piece.

Because we didn't have that kind of technology around to dynamically build those relationship maps, all of that had to be done manually.

And what would happen was, you'd bring in a whole bunch of contractors to do that, to build it all manually. And it would work for a week, maybe. And the reason it only worked for a week is, as I mentioned earlier, infrastructure is organic. Infrastructure is constantly changing.

So because we didn't have that kind of technology around to dynamically build those relationship maps, all of that had to be done manually.

And what would happen was, you'd bring in a whole bunch of contractors to do that, to build it all manually. And it would work for a week, maybe. And the reason it only worked for a week is, as I mentioned earlier, infrastructure is organic. Infrastructure is constantly changing. Every time you plug in a new endpoint, every time you add a new router, you add a new switch, you add a new BPC, you add a new VNet. See, I'm adding cloud terms in there as well because that counts too.

Every time you do something like that, your infrastructure changes.

Greg: Yes, indeed.

Michael: And because of this wonderful thing we use called dynamic routing, there is very much the butterfly effect, where you add a VNet somewhere in Azure, and something over in a data center in Asia Pacific falls over, or the host suddenly can't get to where it could get to before.

And those kinds of relationships are very, very complicated, especially in large enterprise environments.

Now, there have been more current tools like Big Panda and Moogsoft that have also tried to take this correlation on. But again, a lot of that correlation, a lot of those business rules, take a lot of work to maintain and have to be done by humans. And the challenge is then prioritizing that work for that human

Greg: Right.

Michael: Sometimes it falls to the bottom. Sometimes it's at the top. Usually it's only at the top when you realize you haven't been taking care of it and something fell over and nobody knew or something happened and nobody understands why the impact was the way it was.

So that's kind of the history of where we are hopeful that AI—or artificial intelligence—and machine learning can help us in an operational way. And that's what we're looking at right now.

Greg: Yeah, that makes a lot of sense. Maybe it's a clunky metaphor—but with other AI, it's developed with us.

So the one that I like to think of is driver assistance. There's types one through four in terms of automated driving. I've not yet had the chance to get into like a Waymo or something, where it's like fully automated. But I have a newer car where it steers a little bit for me and I have adaptive cruise control. You're kind of talking about that that.

Listen to the full episode below.

From This Episode:

Greg Bryan

Greg is Senior Manager, Enterprise Research at TeleGeography. He's spent the last decade and a half at TeleGeography developing many of our pricing products and reports about enterprise networks. He is a frequent speaker at conferences about corporate wide area networks and enterprise telecom services. He also hosts our podcast, TeleGeography Explains the Internet.

Connect with Greg

What Network Managers Should Know About AI and Machine Learning

Greg Bryan

Elsewhere on the Blog

Find the Right SD-WAN Vendor With Our Free 2025 Guide

An FCC Commissioner Explains Spectrum Allocation + Broadband Expansion

Automate or Bust: What's the Real Benefit of Network Automation?

New on the Pod: Lightyear Considers the State of Connectivity

Subscribe to the TeleGeography Blog