The Continuing Evolution of Forward Networks – Networking Field Day 34

Check out those backdrops!

Digital Twin

I was very fortunate to be a delegate at Networking Field Day 13, all the way back in 2016. This was a milestone event for Forward, as this was their official move out of “stealth” as a Silicon Valley startup. Their initial presentation was impressive, and the Forward Networks platform offered something I had not seen before, an accurate digital copy of your network, which you could query to understand paths and flows, and test proposed changes to prevent misconfiguration.

Fast forward to Networking Field Day 34, and Forward Networks are presenting again, now to highlight the maturity of their product, and how they are now integrating AI and LLM to allow for an even better experience with natural language queries, and enhanced visibility into your network digital twin.

At it’s core, the Forward Networks platform remains a full digital twin of your environment, allowing you to search, query, verify, and predict how traffic is behaving, and will behave in your environment. They are vendor-agnostic, meaning you can easily have a mix of Cisco, Juniper, HPE-Aruba, Arista, etc. and still leverage the power of the platform. A simple local agent crawls your network with SSH credentials (or via API if you have devices that don’t support SSH) and builds the snapshot of your network, which you can then import into the tool, and begin working with.

Having evolved quite a lot since 2016, Forward Networks now included integrations with the 3 major cloud providers, and security tie-ins to platforms like Rapid7 and Tenable to identify CVEs that may impact your network devices. Now they have taken the next step, with integration of Generative AI with AI Assist as part of the Network Query Engine (NQE).

AI Assist now allows the use of AI LLM to generate queries against the network model. These queries can be saved for use later in your own repository of queries, or you can also use a number of pre-packaged queries out of the box. The reverse is also true, and you can use Summary Assist to analyze a query and provide a plain language summary of what it is doing.

Proving the Negatives

If you’ve been in networking for any length of time, you know the feeling of having to “defend” the network because it’s the first thing that gets blamed when something isn’t working. We’re constantly having to prove a negative, which is sometimes hard to do. It can involve a lot of jumping around your network in the CLI, pinging and checking routes, doing packet captures, etc. and there’s no easy way to translate a lot of these methods into a simple to understand view of your network, and where the traffic is or is not going.

The Forward Networks platform provides a simple, easy to understand analysis and view of traffic flow across your network in a 100% mathematically accurate carbon copy. Queries can be copied and shared, so now you can send a link to your Dev team and show them that, despite their initial assessment with no troubleshooting or factual information, it is *not* the network.

Contining Forward

The team at Forward Networks continue to evolve and strengthen their platform, and the integration of AI LLM with AI Assist and Network Query Engine is a perfect fit. In an era where everyone is trying to shoehorn AI into their product, whether or not it makes sense to, this is an excellent example of what is still a very immature technology, put to good use.

If you want to learn more, and check out a customer testimonial around automation and cost savings from one of Forward Networks’ biggest customers, you can watch the recordings from the presentations here.

She’s a Keeper! Keeper Security presents at Security Field Day 7.

I never thought I’d leave you, 1Password…

Password Managers aren’t a new technology. Arguably the Excel 97 spreadsheet in your corporate file share labelled DefinitelyNotPasswords.xls was an early form of Password Manager (but only if it was password-protected). The architecture behind them hasn’t really evolved much over the years, they’ve just moved away from being flat files of all your most coveted secrets, protected by yet another code word/phrase, into nicely presented applications, or cloud solutions that simply scramble your bucket of passwords with some fancy encryption algorithm.

There are different storage models, each with their own pros and cons. Storing the passwords locally means they aren’t sitting in the cloud on someone else’s infrastructure, but that also means if your device is compromised, those passwords could be compromised as well. Store the passwords in the cloud, and well, they’re in the cloud, on hardware you don’t own or manage, and what happens when you lose communication to that infrastructure? You might not be able to update your vault. What happens in case of a breach? Has all of your data been exposed? There are certainly examples of this happening, as well as vulnerabilities in local versions of various password managers.

Ultimately there is no silver bullet, no perfect solution that is going to be completely impervious to a code vulnerability, a hack, or poor infrastructure security. But still, having a properly organized password manager, regardless of which model you choose, is still better than not having one at all.

This week at Security Field Day 7, I was introduced to Keeper. I’d heard of them before but hadn’t really dug into their product very much as I had chosen and settled into a password manager quite a few years ago, and truthfully had no complaints, and no reason to look elsewhere. I’ve been a 1Password user for several years, and even brought 1Password to my employer where we’ve adopted the enterprise version.

What stood out the most for me with the presentation from Keeper was that it seemed to be the first password manager that was purpose built from the ground up to be an enterprise-grade security tool. Many of today’s popular subscription-based password managers, 1Password included, evolved from a free product, aimed at consumers. That doesn’t mean they’re not secure, but that features were developed with the consumer in mind first, not the enterprise. Some of the more enterprise-y features they might have now may seem to have been tacked on as an afterthought, or to simply check off a box that might get the product adopted into enterprise.

Zero-Knowledge and Zero-Trust

Keeper has taken security of customer data very seriously, as they should, however their discussion around encryption and their methods used to protect and store password and secret data was next level. Keeper has absolutely no knowledge of a user’s master password, or stored passwords/secrets, as the keys to encrypt and decrypt this data are only stored on the user’s device. The data in your vault isn’t protected with a single key pair either, every single record in the vault is encrypted with its own keys. Those keys are then wrapped in another key if contained in a shared folder.

Their credentials are solid, and they are the only FIPS 140-2 validated password manager that I am aware of.

If you’re really into the nerdy side of encryption, check out all the details here.

“Keeper is the most secure, certified, tested and audited password security platform in the world. We are the only SOC2 and ISO27001 certified password management solution in the industry and Privacy Shield Compliant with the U.S. Department of Commerce’s EU-U.S. Privacy Shield program, meeting the European Commission’s Directive on Data Protection.”

Authentication and 2FA

The list of features they support is exhaustive, from SSO support via SAML 2.0 with any identity provider you can think of, to biometric support that includes Windows Hello, TouchID, FaceID, and Android. Aall of these options feature their Zero Knowledge model that completely protects your information in flight during the authentication process.

Two-Factor Authentication enforcement is available, along with Role-Based Access Control. Keeper supports all popular 2FA methods with your authenticator of choice, including Google Authenticator, Microsoft Authenticator, Duo, RSA, or FIDO2 keys like Yubikey. They even have their own integration with wearable technology like Apple Watch and Android Wear devices through KeeperDNA.

Even better, you can just use Keeper for all your 2FA codes and stop having to use 3-4 different apps for your TOTP/OTP supported logins.

Unfortunately, they also support SMS for 2FA, which I’d personally like to see more products completely remove as an option. This is likely my only complaint about Keeper but I understand it’s an option some people insist on using.

BreachWatch

With credential stuffing and spraying attacks on the rise, it is vital that passwords can be checked against known breached passwords. Keeper offers a feature called BreachWatch for checking vault information against known breached information on the Dark Web. You will be alerted to change a known breached password if there is a match.

Hard to Switch, but…

The list of features Keeper offers goes on. If you can think of it, they likely already have it, and if not, they’re working on it.

If you take your credential security seriously, you’re using a password manager, and 2FA wherever you can. Once you’ve gotten yourself into a particular product like this, it can be a daunting task to switch to a new one. I myself have over 1200 items in my 1Password vaults, and thus far I’ve had no compelling reason to think about migrating my secrets to another platform. Until now.

Keeper’s presentation at Security Field Day 7 truly has me considering signing up for a trial at the very least.

Check out their presentation over at Tech Field Day.

Rapid Incident Reponse with PathSolutions Security Operations Manager – Security Field Day 3

As I delve further and further into “all things security” along my career path, it has become clear to me that one of the key skills a good Security Professional must have is the ability to filter out noise, and focus on identifying the critical pieces of information that require action, while safely being able to ignore the rest. Modern security tools, whether they are Firewalls, IDS/IPS, Proxies, Web Application Firewalls, Content Filters, etc. all collect, report, and alert on a lot of information. It can be overwhelming, and this is especially true for smaller, flatter IT teams that perhaps don’t have a dedicated Security Operations Center (SOC), or even an actual Security Team. Quite often, the “Security Team” is one person, and that person may also fill the role of Network Administrator, Server Administrator, or any number of other roles that some larger IT teams might have distributed across several individuals.

In these situations, having a tool or process that can consolidate and help with filtering and focusing the important data is key to being able to avoid information paralysis – the idea of having too much information to really be able to act on any of it in a meaningful way. This is SIEM – or Security Information and Event Management. Now, I’ve found SIEM can be interchangably used as a noun when referring to a specific tool that performs this fuction, or as a verb when describing the act of processing the data from multiple sources into actionable information. In either case, the end result is the most critical – the ability to gather data from multiple sources, and render it down to something useful, and actionable.

This week at Security Field Day 3, I was fortunate to participate in a fantastic conversation with PathSolutions CTO Tim Titus, as he presented TotalView Security Operations Manager and its capabilities as a SecOps tool that can greatly improve awareness and response time to security events within your network.

60 Second Decisions

Investigating alerts can be tedious, and can take up a lot of time, only to find out in many cases that the alert was benign, and doesn’t require intervention. TotalView Security Operations Manager is a security orchastration, automation, and response (SOAR) product designed to optimize event response, reduce wasted time on false positives, and provide a faster path to quarantine and remediation.

Immediately upon an indication of suspicious activity, the Security Operations Manager dashboard provides almost instant details for the potentially compromised asset: the switch and port it is connected to, what it is (operating system, manufacturer), who is logged into it, what security groups/access they have, what Indicators of Compromise (IoC) are detected, and what destination(s) this asset is talking to on or outside the network, and whether any of these locations could be a known malicious or C&C (Command and Control) destination. With information presented, the option to quickly quarantine the asset is presented, and is as simple as shutting down the switch port with the click of a button. All of this information is sourced natively, with no installed agents, no need for SPAN ports, or network taps. It is all done thorugh NetFlow, SNMP, and WMI (Windows Management Instrumentation).

In roughly 60 seconds, enough information is presented to enable you to make a swift, informed decision on what action to take, and saves countless minutes our hours of correlating information from disparate tools or infrastructure in order to determine if there is in fact a problem. Should this end user workstation suddenly start talking to a known bad IP in North Korea? Probably not! Shut it down.

PathSol_1

Totalview Security Operations Manager doesn’t stop there, and Tim walked us through an in-depth demo of their solution.

Device Vulnerability Reporting

It would be almost too easy to insert a Buzz Lightyear meme captioned with “Vulnerabilities. Vulnerabilities everywhere…” because it’s true. Just a few days ago (as of this writing), Microsoft’s Patch Tuesday saw the release of 111 fixes for various vulnerabilites, the third largest in Microsoft’s history. Keeping up with patches and software updates for any size network can be a daunting task, and more often than not, there is simply not enough time to patch absolutely everything. We must pick and choose what gets patched by evaluating risk, and triaging updates based on highest risk or exposure.

PathSol_4

TotalView Security Operations Manager is able to provide constant monitoring of all of your network assets for operating system or device vulnerabilities by referencing the NIST Vulnerability Database (NVD) every 24 hours, identifying those with a known vulnerability, and allowing you to dig deeper into the CVE to assist with risk assesment.

Communications Monitoring and Geographic Risk Profiling

Do you know which of your devices are talking to each other? Do you know where in the world your devices are sending data? These are both questions that can sometimes be difficult to answer without some baseline understanding of all of the traffic across your network. With Communications Policy Monitoring and Alerting, Security Operations Manager is able to trigger an alert when a device starts communicating with another device that it shouldn’t be talking to, based on policies you define.

The Geographic Risk profiling looks at where your devices are communicating globally, presented in an easy to understand map view, quickly showing if and when you may have an asset that is sending data somewhere it shouldn’t. The Chord View within the dashboard breaks out the number of flows by country, which presents a nice quick visual, giving you an idea of the percentage of your data is flowing to appropriate vs. questionable destinations.

PathSol_5

New Device Discovery and Interrogation

Not everyone has a full Network Access Control (NAC) system in place. Let’s be honest, they’re not simple to set up, and can often be responsible for locking out legitimate devices from accessing the network at inconvenient times. Without NAC, network operators are often blind to new devices being connected. With Security Operations Manager, in the even that new devices are connected, they are discovered, and interrogated to find out what they are, and what they are communicating with. This gives tremendous flexibility to monitor random items being connected, and making it simple to decide on how they should be treated.

PathSol_2

Rapid Deployment

Touting a 30 minute deployment, with only a single 80MB Windows VM required, this seems to good to be true, right? Maybe. There are some dependancies here that, if not already in place, will require some ground work to get all of the right information flowing to the tool. As Tim mentions, there are no requirements for agents to be installed, or taps, but that all of the data is sourced natively via SNMP, NetFlow and WMI. This means, all you need to provide the Security Operations Manager VM is SNMP access to all of your routers, switches, firewalls, gateways, etc. as well as access to the NetFlow data, and WMI credentials for your Windows environment. Setting all of that up, if it’s not already in place, will take some planning, and time. It’s especially important to ensure that SNMP is set up correctly, and securely. Here, the ability of Security Operations Manager to be able to gather 100% of the data from your network relies on the fact that you correctly configured and prepared 100% of your devices for these protocols.

Final Thoughts

Every so often I will come away from a product presentation and really feel like it’s a product that was meant for me, or other folks who find themselves on smaller teams but still managing decent-sized infrastructure. IT teams tend to run slim, and the prevalence of automation, and the need for it have justified some of the lower staffing ratios seen throughout the industry. Less so in large enterprise, but in mid-size or smaller enterprise networks, tools like Security Operations Manager help reduce the noise, and expedite decision making when it comes to monitoring and identifying problematic or compromised devices within the network.

PathSolutions have evolved what began as a tool for network administrators, and added insights for voice/telecom administrators, into a product that now takes all of the data they were already collecting from all of your infrastructure, and boils it down to something quickly parsed and understood by security administrators. Even better if you happen to fill all three of those roles on your infrastructure team.

It’s surpisingly simple, lightweight, and very quick to get up and running. I’m looking forward to diving deeper into Security Operations Manager’s sandbox myself, and invite you to as well.

Check out the full presentation and demo from Security Field Day 3.

Also, feel free to take a look at the PathSolutions Sandbox to try it yourself.

Nerdy Bullets

– All written in C/C++
– Backend storage is SQL Lite
– 13 months data retention (default) – but can be scaled, or descaled based on specific needs
– Data cleanup is done via SQL scripts, and can be customized based on your retention needs
– API integration with some firewall vendors (Palo Alto, as an example) to poll detailed data where SNMP is lacking
– Integrated NMAP to scan devices on the fly
– IP geolocation db updated every 24 hours
– Flow support – NetFlow, sFlow, and (soon) JFlow
– Security intelligence feeds from Firehall

Cisco Catalyst Wifi, Take Two

On November 13th, Cisco announced their next-generation wireless platform with the release of the Catalyst 9800 Series Wireless Controller.

You read that right, the next WLC platform from Cisco is running on Catalyst and expands Cisco’s DNA-Center architecture into the wireless space.

The Catalyst 9800 controllers come in a variety of form factors. The option for a standalone hardware controller is still here with the 9800-40 and 9800-80, or the 9800 series can be run as a VM in a private or public cloud. A third option is now to run embedded wireless on the Catalyst 9k series switches.

Embedded wireless controllers on Catalyst switches…that sounds familiar, doesn’t it?

Cisco made a similar move a few years ago with an architecture called Converged Access. This embedded the wireless controller functionality into IOS XE on the 3650 and 3850 access switches. For various reasons, it did not live up to expectations, and Cisco killed it in IOS XE Everest 16.5.1a in late 2017.

Cisco and Aironet

Cisco acquired Aironet Wireless Communications in 1999 for $799M. Since then, Cisco wireless access points have generally been referred to as “Aironet” products by name. This includes the software that runs on the wireless controllers and access points, AireOS.

AireOS came from Cisco’s acquisition of Airespace in 2005. Airespace were the developers of the AP/Controller model and the Lightweight Access Point Protocol (LWAPP), which was the precursor to CAPWAP.

(Credit to Jake Snyder for correcting me on the origins of AireOS)

Whatever AireOS version is running on your wireless controller is the same that you have on your access points. Cisco has developed the platform to be what it is today, and very little of it remains what was once the original AireOS.

With this iteration, or rather re-invention of the Wireless Controller, Cisco have highlighted three key improvements to their predecessor wireless software.

Always-On

Controller redundancy is always critical to prevent downtime in the event of failure. Here, Cisco are touting stateful switch over with an active standby model in which client state is maintained across the standby controller, offering no downtime for clients in the event of a failure.

Patches and minor software updates now will not change the base image of the controller. Updates can be done without client downtime. Patches for specific AP models can be done without affecting the base image or other access point models with per-AP device packs. These are installed to the controller and then pushed only to the model of AP they are for.

New AP models can also be joined to the controller without impact to the overall base image with the AP device packs, allowing new hardware to join an existing environment without a major upgrade.

Citing “no disruption” base image/version upgrades, the new 9800 controllers can be updated independently of the access points, whereas previously the software version running on the controller and access points was coupled. Upgrades were done to the controller, and then pushed to the access points. More often than not, this resulted in interruption to clients on affected access points, some rebooting of the controller and AP’s was inevitable, and quite often, some orphaned access points that never quite upgraded properly or failed to rejoin the controller.

Cisco have made many improvements to the upgrade process over the years, including staged firmware upgrades, however in large wireless deployments, firmware upgrades would not generally be considered zero-downtime.

With the new controller architecture using an RF-based intelligent rolling upgrade process, Cisco has aimed at eliminating some of these issues. During the upgrade process, the standby or secondary controller is first upgraded to the new image. You can then specify a percentage of access points you would like upgraded at once (5%-25%), and the controller then determines which AP’s should be upgraded using the AP neighbor information and # of clients on each AP. APs with no clients are upgraded first. Clients on access points that are to be upgraded are steered toward neighboring access points in order to prevent interruption in service.

The idea of steering clients to other access points or 5Ghz radios instead of 2.4Ghz radios isn’t new, and because I’m not a wireless expert I won’t comment on exactly how it’s done, but it is my understanding that it is difficult to guarantee that the client will “listen” to the steering mechanism. I feel even with this intelligent RF behind this upgrade process, some clients will inevitably experience a loss of connectivity during the upgrade process.

Once the access point is upgraded, it then joins the already-upgraded controller, and resumes servicing clients.

After all access points are joined to the upgraded controller, the primary controller begins its upgrade process.

Secure

Encrypted Traffic Analytics was first announced as part of the Catalyst 9K switch launch, and uses advanced analytics and Cisco Stealthwatch to detect malware in encrypted flows, without the need for SSL decryption. ETA is now available for wireless traffic on the 9800 platform, if deployed in a centralized model, meaning all wireless traffic is tunneled back to the controller.

This is a great feature considering the only other option for gaining visibility into encrypted traffic is usually some form of sketchy certificate man-in-the-middle voodoo. In many situations this works okay for corporate domain-joined machines as here you can control the certificate trusts, but if you provide wireless to any BYOD devices or to the general public in any way, this often results in people not using your wireless because of certificate issues.

Deploy Anywhere

Cisco is offering a lot of flexibility in deployment options for this new wireless controller.

Branch offices can look at the embedded software controller on Catalyst 9K switches for up to 200 APs, and 4K clients.

Edit: Since the original publication of this post, I’ve clarified that the option to run the 9800 controller on a Catalyst 9K switch is only available as an SD-Access Fabric Mode deployment option. SD-Access requires DNA Center. This is an expensive proposition for what could truly have been a fantastic option for small/medium branch office deployments.

Private or public cloud options are available on KVM, VMware, Cisco ENCS, and will be available on AWS. These options support 1000, 3000, and up to 6000 APs, and 10K, 32K, and 64K clients. The AWS public cloud option only supports FlexConnect deployment models, which makes sense as tunneling all client traffic back to your controller in this case would get expensive quickly.

Physical appliance options include the 9800-40 at 2000 APs, 32K clients and 40Gbps (4x10Gbps interfaces), as well as the 9800-80 at 6000 APs, 64K clients, and 80Gbps (8x10Gbps interfaces). The 9800-80 also has a modular option which allows for GigE, 10GigE, 40GigE, and 100GigE uplinks.

Each of these options have identical setup, configuration, management, and features.

Lessons Learned?

Overall, the presentation of this new wireless platform seems solid. Cisco have acknowledged the problems with Converged Access, and have seem to have checked off all of the missing boxes from that first attempt. Feature parity was a big one, and Cisco insists here that all features will be the same up to the existing controller software version 8.8 (current version is 8.5 at the time if this post), so that would give Cisco and their customers quite a bit of time to flesh out the new architecture.

Now, AireOS isn’t going to disappear suddenly. Cisco have said that they are going to continue to develop and support the existing line of controllers and AireOS software, until they can be sure that this new architecture has been successfully adapted by their customers. Customers who previously bought into Converged Access may not be lining up to be the first customers to try out the new platform, but the popularity of the Catalyst 9K switches should provide a good foundation for the embedded controller to gain a foothold.

You can check out Cisco’s presentation at Networking Field Day 19 here:

 

Free Cisco CCNA Lab Guide

I’ve just had a look at the free Cisco CCNA Lab Guide from Neil Anderson at the Flackbox blog. The eBook contains complete configuration lab exercises and solutions that help prepare you for and pass the Cisco CCNA Routing and Switching exam (200-125). It’s also useful as a configuration reference for Cisco routers and switches even if you’re not interested in taking the exam.

The eBook contains 350 pages with 25 complete lab exercises along with solutions which cover everything on the latest 200-125 CCNA and 100-105 and 200-105 ICND exams. The lab exercises can be run completely for free on your laptop – no additional equipment is necessary.

The guide contains full instructions on how to install the software and also download links for the lab start-up files, so you can immediately get into the hands on practice that will help you learn the material and pass the exam.

Below is the list of the 25 lab exercises included in the eBook:

  • The IOS Operating System
  • The Life of a Packet
  • The Cisco Troubleshooting Methodology
  • Cisco Router and Switch Basics
  • Cisco Device Management
  • Routing Fundamentals
  • Dynamic Routing Protocols
  • Connectivity Troubleshooting
  • RIP Routing Information Protocol
  • EIGRP Enhanced Interior Gateway Routing Protocol
  • OSPF Open Shortest Path First
  • VLANs and Inter-VLAN Routing
  • DHCP Dynamic Host Configuration Protocol
  • HSRP Hot Standby Router Protocol
  • STP Spanning Tree Protocol
  • EtherChannel
  • Port Security
  • ACL Access Control Lists
  • NAT Network Address Translation
  • IPv6 Addressing
  • IPv6 Routing
  • WAN Wide Area Networks
  • BGP Border Gateway Protocol
  • Cisco Device Security
  • Network Device Management

The different lab exercises help you explore Cisco IOS operating system Command Line Interface (CLI) navigation. Each has a guided walkthrough of the IOS command line interface and exercises that will familiarise you with Cisco IOS configuration. The labs are presented in two parts – first the lab exercise and then the detailed answer key.

Neil wanted the guide to be completely free and as simple to use as possible so it uses the free software GNS3 and Packet tracer for all the exercises. GNS3 is the best software for routing labs while Packet Tracer is the best for switching labs.

The downloadable start-up files load in either GNS3 or Packet Tracer so you can get up and running with the labs immediately. But if you have your own physical lab, you can refer to the topology diagrams and use them as instructions on cabling it up.

The guide also contains troubleshooting tips that will further expand your networking knowledge. These are explained in a logical manner to give you a systematic way of troubleshooting issues as they arise.

You can download the guide for free at https://www.flackbox.com/cisco-ccna-lab-guide to take your networking skills up a notch and further your career.

Forward Thinkers, Forward Networks.

Maintenance windows. Let’s be honest, they suck. If you ask any network admin they will likely tell you the midnight maintenance windows are their least favorite part of the job. They are a necessity due to the very nature of what we do, which is build, operate and maintain large, complex networks, because any changes that are made can have far-reaching, and often unpredictable impact. Impact to production systems that we must avoid whenever possible. So, we schedule downtime and amp up our caffeine intake for an evening of changes and testing whatever we may have broken.

No matter how meticulous you are in your planning, no matter how well you know the subtle intricacies of your environment, something, somewhere is going to go wrong. Even if you are one of the lucky few to have a lab environment in which to test changes, it’s often not even close to the scale of your actual network.

But, what if you had a completely accurate, full-scale model of your network, and could test those changes without having to risk your production network? A break/fix playground that would allow you to vet any changes you needed to make, which would in turn, allow you the peace of mind of shorter, smoother maintenance windows, or perhaps (GASP!) no maintenance windows at all?

Go ahead, break it.

That’s what Forward Networks’ co-founders David Erickson and Brandon Heller want you to do within their Forward Platform, as they bring about a new product category they call Network Assurance:

“Reducing the complexity of networks while eliminating the human error, misconfiguration, and policy violations that lead to outages.”

At Network Field Day 13, only a few days after Forward Networks came out of stealth, we had the privilege of hearing, for the first time, exactly who and what Forward Networks was, and how their product would “accelerate an industry-wide transition toward networks with greater flexibility, agility, and automation, driven by a new generation of network control software.”

David Erickson, CEO and co-founder, spoke to how they have recognized that modern networks are complex, made up of hundreds if not thousands of devices, are often heterogeneous, and can contain millions of lines of configuration, rules, and policy. The tools we have to manage these networks are outdated (ping, traceroute, SNMP, etc.) and the time spent as a network admin going through the configuration of these devices looking for problems is overwhelming at times. As a result, a significant portion of outages in today’s networks are caused by simple human error, which has far-reaching impact to business, and brand.

This is not a simulation or emulated model of your network, but a full-scale replica, in software, that you can use to review, verify and test against, without risk to production systems. The algorithm they use claims to trace through every port in your network to determine where every possible packet could go within the network as it is presently configured. The “all packet”.

Applications

The three applications that were demonstrated for us were Search, Verify, and Predict.

Search – think “Google” for your network. Search devices and behavior within and interactive topology.

Verify – See if your network is doing what you think it should be doing. All policy is applied with some intent, is your intent being met?

Predict – When you identify the need for a change, how can you be sure the change you make will work? How do you know that change won’t break something else? Test your proposed changes against the copy of your network and see exactly what the impacts will be.

Forward Search

Brandon Heller offered an in-depth demo of these tools, beginning with Search. Looking at a visual overview of the demo network, he was able to query in very simple terms for specific traffic. In this case traffic from the Internet, to his web servers. In a split second, Search zoomed in on a subset of the network topology, showing exactly where this traffic would flow. Diving further into the results, each device would then show the rules or configuration that allowed this traffic across the device in an intuitive step-through menu that traced the specified path through the entire network, and highlighted the relevant configuration or code.

This was all done in a few seconds, on a heterogeneous topology of Juniper, Arista, and Cisco devices.

Normally, tracing the path through the network would require a network admin, with knowledge of each of those vendors, to manually test with tools like ping and traceroute, and also comb through each configuration device-by-device along the path he or she thought was the correct one, in order to verify the traffic was flowing properly.

The response time on the queries was snappy,  and Brandon explained this was due to the fact that, like a search engine, everything about the network was indexed ahead of time, making queries almost instantaneous.

Forward Verify

It’s one thing to understand how your network should behave, and another to be able to test and confirm this behavior. Forward Verify has two ways of doing this. The first is a library of predefined checks that identify common configuration errors. Things like duplex consistency, etc. that are fairly common, yet easy to miss configuration errors.

The second is with network-specific policy checks. Here once again, a simple to understand intuitive query verified that bidirectional traffic to and from the Internet could get to the web servers over via http and ssh.

When there is a failure, a link is provided which allows you to drill down into the pertinent devices and their configuration and see where your policy check is failing.

Forward Predict

When a problem is identified or a change to the network configuration is necessary, Forward Predict is the final tool in the suite, and in my opinion, the most important one, as it allows you to test a change against your modeled network to see what impact it will have. This is huge, as typically changes are planned, implemented and then tested in a production environment in a change or maintenance window.

Forward Predict, while it may not eliminate the need for proper planning and implementation, allows you to build and test configuration changes in what is essentially a fully duplicated sandbox model of your exact environment. This is going to make those change windows a lot less painful as you already know what the outcome will be, rather than troubleshooting problems that weren’t anticipated when the changes were planned.

Moving “Forward”

A common sentiment among NFD delegates during this presentation was that Forward Networks’ product did some amazing things, however we wondered if there was an opportunity here to move this product one step further and have it actually implement or make the changes to the network, after the changes have been vetted by Forward Predict.

Forward Adjust, perhaps?

Understandably, this is going to involve a lot of testing, especially in light of the fact that Forward is completely vendor-neutral and touts the ability to work with complex, mixed environments. Making changes in those types of environments adds a lot of responsibility to this platform, and with that comes risk. Risk that most engineers might be a little skeptical to entrust to a single platform.

Time will tell, and I look forward to hearing more about Forward Networks’ development over the upcoming months, and see where the Network Assurance platform takes us.

Check out the entire presentation over at Tech Field Day, including a fantastic demonstration from Behram Mistree on how Forward Verify can help mitigate and diagnose outages in complex, highly resilient networks.

 

 

Netool – Pocket Sized Network Tester and Analyzer

As network engineers/analysts/administrators, we’re always looking to add to our list of tools. Whether these are pieces of software, tidbits of script, or physical tools, anything that helps us in the performance of our day to day work is something we tend to hang on to and use again and again. More often than not these tools are manifested out of a need to make a specific task more efficient, or less mundane, especially if you don’t have a junior analyst around to give all that work to.

One such task is identifying or tracing a switch port. In a perfect world, all of the network drops in a building would have accurate labels that never fade or fall off, that correspond precisely to the switch and port that they connect to, and cables that never get arbitrarily moved between ports, and of course accurate port descriptions on the switches themselves. In this world, there’s no need for any kind of tool to trace a cable or drop, is there?

Sadly that perfect world rarely exists. Even in new construction the natural entropy of networks ensures that wall jack labels, punch panels, and switch ports all become muddled, and more often than not one or more of those pieces is incorrect. This leads to a need to verify and ensure the information you have is accurate, and the need for another tool.

Now, cable tracing isn’t new, and tools for tracing cables have been around for a very long time. Often these come in the form of a probe and tone set, where one device is connected to the cable and it sends a tone along the wires, which can then be traced with a probe that listens for this tone. One simply waves the magic wand around all of your cables and wait for the one that provides tone, that must be the right cable! Well, not so fast, as crosstalk sometimes causes that tone to carry to several other cables in a bundle, and that tone you hear might not be the “real” one. That aside, it’s a tedious, manual practice, and can waste a lot of time if you have to repeat the task with several ports.

Companies like Fluke Networks have, over the years, developed some very nice tools for cable testing and verification. Many of these can be fairly expensive however, and perhaps outside the budget of an independant network consultant or other IT professional.

netool-logo

Enter Netool. This Indiegogo campaign touts the “World’s smallest network analyzer, testing and mapping tool”. When I came across this product on my Twitter feed I was very interested in learning more. This tiny tool will connect to and analyze a data port or cable, and provide switch and network information to your smartphone. It can provide information gleaned from protocols such as CDP (Cisco Discovery Protocol) and LLDP (Link Layer Discovery Protocol) which can provide switch port, VLAN, switch host name and IP information. It will test for DHCP services and display a leased IP as well as default gateway.

Check our their campaign video:

According to the list of campaign perks, one of these will cost $160 USD + shipping, or a special early bird price of $130 USD + shipping. There is also currently a Beta Tester perk that will get one of these in your hands before anyone else for only $99 USD + shipping.

Compare this to a Fluke Networks Linksprinter 100 at $215.99 USD (as listed on CDW) this seems like a great deal. As far as I can tell the only features the Netool lacks in comparison to the entry-level Linksprinter would be PoE detection, and perhaps some support for additional protocols such as EDP (Extreme Discovery Protocol) or BDP (Brocade Discovery Protocol). All of these that perhaps could come via a software update in the future.

I would encourage any of you in the market for a lightweight, hand-held, network testing and port mapping tool to check out the Netool web site, and consider a contribution to their Indiegogo campaign if this device is something you could see being part of your toolkit.

Hyperscale Networking for the Masses

In my career I’ve typically been responsible for plumbing together networks for branch, campus, and (very) small enterprise networks that had datacenters that were defined by single-digit rack numbers. So, when I’m reading or watching news about datacenter networking I often have a difficult time putting this into perspective, especially when the topic is focused on warehouse and football field sized datacenters. This might explain why I have not spent a lot of time working with or learning about Software Defined Networking, because it seems to me that SDN is a solution to a problem of scale, and scale isn’t something I’ve had to deal with.

As networks grow, management of configuration and policy eventually becomes ungainly and increasingly difficult to keep consistent. Having to log into 100, 200, even 1000 devices to make a change is cumbersome, and so we as networkers seek to automate this process in some way. There have been applications and tools developed over the years that leverage existing management protocols like SNMP and others to provide a single-pane view to managing changes to your network, but once again these don’t scale to the size and scope that we’re talking about with SDN.

Taken to the extreme, SDN and Open Networking have allowed companies like Facebook and Google to actually define and design their own data center infrastructure, using merchant silicon. The argument here being that Moore’s Law is coming to an end. Commodity hardware is catching up to or has caught up to custom built silicon and the premium that many were willing to pay for these custom ASICs is no longer required in order to stay on the cutting edge of data networking.

Amin Vahdat, Fellow and Technical Lead for Networking at Google spoke about this at the Open Networking Summit earlier this year, and contributed to a paper on Google’s Datacenter Network for Sigcomm ’15. In both presentations, Amin outlines how Google has, over the course of the last 7-8 years, achieved 1.3 Pbps of bisection bandwidth in their current datacenters with their home-grown Jupiter platform. I would encourage you to check out both the video and the paper to learn more.

ONS 2015 Keynote – Amin Vahdat
Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Google’s Datacenter Network

This application of SDN is dramatic. Few organizations have the ability, or need, to develop their own SDN solution for their own use. So how can this same scale-out model be applied to these other, smaller datacenters?

Recently I was invited to attend Networking Field Day 10 in San Jose, and we had an opportunity to visit Big Switch Networks. Rob Sherwood, CTO for Big Switch, spoke about some of the same principals around SDN, citing the Facebook and Google examples, and explained that there was “a tacit assertion that the incumbent vendors, the products that they build do not work for companies at this scale.”

Their solution? Big Cloud Fabric, designed to offer hyperscale-style networking to any enterprise. It is designed around the same three principals seen in Google’s infrastructure:

1) Merchant Silicon
2) Centralized Control
3) Clos Topology

Operating on 1U white-box/bare-metal switches running Switch Light OS, the leaf-spine topology is managed through the Big Cloud Fabric Controller. Several deployment options exist including integration with Openstack and VMware, and based on the current Broadcom chip being used, can scale out to up to 16 racks of compute resources per controller pair. Even if you only have half a dozen racks today, BCF provides scalability, and economy.

You can watch Rob’s presentation on BCF here:

One of the other things Big Switch Networks has done is launch Big Switch Labs, which provides an opportunity to test drive their products and for those of us who don’t work in large(ish) datacenters, a venue for getting your hands on a true SDN product in a lab environment. It’s a great way to gain insight into some of the problems SDN is aimed at solving and provides a fantastic demonstration of some of the capabilities and scalability that the Big Switch Fabric can offer.

BSLabs

If you’re just getting your feet wet with SDN, and/or Open Networking and want a brain-melting crash course on how it operates and scales in some of the world’s largest, most powerful datacenters, give Big Switch Labs a test drive. Big Cloud Fabric provides datacenter management and control modeled around the same principals as other massive hyperscale fabrics, but designed to be “within reach” for today’s enterprise customers and their own datacenter workloads.

Updated Study Plans

It’s been a while since I’ve sat down and really mapped out where my studies are going. The last several months have been rather crazy, with some family issues, having a new baby (not me, my wife), and changing jobs. As things settle down I am going to re-focus on completing my CCNP, and continue along the VCP-NV track, while also planning to upgrade my current VCP5-DCV to the VCP6-DCV.

CCNP R&S Progress-O-Meter:

SWITCH – 4/6/2013

TSHOOT – 1/24/2015

ROUTE – In progress…(again)

On Pretentiousness

At the 2015 Cisco Live Welcome Keynote, I was fortunate enough to tag along with some other Cisco Champions and Social Media folks who were provided advance seating to the event. This gave us an opportunity to see some of the behind-the-scenes last-minute preparation that goes into the presentation and experience the production from a different perspective than we would traditionally experience as members of the general audience.

During this time a number of us were using Twitter and Periscope to share the experience and provide sounds and images from within the room using the #CLUS hashtag. One such photo included a panoramic view of the stage from the perspective of our seating area, which gave a great view of the stage and the entire production area.

This photo received a response that began a very unfortunate exchange on Twitter:

“Is this what pretentiousness looks like?”

The exchange degraded into vulgar personal attacks and references to genitalia and the person who initiated with the comment above has since deleted their Twitter account. Not surprising, considering how far south this conversation went.

Some Clarity

For those who missed it, essentially we (those who had been granted advance access to the keynote) were accused of “showing off” to the rest of Twitter. This accusation came from a former member of the Cisco Champions team, and someone who I had personally been following very early on since joining Twitter, and who had been, at least until this incident, a respected member of the social media community.

I can’t speak to the motivation behind the comments made, but I can say with certainty that nothing we were sharing with the rest of the community was or has ever been meant as bragging, or showing off. As part of the Cisco Champions team, or any other Social Media group, the intent is to share and provide insight to the community as a whole. It serves to involve as many people as possible in an inclusive manner, not as an exclusive, pretentious group.

Evidence of this is clear in that the group of people gathering and socializing at the Twitter Lounge and Social Media Hub over the years has grown exponentially. And let’s face it, many of us in this industry are fairly introverted, and if there were some underlying sense of cliquishness or exclusivity, we wouldn’t be welcoming new faces to the events year after year.

Ultimately whether you are a member of Cisco Champions, VMware vExpert, Microsoft MVP, EMC Elect or any other similar group, the goal is engagement rather than exclusion. These ladies and gentlemen are purposed to participate and grow additional engagement with the community at large.

Now, this is a definite give and take relationship and there is some work involved – as a member of one of these groups you are going to spend some of your personal time engaged and involved in the community whether it is through blogging, webinars, podcasts, etc. and the reward or benefit from this is perhaps some exclusive access whether it’s VIP seating at an event, or a sneak-preview of a new product release or product updates.

Let’s call these what they are, perks. It’s a fair trade for the effort involved in creating content, but it is not there to cause any kind of divide in the community, but rather to highlight the benefits of becoming more involved.

Final Thoughts

If you can’t say something nice…

Electronic communication, whether it be email, text, Twitter, etc. all tend to distance the creator from their audience. It’s well-known on the Internet that many people have a strong sense of anonymity and thus the “keyboard warriors” are born, those who feel they can say whatever they want to whomever they want without fear of repercussion or reprisal.

Sometimes this feeling carries over to a medium in which you aren’t entirely anonymous, and whatever you say is going to be a part of your online resume or footprint, and could have lasting effects in the long-term.

I believe the source of these comments understands this, and this is at least part of the reason these comments were removed and ultimately their Twitter account was deleted.

It’s also evidence that they don’t have the integrity to stand by their comments.

For those of us who continue to participate in events like Cisco Live as members of the larger Social Media community, I believe we will continue to share and engage those around us by sharing content and insight. If you see something that makes you stop and say “I’d like to be part of that” then by all means, join us.