The Continuing Evolution of Forward Networks – Networking Field Day 34

Check out those backdrops!

Digital Twin

I was very fortunate to be a delegate at Networking Field Day 13, all the way back in 2016. This was a milestone event for Forward, as this was their official move out of “stealth” as a Silicon Valley startup. Their initial presentation was impressive, and the Forward Networks platform offered something I had not seen before, an accurate digital copy of your network, which you could query to understand paths and flows, and test proposed changes to prevent misconfiguration.

Fast forward to Networking Field Day 34, and Forward Networks are presenting again, now to highlight the maturity of their product, and how they are now integrating AI and LLM to allow for an even better experience with natural language queries, and enhanced visibility into your network digital twin.

At it’s core, the Forward Networks platform remains a full digital twin of your environment, allowing you to search, query, verify, and predict how traffic is behaving, and will behave in your environment. They are vendor-agnostic, meaning you can easily have a mix of Cisco, Juniper, HPE-Aruba, Arista, etc. and still leverage the power of the platform. A simple local agent crawls your network with SSH credentials (or via API if you have devices that don’t support SSH) and builds the snapshot of your network, which you can then import into the tool, and begin working with.

Having evolved quite a lot since 2016, Forward Networks now included integrations with the 3 major cloud providers, and security tie-ins to platforms like Rapid7 and Tenable to identify CVEs that may impact your network devices. Now they have taken the next step, with integration of Generative AI with AI Assist as part of the Network Query Engine (NQE).

AI Assist now allows the use of AI LLM to generate queries against the network model. These queries can be saved for use later in your own repository of queries, or you can also use a number of pre-packaged queries out of the box. The reverse is also true, and you can use Summary Assist to analyze a query and provide a plain language summary of what it is doing.

Proving the Negatives

If you’ve been in networking for any length of time, you know the feeling of having to “defend” the network because it’s the first thing that gets blamed when something isn’t working. We’re constantly having to prove a negative, which is sometimes hard to do. It can involve a lot of jumping around your network in the CLI, pinging and checking routes, doing packet captures, etc. and there’s no easy way to translate a lot of these methods into a simple to understand view of your network, and where the traffic is or is not going.

The Forward Networks platform provides a simple, easy to understand analysis and view of traffic flow across your network in a 100% mathematically accurate carbon copy. Queries can be copied and shared, so now you can send a link to your Dev team and show them that, despite their initial assessment with no troubleshooting or factual information, it is *not* the network.

Contining Forward

The team at Forward Networks continue to evolve and strengthen their platform, and the integration of AI LLM with AI Assist and Network Query Engine is a perfect fit. In an era where everyone is trying to shoehorn AI into their product, whether or not it makes sense to, this is an excellent example of what is still a very immature technology, put to good use.

If you want to learn more, and check out a customer testimonial around automation and cost savings from one of Forward Networks’ biggest customers, you can watch the recordings from the presentations here.

She’s a Keeper! Keeper Security presents at Security Field Day 7.

I never thought I’d leave you, 1Password…

Password Managers aren’t a new technology. Arguably the Excel 97 spreadsheet in your corporate file share labelled DefinitelyNotPasswords.xls was an early form of Password Manager (but only if it was password-protected). The architecture behind them hasn’t really evolved much over the years, they’ve just moved away from being flat files of all your most coveted secrets, protected by yet another code word/phrase, into nicely presented applications, or cloud solutions that simply scramble your bucket of passwords with some fancy encryption algorithm.

There are different storage models, each with their own pros and cons. Storing the passwords locally means they aren’t sitting in the cloud on someone else’s infrastructure, but that also means if your device is compromised, those passwords could be compromised as well. Store the passwords in the cloud, and well, they’re in the cloud, on hardware you don’t own or manage, and what happens when you lose communication to that infrastructure? You might not be able to update your vault. What happens in case of a breach? Has all of your data been exposed? There are certainly examples of this happening, as well as vulnerabilities in local versions of various password managers.

Ultimately there is no silver bullet, no perfect solution that is going to be completely impervious to a code vulnerability, a hack, or poor infrastructure security. But still, having a properly organized password manager, regardless of which model you choose, is still better than not having one at all.

This week at Security Field Day 7, I was introduced to Keeper. I’d heard of them before but hadn’t really dug into their product very much as I had chosen and settled into a password manager quite a few years ago, and truthfully had no complaints, and no reason to look elsewhere. I’ve been a 1Password user for several years, and even brought 1Password to my employer where we’ve adopted the enterprise version.

What stood out the most for me with the presentation from Keeper was that it seemed to be the first password manager that was purpose built from the ground up to be an enterprise-grade security tool. Many of today’s popular subscription-based password managers, 1Password included, evolved from a free product, aimed at consumers. That doesn’t mean they’re not secure, but that features were developed with the consumer in mind first, not the enterprise. Some of the more enterprise-y features they might have now may seem to have been tacked on as an afterthought, or to simply check off a box that might get the product adopted into enterprise.

Zero-Knowledge and Zero-Trust

Keeper has taken security of customer data very seriously, as they should, however their discussion around encryption and their methods used to protect and store password and secret data was next level. Keeper has absolutely no knowledge of a user’s master password, or stored passwords/secrets, as the keys to encrypt and decrypt this data are only stored on the user’s device. The data in your vault isn’t protected with a single key pair either, every single record in the vault is encrypted with its own keys. Those keys are then wrapped in another key if contained in a shared folder.

Their credentials are solid, and they are the only FIPS 140-2 validated password manager that I am aware of.

If you’re really into the nerdy side of encryption, check out all the details here.

“Keeper is the most secure, certified, tested and audited password security platform in the world. We are the only SOC2 and ISO27001 certified password management solution in the industry and Privacy Shield Compliant with the U.S. Department of Commerce’s EU-U.S. Privacy Shield program, meeting the European Commission’s Directive on Data Protection.”

Authentication and 2FA

The list of features they support is exhaustive, from SSO support via SAML 2.0 with any identity provider you can think of, to biometric support that includes Windows Hello, TouchID, FaceID, and Android. Aall of these options feature their Zero Knowledge model that completely protects your information in flight during the authentication process.

Two-Factor Authentication enforcement is available, along with Role-Based Access Control. Keeper supports all popular 2FA methods with your authenticator of choice, including Google Authenticator, Microsoft Authenticator, Duo, RSA, or FIDO2 keys like Yubikey. They even have their own integration with wearable technology like Apple Watch and Android Wear devices through KeeperDNA.

Even better, you can just use Keeper for all your 2FA codes and stop having to use 3-4 different apps for your TOTP/OTP supported logins.

Unfortunately, they also support SMS for 2FA, which I’d personally like to see more products completely remove as an option. This is likely my only complaint about Keeper but I understand it’s an option some people insist on using.

BreachWatch

With credential stuffing and spraying attacks on the rise, it is vital that passwords can be checked against known breached passwords. Keeper offers a feature called BreachWatch for checking vault information against known breached information on the Dark Web. You will be alerted to change a known breached password if there is a match.

Hard to Switch, but…

The list of features Keeper offers goes on. If you can think of it, they likely already have it, and if not, they’re working on it.

If you take your credential security seriously, you’re using a password manager, and 2FA wherever you can. Once you’ve gotten yourself into a particular product like this, it can be a daunting task to switch to a new one. I myself have over 1200 items in my 1Password vaults, and thus far I’ve had no compelling reason to think about migrating my secrets to another platform. Until now.

Keeper’s presentation at Security Field Day 7 truly has me considering signing up for a trial at the very least.

Check out their presentation over at Tech Field Day.

Rapid Incident Reponse with PathSolutions Security Operations Manager – Security Field Day 3

As I delve further and further into “all things security” along my career path, it has become clear to me that one of the key skills a good Security Professional must have is the ability to filter out noise, and focus on identifying the critical pieces of information that require action, while safely being able to ignore the rest. Modern security tools, whether they are Firewalls, IDS/IPS, Proxies, Web Application Firewalls, Content Filters, etc. all collect, report, and alert on a lot of information. It can be overwhelming, and this is especially true for smaller, flatter IT teams that perhaps don’t have a dedicated Security Operations Center (SOC), or even an actual Security Team. Quite often, the “Security Team” is one person, and that person may also fill the role of Network Administrator, Server Administrator, or any number of other roles that some larger IT teams might have distributed across several individuals.

In these situations, having a tool or process that can consolidate and help with filtering and focusing the important data is key to being able to avoid information paralysis – the idea of having too much information to really be able to act on any of it in a meaningful way. This is SIEM – or Security Information and Event Management. Now, I’ve found SIEM can be interchangably used as a noun when referring to a specific tool that performs this fuction, or as a verb when describing the act of processing the data from multiple sources into actionable information. In either case, the end result is the most critical – the ability to gather data from multiple sources, and render it down to something useful, and actionable.

This week at Security Field Day 3, I was fortunate to participate in a fantastic conversation with PathSolutions CTO Tim Titus, as he presented TotalView Security Operations Manager and its capabilities as a SecOps tool that can greatly improve awareness and response time to security events within your network.

60 Second Decisions

Investigating alerts can be tedious, and can take up a lot of time, only to find out in many cases that the alert was benign, and doesn’t require intervention. TotalView Security Operations Manager is a security orchastration, automation, and response (SOAR) product designed to optimize event response, reduce wasted time on false positives, and provide a faster path to quarantine and remediation.

Immediately upon an indication of suspicious activity, the Security Operations Manager dashboard provides almost instant details for the potentially compromised asset: the switch and port it is connected to, what it is (operating system, manufacturer), who is logged into it, what security groups/access they have, what Indicators of Compromise (IoC) are detected, and what destination(s) this asset is talking to on or outside the network, and whether any of these locations could be a known malicious or C&C (Command and Control) destination. With information presented, the option to quickly quarantine the asset is presented, and is as simple as shutting down the switch port with the click of a button. All of this information is sourced natively, with no installed agents, no need for SPAN ports, or network taps. It is all done thorugh NetFlow, SNMP, and WMI (Windows Management Instrumentation).

In roughly 60 seconds, enough information is presented to enable you to make a swift, informed decision on what action to take, and saves countless minutes our hours of correlating information from disparate tools or infrastructure in order to determine if there is in fact a problem. Should this end user workstation suddenly start talking to a known bad IP in North Korea? Probably not! Shut it down.

PathSol_1

Totalview Security Operations Manager doesn’t stop there, and Tim walked us through an in-depth demo of their solution.

Device Vulnerability Reporting

It would be almost too easy to insert a Buzz Lightyear meme captioned with “Vulnerabilities. Vulnerabilities everywhere…” because it’s true. Just a few days ago (as of this writing), Microsoft’s Patch Tuesday saw the release of 111 fixes for various vulnerabilites, the third largest in Microsoft’s history. Keeping up with patches and software updates for any size network can be a daunting task, and more often than not, there is simply not enough time to patch absolutely everything. We must pick and choose what gets patched by evaluating risk, and triaging updates based on highest risk or exposure.

PathSol_4

TotalView Security Operations Manager is able to provide constant monitoring of all of your network assets for operating system or device vulnerabilities by referencing the NIST Vulnerability Database (NVD) every 24 hours, identifying those with a known vulnerability, and allowing you to dig deeper into the CVE to assist with risk assesment.

Communications Monitoring and Geographic Risk Profiling

Do you know which of your devices are talking to each other? Do you know where in the world your devices are sending data? These are both questions that can sometimes be difficult to answer without some baseline understanding of all of the traffic across your network. With Communications Policy Monitoring and Alerting, Security Operations Manager is able to trigger an alert when a device starts communicating with another device that it shouldn’t be talking to, based on policies you define.

The Geographic Risk profiling looks at where your devices are communicating globally, presented in an easy to understand map view, quickly showing if and when you may have an asset that is sending data somewhere it shouldn’t. The Chord View within the dashboard breaks out the number of flows by country, which presents a nice quick visual, giving you an idea of the percentage of your data is flowing to appropriate vs. questionable destinations.

PathSol_5

New Device Discovery and Interrogation

Not everyone has a full Network Access Control (NAC) system in place. Let’s be honest, they’re not simple to set up, and can often be responsible for locking out legitimate devices from accessing the network at inconvenient times. Without NAC, network operators are often blind to new devices being connected. With Security Operations Manager, in the even that new devices are connected, they are discovered, and interrogated to find out what they are, and what they are communicating with. This gives tremendous flexibility to monitor random items being connected, and making it simple to decide on how they should be treated.

PathSol_2

Rapid Deployment

Touting a 30 minute deployment, with only a single 80MB Windows VM required, this seems to good to be true, right? Maybe. There are some dependancies here that, if not already in place, will require some ground work to get all of the right information flowing to the tool. As Tim mentions, there are no requirements for agents to be installed, or taps, but that all of the data is sourced natively via SNMP, NetFlow and WMI. This means, all you need to provide the Security Operations Manager VM is SNMP access to all of your routers, switches, firewalls, gateways, etc. as well as access to the NetFlow data, and WMI credentials for your Windows environment. Setting all of that up, if it’s not already in place, will take some planning, and time. It’s especially important to ensure that SNMP is set up correctly, and securely. Here, the ability of Security Operations Manager to be able to gather 100% of the data from your network relies on the fact that you correctly configured and prepared 100% of your devices for these protocols.

Final Thoughts

Every so often I will come away from a product presentation and really feel like it’s a product that was meant for me, or other folks who find themselves on smaller teams but still managing decent-sized infrastructure. IT teams tend to run slim, and the prevalence of automation, and the need for it have justified some of the lower staffing ratios seen throughout the industry. Less so in large enterprise, but in mid-size or smaller enterprise networks, tools like Security Operations Manager help reduce the noise, and expedite decision making when it comes to monitoring and identifying problematic or compromised devices within the network.

PathSolutions have evolved what began as a tool for network administrators, and added insights for voice/telecom administrators, into a product that now takes all of the data they were already collecting from all of your infrastructure, and boils it down to something quickly parsed and understood by security administrators. Even better if you happen to fill all three of those roles on your infrastructure team.

It’s surpisingly simple, lightweight, and very quick to get up and running. I’m looking forward to diving deeper into Security Operations Manager’s sandbox myself, and invite you to as well.

Check out the full presentation and demo from Security Field Day 3.

Also, feel free to take a look at the PathSolutions Sandbox to try it yourself.

Nerdy Bullets

– All written in C/C++
– Backend storage is SQL Lite
– 13 months data retention (default) – but can be scaled, or descaled based on specific needs
– Data cleanup is done via SQL scripts, and can be customized based on your retention needs
– API integration with some firewall vendors (Palo Alto, as an example) to poll detailed data where SNMP is lacking
– Integrated NMAP to scan devices on the fly
– IP geolocation db updated every 24 hours
– Flow support – NetFlow, sFlow, and (soon) JFlow
– Security intelligence feeds from Firehall

Cisco Catalyst Wifi, Take Two

On November 13th, Cisco announced their next-generation wireless platform with the release of the Catalyst 9800 Series Wireless Controller.

You read that right, the next WLC platform from Cisco is running on Catalyst and expands Cisco’s DNA-Center architecture into the wireless space.

The Catalyst 9800 controllers come in a variety of form factors. The option for a standalone hardware controller is still here with the 9800-40 and 9800-80, or the 9800 series can be run as a VM in a private or public cloud. A third option is now to run embedded wireless on the Catalyst 9k series switches.

Embedded wireless controllers on Catalyst switches…that sounds familiar, doesn’t it?

Cisco made a similar move a few years ago with an architecture called Converged Access. This embedded the wireless controller functionality into IOS XE on the 3650 and 3850 access switches. For various reasons, it did not live up to expectations, and Cisco killed it in IOS XE Everest 16.5.1a in late 2017.

Cisco and Aironet

Cisco acquired Aironet Wireless Communications in 1999 for $799M. Since then, Cisco wireless access points have generally been referred to as “Aironet” products by name. This includes the software that runs on the wireless controllers and access points, AireOS.

AireOS came from Cisco’s acquisition of Airespace in 2005. Airespace were the developers of the AP/Controller model and the Lightweight Access Point Protocol (LWAPP), which was the precursor to CAPWAP.

(Credit to Jake Snyder for correcting me on the origins of AireOS)

Whatever AireOS version is running on your wireless controller is the same that you have on your access points. Cisco has developed the platform to be what it is today, and very little of it remains what was once the original AireOS.

With this iteration, or rather re-invention of the Wireless Controller, Cisco have highlighted three key improvements to their predecessor wireless software.

Always-On

Controller redundancy is always critical to prevent downtime in the event of failure. Here, Cisco are touting stateful switch over with an active standby model in which client state is maintained across the standby controller, offering no downtime for clients in the event of a failure.

Patches and minor software updates now will not change the base image of the controller. Updates can be done without client downtime. Patches for specific AP models can be done without affecting the base image or other access point models with per-AP device packs. These are installed to the controller and then pushed only to the model of AP they are for.

New AP models can also be joined to the controller without impact to the overall base image with the AP device packs, allowing new hardware to join an existing environment without a major upgrade.

Citing “no disruption” base image/version upgrades, the new 9800 controllers can be updated independently of the access points, whereas previously the software version running on the controller and access points was coupled. Upgrades were done to the controller, and then pushed to the access points. More often than not, this resulted in interruption to clients on affected access points, some rebooting of the controller and AP’s was inevitable, and quite often, some orphaned access points that never quite upgraded properly or failed to rejoin the controller.

Cisco have made many improvements to the upgrade process over the years, including staged firmware upgrades, however in large wireless deployments, firmware upgrades would not generally be considered zero-downtime.

With the new controller architecture using an RF-based intelligent rolling upgrade process, Cisco has aimed at eliminating some of these issues. During the upgrade process, the standby or secondary controller is first upgraded to the new image. You can then specify a percentage of access points you would like upgraded at once (5%-25%), and the controller then determines which AP’s should be upgraded using the AP neighbor information and # of clients on each AP. APs with no clients are upgraded first. Clients on access points that are to be upgraded are steered toward neighboring access points in order to prevent interruption in service.

The idea of steering clients to other access points or 5Ghz radios instead of 2.4Ghz radios isn’t new, and because I’m not a wireless expert I won’t comment on exactly how it’s done, but it is my understanding that it is difficult to guarantee that the client will “listen” to the steering mechanism. I feel even with this intelligent RF behind this upgrade process, some clients will inevitably experience a loss of connectivity during the upgrade process.

Once the access point is upgraded, it then joins the already-upgraded controller, and resumes servicing clients.

After all access points are joined to the upgraded controller, the primary controller begins its upgrade process.

Secure

Encrypted Traffic Analytics was first announced as part of the Catalyst 9K switch launch, and uses advanced analytics and Cisco Stealthwatch to detect malware in encrypted flows, without the need for SSL decryption. ETA is now available for wireless traffic on the 9800 platform, if deployed in a centralized model, meaning all wireless traffic is tunneled back to the controller.

This is a great feature considering the only other option for gaining visibility into encrypted traffic is usually some form of sketchy certificate man-in-the-middle voodoo. In many situations this works okay for corporate domain-joined machines as here you can control the certificate trusts, but if you provide wireless to any BYOD devices or to the general public in any way, this often results in people not using your wireless because of certificate issues.

Deploy Anywhere

Cisco is offering a lot of flexibility in deployment options for this new wireless controller.

Branch offices can look at the embedded software controller on Catalyst 9K switches for up to 200 APs, and 4K clients.

Edit: Since the original publication of this post, I’ve clarified that the option to run the 9800 controller on a Catalyst 9K switch is only available as an SD-Access Fabric Mode deployment option. SD-Access requires DNA Center. This is an expensive proposition for what could truly have been a fantastic option for small/medium branch office deployments.

Private or public cloud options are available on KVM, VMware, Cisco ENCS, and will be available on AWS. These options support 1000, 3000, and up to 6000 APs, and 10K, 32K, and 64K clients. The AWS public cloud option only supports FlexConnect deployment models, which makes sense as tunneling all client traffic back to your controller in this case would get expensive quickly.

Physical appliance options include the 9800-40 at 2000 APs, 32K clients and 40Gbps (4x10Gbps interfaces), as well as the 9800-80 at 6000 APs, 64K clients, and 80Gbps (8x10Gbps interfaces). The 9800-80 also has a modular option which allows for GigE, 10GigE, 40GigE, and 100GigE uplinks.

Each of these options have identical setup, configuration, management, and features.

Lessons Learned?

Overall, the presentation of this new wireless platform seems solid. Cisco have acknowledged the problems with Converged Access, and have seem to have checked off all of the missing boxes from that first attempt. Feature parity was a big one, and Cisco insists here that all features will be the same up to the existing controller software version 8.8 (current version is 8.5 at the time if this post), so that would give Cisco and their customers quite a bit of time to flesh out the new architecture.

Now, AireOS isn’t going to disappear suddenly. Cisco have said that they are going to continue to develop and support the existing line of controllers and AireOS software, until they can be sure that this new architecture has been successfully adapted by their customers. Customers who previously bought into Converged Access may not be lining up to be the first customers to try out the new platform, but the popularity of the Catalyst 9K switches should provide a good foundation for the embedded controller to gain a foothold.

You can check out Cisco’s presentation at Networking Field Day 19 here:

 

Free Cisco CCNA Lab Guide

I’ve just had a look at the free Cisco CCNA Lab Guide from Neil Anderson at the Flackbox blog. The eBook contains complete configuration lab exercises and solutions that help prepare you for and pass the Cisco CCNA Routing and Switching exam (200-125). It’s also useful as a configuration reference for Cisco routers and switches even if you’re not interested in taking the exam.

The eBook contains 350 pages with 25 complete lab exercises along with solutions which cover everything on the latest 200-125 CCNA and 100-105 and 200-105 ICND exams. The lab exercises can be run completely for free on your laptop – no additional equipment is necessary.

The guide contains full instructions on how to install the software and also download links for the lab start-up files, so you can immediately get into the hands on practice that will help you learn the material and pass the exam.

Below is the list of the 25 lab exercises included in the eBook:

  • The IOS Operating System
  • The Life of a Packet
  • The Cisco Troubleshooting Methodology
  • Cisco Router and Switch Basics
  • Cisco Device Management
  • Routing Fundamentals
  • Dynamic Routing Protocols
  • Connectivity Troubleshooting
  • RIP Routing Information Protocol
  • EIGRP Enhanced Interior Gateway Routing Protocol
  • OSPF Open Shortest Path First
  • VLANs and Inter-VLAN Routing
  • DHCP Dynamic Host Configuration Protocol
  • HSRP Hot Standby Router Protocol
  • STP Spanning Tree Protocol
  • EtherChannel
  • Port Security
  • ACL Access Control Lists
  • NAT Network Address Translation
  • IPv6 Addressing
  • IPv6 Routing
  • WAN Wide Area Networks
  • BGP Border Gateway Protocol
  • Cisco Device Security
  • Network Device Management

The different lab exercises help you explore Cisco IOS operating system Command Line Interface (CLI) navigation. Each has a guided walkthrough of the IOS command line interface and exercises that will familiarise you with Cisco IOS configuration. The labs are presented in two parts – first the lab exercise and then the detailed answer key.

Neil wanted the guide to be completely free and as simple to use as possible so it uses the free software GNS3 and Packet tracer for all the exercises. GNS3 is the best software for routing labs while Packet Tracer is the best for switching labs.

The downloadable start-up files load in either GNS3 or Packet Tracer so you can get up and running with the labs immediately. But if you have your own physical lab, you can refer to the topology diagrams and use them as instructions on cabling it up.

The guide also contains troubleshooting tips that will further expand your networking knowledge. These are explained in a logical manner to give you a systematic way of troubleshooting issues as they arise.

You can download the guide for free at https://www.flackbox.com/cisco-ccna-lab-guide to take your networking skills up a notch and further your career.

Forward Thinkers, Forward Networks.

Maintenance windows. Let’s be honest, they suck. If you ask any network admin they will likely tell you the midnight maintenance windows are their least favorite part of the job. They are a necessity due to the very nature of what we do, which is build, operate and maintain large, complex networks, because any changes that are made can have far-reaching, and often unpredictable impact. Impact to production systems that we must avoid whenever possible. So, we schedule downtime and amp up our caffeine intake for an evening of changes and testing whatever we may have broken.

No matter how meticulous you are in your planning, no matter how well you know the subtle intricacies of your environment, something, somewhere is going to go wrong. Even if you are one of the lucky few to have a lab environment in which to test changes, it’s often not even close to the scale of your actual network.

But, what if you had a completely accurate, full-scale model of your network, and could test those changes without having to risk your production network? A break/fix playground that would allow you to vet any changes you needed to make, which would in turn, allow you the peace of mind of shorter, smoother maintenance windows, or perhaps (GASP!) no maintenance windows at all?

Go ahead, break it.

That’s what Forward Networks’ co-founders David Erickson and Brandon Heller want you to do within their Forward Platform, as they bring about a new product category they call Network Assurance:

“Reducing the complexity of networks while eliminating the human error, misconfiguration, and policy violations that lead to outages.”

At Network Field Day 13, only a few days after Forward Networks came out of stealth, we had the privilege of hearing, for the first time, exactly who and what Forward Networks was, and how their product would “accelerate an industry-wide transition toward networks with greater flexibility, agility, and automation, driven by a new generation of network control software.”

David Erickson, CEO and co-founder, spoke to how they have recognized that modern networks are complex, made up of hundreds if not thousands of devices, are often heterogeneous, and can contain millions of lines of configuration, rules, and policy. The tools we have to manage these networks are outdated (ping, traceroute, SNMP, etc.) and the time spent as a network admin going through the configuration of these devices looking for problems is overwhelming at times. As a result, a significant portion of outages in today’s networks are caused by simple human error, which has far-reaching impact to business, and brand.

This is not a simulation or emulated model of your network, but a full-scale replica, in software, that you can use to review, verify and test against, without risk to production systems. The algorithm they use claims to trace through every port in your network to determine where every possible packet could go within the network as it is presently configured. The “all packet”.

Applications

The three applications that were demonstrated for us were Search, Verify, and Predict.

Search – think “Google” for your network. Search devices and behavior within and interactive topology.

Verify – See if your network is doing what you think it should be doing. All policy is applied with some intent, is your intent being met?

Predict – When you identify the need for a change, how can you be sure the change you make will work? How do you know that change won’t break something else? Test your proposed changes against the copy of your network and see exactly what the impacts will be.

Forward Search

Brandon Heller offered an in-depth demo of these tools, beginning with Search. Looking at a visual overview of the demo network, he was able to query in very simple terms for specific traffic. In this case traffic from the Internet, to his web servers. In a split second, Search zoomed in on a subset of the network topology, showing exactly where this traffic would flow. Diving further into the results, each device would then show the rules or configuration that allowed this traffic across the device in an intuitive step-through menu that traced the specified path through the entire network, and highlighted the relevant configuration or code.

This was all done in a few seconds, on a heterogeneous topology of Juniper, Arista, and Cisco devices.

Normally, tracing the path through the network would require a network admin, with knowledge of each of those vendors, to manually test with tools like ping and traceroute, and also comb through each configuration device-by-device along the path he or she thought was the correct one, in order to verify the traffic was flowing properly.

The response time on the queries was snappy,  and Brandon explained this was due to the fact that, like a search engine, everything about the network was indexed ahead of time, making queries almost instantaneous.

Forward Verify

It’s one thing to understand how your network should behave, and another to be able to test and confirm this behavior. Forward Verify has two ways of doing this. The first is a library of predefined checks that identify common configuration errors. Things like duplex consistency, etc. that are fairly common, yet easy to miss configuration errors.

The second is with network-specific policy checks. Here once again, a simple to understand intuitive query verified that bidirectional traffic to and from the Internet could get to the web servers over via http and ssh.

When there is a failure, a link is provided which allows you to drill down into the pertinent devices and their configuration and see where your policy check is failing.

Forward Predict

When a problem is identified or a change to the network configuration is necessary, Forward Predict is the final tool in the suite, and in my opinion, the most important one, as it allows you to test a change against your modeled network to see what impact it will have. This is huge, as typically changes are planned, implemented and then tested in a production environment in a change or maintenance window.

Forward Predict, while it may not eliminate the need for proper planning and implementation, allows you to build and test configuration changes in what is essentially a fully duplicated sandbox model of your exact environment. This is going to make those change windows a lot less painful as you already know what the outcome will be, rather than troubleshooting problems that weren’t anticipated when the changes were planned.

Moving “Forward”

A common sentiment among NFD delegates during this presentation was that Forward Networks’ product did some amazing things, however we wondered if there was an opportunity here to move this product one step further and have it actually implement or make the changes to the network, after the changes have been vetted by Forward Predict.

Forward Adjust, perhaps?

Understandably, this is going to involve a lot of testing, especially in light of the fact that Forward is completely vendor-neutral and touts the ability to work with complex, mixed environments. Making changes in those types of environments adds a lot of responsibility to this platform, and with that comes risk. Risk that most engineers might be a little skeptical to entrust to a single platform.

Time will tell, and I look forward to hearing more about Forward Networks’ development over the upcoming months, and see where the Network Assurance platform takes us.

Check out the entire presentation over at Tech Field Day, including a fantastic demonstration from Behram Mistree on how Forward Verify can help mitigate and diagnose outages in complex, highly resilient networks.

 

 

Hyperscale Networking for the Masses

In my career I’ve typically been responsible for plumbing together networks for branch, campus, and (very) small enterprise networks that had datacenters that were defined by single-digit rack numbers. So, when I’m reading or watching news about datacenter networking I often have a difficult time putting this into perspective, especially when the topic is focused on warehouse and football field sized datacenters. This might explain why I have not spent a lot of time working with or learning about Software Defined Networking, because it seems to me that SDN is a solution to a problem of scale, and scale isn’t something I’ve had to deal with.

As networks grow, management of configuration and policy eventually becomes ungainly and increasingly difficult to keep consistent. Having to log into 100, 200, even 1000 devices to make a change is cumbersome, and so we as networkers seek to automate this process in some way. There have been applications and tools developed over the years that leverage existing management protocols like SNMP and others to provide a single-pane view to managing changes to your network, but once again these don’t scale to the size and scope that we’re talking about with SDN.

Taken to the extreme, SDN and Open Networking have allowed companies like Facebook and Google to actually define and design their own data center infrastructure, using merchant silicon. The argument here being that Moore’s Law is coming to an end. Commodity hardware is catching up to or has caught up to custom built silicon and the premium that many were willing to pay for these custom ASICs is no longer required in order to stay on the cutting edge of data networking.

Amin Vahdat, Fellow and Technical Lead for Networking at Google spoke about this at the Open Networking Summit earlier this year, and contributed to a paper on Google’s Datacenter Network for Sigcomm ’15. In both presentations, Amin outlines how Google has, over the course of the last 7-8 years, achieved 1.3 Pbps of bisection bandwidth in their current datacenters with their home-grown Jupiter platform. I would encourage you to check out both the video and the paper to learn more.

ONS 2015 Keynote – Amin Vahdat
Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Google’s Datacenter Network

This application of SDN is dramatic. Few organizations have the ability, or need, to develop their own SDN solution for their own use. So how can this same scale-out model be applied to these other, smaller datacenters?

Recently I was invited to attend Networking Field Day 10 in San Jose, and we had an opportunity to visit Big Switch Networks. Rob Sherwood, CTO for Big Switch, spoke about some of the same principals around SDN, citing the Facebook and Google examples, and explained that there was “a tacit assertion that the incumbent vendors, the products that they build do not work for companies at this scale.”

Their solution? Big Cloud Fabric, designed to offer hyperscale-style networking to any enterprise. It is designed around the same three principals seen in Google’s infrastructure:

1) Merchant Silicon
2) Centralized Control
3) Clos Topology

Operating on 1U white-box/bare-metal switches running Switch Light OS, the leaf-spine topology is managed through the Big Cloud Fabric Controller. Several deployment options exist including integration with Openstack and VMware, and based on the current Broadcom chip being used, can scale out to up to 16 racks of compute resources per controller pair. Even if you only have half a dozen racks today, BCF provides scalability, and economy.

You can watch Rob’s presentation on BCF here:

One of the other things Big Switch Networks has done is launch Big Switch Labs, which provides an opportunity to test drive their products and for those of us who don’t work in large(ish) datacenters, a venue for getting your hands on a true SDN product in a lab environment. It’s a great way to gain insight into some of the problems SDN is aimed at solving and provides a fantastic demonstration of some of the capabilities and scalability that the Big Switch Fabric can offer.

BSLabs

If you’re just getting your feet wet with SDN, and/or Open Networking and want a brain-melting crash course on how it operates and scales in some of the world’s largest, most powerful datacenters, give Big Switch Labs a test drive. Big Cloud Fabric provides datacenter management and control modeled around the same principals as other massive hyperscale fabrics, but designed to be “within reach” for today’s enterprise customers and their own datacenter workloads.

On Pretentiousness

At the 2015 Cisco Live Welcome Keynote, I was fortunate enough to tag along with some other Cisco Champions and Social Media folks who were provided advance seating to the event. This gave us an opportunity to see some of the behind-the-scenes last-minute preparation that goes into the presentation and experience the production from a different perspective than we would traditionally experience as members of the general audience.

During this time a number of us were using Twitter and Periscope to share the experience and provide sounds and images from within the room using the #CLUS hashtag. One such photo included a panoramic view of the stage from the perspective of our seating area, which gave a great view of the stage and the entire production area.

This photo received a response that began a very unfortunate exchange on Twitter:

“Is this what pretentiousness looks like?”

The exchange degraded into vulgar personal attacks and references to genitalia and the person who initiated with the comment above has since deleted their Twitter account. Not surprising, considering how far south this conversation went.

Some Clarity

For those who missed it, essentially we (those who had been granted advance access to the keynote) were accused of “showing off” to the rest of Twitter. This accusation came from a former member of the Cisco Champions team, and someone who I had personally been following very early on since joining Twitter, and who had been, at least until this incident, a respected member of the social media community.

I can’t speak to the motivation behind the comments made, but I can say with certainty that nothing we were sharing with the rest of the community was or has ever been meant as bragging, or showing off. As part of the Cisco Champions team, or any other Social Media group, the intent is to share and provide insight to the community as a whole. It serves to involve as many people as possible in an inclusive manner, not as an exclusive, pretentious group.

Evidence of this is clear in that the group of people gathering and socializing at the Twitter Lounge and Social Media Hub over the years has grown exponentially. And let’s face it, many of us in this industry are fairly introverted, and if there were some underlying sense of cliquishness or exclusivity, we wouldn’t be welcoming new faces to the events year after year.

Ultimately whether you are a member of Cisco Champions, VMware vExpert, Microsoft MVP, EMC Elect or any other similar group, the goal is engagement rather than exclusion. These ladies and gentlemen are purposed to participate and grow additional engagement with the community at large.

Now, this is a definite give and take relationship and there is some work involved – as a member of one of these groups you are going to spend some of your personal time engaged and involved in the community whether it is through blogging, webinars, podcasts, etc. and the reward or benefit from this is perhaps some exclusive access whether it’s VIP seating at an event, or a sneak-preview of a new product release or product updates.

Let’s call these what they are, perks. It’s a fair trade for the effort involved in creating content, but it is not there to cause any kind of divide in the community, but rather to highlight the benefits of becoming more involved.

Final Thoughts

If you can’t say something nice…

Electronic communication, whether it be email, text, Twitter, etc. all tend to distance the creator from their audience. It’s well-known on the Internet that many people have a strong sense of anonymity and thus the “keyboard warriors” are born, those who feel they can say whatever they want to whomever they want without fear of repercussion or reprisal.

Sometimes this feeling carries over to a medium in which you aren’t entirely anonymous, and whatever you say is going to be a part of your online resume or footprint, and could have lasting effects in the long-term.

I believe the source of these comments understands this, and this is at least part of the reason these comments were removed and ultimately their Twitter account was deleted.

It’s also evidence that they don’t have the integrity to stand by their comments.

For those of us who continue to participate in events like Cisco Live as members of the larger Social Media community, I believe we will continue to share and engage those around us by sharing content and insight. If you see something that makes you stop and say “I’d like to be part of that” then by all means, join us.

Looking for the next opportunity!

As some of you already know, I’ve recently become a free agent and have begun the search for my next great job. I’ve learned a lot about the “brave new world” of job hunting over the last couple of weeks, and to be honest it’s been a bit scary.

I’ve only had two employers over the past 18 years, and in both cases I was laid off due to staff reductions. I jokingly told my wife that someday I’d like to experience what it is like to actually quit a job, rather than having a job quit me. After leaving my role as a Sr. Business Manager with Convergys (a large contact center organization) in June of 2007, it had been 10 years since I had applied for and interviewed for a job, and I found the idea of re-writing my resume and hitting the pavement to find a new career rather daunting. In October that same year I was contacted by someone I had worked with previously who was now working in the HR department of a public school division. She explained their IT department needed some temporary help for about 3 months, and although my role when we worked together had been in Operations, she knew I had technical skills, and wanted to know if I was interested. I accepted, thinking the work would pay some bills in the short-term while I continued to tweak my resume and find full-time employment. Instead, I re-kindled my passion for hands-on technical work, and ended up accepting a permanent position in January 2008, and worked there until April of this year.

That was my first taste of social networking and finding a job.

8 years later it seems leveraging the power of social media and professional networks is the absolute best way to find that new role. The general consensus seems to be that sending your resume out electronically to a bunch of automated HR systems, or submitting your CV and cover letter through a web form is not going to get you that position you wanted. It is frightening to read articles on the subject of modern recruiting explaining how automated software scans and scores your resume and rejects it before a real human being ever reads it. How prevalent that actually is I don’t know, but I do know that when I send a resume via email, I often envision it being packed away in a warehouse and forgotten like the Ark of The Covenant in Raiders of the Lost Ark.

Decisions…decisions

I have a few decisions to make. The first and foremost seems to be deciding what I want to do next. In 1997 I began doing technical support in a call center and fast-forward 10 years later I had relocated twice, and been promoted through various roles within that same organization with experience in training, client services, project management, and operations management. I had managed multi-million dollar budgets, with staff and operations spanning multiple cities in Canada and the US. I had developed my business skills, and although each of the projects I had worked on over the years were technical in nature, I had not really been hands-on with technology in some time. I knew I wanted to get back to that.

As a Systems Analyst with my most recent employer, a K-12 public school division, I had been able to spend the last 7+ years “doing IT” again. I’ve focused on networking and virtualization, and even knocked out a few certifications. The technology is what I am truly passionate about and being in a position to learn something new every day was fantastic. While it wasn’t a large infrastructure, I’ve had exposure and developed skills and experience with Cisco, HP, Dell, Microsoft, VMware, NetApp, Fortinet, and a number of other technologies. It was truly a great experience to work in a small IT shop and have access into a little bit of everything.

Somewhat parallel to this I decided to combine my business knowledge and my IT skills and started my own business 2 years ago offering managed IT services to small businesses that can’t afford their own dedicated IT staff. I’m able to partner with them and understand both the key issues that drive their business, while assisting them meet any technology needs they have. The possibility of growing the business is there, but with a family, and my wife presently on maternity leave, there is something to be said for the comfort and security of full-time employment. Mainly the steady income and benefits.

I could perhaps work for a vendor, doing pre-sales or post-sales support, and really get to know one particular technology. I could work for a reseller, which might provide exposure to a larger variety of products. Or, I could join another IT team, but if I did it would have to be a significantly larger organization. I want to experience work in a real data center, no more 2 rack switch closets with a portable AC unit that serve as one.

My “dream job” would probably be working somewhere with responsibility for a decent-sized VMware cluster, maybe on Cisco UCS or another converged/hyper-converged platform, and management of the underlying L2/L3 network infrastructure.

Wherever I go, I want to be able to make a real contribution and continue to develop myself as an IT professional. I want to ask dumb questions and learn from others and I want to be part of a great team.

Ongoing Learning

In my previous role there was some opportunity for on-the-job learning, but very little time or budget was set aside for real professional development. The reality is, in a public education environment, budgets seem to dwindle year after year and there is constant juggling between departments as to where the dollars are needed the most. Funding someone to take a $3000 course at Global Knowledge was out of the question.

That being said, I believe ongoing learning is critical, and found ways on my own to learn, play with, study and prepare for certifications. I’ve developed a fairly decent home lab, without raising too many red flags with my wife in terms of our household budget, and have been able to prepare for and pass a number of certifications over the last several years.

I’m in the process of wrapping up my CCNP R&S with one exam left (ROUTE), and completed the VCP5-DCV in December. I’ll likely focus on learning more about VMware’s NSX product and perhaps look at writing the VCP6-NV exam along with upgrading my VCP5-DCV to the VCP6 version.

Long term, I plan to dedicate myself to the challenge of the CCIE.

Success stories

I’ve read and been inspired by a couple of other folks in the industry who have used social media as a platform or jumping-off point to find their new career, and although I certainly don’t have the same sphere of influence these people have, I’m going to try to do the same. Hat tip to Keith Townsend for sharing his story over at VirtualizedGeek.com and also to Sean Thulin whose journey is told on his blog at Thulin’ Around and congratulations to both of them on their new roles.

Now, do I expect my dream job to simply fall into my lap? Of course not. I’ll be engaged in some of the more traditional methods of searching online and reaching out directly to a handful of contacts who may know of some unlisted positions. First of all however, I’ll need to tweak my resume to fool those pesky HR screening tools!

So, if you or someone you know are aware of an opportunity for a skilled, loyal (2 jobs in 18 years!) networking and virtualization professional, or simply would like to learn a little bit more about me, feel free to reach me here, or on Twitter or LinkedIn. I’d love to hear from you!

Otherwise feel free to share, retweet, or carrier pigeon this article and help me cast the net as far and wide as possible.

Two Out of Three Ain’t Bad?

The last several months have been quite a blur. My wife and I were expecting the arrival of our second child in April so way back in October 2014 I decided to spend the last few months of relative freedom catching up on some studying, in the hopes that I could knock out a few exams before some deadlines passed.

I had two goals, the first was to complete my CCNP certification as Cisco had announced the end of the current track effective January 30th, 2014. I had started and stopped studying for ROUTE so many times I was beginning to wonder if I was ever going to actually finish it. I had already passed SWITCH, and I, like many others, was saving TSHOOT for last.

The second goal was to attempt the VCP5-DCV exam. I had taken the VMware vSphere: Install, Configure, Manage course early in 2014 and had a voucher for 85% off the exam, but it had to be used by the end of 2014. I didn’t think I was prepared for it, but why waste an 85% discount? I decided to at least get a peek at the exam and gauge where I needed to focus in order to pass when I took a “real” shot at it.

 My Nemesis – ROUTE

I’ve never failed a Cisco exam more than once. Each time I’ve failed an exam I’ve taken a little time to regroup, and then focus right back on the areas I was deficient in, scheduled a re-take and passed. With ROUTE, this was not the case. I had failed it previously twice, both as my free exam at Cisco Live. Maybe it was the environment, staying in a hotel, lack of sleep, or the fact that it was “free” and something in my subconscious didn’t take it seriously, but for whatever reason I had not been able to massage a passing score out of this particular exam.

Now, my exposure to a lot of the L3 subjects has been limited, in that my day job had very little routing other than some static routes between sites and our ISP, so I had my work cut out for me starting all over again and learning OSPF, EIGRP, and BGP from scratch.

I dedicated myself beginning in October to studying for this exam. I was going to pass it if it killed me. I had Wendell Odom’s CCNP ROUTE 642-902 Official Certification Guide, I had video training from Pluralsight, INE, and CBT Nuggets, I had the Boson practice exams, I had physical lab gear, I had virtual lab gear. This was it, I was going to pass.

Not So Fast…

December came a lot quicker than I had anticipated. You see I was fighting with two deadlines, the expiration of my VCP exam voucher at the end of December, and the end of the current CCNP track of exams. I had hoped to pass ROUTE by mid-December and then take a run at the VCP exam, knowing it was just a trial run, and then finish off TSHOOT sometime in January.

By mid-December I felt I wasn’t ready for ROUTE yet, and my studying was getting more and more difficult as I read and re-read certain chapters and concepts that I just didn’t seem to grasp very well. It was time to take a break.

So, I scheduled the VCP5-DCV exam for December 29th and spent a couple of weeks re-reading Mastering VMware vSphere 5.5 by Scott Lowe and Nick Marshall, playing around in my VMware home lab, and testing myself with the MeasureUp practice exams.

By the time the 29th rolled around I actually felt pretty good. I mean, I didn’t expect to pass, but I thought maybe if the exam gods were in the spirit of the holidays, I might have a shot…

And I passed!

Back to ROUTE

Passing the VCP gave me a boost and so I re-focused on the ROUTE exam with a scheduled exam on January 16th. When exam day rolled around I felt I had a good shot at passing. The usual light nervousness hit me as I sat down at the PC and began to read through the usual Cisco exam agreement, but I focused and started the exam.

Well, I failed. and not by much. I was devastated. I had felt so prepared, but some of the simulations just caught me off guard for some reason. Back in my car I scrambled to recall areas that I needed to re-focus on and take notes, but I was seriously considering walking away from this exam for a while.

With the encouragement of a number of friends and peers on social media, I decided to at least take a run at TSHOOT before the end of January. This would at least mean I had 2 of the 3 exams under my belt and I could re-focus on the new ROUTE exam in February.

TSHOOT

I scheduled TSHOOT for January 24th, and just in case, re-scheduled ROUTE for January 29th. Knowing I could cancel up to 48 hours in advance, if I didn’t pass TSHOOT I wasn’t going to take another run at ROUTE.

I didn’t study much for TSHOOT to be honest. I’ve heard from many people it’s the type of exam you can either do, or you can’t. If you understand the L2/L3 technologies behind the topology (freely published and available from Cisco) then it all comes down to whether or not you can troubleshoot in an orderly, systematic way that eliminates possible problems, and identifies the root cause of the issue.

I did run through some of the tickets in the Boson TSHOOT practice exams, more or less to get comfortable with the format. I also did a bit of review on the “dry” subjects that would likely be part of the multiple choice questions that focused on methodologies like ITIL, etc.

When I sat the exam on the 24th I didn’t think I could feel any more relaxed. They way the exam is formatted you pretty much know if you got the ticket right or not, so by the end of the exam I was expecting to see a perfect score.

It wasn’t perfect, but it was about as close to perfect as you can get. I think I may have gotten one of the five multiple choice questions wrong, but seeing a score that high was confirmation at least that I did in fact have the skills necessary to continue with this career path. I had been pretty discouraged after failing ROUTE yet again, but this gave me the boost I needed to take another run at it.

 Re-ROUTE

I didn’t see much of my family between the 24th and 29th, I was so focused on reviewing the areas I needed to improve to pass ROUTE. I felt really good going into the exam center on the 29th.

So good in fact that I think I got over-confident. I had some repeat questions and simulations from my previous attempt and when faced with those I had the attitude “Oh yeah, I know this” and didn’t spend enough time really making sure I was answering the question correctly. I got through the exam way too quickly but 100% expected to see a passing score.

Nope.

And it was really close, too.

Looking on the Bright Side

I passed two out of three exams in a 4 month period, ending up 2/3 of the way to completing my CCNP and adding the VCP5-DCV to my list of accomplishments. I think I’m okay with that.

I’ve already purchased the new Official Certification Guide for the new 300-101 ROUTE exam, along with some practice exams, and although there are some new topics on the exam I don’t think it will be all that different from the old exam.

Two goals for this year will be to complete the CCNP and then I would like to focus on VMware’s NSX product and perhaps write the VCP-NV exam. I’ll also have to think about upgrading my VCP certification to version 6 sometime.

Certifications aren’t easy, as anyone who has ever taken one will tell you. You have to be able to take a failure and learn from it, and not get too discouraged. I know I’ll pass ROUTE, I’m stubborn that way.