Defending ESnet with ZoMbis!

Zeek is a powerful open source network security monitoring software extensively used by ESnet. Zeek (formally called Bro) was initially developed by researchers at Berkeley Lab; it allows users to identify & manage cyber threats by tracking and logging network traffic activity. Zeek operates as a passive monitor, providing a holistic view of what is transpiring in the network and on all network traffic. 

In a previous post, I presented some of our efforts in approaching the WAN security using Zeek for general network monitoring, with successes and challenges found during the process. In this blog post I’ll focus on our efforts in using Zeek as  part of security monitoring for the ESnet6 management network – ZoMbis (Zeek on Management based information system).

ZoMbis on the ESnet6 management network:

Most research and educational networks employ a dedicated management network as a best practice. The management network provides a configuration command and control layer, as well as conduits for all of the inter-routing communications between the devices used to move our critical customer data. Because of the sensitive nature of these communications, the management network needs to be protected from external and general user network traffic (websites, file transfers, etc.), and our staff needs to have detailed visibility on management network activity.

At ESnet, we typically use real IP addresses for all internal network resources, and our management network is allocated a fairly large address space block advertised in our global routing table, to help protect against opportunistic hijacking attacks. By isolating our management network from user data streams, the amount of routine background noise is vastly reduced making the use of Zeek, or any network monitoring security capability, much more effective. 

The above diagram shows an overview of the deployment strategy of Zeek on the ESnet6 management network. The blue dots in the diagram show the locations that will have equipment running Zeek instances for monitoring the network traffic on the management network. The traffic from the routers on those locations is mirrored to the Zeek instances using a spanning port, and the Zeek logs generated are then aggregated in our central security information and event logging and management system (SIEM).

Scott Campbell presented ‘Using Zeek in ESnet6 Management Network Security Monitoring’ during virtual Zeek Week held last year that explained the overall strategy for deployment of Zeek on the management network in greater detail. Some ZoMbis deployment highlights are:

ESnet 6’s new management network will use only IPv6. From a monitoring perspective this change from the traditional IPv4 poses a number of interesting challenges; In particular, IPv6 traffic employs more multicast and link-local traffic for local subnet communications. Accordingly, we are in the process of adjusting and adding to Zeek’s policy based detection scripts to support these changes in network patterns. These new enhancements and custom scripts being written by our cybersecurity team to support IPv6 will be of interest to other Zeek users and we will release them to the entire Zeek community soon. 

The set of Zeek policy created for this project can be broken out into two general groups. The first of these is protocol mechanics – particularly looking closer between layer 2 and 3 where there are a number of interesting security behaviors with IPv6.  A subset of notices that these protocol mechanic policies will provide are:

  • ICMP6_RtrSol_NotMulticat – Router solicitation not multicast
  • ICMP6_RtrAnn_NotMulticat – Router announcement should be a multicast request
  • ICMP6_RtrAnn_NewMac – Router announcement from an unknown MAC
  • ICMP6_MacIPChange – If the MAC <-> IP mapping changes
  • ICMP6_NbrAdv_NotRouter – Advertisement comes from non-router
  • ICMP6_NbrAdv_UnSolicit – Advertisement is not solicited
  • ICMP6_NbrAdv_OverRide – Advertisement without override
  • ICMP6_NbrAdv_NoRequest – Advertisement without known request

The second set of Zeek policies that have been developed in support for ZoMbis involves taking advantage of predictable management network behavioral patterns – we build policy to model anticipated behaviors and let us know if something is amiss. For example looking at DNS and NTP behavior we can identify unexpected hosts and data volumes, since we know which systems are supposed to be communicating with one another, and what patterns traffic between these components should follow.

Stay tuned for the part II of this blogpost, where I will discuss ways of using Sinkholing, together with ZoMbis, to provide better understanding and visibility of unwanted traffic upon the management network.

Zeek and stream asymmetry research at ESnet

In my previous post, we discussed use of the open-source Zeek software to support network security monitoring at ESnet.  In this post, I’ll talk a little about work underway to improve Zeek’s ability to support network traffic monitoring when faced with stream asymmetry.

This comes from recent work by two of my colleagues on the ESnet Security team.

Scott Campbell and Sam Oehlert presented ‘Running Zeek on the WAN: Experiences and solutions for large scale flow asymmetry’ during a workshop held last year at CERN Geneva that explained the phases and deployment of the Zeek-on-the-WAN (ZoW) pilot in detail.

Scott Campbell at CERN presenting ‘Running Zeek on the WAN’
The asymmetry problem on a WAN (example)

Some of the significant findings and results from this presentation are highlighted below:

  • Phase I: Initial Zeek Node Design Considerations 
    • Select locations that provide an interesting network vantage point – in the case of our ESnet network, we deployed Zeek nodes on our commodity internet peerings (eqx-sj, eqx-chi, eqx-ash) since they represent the interface to the vast majority of hostile traffic.
    • Identifying easy traffic to test with and using spanning ports to forward traffic destined to the stub network on each of the routers used for collection.
  • Phase I: Initial Lessons learned from testing and results
    • Some misconfigurations were found in the ACL prefix lists. 
    • We increased visibility into our WAN side traffic through implementation of new background methods.
    • Establishing a new process for end-to-end testing, installing and verifying Zeek system reporting. 
  • Phase II:  Prove there is more useful data to be seen
    • For phase II we moved towards collection of full peer connection records, from statistical sampling based techniques. Started running Zeek on traffic crossing the interfaces which connect ESnet network peers to the internet from the AS (Autonomous system) responsible for most notices. .
    • To get high fidelity connection information without being crushed by data volume, define a subset of packets that are interesting – zero length control packets (Syn/Syn-Ack/Fin/Rst) from peerings.
  • Phase II: Results
    • A lot of interesting activity got discovered like information leakage in syslogs, logins (and attempted logins) using poorly secure authentication protocols, and analysis of the amount of asymmetric traffic patterns gave valuable insights to understand better the asymmetric traffic problems.
  • Ongoing Phase III: Expanding the reach of traffic collection on WAN
    • We are currently in the process of deploying Zeek nodes at another three WAN locations for monitoring commodity internet peering – PNWG (peering at Seattle WA), AM-SIX (peering at Amsterdam) and LOND (peering at London)
Locations for the ZoW systems, the pink shows ongoing Phase III deployment

As our use of Zeek on the WAN side of ESnet continues to grow, the next phase to the ZoW pilot is currently being defined.  We’re working to incorporate these lessons learned on how to handle traffic asymmetry into these next phases of effort. 

Some (not all) solutions being taken into consideration include: 

  • Aggregating traffic streams at a central location to make sense out of the asymmetric packet streams and then run Zeek on the aggregated traffic, or
  • Running Zeek on the individual asymmetric streams and then aggregating these Zeek streams @ 5-tuple which will be aggregation of connection metadata rather than the connection stream itself. 

We are currently exploring these WAN solutions as part of providing better solutions to both ESnet, and connected sites.

Zeekurity at ESnet

Zeek is an open source network security monitoring software extensively used by ESnet.  Zeek (formally called Bro) was initially developed by researchers at Berkeley Lab. Zeek allows users to identify & manage cyber threats by monitoring network traffic. It acts as a passive monitoring software (NSM – Network Security Monitor), that gives a holistic view of what is transpiring in the network and gives visibility into the network traffic. 

In order to better understand network behavior and provide flexible security services, we use Zeek as an important part of our data center security architecture and are experimenting with placing Zeek clusters on various WAN high value points. This is providing technical insights as well as significant challenges. 

In this post we would present some of our efforts in approaching the WAN security using Zeek for network monitoring, with successes and challenges hit during the process and interesting things learned.

Zeek on the ESnet LAN:

Monitoring local area and data center networks is a familiar and less complex network traffic monitoring design, and ESnet is no different. The traffic flowing through the LAN networks is currently monitored using two Zeek clusters, one at Brookhaven National Lab and another for the west coast at Berkeley Lab. We have implemented BHR (black hole routing) functionality on our data center routers to block external actors which violate our established policies based on Zeek detections on both IPv4 and IPv6 protocol stacks. 

Apart from network security monitoring using “standard” Zeek detection scripts, many enhancements and custom scripts written by the ESnet Security team members serve a vital role in detecting various kinds of suspicious activity. Recently, a Zeek package – Zeek-Known-outbound contributed by Michael “Dop” Dopheide won the first prize in the Zeek Package Contest-2 held in May 2020. The package provides the ability to track and alert on outbound service usage to a list of ‘watched’ countries, and also adds the country codes for the origin and recipient hosts in one of the log files that Zeek generates called conn.log, to log all the connection attempts seen on the network. The motivation behind this work came from the discovery of few systems contacting hosts in foreign countries for package updates, and DNS services found during routine log analysis. 

Zeek on the ESnet WAN:

To augment our LAN efforts on a wider scale, we have been experimenting with monitoring the network traffic on the WAN side of the network using Zeek in order to get more visibility and to provide improved security/network services. Most of this work is experimental: iterative design changes as we use what we learn from stage 1 to stage 3 and beyond.

  • Some notable differences and challenges from typical LAN network: 
    • Data Volume: There are a large number of WAN links that run at 1-400Gb/s
    • Data Encapsulation: Data with variable length headers is problematic, so we have been employing a load balancer to address this problem. 
    • Asymmetric Data Flows: This is a hard problem to solve, especially when the network is distributed across the country. When the packets corresponding inbound and outbound flows between two network nodes follow different paths, it can be challenging to reconcile conversation activities as part of network monitoring.
    • Technical Integration: Coordinating activities between teams distributed geographically  introduces challenges, which we are developing ways to overcome.

At ESnet we thrive to push the boundaries and try innovative ways to address challenges, Zeek on the WAN is an example of that and in my next article I will discuss some ways we have been experimenting with to address above noted complex problems and specifically going into details of the research been done in addressing Asymmetric Data Flows on WAN.