Thursday, January 31, 2013

NAT64 on a hairpin interface with cisco ios 15.X

Okay if you followed my earlier post, that I  posted just last night from my notes of previous NAT and ipv6 setups. I got to thinking;


Can we conduct a NAT64 on a hairpin interface? Will I found the answer after investing about 1hr of playin around. 1st Let's  look at the design;


ipv6-lan = 2001:470:C021:1::/64
ipv4-lan = 1.1.1.0/24
NAT4_address = 5.0.0.5
NAT6_address = 2001:179:179::1 

Router =3825ISR
IOScode= ADVENTERPRISEK9-M Version 15.1(4)M4


Here's a graphical representation of the design;



The goal here; "was to do all of the  NAT on a single interface". We are using a  cisco3825 for the NAT64. It's connected to a layer2 switch with a ipv4 and ipv6 host on the same layer2 lan segment. The single gige interface, will handle both ipv6 and ipv4 traffic.


The cfg ( 3825 gig0/0 intf ) ;

   
!

interface GigabitEthernet0/0
 description this is a single dual/stacked interface both ipv4/v6

 ip address 1.1.1.253 255.255.255.0

 ip flow ingress

 ip flow egress

 duplex auto

 speed auto

 media-type rj45

 ipv6 address 2001:470:C021:1::1/64

 ipv6 enable

 ipv6 nat

end


Okay now let's look at the simple ipv6v4 nat cfg.

1st I had problems using a  source-list and pool, so I revert my cfg to a static nat for this example. I will continue into look at the issues with regards to the source list and nat pool. Once I figured that out, I will post an update, and another post on my  blog, so stay tuned.


Here's my nat rules, I left some of the old cfg in place and comment it out for your reference and pleasure ;

!
ipv6 nat translation icmp-timeout 5

ipv6 nat v4v6 source 1.1.1.2 2001:179:179::1
! the following line is one method I tried that failed, don't why but one packet was entering
!

! ipv6 nat v6v4 source list myv6 pool nat6 overload
!

ipv6 nat v6v4 source 2001:470:C021:1:21F:5BFF:FEEA:AFA 5.0.0.5
! the follow line below didn't work either during my testing
!

! ipv6 nat v6v4 pool nat6 10.0.0.2  10.0.0.2 prefix-length 30
!

ipv6 nat prefix 2001:179:179::/96


Okay here's what the  debug ipv6 nat shows;


-->
Feb  1 01:37:30.943: IP: tableid=0, s=5.0.0.5 (FastEthernet0/0), d=1.1.1.2 (FastEthernet0/0), routed via RIB

*Feb  1 01:37:30.943: IP: s=5.0.0.5 (FastEthernet0/0), d=1.1.1.2 (FastEthernet0/0), len 36, rcvd 3

*Feb  1 01:37:30.947:     ICMP type=8, code=0

*Feb  1 01:37:30.947: IP: tableid=0, s=1.1.1.2 (local), d=5.0.0.5 (FastEthernet0/0), routed via FIB

*Feb  1 01:37:30.947: IP: s=1.1.1.2 (local), d=5.0.0.5 (FastEthernet0/0), len 36, sending

*Feb  1 01:37:30.947:     ICMP type=0, code=0

and our nat translation table as seen on the single NAT hairpin interface;
;

-->
router3825#sh ipv6 nat tr ver

Prot  IPv4 source              IPv6 source

      IPv4 destination         IPv6 destination

---   ---                      ---

      1.1.1.2                  2001:179:179::1

      create 00:20:42, use 00:00:03,



tcp   5.0.0.5,61837            2001:470:C021:1:21F:5BFF:FEEA:AFA,61837

      1.1.1.2,22               2001:179:179::1,22

      create 00:00:03, use 00:00:00, left 23:59:59,



---   5.0.0.5                  2001:470:C021:1:21F:5BFF:FEEA:AFA

      ---                      ---

      create 00:01:07, use 00:00:03,


and our ipv6 nat table;

router3825>show ipv6 nat statistics
Total active translations: 4 (2 static, 2 dynamic; 2 extended)
NAT-PT interfaces:
  GigabitEthernet0/0

Hits: 10  Misses: 0
Expired translations: 36
router3825>


And here's  the cisco1841 device that I configured for my ipv4 host & for testings;

-->
ccie02#show ip int fas 0/0  | i add

  Internet address is 1.1.1.2/24

  Broadcast address is 255.255.255.255

  Helper address is not set

  Network address translation is disabled

ccie02#

 
And when we finally had things working, we could ping and ssh from my macosx host  ( ipv6) to the cisco (ipv4) and on a hair-pinned interface.

Ken-Felixs-MacBook:~ root# ping6 2001:179:179::1
PING6(56=40+8+8 bytes) 2001:470:c021:1:21f:5bff:feea:afa --> 2001:179:179::1
Request timeout for icmp_seq=0
Request timeout for icmp_seq=1
16 bytes from 2001:179:179::1, icmp_seq=2 hlim=253 time=1.719 ms
16 bytes from 2001:179:179::1, icmp_seq=3 hlim=253 time=1.568 ms
Request timeout for icmp_seq=4
16 bytes from 2001:179:179::1, icmp_seq=5 hlim=253 time=1.579 ms
16 bytes from 2001:179:179::1, icmp_seq=6 hlim=253 time=1.617 ms
16 bytes from 2001:179:179::1, icmp_seq=7 hlim=253 time=1.599 ms
16 bytes from 2001:179:179::1, icmp_seq


And we validate on the cisco1841with a cli cmd  show user after executing a ssh -6 to this device from my  macbook.
 

  -->
ccie02#sh user

    Line       User       Host(s)              Idle       Location

*  0 con 0                idle                 00:00:00

 194 vty 0     cisco      idle                 00:00:52 5.0.0.5



  Interface    User               Mode         Idle     Peer Address




So what this means,  " If  you on a lan segment that must handle ipv4 traffic, but you don't have another interface nor want to enable a sub-vlan-interface,  you can hairpin on a dual-stacked and addressed cisco router interface."

So for example, you have ipv6-only hosts and maybe a ipv4-only host ( i.e printer ), you want to integrated this into your existing layer2 segment and allow ipv6 machines to asssociated and use the printer for print function, with nat64 on a hairpin, you can easily conduct this without wasting any  Physical or Virtual-interfaces.

I hope you find this posting useful, within your ipv4-2-ipv6 migrations.

Ken Felix
Freelance Network & Security Engineer, Specialized with ipv6 migration designs and planning

kfelix "@" hyperfeed  "dot" com
  
 



BGP messages-types

This post will show you some  of the differences in these bgp  messages; Using tshark/wireshark we can monitor the BGP messages.

Here's a UPDATE message, notice how big, and how much information is within the  message?


Border Gateway Protocol
    UPDATE Message
        Marker: 16 bytes
        Length: 70 bytes
        Type: UPDATE Message (2)
        Unfeasible routes length: 0 bytes
        Total path attribute length: 43 bytes
        Path attributes
            ORIGIN: EGP (4 bytes)
                Flags: 0x40 (Well-known, Transitive, Complete)
                    0... .... = Well-known
                    .1.. .... = Transitive
                    ..0. .... = Complete
                    ...0 .... = Regular length
                Type code: ORIGIN (1)
                Length: 1 byte
                Origin: EGP (1)
            AS_PATH: 29816 16967 7018 2914 9318 38402 (17 bytes)
                Flags: 0x40 (Well-known, Transitive, Complete)
                    0... .... = Well-known
                    .1.. .... = Transitive
                    ..0. .... = Complete
                    ...0 .... = Regular length
                Type code: AS_PATH (2)
                Length: 14 bytes
                AS path: 29816 16967 7018 2914 9318 38402
                    AS path segment: 29816 16967 7018 2914 9318 38402
                        Path segment type: AS_SEQUENCE (2)
                        Path segment length: 6 ASs
                        Path segment value: 29816 16967 7018 2914 9318 38402
            NEXT_HOP: 144.223.130.2 (7 bytes)
                Flags: 0x40 (Well-known, Transitive, Complete)
                    0... .... = Well-known
                    .1.. .... = Transitive
                    ..0. .... = Complete
                    ...0 .... = Regular length
                Type code: NEXT_HOP (3)
                Length: 4 bytes
                Next hop: 144.223.130.2 (144.223.130.2)
            COMMUNITIES: 16967:666 16967:1001 16967:7018 (15 bytes)
                Flags: 0xc0 (Optional, Transitive, Complete)
                    1... .... = Optional
                    .1.. .... = Transitive
                    ..0. .... = Complete
                    ...0 .... = Regular length
                Type code: COMMUNITIES (8)
                Length: 12 bytes
                Communities: 16967:666 16967:1001 16967:7018
                    Community: 16967:666
                        Community AS: 16967
                        Community value: 666
                    Community: 16967:1001
                        Community AS: 16967
                        Community value: 1001
                    Community: 16967:7018
                        Community AS: 16967
                        Community value: 7018
        Network layer reachability information: 4 bytes
            1.238.7.0/24
                NLRI prefix length: 24
                NLRI prefix: 1.238.7.0 (1.238.7.0)


Also the common well known  bgp attributes are present, ORIGIN, COMMUNITIES,AS_PATH, NLRI information.


Here's a route -withdrawn message;

Border Gateway Protocol
    UPDATE Message
        Marker: 16 bytes
        Length: 27 bytes
        Type: UPDATE Message (2)
        Unfeasible routes length: 4 bytes
        Withdrawn routes:
            2.93.232.0/24
                Withdrawn route prefix length: 24
                Withdrawn prefix: 2.93.232.0 (2.93.232.0)
        Total path attribute length: 0 bytes


and a KeepAlive;


Border Gateway Protocol
    KEEPALIVE Message
        Marker: 16 bytes
        Length: 19 bytes
        Type: KEEPALIVE Message (4)


Notice how simple sweet this last  2 message type are ? ( not too much involved in a KA ) 

Typically a full internet view, will generate a lot of BGP message handling. Every message will generate a increment within the bgp table revision and a bgp speaker could stay busy with handling path changes and updates.


Due to the above, we need to  select higher CPU routers models and with globs of memory in order to  managed the bgp-table.

For example, the BGP table is way over 400K prefixes, as seen by this Hurricane Electric route-server;

( output trunacated )

Neighbor        V    AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
216.218.252.147 4  6939  202075     729        0    0    0 12:13:18   440036
216.218.252.148 4  6939       0       0        0    0    0 never    Active    
216.218.252.150 4  6939  181951     731        0    0    0 12:15:37   439869
216.218.252.151 4  6939  183015    1158        0    0    0 11:50:31   443386
216.218.252.153 4  6939  202289     727        0    0    0 12:09:02   440032
216.218.252.154 4  6939  239474     729        0    0    0 12:13:05   440039
216.218.252.155 4  6939       0       0        0    0    0 never    Active    
216.218.252.156 4  6939  207788     731        0    0    0 12:15:37   440034
216.218.252.157 4  6939  182187     883        0    0    0 12:01:02   439953
216.218.252.158 4  6939       0       0        0    0    0 never    Active    
216.218.252.159 4  6939  194688     832        0    0    0 11:56:47   440032




vrs the ipv6 table is way under 20K;


Neighbor        V    AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
2001:470:0:d::1 4  6939   29017    1272        0    0    0 11:46:48    11915
2001:470:0:e::1 4  6939   25214     732        0    0    0 12:16:39    11915
2001:470:0:12::1
                4  6939       0       0        0    0    0 never    Active    
2001:470:0:13::1
                4  6939   26106     732        0    0    0 12:16:37    11915
2001:470:0:16::1
                4  6939   24880     728        0    0    0 12:10:03    11915
2001:470:0:17::1
                4  6939   24497    1081        0    0    0 12:05:38    11915
2001:470:0:19::1
                4  6939   26330     730        0    0    0 12:14:29    11915
2001:470:0:1a::1
                4  6939   24978     729        0    0    0 12:13:30    11915
2001:470:0:1b::1



Bgp support the following message types;

  • Open= means just that's, we open a connection, here we pass the  router peer info, and capabilities
  • Update= update ( path change,communities,nexthop,etc....)
  • Notification =  Only seen if errors or some other events termination event
  • Route-Refresh = Typically only see when we reconfig a policy and  during any soft resets
  • KeepAlive  = Helps ensure the Neighbor are Alive ( cisco defaults to 60sec most of the time )
KA and UPDATEs are typically the normal messages always seen in any  single BGP router  & once peering has been established.  Being awared of this, and the fact that we have messages always being sent back and forth is good to know.


Ken Felix
Freelance Network/Security Engineer
kfelix  hyperfeed com



BGP origin types ( a quick explaination )

-->

 
If you ever worked with BGP, you will see the route information contain; EGP/IGP/INCOMPLETE. What does this mean?


·       IGP  =  means the prefix was originated from  routing information learned from an interior gateway protocol such as  OSPF/EIGRP/RIP





·       EGP = means the prefix originated from routing information obtained from a  the EGP protocol such as some other  eBGP router



·       INCOMPLETE = means the prefix originate mainly from  aggregate statement or via redistribution of a static route.

Take a look at this prefix as seen on a route-server at host dot net ( telnet route-server.host.net );

route-server> show ip bgp 27.89.0.0/16

BGP routing table entry for 27.89.0.0/16

Paths: (1 available, best #1, table Default-IP-Routing-Table)

  Not advertised to any peer

  13645 19151 2516

    64.135.0.1 from 64.135.0.1 (64.135.0.1)

      Origin IGP, localpref 100, valid, external, best

      Community: 13645:3111

      Last update: Mon Jan 21 13:40:18 2013

 


The above shows the originate  AS == 2516 and via IGP

Now look at the following;

route-server> show ip bgp 1.229.138.0/24

BGP routing table entry for 1.229.138.0/24

Paths: (1 available, best #1, table Default-IP-Routing-Table)

  Not advertised to any peer

  13645 19151 9318 38388

    64.135.0.1 from 64.135.0.1 (64.135.0.1)

      Origin incomplete, localpref 100, valid, external, best

      Community: 13645:3111

      Last update: Mon Jan 21 13:40:14 2013

The above originated via AS38388 but the origin information was incomplete. So we can guess it was via aggregation or a redistribution process.

And now the  following;



show ip bgp ipv4 unicast  172.16.29.0/24

BGP routing table entry for 172.16.29.0/24, version 124

Paths: (1 available, best #1, table default, not advertised to EBGP peer)

Multipath: eBGP

  Advertised to update-groups:

     65001   

  Local

    10.0.0.1 (metric 91) from 10.11.0.1 (10.11.0.1)

      Origin EGP, metric 0, localpref 220, valid, internal, best

      Community: 65001:777 no-export

      Originator: 10.11.1.26, Cluster list: 10.23.44.1

Okay the last  route was something that I created in my local LAN.  Notice in all samples, how the origin source are all unique?

Okay what does this mean in  bgp path selection? 
 
Will part of the tie breaker  when we have paths for the same destination, but with different origin type is to prefer lowest origin code.
 
IGP
 
  Than 
 
EGP
 
  Than
 
Incomplete
 
So we select paths that are from a IGP process,  over those from a EGP process and like-wise if the origin type is incomplete, is least preferred.
 
Think of it in this fashion; Who would be more trusted  if you want to call some one directly and had their  phone #?
 
 
1: the person that has his/her number and gives it to you directly?
 
or
 
2: the a phone number found in the white pages of the phonebook ?
 
or
 
3: some number found laying on the ground and we have no clue as to who/what/where that number originated from?
 
 
No this means very little in real-life, because we can enforce or control the origin type via route-map
 
i.e
 
 
Enter configuration commands, one per line.  End with CNTL/Z.
rtr1(config)#route-map setorigin permit 10
rtr1(config-route-map)#set origin?
  egp         remote EGP
  igp         local IGP
  incomplete  unknown heritage

So we can change any of the originate information or attribute.  I hope this helps you in your understanding of BGP
 
origin-types
 

 
Ken Felix
Freelance Network/Security Engineer
Kfelix     at hyperfeed d-o-t com
 
 





Wednesday, January 30, 2013

NAT-64 on cisco howto get by when you have ipv4-only machines

This post will explain how to use a cisco router for ipv6-to-ipv4 NAT-PAT operation. Let's say you have the following setup


                                               ipv6-lan----------Router------------ipv4-lan


Okay so obviously that's going to cause havoc if you ever need to do anything  with ipv6 and ipv4. In my case I  had a ipv4-only printer that I wanted to use it in my all ipv6 environment ( remember my rant on ipv6,  if you have been following my blog ? and how most SOHO printers don't understand ipv6 ?)

Will the above type of IPv6----2----IPv4 issues are very common in most SOHO/SMB/ENTERPRISE networks. So in my case, I'm using a cisco router to act as some type of  protocol translator , also known as NAT-PT or the correct term of NAT64.

Router =3825ISR
IOScode= ADVENTERPRISEK9-M Version 15.1(4)M4

Okay let's look at how simple this setup could be using PAT;

IPv6 local-lan = 2001:470:C021:1::0/64
ipv4 local-lan = 192.168.0.0/24

gw = .1 in both protocol version for each lan

Okay the router for the ipv6 is setup with RA announcement for my inside hosts, so my macosx machines will receive the prefix and default gateway from this router-adv

i.e ( ifconfig en0 )

en0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
    ether 00:1f:5b:ea:0a:fa
    inet6 fe80::21f:5bff:feea:afa%en0 prefixlen 64 scopeid 0x4
    inet6 2001:470:c021:1:21f:5bff:feea:afa prefixlen 64 autoconf
    media: autoselect (1000baseT <full-duplex,flow-control>)
    status: active


and  gateway ( output shorten )

Ken-Felixs-MacBook:downloads kenfelix1$ netstat -rn -f inet6
Routing tables

Internet6:
Destination                             Gateway                         Flags         Netif Expire
default                                 fe80::21d:70ff:fe39:7f00%en0    UGSc            en0
::1                                     ::1                             UH              lo0
2001:470:c021:1::/64                    link#4                          UC              en0
2001:470:c021:1::1                      0:1d:70:39:7f:0                 UHLW            en0
2001:470:c021:1:21f:5bff:feea:afa       0:1f:5b:ea:a:fa                 UHL             lo0


On  ipv6-lan = gi0/0 , and ipv4-lan = gi0/1 interfaces;

interface GigabitEthernet0/0
 description ipv6 lan and my test bed/lab no ipv4 address space in my lab
 no ip address
 ip nbar protocol-discovery
 ip flow ingress
 ip flow egress
 ip virtual-reassembly in
 duplex auto
 speed auto
 media-type rj45
 analysis-module monitoring
 ipv6 address 2001:470:C021:1::1/64
 ipv6 enable
 ipv6 nat

!
interface GigabitEthernet0/1
 description  my printer lan and any non ipv6 device here
 ip address 192.168.0.1 255.255.255.0
 duplex auto
 speed auto
 media-type rj45
 ipv6 enable
 ipv6 nat
end



Okay so now let's look at the ipv6 nat configuration;

1st; we define a ipv6 access-list for my  ipv6-local lan, this list would list your /64 prefix or the ipv6-addresses that you want to allow. Here, I'm allowing my  full /64 prefix

!
ipv6 access-list myv6
    remark  Ken's ipv6 internal network
    permit ipv6 2001:470:C021:1::/64 any
!



Maybe in a real setup, we would  be specific and allow just the print-server and the protocol(s) and ports.

2nd;

just like in ipv4 we assigned a source list and overload  statement, using our source-list named "myv6"

ipv6 nat v4v6 source 192.168.0.2 2001:178:178::1
ipv6 nat v6v4 source list myv6 interface GigabitEthernet0/1 overload
ipv6 nat prefix 2001:178:178::/96



So in the above lines, we specify that 2001:178:178::/96 will be the  targeted ipv6 address that our nat's will take on & when the ipv4 hosts send traffic inbound. The interface is overload against gi 0/1 ipv4 address { 192.168.0.1}

Here's what the debug ipv6 nat shows;


Jan 11 06:01:02.935: %SYS-5-CONFIG_I: Configured from console by kfelix on console
*Jan 11 06:01:04.259: IPv6 NAT: IPv6->IPv4: icmp src (2001:470:C021:1:21F:5BFF:FEEA:AFA) -> (192.168.0.1), dst (2001:178:178::1) -> (192.168.0.2)
*Jan 11 06:01:06.259: IPv6 NAT: IPv6->IPv4: icmp src (2001:470:C021:1:21F:5BFF:FEEA:AFA) -> (192.168.0.1), dst (2001:178:178::1) -> (192.168.0.2)
*Jan 11 06:01:08.259: IPv6 NAT: IPv6->IPv4: icmp src (2001:470:C021:1:21F:5BFF:FEEA:AFA) -> (192.168.0.1), dst (2001:178:178::1) -> (192.168.0.2)
*Jan 11 06:01:10.259: IPv6 NAT: IPv6->IPv4: icmp src (2001:470:C021:1:21F:5BFF:FEEA:AFA) -> (192.168.0.1), dst (2001:178:178::1) -> (192.168.0.2)
*Jan 11


and here's the nat translation table using cmd "show ipv6 nat tran verbose"

 router3825#show ipv6 nat trans verbose
Prot  IPv4 source              IPv6 source
      IPv4 destination         IPv6 destination
---   ---                      ---
      192.168.0.2              2001:178:178::1
      create 00:44:43, use 00:00:05,

icmp  192.168.0.1,2817         2001:470:C021:1:21F:5BFF:FEEA:AFA,2817
      192.168.0.2,2817         2001:178:178::1,2817
      create 00:00:43, use 00:00:15, left 00:00:44,

tcp   192.168.0.1,56312       
2001:470:C021:1:21F:5BFF:FEEA:AFA,56312
      192.168.0.2,22           2001:178:178::1,22
      create 00:00:05, use 00:00:03, left 23:59:56,

router3825#


As you can see, I have both a icmp/tcp translation from  host ; 2001:470:C021:1:21F:5BFF:FEEA:AFA to 192.168.0.2


Key points to take away;


> enable ipv6 nat on the ipv6/4 interfaces
> ipv6 cef must or I should say, should be enabled
> ipv6 unicast-routing 
> ipv4 unicast-routing
> the ipv6 nat prefix must be defined with a /96 prefix definition
> same  ipv4 based configuration that we are use to, but with ipv6 nat v6v4 and v4v6 cmds


What I 've found out with cisco, if you should run into problems, the clearing of the ipv6 nat translations,  and  the removing  and re-add the ipv6 nat statements, seems to help. NAT64 has been  picky with operations on earlier cisco codes.

You can also use the same setup and conduct static 1-2-1 mappings  v6-2-v4  and v4-to-v6. I decided on a simple PAT overload in my case and setup.

I hope this post becomes helpful for those looks at ipv6 and have ipv4 devices within their network enterprise. I 'm going to try the same setup but with a  twisted  of having a ipv4/v6 on the same interface.

So basically gi 0/0 will  be dual-stacked and with ipv4 devices NAT'd into the v6 space. Stay tuned and I will post my  success of failure :). This would be some what of a hairpin NAT.

The above NAT v6v4 is what works with DNS64, where we intercept or handle ipv6 AAAA queries, but translation the A record into  ipv6 embedded address. Than the ipv6 clients routes to the embedded ipv4 address and the NAT64 at the edge, NAT's theipv6 client into the ipv4 world. Both the NAT64 router and DNS64 dns-servers,  need access into a ipv4 address space for this to work.


Ken Felix
Feelance Network/Security Engineer + ipv6  migration specialist
kfelix   at hyperfeed  dot com







Thursday, January 24, 2013

Cisco TIP/TRICK on using a ACLs as a network monitor

How many you have been in a situations where the following happens;

  • a end users complains he/she can't access blah-blah-blah,  with the blah-blah-blah being a server , printer or some other resource on the network
  • your firewall-guys shrugs their  shoulders or are plain out; " not helpful" and quick to points the finger at you ( the networking team )
  • the application-team, claims the server is up and happy, but that's about it, &  leave me alone
  • the location is a remote branch and you have no technical resource to login into the server for any  nestat -an or any review of  the local connectiontable status
  • the end_user/client is leaning on you for a fix or direction or escalation, and they are frustrated
  • you have no span, no port-tap, no sniffer, basically limited
  • And everybody believes it's your problem, but offers little or no help from the rest of the IT staff in isolation of the issue
  • your call out of bed at 02:00 due to this issue to look into  the matter


Okay we all have been in the above situations & at one time/day in our careers , right ? Where we have a little bit of this going on ?



The above is a typical day of  any  IT network gal/dude day. And with out means to debug and diagnostic your network topology, we are stuck with the maze of finger pointing.

Have no fear,

I have tip that can just about make your life easier, & when you have different dept and teams that don't work effectively, or just  independently, with bad collaboration, and/or a whole bunch of finger pointing.

Take this drwg that I drew up,  and follow me along the journey of network diagnostic ,where you are blinded and limited in what you can do.



In this scenario, it's a simple VPN Firewall managed by an outside group that's a good player of how to "point the finger" and with a simple core   ( your network gear ) and the server. Okay very simple in deed.

The business application server is managed  by another team,  that also points the finger. They are never around, and just plain not competent in their duties. I do a lot of debugging for them, because of this.

The end_user/client  {172.16.17.15} is actual a payroll lady that updates  payroll details or whatever she does on this server. It was working correctly last week, but today she's having problems.  The firewall group in this case gave her a dedicated webvpn ip_address of  { --> 172.16.17.15 } and the business app server that shes access, is located at { --> 172.16.17.88.22:356/tcp }. All she sees in  her dashboard, is that she can't connect to the server. She upset and can't conduct her work.

Okay simple. or so we think ! But really it is :)

Since you want to quickly rule your network out ,  &  as not the cause of the lack of connectivity and want to  validate the firewall team hasn't made any changes towards the  previous  fwpolicies.

We have a way of doing this using just our cisco layer2/3 switches & with no attached sniffer.

It involves a little bit of hacking around with the cisco debug  function, & against a access-list. This ACL that we will create, is NOT apply to any interface nor do we modify any existing ACLs.

Okay watch and learn;

1st:

 I like to create a debug file for holding the data.This allows you to set this up on all switches/routers in the path of the end_user/client and the server(s).

You could let it run to  logging buffer and/or remote-syslog, but a small file situation in flash will not hurt anything.

On my bigger switches like that of a  6500/7600 with a spare CF slot, I purposely install bigger CF-memory into these slots for backup iosimage, backups of the running-config, and for things like
 what I'm posting here.

here's my small restricted debug file;

config t
-->
    logging file flash:debug 4096 debugging
 end

Notice I made it 4096bytes and told it to log facilities messages debugging or higher. I could have made it bigger, but 4096bytes was ideal in this case.

 2nd

Next, I validate the file with a cli dir cmd;


-->
dir flash:debug

Directory of flash:/debug



    3  -rwx        42  Jan 11 2013 07:25:48 -04:00  debug


Its also a good thing to do a cmd "show logging" and validate  the log services. You should see a line  that look's like the following;

-->
show logging

( output reduce to show only the important information)

File logging: file flash:debug,
        max size 4096, min size 0



3rd

Okay we are almost ready, & now the fun part. You craft a simple ACL. 

Key things to think about during this process;

  • be specific in your src/dst/protocol/port ( basically what are you looking for )
  • DO NOT due a  "ip any any"
  • DO NOT apply this ACL to any interface ( it's not required for this level of diagnostic )
  • Make pretty damm sure the number  you picked is NOT being currently used, basically find a new un-used number in the extened ACL range
  • You can not as of Cisco 12.4 ios-train,  debug a named ACL
  • install remarks in this ACL,  like  the case/ticket/date in the,  so you or other engineers know what it's for , and if  you should happen to leave it in  accidently or for any  extended time ( give you a clue later on so you don't being say WTF )

In my case and for this example, I was only interested in the  TCP protocol and the 1st part of the 3-way tcp handshake the SYN.

 If I can find and see the SYN in my debug messages, than I now know the clients connection is allowed & that I  could also rule out the firewall team, and the network  obviously was correct up to the point and at whatever position of my debug ACLs on either sw1 or sw2.

Here's my  ACL;


-->
access-list 111 remark ticket326783 reqst  pty-dept-bs-app  Charlie O'riley 20130102

access-list 111 permit tcp host 172.16.99.55  host 172.16.129.10 syn


Okay a very  simple ACL. here's some of the options I could have selected;


   -->
  ack          Match on the ACK bit
  dscp         Match packets with given dscp value
  eq           Match only packets on a given port number
  established  Match established connections
  fin          Match on the FIN bit
  fragments    Check non-initial fragments
  gt           Match only packets with a greater port number
  log          Log matches against this entry
  log-input    Log matches against this entry, including input interface
  lt           Match only packets with a lower port number
  neq          Match only packets not on a given port number
  option       Match packets with given IP Options value
  precedence   Match packets with given precedence value
  psh          Match on the PSH bit
  range        Match only packets in the range of port numbers
  rst          Match on the RST bit
  syn          Match on the SYN bit
  time-range   Specify a time-range
  tos          Match packets with given TOS value
  urg          Match on the URG bit



It's up to you, as to what information that you need to inspect for.

So since this lady only gave me ip_address and destination/source, I installed this into my ACL. I could have very much written like  the following;

access-list 111 permit tcp host 172.16.99.55  host 172.16.129.10 eq 356

or

access-list 111 permit ip host 172.16.99.55  host 172.16.129.10

or

access-list 111 permit tcp host 172.16.99.55 gt 1024  host 172.16.129.10 eq 356

Whatever you do, be very careful in selecting that information that you want to debug on. Also your not going to capture data, so I personally could care less on looking at the full flow of data between client-server. The syn/syn-ack would ideal in this setup.

Now after the  ACL is crafted, and after selecting the fields/ports/src+dst-address, we can now execute our  debug against the ACL list. 

( i.e router>debug ip packet 111 detail )

When done, you screen should echo something similar  to the following;
 
  -->
debug ip packet 111 detail

IP packet debugging is on (detailed) for access list 111

term mon

And now you monitor the  debug filename  for any output that it collects when the use attempts to make a session.

cmd more flash:filename

e.g
 
core01:  
-->more flash:debug  
( a snippet of the information in that file )

-->
-->
Jan 11 11:25:21.402: IP: s=172.16.17.15(Vlan123), d=172.16.17.88.356, len 64, input feature
.Jan 11 11:25:21.402:     TCP src=49212, dst=356, seq=2432073100, ack=0, win=8192 SYN, MCI Check(63), rtype 0, forus FALSE, sendself FALSE, mtu 0
.Jan 11 11:25:21.402: IP: s=172.16.17.15(Vlan123), d=172.16.17.88.356, len 64, rcvd 1
.Jan 11 11:25:21.402:     TCP src=49212, dst=356, seq=2432073100, ack=0, win=8192 SYN

.Jan 11 11:25:34.019: IP: s=172.16.17.15(Vlan123), d=172.16.17.88.356, len 64, input feature
.Jan 11 11:25:34.019:     TCP src=49213, dst=356, seq=2938999768, ack=0, win=8192 SYN, MCI Check(63), rtype 0, forus FALSE, sendself FALSE, mtu 0
.Jan 11 11:25:34.019: IP: s=172.16.17.15(Vlan123), d=172.16.17.88.356, len 64, rcvd 1
.Jan 11 11:25:34.019:     TCP src=49213, dst=356, seq=2938999768, ack=0, win=8192 SYN
.Jan 11 11:25:35.789: IP: s=172.16.17.15(Vlan123), d=172.16.17.88.356, len 64, input feature
.Jan 11 11:25:35.789:     TCP src=49214, dst=356, seq=880617259, ack=0, win=8192 SYN, MCI Check(63), rtype 0, forus FALSE, sendself FALSE, mtu 0
.Jan 11 11:25:35.789: IP: s=172.16.17.15(Vlan123), d=172.16.17.88.356, len 64, rcvd 1
.Jan 11 11:25:35.789:     TCP src=49214, dst=356, seq=880617259, ack=0, win=8192 SYN

.Jan 11 11:25:36.796: IP: s=172.16.17.15(Vlan123), d=172.16.17.88.356, len 64, input feature
.Jan 11 11:25:36.796:     TCP src=49215, dst=356, seq=2038341671, ack=0, win=8192 SYN, MCI Check(63), rtype 0, forus FALSE, sendself FALSE, mtu 0
.Jan 11 11:25:36.796: IP: s=172.16.17.15(Vlan123), d=172.16.17.88.356, len 64, rcvd 1
.Jan 11 11:25:36.796:     TCP src=49215, dst=356, seq=2038341671, ack=0, win=8192 SYN

.Jan 11 11:25:37.794: IP: s=172.16.17.15(Vlan123), d=172.16.17.88.356, len 64, input feature
.Jan 11 11:25:37.794:     TCP src=49216, dst=356, seq=489952225, ack=0, win=8192 SYN, MCI Check(63), rtype 0, forus FALSE, sendself FALSE, mtu 0
.Jan 11 11:25:37.794: IP: s=172.16.17.15(Vlan123), d=172.16.17.88.356, len 64, rcvd 1
.Jan 11 11:25:37.794:     TCP src=49216, dst=356, seq=489952225, ack=0, win=8192 SYN




 With the diagnostic approach shown in the above view, you know for sure that the firewall has allowed the client connection.

Once again if you execute the above thru out the network infastructure & various points, you can quickly prove or dis-approve that the client is allow and if any upstream ACLs, rules or other device has prevent access. You can take systematic approach and start close to the src ( firewall ) or dest ( server ) if you had multiple systems in the path. I did the above debug on the last switch#2 in this example.

In this scenario, it was found that the business-app software had  Pseudo internal error, causing the service to show running, and the task manage window indicating all was fine, but in reality; "  the service was NOT listening or responding to the client requests". So the application team, was dragged out of bed and escalated to look into the issues.

 Okay, I'm  sorry fire-guys. It wasn't your fault.  This time :)

On a final note: the logfile shown in this example would log all messages to include other console messages. You could play with tcl scripting ( aka tickle )  to  clean up the logfile output.

example here's a very basic  tcl script,

more flash:logfilter.tcl
# filter to show any log messages that don't have TIMEZONE
#
dir flash:debug
#
#
more flash:debug | ex EST: ;
 


The script "dir" the flash:debug file and then "more" out the contents, but exclude anything with the EST timezone in it. This  can be execute using anyone of the following means;

Router>tclsh flash:logfilter.tcl

or

Router>tclsh logfilter.tcl


If I ever get un-lazy, I will add to that tcl script to first  check to see if the file exist, and if it was more than 0 bytes,  before execution of the display of the content. My tcl scripting experience is just as bad as my perl scripting :)

Remember that  logfile that we created would be roll-over at  4096bytes,   & will collect debug messages ( severity 7 ) and anything higher.  You can increase the filesize,  but beware it will chew up storage space on the flash/disk/bootflash device. A chassis style  switches or routers, with spare storage slots,  can benefit from my earlier suggestion of populating  the spare slot with a CF ( Compact Flash ) card.

I hope you found this post interesting, but I challenge you to explore debugging w/ACL when all else fails, and you are  limited by locations, available personnel, or sniffers & with inspecting packets.

Ken Felix
Freelance Nework/Security Engineer
kfelix  at hyperfeed d-o-t com