Category: Linux

admin | July 15, 2020

How to check if ports are open on an array of servers

Okay now there is a whole bunch of ways you can do this. This is just the way I played around with to save myself a bunch of time, using NCAT. Also previously known as NETCAT.

1.Ensure your Jumphost can ssh to all your newely deployed machines. Either you will use a root password or ssh key of some sort.

2.You will also need to install ncat

a.Yum install nmap-ncat (redhat/centos)
Note (ensure you have this install on all the new servers)

3.Open your editor and copy and paste this script below and save the file

b.Vi portcheckscriptnick.sh & save

c.Chmod +x portcheckscriptnick.sh (change permissioned to executable)

portcheckscriptnick.sh – this will check to see if your new server can talk to all the hosts below and check to see if those ports are up or down on each

============================

#!/bin/bash

host=”nick1 nick2 nick3 nick4″

for host in $host; do

for port in 22 53 67 68

if ncat -z $host $port

then

echo port $port $host is up

else

echo port $port $host is down

done

done
========================================

4.Next you want create an array for your for loop to cycle through and check if all those servers can communicate with those machine and ports

d.Create a file called servers

i.Vi servers

ii.Add a bunch of hosts in a single column

Example:

Server1

Server2

Server3

Server4

e.Save the file servers

5.Now what were going to is have a for loop cycle through the list by logging into each host running that script and outputting the results to a file for us to look at.

6.Run the following below check the servers and see if each server can communicate with the hosts and ports necessary. If you see the are down. Then you will need to check the firewalls to see why the host is unable to communicate.

• for HOST in $(cat server.txt) ; do ssh root@$HOST “bash -s” < portcheckscriptnick.sh ; echo $HOST ; done 2>&1 | tee -a port.status

Note: the file port.status will be created on the jump host and you can simply look through to see if any ports were down on whichever hosts.

This is what the script looks like on one host if its working properly

[root@nick ~]# ./portcheckscriptnick.sh

port 22 192.168.1.11 is up

port 53 192.168.1.11 is down

port 67 192.168.1.11 is down

port 68 192.168.1.11 is down

This is what it will look like when you run against your array of new hosts from your jumpbox

[root@nick ~]# for HOST in $(cat servers.txt) ; do ssh root@$HOST “bash -s” < portcheckscriptnick.sh ; echo $HOST ; done

root@192.168.1.11’s password:

port 22 nick1 is up

port 53 nick1 is down

port 67 nick1 is down

port 68 nick1 is down

port 22 nick2 is up

port 53 nick2 is down

port 67 nick2 is down

port 68 nick2 is down

admin July 15, 2020 Centos, Linux, Network Stuff, SecurityNo Comments »

admin | July 8, 2020

How to setup SMTP port redirect with IPTABLES and NAT

RedHat/Centos

Okay its really easy to do. You will need to add the following in /etc/sysctl.conf
Note: these are kernel parameter changes

1.vi /etc/sysctl.conf add the following lines

kernel.sysrq = 1

net.ipv4.tcp_syncookies=1

net/ipv4/ip_forward=1 (important)

net.ipv4.conf.all.route_localnet=1 (important)

net.ipv4.conf.default.send_redirects = 0

net.ipv4.conf.all.send_redirects = 0

2.Save the file and run

• Sysctl -p (this will load the new kernel parameters)

3.Now you if you already have iptables running you want to save the running config and add the new redirect rules

• Iptables-save > iptables.back

4.Now you want to edit the iptables.back file and add the redirect rules

• vi iptables.back

It will probably look something like the rules below.

EXAMPLE

# Generated by iptables-save v1.2.8 on Thu July 6 18:50:55 2020

*filter

:INPUT ACCEPT [0:0]

:FORWARD ACCEPT [0:0]

:OUTPUT ACCEPT [2211:2804881]

:RH-Firewall-1-INPUT – [0:0]

-A INPUT -j RH-Firewall-1-INPUT

-A FORWARD -j RH-Firewall-1-INPUT

-A RH-Firewall-1-INPUT -i lo -j ACCEPT

-A RH-Firewall-1-INPUT -p icmp -m icmp –icmp-type 255 -j ACCEPT

-A RH-Firewall-1-INPUT -p esp -j ACCEPT

-A RH-Firewall-1-INPUT -p ah -j ACCEPT

-A RH-Firewall-1-INPUT -m state –state RELATED,ESTABLISHED -j ACCEPT

-A RH-Firewall-1-INPUT -p tcp -m tcp –dport 1025-m state –state NEW -j ACCEPT (make sure to have open)

-A RH-Firewall-1-INPUT -p tcp -m tcp –dport 443 -m state –state NEW -j ACCEPT

-A RH-Firewall-1-INPUT -p tcp -m tcp –dport 8443 -m state –state NEW -j ACCEPT

-A RH-Firewall-1-INPUT -p tcp -m tcp –dport 25 -m state –state NEW -j ACCEPT (make sure to have open)

-A RH-Firewall-1-INPUT -p tcp -m tcp –dport 80 -m state –state NEW -j ACCEPT

-A RH-Firewall-1-INPUT -p tcp -m tcp –dport 21 -m state –state NEW -j ACCEPT

-A RH-Firewall-1-INPUT -p tcp -m tcp –dport 22 -m state –state NEW -j ACCEPT

-A RH-Firewall-1-INPUT -p tcp -m tcp –dport 106 -m state –state NEW -j ACCEPT

-A RH-Firewall-1-INPUT -p tcp -m tcp –dport 143 -m state –state NEW -j ACCEPT

-A RH-Firewall-1-INPUT -p tcp -m tcp –dport 465 -m state –state NEW -j ACCEPT

-A RH-Firewall-1-INPUT -p tcp -m tcp –dport 993 -m state –state NEW -j ACCEPT

-A RH-Firewall-1-INPUT -p tcp -m tcp –dport 995 -m state –state NEW -j ACCEPT

-A RH-Firewall-1-INPUT -p tcp -m tcp –dport 8222 -m state –state NEW -j ACCEPT

-A RH-Firewall-1-INPUT -j REJECT –reject-with icmp-host-prohibited

COMMIT

#ADD this section with another Commit like below

# Completed on Thu July 6 18:50:55 2020

# Generated by iptables-save v1.2.8 on Thu July 6 18:50:55 2020

*nat

:PREROUTING ACCEPT [388:45962]

:POSTROUTING ACCEPT [25:11595]

:OUTPUT ACCEPT [25:11595]

-A PREROUTING -p tcp -m tcp –dport 1025 -j REDIRECT –to-ports 25

COMMIT

# Completed on Thu July 6 18:50:55 2020

• Save the file

5.Next you want to reload the new config

• Iptables-restore < iptables.back

6.Now you should be able see the new rules and test

• Iptables -L -n -t nat (should show the rules)

[root@nick ~]# iptables -L -n | grep 1025

ACCEPT tcp — 0.0.0.0/0 0.0.0.0/0 tcp dpt:1025 state NEW

[root@nick ~]# iptables -L -n -t nat| grep 1025

REDIRECT tcp — 0.0.0.0/0 0.0.0.0/0 tcp dpt:1025 redir ports 25

Note:

You will need to run telnet from outside the host as you cant NAT to localhost locally. 🙂

[root@nick1 ~]# telnet 192.168.86.111 1025

Trying 192.168.86.111…

Connected to localhost.

Escape character is ‘^]’.

220 nick.ansible.com ESMTP Postfix

admin July 8, 2020 Centos, Linux, MailStuff, Network Stuff, SecurityNo Comments »

admin | July 2, 2020

How to rebuild a drive that’s fallen out of a software raid

Now I know nobody uses this kind of raid technology anymore, but it was one of the cool things I learned from my mentor at the time, when I first started my career centuries ago. I happen to find this in my archives and thought I would write up to share.

There is another way to do this as using mdadm & sfdisk. When I find time I will share how to do that as well.

1.First thing you want to do is check to see drive has fallen out of the raid by running the following command below

• cat /proc/mdstat

md2 : active raid1 hda3[0] hdc3[1]

524096 blocks [2/2] [UU]

md1 : active raid1 hda2[0] hdc2[1]

524096 blocks [2/2] [UU]

md0 : active raid1 hda1[0]

78994304 blocks [2/1] [U_] *You notice this one is showing a drive has fallen out*

Note: If you see this, take notice to the one with [U_] this line means that the drive has fallen out of the raid.

1. To enter it back in run the lines below, based on the drive assignments in the above paritions that are good.

• raidhotadd /dev/md0 /dev/hdc1

• echo -n 6666666 > /proc/sys/dev/raid/speed_limit_max (this increases the rebuild speed)

How to rebuild a failed drive in software if you replaced the drive:

• cat /proc/mdstat

md2 : active raid1 hda3[0] hdc3[1]
524096 blocks [2/2] [UU]
md1 : active raid1 hda2[0] hdc2[1]
524096 blocks [2/2] [UU]
md0 : active raid1 hda1[0]
78994304 blocks [2/1] [U_]

2. recreate the paritions on the new drive by doing the following, using the same mirror drive designations from /proc/mdstat.

• sfdisk -d /dev/hda(source) | sfdisk /dev/hdc(destination) (this duplicates all three partitions on the drive on the new drive)

• echo 6666666666 > /proc/sys/dev/raid/speed_limit_max (increase rebuild speed)

3. Next check the partition by running

• df -h

• fdisk -l

Disk /dev/hdc: 81.9 GB, 81964302336 bytes

16 heads, 63 sectors/track, 158816 cylinders

Units = cylinders of 1008 * 512 = 516096 bytes

Device Boot Start End Blocks Id System

/dev/hdc1 * 1 156735 78994408+ fd Linux raid autodetect

/dev/hdc2 156736 157775 524160 fd Linux raid autodetect

/dev/hdc3 157776 158815 524160 fd Linux raid autodetect

Disk /dev/hda: 81.9 GB, 81964302336 bytes

16 heads, 63 sectors/track, 158816 cylinders

Units = cylinders of 1008 * 512 = 516096 bytes

Device Boot Start End Blocks Id System

/dev/hda1 * 1 156735 78994408+ fd Linux raid autodetect

/dev/hda2 156736 157775 524160 fd Linux raid autodetect

/dev/hda3 157776 158815 524160 fd Linux raid autodetect

———————————————————————

Filesystem Size Used Avail Use% Mounted on

/dev/md0 75G 11G 60G 16% /

none 251M 0 251M 0% /dev/shm

/dev/md1 496M 8.1M 463M 2% /tmp

4. Next you want it rebuild the partitions on the new drive so run the following, you will need to update your drive designation according to your drive assignment.

• raidhotadd /dev/md0 /dev/hdc1

• raidhotadd /dev/md1 /dev/hdc2

• raidhotadd /dev/md2 /dev/hdc3

Note: the primary partition should match the new drive designation ‘dev/md0 /dev/hdc1’.

admin July 2, 2020 Linux, Linux Software Raid StuffNo Comments »

admin | July 1, 2020

How to add a new SCSI LUN while server is Live

REDHAT/CENTOS:

In order to get wwn ids from a server:

• cat /sys/class/scsi_host/host0/device/fc_host\:host0/port_name

• cat /sys/class/scsi_host/host1/device/fc_host\:host1/port_name

Or:

• systool -av -cfc_host | grep port_name | awk ‘{ print $3 }’ | cut -d\” -f 2 | cut -dx -f 2

1.To add a new SAN LUN while live:

Run this to find the new disks after you have added them to your VM

• rescan-scsi-bus.sh

Note: rescan-scsi-bus.sh is part of the sg3-utils package

2.Check that it has been found, will be mpath(something)

• multipath –l

# That’s it, unless you want to fix the name from mpath(something) to something else

1.Change the shortcut name

• vi /etc/multipath_bindings

2.Remove the default mpath device autogenerated

• multipath –f mpath(something)

# Go into the multipath consolde and re add the multipath device with your new shortcut name (nickdsk2 in this case)

• multipathd –k

• add map nickdsk2

Note: Not going to lie, sometimes you could do all this and still need a reboot, majority of the time this should work. But what do i know…haha

admin July 1, 2020 Centos, Diskstuff, Linux, SCSI StuffNo Comments »

admin | June 17, 2020

How to figure out switch and port via tcpdump

Okay if you have ever worked in a place where their network was complete choas with no documentation or network maps to help you figure out where something resides.

You can sometimes use tcpdump to help you figure out where the server is sitting by using tcpdump.

Syntax

tcpdump -nn -v -i <NIC_INTERFACE> -s 1500 -c 1 ‘ether[20:2] == 0x2000’

Example:

root@ansible:~ # tcpdump –nn -v –i eth0 -s 1500 -c 1 ‘ether[20:2] == 0x2000’
tcpdump: listening on eth3, link-type EN10MB (Ethernet), capture size 1500 bytes
03:25:22.146564 CDPv2, ttl: 180s, checksum: 692 (unverified), length 370
Device-ID (0x01), length: 11 bytes: ‘switch-sw02‘
Address (0x02), length: 13 bytes: IPv4 (1) 192.168.1.15
Port-ID (0x03), length: 15 bytes: ‘Ethernet0/1‘
Capability (0x04), length: 4 bytes: (0x00000028): L2 Switch, IGMP snooping
Version String (0x05), length: 220 bytes:
Cisco Internetwork Operating System Software
IOS ™ C2950 Software (C2950-I6Q4L2-M), Version 12.1(14)EA1a, RELEASE SOFTWARE (fc1)
Copyright (c) 1986-2003 by cisco Systems, Inc.
Compiled Tue 02-Sep-03 03:33 by Nicola tesla
Platform (0x06), length: 18 bytes: ‘cisco WS-C2950T-24’
Protocol-Hello option (0x08), length: 32 bytes:
VTP Management Domain (0x09), length: 6 bytes: ‘ecomrd‘
Duplex (0x0b), length: 1 byte: full
AVVID trust bitmap (0x12), length: 1 byte: 0x00
AVVID untrusted ports CoS (0x13), length: 1 byte: 0x00
1 packets captured
2 packets received by filter
0 packets dropped by kernel

root@ansible:~ #

Written by Nick Tailor

admin June 17, 2020 Linux, Network Stuff2 Comments »

admin | June 17, 2020

How to increase disk size on virtual scsi drive using gpart

1.Login to VMware vSphere Client and shutdown server VM guest.

2.Select VM Guest server and click “Edit virtual machine settings”. Virtual Machine Properties window will appear. Under “Hardware”, click Hard Disk 2 (which is /data partition) and edit provision size to 200 GB as shown in below screenshot.
Power ON VM guest after editing disk size.

vsphere

3.Take VM snapshot of VM guest.

4.Log on to VM Guest using SSH client, like PuTTy, with “root” user.

5.List the SCSI devices using command – cat /proc/scsi/scsi

6.Run following command to see the name of the partition
ls -d /sys/block/sd*/device/scsi_device/* |awk -F ‘[/]’ ‘{print $4,”- SCSI”,$7}’

7.Run following commands to confirm the size of the SCSI disk for which you have increased size in step 4 has been updated by the following steps

1.echo 1 > /sys/class/scsi_device/2\:0\:1\:0/device/rescan

2.fdisk -l | grep Disk

3.df -h

8.Stop cron and services using commands
service crond stop

9.Unmount “/data” partition using command — umount /data
Note: If you observe “Device is busy” error then make sure that your current session is not in /data partition.

10.Perform following steps to grow added disk space of /data partition based on partition type

For GPT partition type

In this case parted -l command will give below for “sdb” disk partition
*****************************************************
Model: VMware Virtual disk (scsi)
Disk /dev/sdb: 215GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Number Start End Size File system Name Flags
1 1049kB 215GB 215GB ext4 Linux LVM lvm
*****************************************************

4.Execute command — gdisk /dev/sdb

5.Type “p” to print the partition
gpt_p

6.Type “d” to delete partition

7.Type “p” to check if partition is deleted
gpt_p1

8.Type “n” to create new partition

9.Type “1” as partition number

10.Press “Enter” twice

11.Type “8E00” as GUID
gpt_n

12.Type “p” to check newly added partition
gpt_p2

13.Type “w” to alter partition table

14.Type “Y” to continue
gpt_w

15.To mount /data partition run command – mount /data

16.To resize the file system run command – resize2fs /dev/sdb1

17.To check increased disk space run command – df -h

admin June 17, 2020 Centos, Diskstuff, Linux, SCSI Stuff, vmwareNo Comments »

admin | June 3, 2020

How to compare your route table isn’t missing any routes from your ansible config

REDHAT/CENTOS

Okay so those of you who use ansible like me and deal with complicated networks where they have a route list that’s a mile long on servers that you might need to migrate or copy to ansible and you want to save yourself some time and be accurate by ensuring the routes are correct and the file isn’t missing any routes as missing routes can be problematic and time consuming to troubleshoot after the fact.

Here is something cool you can do.

On your server you can

On the client server

You can use “ip” command with a flag r for routes

Example:

It will look look something like this.

[root@ansibleserver]# ip r
default via 192.168.1.1 dev enp0s8
default via 10.0.2.2 dev enp0s3 proto dhcp metric 100
default via 192.168.1.1 dev enp0s8 proto dhcp metric 101
10.0.2.0/24 dev enp0s3 proto kernel scope link src 10.0.2.15 metric 100
192.168.1.0/24 dev enp0s8 proto kernel scope link src 192.168.1.12 metric 101
10.132.100.0/24 dev mgt proto kernel scope link src 10.16.110.1 metric 1011
10.132.10.0/24  dev mgt proto kernel scope link src 10.16.110.1 metric 101
10.136.100.0/24 dev mgt proto kernel scope link src 10.16.110.1 metric 1011
10.136.10.0/24  dev mgt proto kernel scope link src 10.16.110.1 metric 101
10.134.100.0/24 dev mgt proto kernel scope link src 10.16.110.1 metric 1011
10.133.10.0/24  dev mgt proto kernel scope link src 10.16.110.1 metric 101
10.127.10.0/24  dev mgt proto kernel scope link src 10.16.110.1 metric 101
10.122.100.0/24 dev mgt proto kernel scope link src 10.16.110.1 metric 101
10.134.100.0/24 dev mgt proto kernel scope link src 10.16.110.1 metric 101
10.181.100.0/24 dev mgt proto kernel scope link src 10.16.110.1 metric 101
10.181.100.0/24dev mgt proto kernel scope link src 10.16.110.1 metric 101
10.247.200.0/24dev mgt proto kernel scope link src 10.16.110.1 metric 101
10.172.300.0/24dev mgt proto kernel scope link src 10.16.110.1 metric 101
10.162.100.0/24dev mgt proto kernel scope link src 10.16.110.1 metric 101
10.161.111.0/24 dev mgt proto kernel scope link src 10.16.110.1 metric 101
10.161.0.0/16   dev mgt proto kernel scope link src 10.16.110.1 metric 101
10.233.130.0/24 dev mgt proto kernel scope link src 10.16.110.1 metric 101
10.60.140.0/24   dev mgt proto kernel scope link src 10.16.110.1 metric 101

Now what you want to do is take the all the ips that show up on “mgt” interface and put them in a text file

vi ips1
save the file

copy on the section of one after the other in a column and save the file.

10.132.100.0/24
10.132.10.0/24
10.136.100.0/24
10.136.10.0/24
10.134.100.0/24
10.133.10.0/24
10.127.10.0/24
10.122.100.0/24

Now your ansible route section will probably look something like this…

Example of ansible yaml file “ansblefile”

routes:
    - device: mgt
      gw: 10.16.110.1
      route:
        - 10.132.100.0/24
        - 10.132.10.0/24
        - 10.136.100.0/24
        - 10.136.10.0/24
        - 10.134.100.0/24
        - 10.133.10.0/24
        - 10.127.10.0/24
        - 10.122.100.0/24
        - 10.134.100.0/24
        - 10.181.100.0/24
        - 10.181.100.0/24
        - 10.247.200.0/24
        - 10.172.300.0/24
        - 10.162.100.0/24
        - 10.161.111.0/24
        - 10.161.0.0/16
        - 10.233.130.0/24

So you what you want to do now is copy and paste the routes from the file so they line up perfectly with the correct spacing in your yaml file.Note:
If they aren’t lined up correctly your playbook will fail.
So you can either copy them into a text editor like textpad or notepad++ and just use the replace function to add the “- “ (8 spaces before the – and 1 space before the – and ip) or you can you perl or sed script to do it right from the command line.

# If you want to edit the file in-place
sed -i -e 's/^/prefix/' file

Example:

sed -e 's/^/ - /' ips1 > ips2

Okay now you should have a new file called ips2 that looks like below with 8 space from the left margin.

– 10.136.100.0/24

– 10.136.10.0/24

– 10.134.100.0/24

– 10.133.10.0/24

– 10.127.10.0/24

– 10.122.100.0/24

Now you if you cat that ips2

cat ips2
Then highlight everything inside the file

[highlighted]
- 10.136.100.0/24
- 10.136.10.0/24
- 10.134.100.0/24
- 10.133.10.0/24
- 10.127.10.0/24
- 10.122.100.0/24
[highlighted]

7. Open your ansible yaml that contains the route section and just below “route:” right against the margin paste what you highlighted. Everything should line up perfectly and save the ansible file.

routes:

– device: mgt

gw: 10.16.110.1

route:

[paste highlight]

- 10.132.100.0/24
- 10.132.10.0/24
- 10.136.100.0/24
- 10.136.10.0/24
- 10.134.100.0/24
- 10.133.10.0/24

[paste highlight]

Okay no we need to check to ensure that you didn’t accidently miss any routes between the route table and inside your ansible yaml.

Now with the original ips1 file with just the routes table without the –
- Make sure the ansible yaml file and the ips1 file are inside the same directory to make life easier.

We can run a little compare script like so
while read a b c d e; do if [[ $(grep -w $a ansiblefile) ]]; then :; else echo $a $b $c $d $e; fi ; done < <(cat ips1)

Note:
If there are any routes missing from the ansible file it will spit them out. You can keep running this until the list shows no results, minus any gateway ips of course.

Example:

[root@ansibleserver]# while read a b c d e; do if [[ $(grep -w $a ansiblefile) ]]; then:; else echo $a $b $c $d $e; fi ; done < <(cat ips1)
10.168.142.0/24
10.222.100.0/24
10.222.110.0/24

By Nick Tailor

admin June 3, 2020 Ansible, Centos, Linux, Network StuffOne Comment »

admin | June 2, 2020

How to change the currently active slave of a bonded interface

RedHat / CentOS :

Interface Bonding as we all know is very useful in providing the fault tolerance and increased bandwidth. We can change the active slave interface of bonding without interrupting the production work. In the example below we have the interface bonding bond0 with 2 slaves em0 and em1 (em1 being the active slave). We will be replacing slave em0 with new slave em2.

# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: em0
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 5000
Down Delay (ms): 5000

Slave Interface: em0
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:21:28:b2:65:26
Slave queue ID: 0

Slave Interface: em1
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:21:28:b2:65:27
Slave queue ID: 0

1. Change the active slave to em1

ifenslave command can be used to attach or detach or change the currently active slave interface from the bonding. Now, Change the active slave interface to em1.

# ifenslave -c bond0 em1

Check the bonding status again to ensure that em1 is the new active slave :

# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: em1
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 5000
Down Delay (ms): 5000

Slave Interface: em0
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:3b:26:b2:68:26
Slave queue ID: 0

Slave Interface: em1
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:3b:26:b2:68:27
Slave queue ID: 0

The switch of active slave should get effective immediately, but on critical production systems, please schedule maintenance window or make some test in an identical test environment first.

2. Attach the new slave interface

We can now attach the new slave interface em2 to the bonding.

# ifenslave bond0 em2

3. Unattach the old slave interface

Once we have attached a new slave interface, we can unattach the old slave and remove it from the bonding.

# ifenslave -d bond0 em0

4. Verify

Confirm that the new slave is now the standby interface in the bonding.

# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: em1
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 5000
Down Delay (ms): 5000

Slave Interface: em1
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:21:29:bf:55:30
Slave queue ID: 0

Slave Interface: em2
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:19:1a:d1:43:61
Slave queue ID: 0

If you want to make the changes more permanent

The changes we just made, are temporary and will be cleared after a reboot of the server. To make these changes permanent we will have to make few changes.

Make sure you delete the file /etc/sysconfig/network-scripts/ifcfg-em0 as we are no longer are using this interface in bonding. Create a new file for the new slave interface in the bonding :

# rm /etc/sysconfig/network-scripts/ifcfg-em0

# vi /etc/sysconfig/network-scripts/ifcfg-em2
DEVICE=em2
BOOTPROTO=none
ONBOOT=yes
MASTER=bond0
SLAVE=yes

admin June 2, 2020 Centos, Linux, Network StuffNo Comments »

admin | May 27, 2020

How to fix in the infiniband issue when migrating multiple bonded nics to redhat 7

Okay so some of you be using malenox FPGA cards which basically bypasses the BUS to give lower latency on your network response time.

Now if you have used an OS like SUSE and had a butt load of bonded nics and then want to migrate the OS and all the bonded nics configurations in an automated fashion using ansible or something configuration management tool.

What some of you might run into is when the OS comes up for the first time, some of the Mellanox nics will boot up in infiniband mode. Which will result in the bonded nics showing up as down. I will show you how to determine this and fix this.

So the first thing you want to do is determine which bonds are showing down

How to check which bonds are down.

1.grep -c down /proc/net/bonding/*

◦ this will list out all the bonds that show an interface is down

Example

root@ansibleclient:~> grep -c down /proc/net/bonding/*

/proc/net/bonding/bond1:0

/proc/net/bonding/bond2:0

/proc/net/bonding/bond3:1 (this indicates that one interface is down)

2.Once you determine the bond has an interface that is down you want to figure out if it’s the Mellanox card nic.

• cat /proc/net/bonding/bond3

i.this will give you the nic mac address that are inside the bond.

Example

Bonding Mode: fault-tolerance (active-backup)

Primary Slave: None

Currently Active Slave: eth4

MII Status: up

MII Polling Interval (ms): 100

Up Delay (ms): 0

Down Delay (ms): 0

Slave Interface: eth4

MII Status: up

Speed: 10000 Mbps

Duplex: full

Link Failure Count: 0

Permanent HW addr: 00:02:c9:e9:e9:11

Slave queue ID: 0

Slave Interface: eth5

MII Status: up

Speed: 10000 Mbps

Duplex: full

Link Failure Count: 0

Permanent HW addr: 00:02:c9:e9:e9:12

Slave queue ID: 0

Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

3.now what you want to do next is run ‘ip a’ and see if those interfaces are listed

Example – should look something like this. If you don’t see the down nic here for our example lets say its eth5. This could mean its in infiniband mode and not ethernet mode. It also shows if the interface is up or down. Which is very important when troubleshooting the interface

[root@nickansible]# ip a

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000

link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

inet 127.0.0.1/8 scope host lo

valid_lft forever preferred_lft forever

2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000

link/ether 08:00:26:9a:33:59 brd ff:ff:ff:ff:ff:ff

inet 10.0.2.15/24 brd 10.0.2.255 scope global noprefixroute dynamic enp0s3

valid_lft 82770sec preferred_lft 82770sec

3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000

link/ether 08:00:26:88:5a:fd brd ff:ff:ff:ff:ff:ff

inet 192.168.1.11/24 brd 192.168.1.255 scope global noprefixroute dynamic enp0s8

valid_lft 82773sec preferred_lft 82773sec

4.Okay now we need to determine if eth5 is infact the Mellanox card. So now we need the nic information

• Ethtool -I eth5

Example.
It will look something like this.

[root@nick ansible# ethtool -i eth5

driver: e1000

version: 7.3.21-k8-NAPI

firmware-version:

expansion-rom-version:

bus-info: 0000:00:18.0 (this is the important info you need)

supports-statistics: yes

supports-test: yes

supports-eeprom-access: yes

supports-register-dump: yes

supports-priv-flags: no

• Now you want to take the bus info and determine if it is infact the Mellanox card

◦ lspci –s 0000:00:18.00

Example

[root@nick ansible]# lspci -s 0000:00:18.0.0

00:18:00 Ethernet controller: Mellanox Technologies MT26448 [ConnectX EN 10GigE, PCIe 2.0 5GT/s]

5.Okay now we know for sure this nic is infact the Mellanox nic that is down. So now we went to manually force it into ethernet but first check to see what it says

• cat /sys/bus/pci/devices/0000\:18\:00.0/mlx4_port0

ii.if this doesn’t return “ETH” then its in infiniabnd mode

• cat /sys/bus/pci/devices/0000\:18\:00.0/mlx4_port1

iii.if this doesn’t return “ETH” then its in infiniabnd mode

6.Now what we want to od is manually change the nic to ethernet mode

• echo eth > /sys/bus/pci/devices/0000\:18\:00.0/mlx4_port1

iv.If you cat them now it should say “ETH”

Okay so now when you do ‘ip a’ you should should see the nics up and if you check the status of the bond there should be 0 bonds down. You might have to bring the bond down and up.

7.You can do this simply by

• Ifdown eth5 & ifup eth5

v.If there are no errors, the cursor will simply move to the next line with a brief delay.

Now the issue here is that if you aren’t able to get rpms from Mellanox that are supported by patching in your organisation. You’re going to need a way to ensure that if the server reboots the nic will start up in ethernet mode, otherwise you could be in a very bad situation if the server boots and the nic came up in infiniband mode.

So there are a couple of ideas I came up with to solve this.

Option:

1.You can simply add the echo lines in the /etc/rc.local

• echo eth > /sys/bus/pci/devices/0000\:18\:00.0/mlx4_port1

• echo eth > /sys/bus/pci/devices/0000\:18\:00.0/mlx4_port1

i.This should bring the interface back to “ETH”, however you might need to add some more lines to bring the interface up properly.

1.This the approach I chose and the cooler way to go about it. In redhat 7 you can define a if-preup-local script which will run anytime “ifup” is run.

Here is how you set that up.

1.Create a file called “/etc/sysconfig/network-scripts/ifup-pre-local’

a.vi /etc/sysconfig/network-scripts/ifup-pre-local

2.Now you can add whatever script you want. My colleague and I came up with a script that determined based on mac and bus info and if it certain buses and mac showed up it would run the echo to move the ports into eth mode

ADD this inside and save the file

#!/bin/bash

LID=”00:00:00:00″

for i in `ls /etc/sysconfig/network-scripts/ifcfg-* 2> /dev/null`

for j in `grep HWADDR $i |awk -F\” ‘{print $2}’`

ID1=$(echo $j | awk -F\: ‘{print $2″:”$3}’)

ID2=$(echo $j | awk -F\: ‘{print $4″:”$5}’)

ID=”$ID1:$ID2″

PORT=$(echo $j | cut -c 16-17)

for k in `ls /sys/bus/pci/devices/0000\:*\:00.0/net/ib[0-9]/address 2> /dev/null`

grep “$ID1.*$ID2” $k 1> /dev/null

if [ $? -eq 0 ]; then

if [ “x$ID” != “x$LID” ]; then

mlxport=1

else

let “mlxport++”

LID=$ID

p=$(echo $k | awk -F/ ‘{print “/sys/bus/pci/devices/”$6″/”}’)

echo “Running: echo eth > ${p}mlx4_port${mlxport}”

echo eth > ${p}mlx4_port${mlxport}

done

3.Next you want to create a symlink in side /sbin

b.Move into /sbin

i.cd /sbin

c.now create a symlink for ifup-pre-local

ii.ln -s /etc/sysconfig/network-scripts/ifup-pre-local ifup-pre-local

Now when you run ifup it will run that script that check to see if the any of those bus and macs are in infinband mode and bring them into eth. It safer to do this way because if you restart the network and for some reason the nic goes back into infiniband and someone new had no idea. They would spend awhile trying to figure this out.

How do deploy this fix via anisble role coming soon……

admin May 27, 2020 Centos, Linux, Network StuffNo Comments »

admin | March 27, 2020

How to build a server using kickstart satellite 6.x

Note: This document is assuming that your capsule server are already configured and your dhcpd service is running and your subnets have been added to the config already.

Manual process:

HOST TAB

1.On the top menu bar click on the HOSTS

a.Create hosts

Under Create hosts there are a bunch of tabs that need to be filled out.

Name * (This is the name of your vm) – “nick.test1.com”
This value is used also as the host’s primary interface name.

Organisation * Which ever ORG which want the host to live in (LCH)

Location * london

Host Group – We will do this late for now just choose an existing non-prod group.

Deploy on – Bare Metal

Lifecycle Environment Non-Prod

Content View – Select a content view that exists, check under content view

Content Source – leave blank

Interfaces TAB

Type : Interface

MAC address : Grab the mac address from vcenter or login in existing OS and get interface mac-address

Device identifier : en016780032

DNS name “nick.test1.com

Domain : nicktailor.com

IPv4 Subnet: subnet the vlan lives on(this is setup on capsule server)
nick-10.61.120.0-26(10.61.120.0/26)

IPv6 Subnet

IPv4 address : 10.61.120.45

Managed (checked)

Primary (checked)

Provision(checked)

Remote execution(checked)

Operating System TAB

Architecture * :x86_64

Operating system *: RHEL Server 7.4

Media SelectionSynced Content All Media

Select the installation media that will be used to provision this host. Choose ‘Synced Content’ for Synced Kickstart Repositories or ‘All Media’ for other media.

Media *: RHE7-cap01 (this is where the repositories live)

Partition table *: RHEL7-TESTING (make sure this attached to a hostgroup and operating sytem) Under HOSTS & CONFIGURE)

PXE loader : PXELinux BIOS (this is for the PXE Boot)

Custom partition table (leave blank unless you want to overide

What ever text(or ERB template) you use in here, would be used as your OS disk layout options If you want to use the partition table option, delete all of the text from this field

Root password : password

Password must be 8 characters or more

Pamameters TAB

Puppet class parameters

Puppet class Name Value Omit

Global parameters:

Capsule : nick-cap01.com

Activation_keys: RHEL7-2017-12-PROD

nick-cap01.com
kt_activation_keys: RHEL7-2017-12-Prod

(if you override the default key it shows up below)

puppet_server : nick-pup02.com

Host parameters:

Name Value Actions

kt_activation_keys

RHEL7-2017-12-Non-Prod (nonprod)

Additional Information TAB

Owned by: Nick Tailor

Enabled: Include this host within satellite reporting (check this)

Hardware Model

Commen: Blank

Next Step – Create a hostgroup

Under Configure select Host Groups( You need a host group in for your deployment to work properly without this is will not work )

Note: Generally its easier to clone an existing hostgroup, change the name and edit the settings to save you time. However for the purposes of this document. We are going to go through the process.

1.Click on Create Host Group (Top right)

Host Group Tab

Parent

Name *: Nick-hax0r-servers (Project name – servers)

Lifecycle Environment: NON-PROD (make sure you have lifecycle environment configured)

Content View : RHEL7-2019-03 (Make sure to select a content view that exists, you can go to content views and look at which it exists and the copy and paste the name exactly)

Content Source: nick-cap01.com(This is the capsule server where the content for the repositories exist for the dev environment, in addition where the subnets are defined that these project servers can dhcp from pxeboot)

Puppet Environment: Non_Production_RHEL7_2019_03_127
Note: (Define this is you have a puppet environment configured with satellite. You will need to have your puppet environment match this content view if you do)

Compute profile : Blank

Puppet Master: Blank

Puppet CA: Blank

OpenSCAP Capsule : Blank

Note: (This is good for pulling server information and vulnerabilities)

Network TAB

Domain: nicktailor.com

IPv4 Subnet: NTC-10.61.120.0-26(10.61.120.0/26)
Note: (These subnets are defined in satellite under Infrastructure and then Subnets)

IPv6 : No Subnet

Realm: Blank

Operating System TAB

Architecture: x86_64

Operating system * : RHEL Server 7.4
(Note: This section is very important. You will need to attach the partition table to the operating system under Hosts and Operating System. If you do not when you make your provision template this host group will not be able to see the partition table you created when you choose the OS you want to deploy.

Media Selection Synced Content All Media

Select the installation media that will be used to provision this host. Choose ‘Synced Content’ for Synced Kickstart Repositories or ‘All Media’ for other media.

Media *: RHEL7-nick-cap01

Partition table *: RHEL7-Testing
(Note: This is created under HOSTS and Partition Table)

PXE loader: Blank

Root password: Password (set this for your server to desired setting)

Parameters TAB

Global Parameters

Host group parameters:

Name: Value:

Capsule nick-cap01.com

puppet_server nick-pup02.com
Note:(You only need this define i`f you have a puppet server environment configured)

Locations TAB

Under Selected Items:

Add London

Organizations TAB

Under Selected Items:
Add organizations you want to have access to the host group
ADD: LCH

Activation Keys TAB

Activation keys: RHEL7-2017-12-Non-Prod (this key defines which organization, host group, repositories, life cycle environment and organization the host initially gets registered with. You can manually change these setting after, however its probably good to make a proper key to save you lots of time.

Next Step – Created Patition Table
HOSTS and Partition Tables

1.Click On Create Parition Table
(Note: Its generally better to clone an exitsing table and edit as needed, however for the purposes of this doc, we will go through the settings) You will also need to add this table to your operating system under Hosts and Operating system for the provision template to work properly)

Template TAB

Name * : GTP-RHEL7-Testing (Name your partition table scheme)

Default

Default templates are automatically added to new organisations and locations

Snippet

Operating system family: RED HAT

Input:

Note: This is a standard lvm setup using ext4 for the OS. If you are going to use dual boot, then you want to change the first 3 lines

zerombr

clearpart –drives=sda –all –initlabel

part /boot –fstype ext4 –size=1024 –asprimary –ondisk=sda

part pv.00 –size=1 –grow –asprimary –ondisk=sda

volgroup vgroot pv.00

logvol / –name=lv_root –vgname=vgroot –size=15360 –fstype ext4

logvol swap –name=lv_swap –vgname=vgroot –size 6144 –fstype swap

logvol /var –name=lv_var –vgname=vgroot –size 10240 –fstype ext4

logvol /opt –name=lv_opt –vgname=vgroot –size 10240 –fstype ext4

logvol /var/tmp –name=lv_var_tmp –vgname=vgroot –size 5120 –fstype ext4 –fsoptions=nodev,nosuid,noexec

logvol /var/log –name=lv_var_log –vgname=vgroot –size 5120 –fstype ext4

logvol /var/log/audit –name=lv_var_log_audit –vgname=vgroot –size 2048 –fstype ext4

logvol /var/coredumps –name=lv_crash –vgname=vgroot –size 16384 –fstype ext4

logvol /tmp –name=lv_tmp –vgname=vgroot –size 5120 –fstype ext4 –fsoptions=nodev,nosuid,noexec

logvol /home –name=lv_home –vgname=vgroot –size 5120 –fstype ext4 –fsoptions=nodev

Dual Boot template:

Note: Change the drive designation from sda to sdx (x being whatever the new drive designation is) In the example below its /dev/sdc

clearpart –drives=sdc –all –initlabel

part /boot –fstype ext4 –size=1024 –asprimary –ondisk=sdc

part pv.00 –size=1 –grow –asprimary –ondisk=sdc