Author: admin

How to do a full restore if you wiped all your LVM’s

I’m sure some of you have had the wonderful opportunity to experience loosing all your LVM info in error. Well all is not lost and there is hope. I will show ya how to restore it.

The beauty of LVM is that is naturally creates a backup of the Logical Volumes in the following location.

  • /etc/lvm/archive/

Now If you had just wiped out your LVM and it was simply using one physical disk for all your LVM’s you could simply do a full restore doing the following.

      • vgcfgrestore -f /etc/lvm/archive/(volumegroup to restore) (destination volumegroup)
        o    (ie.)vgcfgrestore -f /etc/lvm/archive/vg_dev1_006.000001.vg vg_dev

If you had multiple disks attached to your volume group then you need to do a couple more things to be able to do a restore.

  • Cat the file /etc/lvm/archive/whatevervolumgroup.vg file you should see something like below
  • physical_volumes {

                        pv0 {

                                    id = “ecFWSM-OH8b-uuBB-NVcN-h97f-su1y-nX7jA9”
                                    device = “/dev/sdj”         # Hint only
                                    status = [“ALLOCATABLE”]
                                    flags = []
                                    dev_size = 524288000    # 250 Gigabytes
                                    pe_start = 2048
                                    pe_count = 63999          # 249.996 Gigabytes
                        }

 

You will need to recreate all the physical volume UUID inside that .vg file for volume group to be able to restore.

    • pvcreate –restore /etc/lvm/archive/vgfilename.vg –uuid <UUID> <DEVICE>

      (IE) pvcreate –restorefile /etc/lvm/archive/vg_data_00122-1284284804.vg –uuid ecFWSM-OH8b-uuBB-NVcN-h97f-su1y-nX7jA9 /dev/sdj
  • Repeat this step for all the physical volumes in the archive vg file until they have all been created.

Once you have completed the above step you should now be able to restore your voluegroups that were wiped

    • vgcfgrestore -f /etc/lvm/archive/(volumegroup to restore) (destination volumegroup)

      o (ie.)vgcfgrestore -f /etc/lvm/archive/vg_dev1_006.000001.vg vg_dev
  • Running the command vgdisplay and pvdisplay should show you that everything is back the way it should be

If you have questions email nick@nicktailor.com

Cheers

 

How to setup NFS server on Centos 6.x

Setup NFS Server in CentOS / RHEL / Scientific Linux 6.3/6.4/6.5

1. Install NFS in Server

  • [root@server ~]# yum install nfs* -y

2. Start NFS service

  • [root@server ~]# /etc/init.d/nfs start

Starting NFS services:                                     [  OK  ]

Starting NFS mountd:                                       [  OK  ]

Stopping RPC idmapd:                                       [  OK  ]

Starting RPC idmapd:                                       [  OK  ]

Starting NFS daemon:                                       [  OK  ]

  • [root@server ~]# chkconfig nfs on

3. Install NFS in Client

  • [root@vpn client]# yum install nfs* -y

4. Start NFS service in client

  • [root@vpn client]# /etc/init.d/nfs start

Starting NFS services:                                     [  OK  ]

Starting NFS quotas:                                       [  OK  ]

Starting NFS mountd:                                       [  OK  ]

Stopping RPC idmapd:                                       [  OK  ]

Starting RPC idmapd:                                       [  OK  ]

Starting NFS daemon:                                       [  OK  ]

  • [root@vpn client]# chkconfig nfs on

5. Create shared directories in server

Let us create a shared directory called ‘/home/nicktailor’ in server and let the client users to read and write files in the ‘home/nicktailor’ directory.

  • [root@server ~]# mkdir /home/nicktailor
  • [root@server ~]# chmod 755 /home/nicktailor/

6. Export shared directory on server

Open /etc/exports file and add the entry as shown below

  • [root@server ~]# vi /etc/exports
  • add the following below
  • /home/nicktailor 192.168.1.0/24(rw,sync,no_root_squash,no_all_squash)

where,

 /home/nicktailor  – shared directory

192.168.1.0/24      – IP address range of clients to access the shared folder

rw                          – Make the shared folder to be writable

sync                       – Synchronize shared directory whenever create new files/folders

no_root_squash   – Enable root privilege  (Users can read, write and delete the files in the shared directory)

no_all_squash     – Enable user’s authority

Now restart the NFS service.

  • [root@server ~]# /etc/init.d/nfs restart

Shutting down NFS daemon:                                  [  OK  ]

Shutting down NFS mountd:                                  [  OK  ]

Shutting down NFS services:                                [  OK  ]

Starting NFS services:                                     [  OK  ]

Starting NFS mountd:                                       [  OK  ]

Stopping RPC idmapd:                                       [  OK  ]

Starting RPC idmapd:                                       [  OK  ]

Starting NFS daemon:                                       [  OK  ]       –

7. Mount shared directories in client

Create a mount point to mount the shared directories of server.

To do that create a directory called ‘/nfs/shared’ (You can create your own mount point)

  • [root@vpn client]# mkdir -p /nfs/shared

Now mount the shared directories from server as shown below

  • [root@vpn client]# mount -t nfs 192.168.1.200:/home/nicktailor/ /nfs/shared/

This will take a while and shows a connection timed out error for me. Well, don’t panic, firewall might be restricting  the clients to mount shares from server. Simply stop the iptables to rectify the problem or you can allow the NFS service ports through iptables.

To do that open the /etc/sysconfig/nfs file and uncomment the lines which are marked in bold.

  • [root@server ~]# vi /etc/sysconfig/nfs

#

# Define which protocol versions mountd 

# will advertise. The values are “no” or “yes”

# with yes being the default

#MOUNTD_NFS_V2=”no”

#MOUNTD_NFS_V3=”no”

#

#

# Path to remote quota server. See rquotad(8)

#RQUOTAD=”/usr/sbin/rpc.rquotad”

# Port rquotad should listen on.

RQUOTAD_PORT=875

# Optinal options passed to rquotad

#RPCRQUOTADOPTS=””

#

#

# Optional arguments passed to in-kernel lockd

#LOCKDARG=

# TCP port rpc.lockd should listen on.

LOCKD_TCPPORT=32803

# UDP port rpc.lockd should listen on.

LOCKD_UDPPORT=32769

#

#

# Optional arguments passed to rpc.nfsd. See rpc.nfsd(8)

# Turn off v2 and v3 protocol support

#RPCNFSDARGS=”-N 2 -N 3″

# Turn off v4 protocol support

#RPCNFSDARGS=”-N 4″

# Number of nfs server processes to be started.

# The default is 8. 

#RPCNFSDCOUNT=8

# Stop the nfsd module from being pre-loaded

#NFSD_MODULE=”noload”

# Set V4 grace period in seconds

#NFSD_V4_GRACE=90

#

#

#

# Optional arguments passed to rpc.mountd. See rpc.mountd(8)

#RPCMOUNTDOPTS=””

# Port rpc.mountd should listen on.

MOUNTD_PORT=892

#

#

# Optional arguments passed to rpc.statd. See rpc.statd(8)

#STATDARG=””

# Port rpc.statd should listen on.

STATD_PORT=662

# Outgoing port statd should used. The default is port

# is random

STATD_OUTGOING_PORT=2020

# Specify callout program 

#STATD_HA_CALLOUT=”/usr/local/bin/foo”

#

#

# Optional arguments passed to rpc.idmapd. See rpc.idmapd(8)

#RPCIDMAPDARGS=””

#

# Set to turn on Secure NFS mounts. 

#SECURE_NFS=”yes”

# Optional arguments passed to rpc.gssd. See rpc.gssd(8)

#RPCGSSDARGS=””

# Optional arguments passed to rpc.svcgssd. See rpc.svcgssd(8)

#RPCSVCGSSDARGS=””

#

# To enable RDMA support on the server by setting this to

# the port the server should listen on

#RDMA_PORT=20049

Now restart the NFS service

  • [root@server ~]# /etc/init.d/nfs restart

Shutting down NFS daemon:                                  [  OK  ]

Shutting down NFS mountd:                                  [  OK  ]

Shutting down NFS services:                                [  OK  ]

Starting NFS services:                                     [  OK  ]

Starting NFS mountd:                                       [  OK  ]

Stopping RPC idmapd:                                       [  OK  ]

Starting RPC idmapd:                                       [  OK  ]

Starting NFS daemon:                                       [  OK  ]

Add the lines shown in bold in  ‘/etc/sysconfig/iptables’ file.

  • [root@server ~]# vi /etc/sysconfig/iptables

# Firewall configuration written by system-config-firewall

# Manual customization of this file is not recommended.

*filter

-A INPUT -m state –state NEW -m udp -p udp –dport 2049 -j ACCEPT

-A INPUT -m state –state NEW -m tcp -p tcp –dport 2049 -j ACCEPT

-A INPUT -m state –state NEW -m udp -p udp –dport 111 -j ACCEPT

-A INPUT -m state –state NEW -m tcp -p tcp –dport 111 -j ACCEPT

-A INPUT -m state –state NEW -m udp -p udp –dport 32769 -j ACCEPT

-A INPUT -m state –state NEW -m tcp -p tcp –dport 32803 -j ACCEPT

-A INPUT -m state –state NEW -m udp -p udp –dport 892 -j ACCEPT

-A INPUT -m state –state NEW -m tcp -p tcp –dport 892 -j ACCEPT

-A INPUT -m state –state NEW -m udp -p udp –dport 875 -j ACCEPT

-A INPUT -m state –state NEW -m tcp -p tcp –dport 875 -j ACCEPT

-A INPUT -m state –state NEW -m udp -p udp –dport 662 -j ACCEPT

-A INPUT -m state –state NEW -m tcp -p tcp –dport 662 -j ACCEPT

:INPUT ACCEPT [0:0]

:FORWARD ACCEPT [0:0]

:OUTPUT ACCEPT [0:0]

-A INPUT -m state –state ESTABLISHED,RELATED -j ACCEPT

-A INPUT -p icmp -j ACCEPT

-A INPUT -i lo -j ACCEPT

-A INPUT -m state –state NEW -m tcp -p tcp –dport 22 -j ACCEPT

-A INPUT -j REJECT –reject-with icmp-host-prohibited

-A FORWARD -j REJECT –reject-with icmp-host-prohibited

COMMIT

Now restart the iptables service

[root@server ~]# service iptables restart

iptables: Flushing firewall rules:                         [  OK  ]

iptables: Setting chains to policy ACCEPT: filter          [  OK  ]

iptables: Unloading modules:                               [  OK  ]

iptables: Applying firewall rules:                         [  OK  ]

Again mount the share from client

  • [root@vpn client]# mount -t nfs 192.168.1.200:/home/nicktailor/ /nfs/shared/

Finally the NFS share is mounted without any connection timed out error.

To verify whether the shared directory is mounted, enter the mount command in client system.

  • [root@vpn client]# mount

/dev/mapper/vg_vpn-lv_root on / type ext4 (rw)

proc on /proc type proc (rw)

sysfs on /sys type sysfs (rw)

devpts on /dev/pts type devpts (rw,gid=5,mode=620)

tmpfs on /dev/shm type tmpfs (rw,rootcontext=”system_u:object_r:tmpfs_t:s0″)

/dev/sda1 on /boot type ext4 (rw)

none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)

sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)

nfsd on /proc/fs/nfsd type nfsd (rw)

192.168.1.200:/home/ostechnix/ on /nfs/shared type nfs (rw,vers=4,addr=192.168.1.200,clientaddr=192.168.1.29)

8. Testing NFS

Now create some files or folders in the ‘/nfs/shared’ directory which we mounted in the previous step.

  • [root@vpn shared]# mkdir test
  • [root@vpn shared]# touch file1 file2 file3

Now go to the server and change to the ‘/home/nicktailor’ directory.

[root@server ~]# cd /home/nicktailor/

  • [root@server nicktailor]# ls

file1  file2  file3  test

  • [root@server nicktailor]#

Now the files and directories are listed which are created from the client. Also you can share the files from server to client and vice versa.

9. Automount the Shares

If you want to mount the shares automatically instead mounting them manually at every reboot, add the following lines shown in bold in the ‘/etc/fstab’ file of client system.

  • [root@vpn client]# vi /etc/fstab 

#

# /etc/fstab

# Created by anaconda on Wed Feb 27 15:35:14 2013

#

# Accessible filesystems, by reference, are maintained under ‘/dev/disk’

# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info

#

/dev/mapper/vg_vpn-lv_root /                       ext4    defaults        1 1

UUID=59411b1a-d116-4e52-9382-51ff6e252cfb /boot                   ext4    defaults        1 2

/dev/mapper/vg_vpn-lv_swap swap                    swap    defaults        0 0

tmpfs                   /dev/shm                tmpfs   defaults        0 0

devpts                  /dev/pts                devpts  gid=5,mode=620  0 0

sysfs                   /sys                    sysfs   defaults        0 0

proc                    /proc                   proc    defaults        0 0

192.168.1.200:/home/nicktailor/nfs/sharednfsrw,sync,hard,intr0 0

10. Verify the Shares

Reboot your client system and verify whether the share is mounted automatically or not.

  • [root@vpn client]# mount

/dev/mapper/vg_vpn-lv_root on / type ext4 (rw)

proc on /proc type proc (rw)

sysfs on /sys type sysfs (rw)

devpts on /dev/pts type devpts (rw,gid=5,mode=620)

tmpfs on /dev/shm type tmpfs (rw,rootcontext=”system_u:object_r:tmpfs_t:s0″)

/dev/sda1 on /boot type ext4 (rw)

none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)

sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)

192.168.1.200:/home/nicktailor on /nfs/shared type nfs (rw,sync,hard,intr,vers=4,addr=192.168.1.200,clientaddr=192.168.1.29)

nfsd on /proc/fs/nfsd type nfsd (rw)

 

How to setup a NFS server on Debian

DEBIAN SETUP

Make sure you have NFS server support in your server’s kernel (kernel module named “knfsd.ko” under your /lib/modules/uname -r/ directory structure)

$ grep NFSD /boot/config-`uname -r`

or similar (wherever you’ve stashed your config file, for example, perhaps in /usr/src/linux/.config.)

There are at ltwo mainstream NFS server implementations that people use (excluding those implemented in Python and similar): one implemented in user space, which is slower however easier to debug, and the other implemented in kernel space, which is faster. Below shows the setup of the kernel-space one. If you wish to use the user-space server, then install the similarly-named package.

First, the packages to begin with:

  1.  $ aptitude install nfs-kernel-server portmap

Note that portmap defaults to only listening for NFS connection attempts on 127.0.0.1 (localhost), so if you wish to allow connections on your local network, then you need to edit /etc/default/portmap, to comment out the “OPTIONS” line. Also, we need to ensure that the /etc/hosts.allow file allows connections to the portmap port. For example:

2.   Now run the following commands. This will edit the portmap configuration file and all
the subnet in your hosts.allow for which ever subnet is nfs server is on

      •           $ perl -pi -e ‘s/^OPTIONS/#OPTIONS/’ /etc/default/portmap
      •           $ echo “portmap: 192.168.1.” >> /etc/hosts.allow
      •           $ /etc/init.d/portmap restart 
      •           $ echo “rpcbind: ALL” >> /etc/hosts.allow

See ‘man hosts.allow’ for examples on the syntax. But in general, specifying only part of the IP address like this (leaving the trailing period) treats the specified IP address fragment as a wildcard, allowing all IP addresses in the range 192.168.1.0 to 192.168.1.255 (in this example.) You can do more “wildcarding” using DNS names, and so on too.

  1. Then, edit the /etc/exports file, which lists the server’s filesystems to export over NFS to client machines. The following example shows the addition of a line which adds the path “/example”, for access by any machine on the local network (here 192.168.1.*).
  1.  $ echo “/example 192.168.1.0/255.255.255.0(rw,no_root_squash,subtree_check)” >> /etc/exports
  2.  $ /etc/init.d/nfs-kernel-server reload

This tells the server to serve up that path, readable/writable, with root-user-id connecting clients to use root access instead of being mapped to ‘nobody’, and to use the ‘subtree_check’ to silence a warning message. Then, reloads the server.

6. On the Client server you wish to mount to the NFS share type the following

    • $ mount 192.168.1.100:/example /mnt/example

Result should look like this if you type

    • $mount <enter>

/dev/sda3 on / type ext4 (rw,errors=remount-ro)
tmpfs on /lib/init/rw type tmpfs (rw,nosuid,mode=0755)
proc on /proc type proc (rw,noexec,nosuid,nodev)
sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
udev on /dev type tmpfs (rw,mode=0755)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=620)
/dev/sda1 on /tmp type ext4 (rw)
rpc_pipefs on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
nfsd on /proc/fs/nfsd type nfsd (rw)
192.168.1.100:/nicktest on /mnt/nfs type nfs (rw,nolock,addr=192.168.1.100)

How to move off san boot to local disk with HP servers

How to move off san boot to local disk
===========================
1. add the disks to the server
next do a rescan-scsi-bus.sh to see if the new disks show up

2. Setup the Raid controler f8 (hp)
3. Boot off of system rescue cd
4. find the new drive, use fdisk -l 
5. copy partition over using dd and reboot to see new parition table

Examples:

  • dd if=/dev/mapper/3600508b4000618d90000e0000b8f0000 of=/dev/sda bs=1M
    or 
  • dd if=/dev/sdar of=/dev/cciss/c0d0 bs=1M

reboot unpresent the SAN’s from virtual connect or whatever storage interface your using.
You need to have the boot from san volumes disabled in VCEM

6. make new swap using cfdisk and then run

  • mkswap /dev/cciss/c0d0p9  (This controller0 drive0 Parititon 9) 
  • The size of the swap partition will vary, I used 32000M when i created it in cfdisk, you are free to use fdisk to do this also.

7. now you need to mount / to a directory, so make a empty directory

  • mkdir /mnt/root
    and mount it, examples below
    mount /dev/sda6 /mnt/root or mount /dev/cciss/c0d0p6 /mnt/root

9. fix the fstab (cfdisk works great for this if your system rescue disk)

  • vi /mnt/root/etc/fstab
  • change /dev/mapper/mpath0p* to cciss/c0d0*
  • comment out the swap volume
  • add new swap (/dev/cciss/c0d0p9)

10. next fix vi /mnt/root/etc/multipath.conf

  • uncomment: devnode “^cciss!c[0-9]d[0-9]*”
  • for EMC add:

device {
vendor “EMC”
product “Invista”
product_blacklist “LUNZ”
getuid_callout “/sbin/scsi_id -g -u -s /block/%n”
features “0”
hardware_handler “0”
path_selector “round-robin 0”
path_grouping_policy multibus
rr_weight uniform
no_path_retry 5
rr_min_io 1000
path_checker tur
}

11. next mount the boot parition

Examples

  • mount /dev/sda1 /mnt/root/boot
    or 
  • mount /dev/cciss/c0d0p1 /mnt/root/boot

12. edit grub.conf

  • vi /mnt/root/boot/grub.conf
  • change /dev/mapper/mpath0p* to /dev/sda*
    or 
  • change /dev/mapper/mpath0p* to /dev/cciss/c0d0

13. edit device.map

  • vi /mnt/root/boot/device.map
  • change /dev/mapper/mpath0 to /dev/sda
    or 
  • change /dev/mapper/mpath0 to /dev/cciss/c0d0

14. fix the initrd

  • zcat /mnt/root/boot/initrd-2.6.18-3……. | cpio -i -d
  • edit the file ‘init’
  • change mkrootdev line to /dev/cciss/c0d0p6 (this is the is / partition)
  • change resume line to /dev/cciss/c0d0p9 (this is the new swap partition)

15. Make a backup of the new partition

  • mv /mnt/root/boot/initrd-2.6.18-…. /mnt/root/boot/initrd-2.6.18-……backup

16. recompress the new initrd

  • find . | cpio -o -H newc |gzip -9 > /mnt/root/boot/initrd-2.6.18-348.2.1.el5.img

17. upon reboot, change the boot order in the bios settings to use the hp smart array controller

18. You may need to create a new inird using the redhat linux distro if the initrd doesnt boot.

  • chroot to /mnt/sysimage for /
  • then go to the boot parition and
  • mkinitrd -f -v initrd-2.6.18-53.1.4.el5.img 2.6.18-53.1.4.el5

19. reboot

 

 

 

 

How to Upgrade and Downgrade Packages with RHN Satellite 5.0

RHN Satellite Package upgrade and downgrade processes

Listing packages installed or available for upgrading on a host.

  1. Click on systems
    1. Next click on the target hostname
    2. Now click on the software tab
  • · If you click on list/remove Installed packages this will show you the current listed packages for the target host, you can also search by the specific package in the search field above the listed packages
  • · If you click on upgrade packages, this will only list the current available packages the host system is currently subscribed to.

Note: just because you don’t see newer packages available does not mean they are not out there.

 

Package Search on all available channels

 

  1. There are two ways you can do this
    1. Method 1 – Click on Channels at the very top, then package search, next type in the package name
  • · Once you have found the package, click on the package name, and it will take you to a details screen, on that screen it will have available from: in that section it will list out the channels that are subscribed to satellite that have the package you are looking for available from.
  1. Method 2 – This is the way I like to do it – Click on systems, then software tab, and then install new packages
  • · Next search for the package you wish to install, this will the latest available package from all channels available, and if you click on the package it will show the available channels for that specified package.

 

Upgrading  packages

  1. Click on systems, select the host, then software tab, and then upgrade
    1. Search for the packages you wish to upgrade and select them by checking the box to to the left of it

Note: if you are going to do select all, I would recommend against this, as if you select this button, even its not listed on the page it will literally select all the packages available. So select them individually is the way to go.

  1. Once completed check boxing scroll down to the bottom right and select upgrade packages, it will go to another confirmation screen, click on confirm.
  2. This will then be queued.
  3. If you click on Events, you should see it there and shortly within 5 min window it should disappear, if it does not then something is wrong, and you need to get a hold of satellite admin to investigate.

Downgrading  packages

  1. Click on systems, select the host, then software tab, and then profiles
    1. Select the stored profile of the date/time that Under “Compare to Stored Profile” and hit compare.
    2. You should see a list of packages that it is now going to synch back to, select sync package bottom right.
    3. You should see it go the events page, after about 5 mins it should no longer be listed in events, which means the server picked up the process and should begin downgrading shortly.

How to patch using RHN Satellite 5.0

Create a roll back tag

.

1.Log into satellite
2.Click on Systems
a.Now select Systems Groups
b.Next to the group you wish to patch click on “Use in SSM”
c.Top right of screen click on Manage (you should see the number of machines for that group selected in brackets)
3.Under Provisioning
d.Click on snapshot rollback
e.Now click on “Tag systems” tab
f.Type in the name of the Tag as depending on the group ie (DEV1-Sept26-2013)
g.Click on Tag current snapshots (this will tag the whole group with a rollback tag, should you ever need to.
h.If you needed to roll back instead of “tagsystems” You would select the “Rollbback” tab
i.Now Click on Manage again top right
4.Under Channels
j.Click on Channel memberships
k.Now Select Base Channels
l.Change the i386 channel to the Latest i386 channel available and do the same x86_64, you may also notice there are RHN5 & RHN6 channels.
m.Click on confirm subscriptions
n.Then click on Alter subscriptions bottom right
o.Now select child channels and ensure any childs you need are subscribed as well (Ie Clustering storage, Network tools, Vmware etc.
5.Now click on Manage again, ensuring the correct number of servers is still being managed.
p.Click on Schedule errata updates
q.Scroll to bottom of screen and select all
r.Click on Apply Errata
s.And now Schedule Updates
6.If you click on Schedule on the top menu should show you all the updates running
7.Click on Systems
t.Click on System Groups
u.Select the group you wish to view
v.Click on the “systems” tab inside the systems group
w.Now if you click on “systems” tab periodically you should see the patching counting down to zero, any server that is not counting down has an issue and you will need to log in as root to figure out what is wrong. (Refer Common problems and fixes)

.

Troubleshooting Guide

Errata does not appear to be counting down in systems group

 Log into Culprit server
 confirm that enabled = 1 is set in the file, cat /etc/yum/pluginconf.d/rhnplugin.conf

type cat

If it isn’t set, the Satellite will try to use the local repos, and not the channels on the Satellite server

 If the above doesn’t work you may want to ensure the you can connect to the satellite server by running telnet to the satellite on the following ports
 telnet kam1opapp99.connex.bclc.com 80
 telnet kam1opapp99.connex.bclc.com 443
 telnet kam1opapp99.connex.bclc.com 5222
1.The response you for all of these should look like

Trying 10.20.0.8…

Connected to kam1opapp99.connex.bclc.com.

Escape character is ‘^]’.

 

 Next run Yum –y update, if you see any of the following errors
 A common error is “cpio: open failed – Permission denied cpio: open failed – Permission denied“ or something similar
2.This usually means you have a mount point that is read only
3.Type mount at the command prompt to see if that is the case.

[root@kam1odapp19<dev>:~]# mount

/dev/mapper/vg_local-root on / type ext3 (rw)

proc on /proc type proc (rw)

sysfs on /sys type sysfs (rw)

devpts on /dev/pts type devpts (rw,gid=5,mode=620)

/dev/mapper/vg_local-usr on /usr type ext3 (rw,nodev)

/dev/mapper/vg_local-tmp on /tmp type ext3 (rw,noexec,nosuid,nodev)

/dev/mapper/vg_local-home on /home type ext3 (rw,nodev)

 If you see (rw,nodev) on the /usr mount

(this mean you the partition is read only and yum can not write updates to the /usr directory)

 To fix type mount –o remount,rw /usr
 And yum –y update again.

 If this still fails then escalate to a Senior Linux System Administrator..hahaha, JUST JOKES 😛

 Upon reboot Server does not come backup
 This could be the result of many things, however the most common is grub failure, to correct this we need to re-install grub manually from a RHN boot CD
4.Mount the VM or Server to a redhat disk 1.img file and boot to the prompt
5.At the prompt type “Linux Rescue” and hit <enter>
6.Once you reach the boot prompt type “chroot /mnt/sysimage” (you should see a note telling you above the prompt on how to do it.
7.Now you want to view grub conf “cat /boot/grub/grub.conf” and write down the following lines somewhere in notepad as you will need them
 kernel /vmlinuz-2.6.18-348.6.1.el5 ro root=/dev/vg_local/root rhgb quiet audit=1
 initrd /initrd-2.6.18-348.6.1.el5.img
8.next cd into the /boot directory
9.type “grub” <enter> this will take you to the grub prompt
 now you need to tell grub to load the kernel & initrd manually indicated below
 grub> kernel /boot/ vmlinuz-2.6.18-348.6.1.el5

(result will look something like this)

[Linux-bzImage, setup=0x1400, size=0x15f464]

 grub> initrd /boot/ initrd-2.6.18-348.6.1.el5.img
 (result will look something like this )
 [Linux-initrd @ 0x376000, 0x79e3d bytes]
 If the initrd gives an error don’t worry, it does that sometimes, proceed to setting up the on boot partition anyway
 grub> setup (hd0)

(Result –should look like below)

Checking if “/boot/grub/stage1” exists… yes

Checking if “/boot/grub/stage2” exists… yes

Checking if “/boot/grub/e2fs_stage1_5” exists… yes

Running “embed /boot/grub/e2fs_stage1_5 (hd0)”… failed (this is not fatal)

Running “embed /boot/grub/e2fs_stage1_5 (hd0,2)”… failed (this is not fatal)

Running “install /boot/grub/stage1 (hd0) /boot/grub/stage2 p /boot/grub/menu.lst “… succeeded

 Done.
 Reboot image

10.If that does not work escalate to Senior Systems Administrator
HAHAH…JUST JOKES 😛

.

 File System Check Fails upon reboot
 If you see the following message after a reboot

Give root password for maintenance (or type Control-D to continue)

 You will need to boot into single user mode and run an fsck on the partition that is failing a file system check.
 To boot into single user mode you edit the boot instructions for the GRUB menu entry you wish to boot and add the kernel parameter/option single. Brief instructions for how to do this are below.
11.Select (highlight) the GRUB boot menu entry you wish to use.
12.Press e to edit the GRUB boot commands for the selected boot menu entry.
13.Look near the bottom of the list of commands for lines similar to

kernel /vmlinuz-2.6.18-348.12.1.el5PAE ro root=LABEL=/

14.You want to add “init=/bin/sh” to the end of the kernel line and then hit “B” to Boot
 It should look like so

kernel /vmlinuz-2.6.18-348.12.1.el5PAE ro root=LABEL=/ init=/bin/sh

15.Next you want to run fsck –y <whatever partition that needs to checked>
 You will run this on a unmounted partition, never run on a mounted partition as you can corrupt the data if you do.

.


RHN Satellite Package
upgrade and downgrade processes

Listing packages installed or available for upgrading on a host.

1.Click on systems
a.Next click on the target hostname
b.Now click on the software tab
 If you click on list/remove Installed packages this will show you the current listed packages for the target host, you can also search by the specific package in the search field above the listed packages
 If you click on upgrade packages, this will only list the current available packages the host system is currently subscribed to.

Note: just because you don’t see newer packages available does not mean they are not out there.

.

Package Search on all available channels

.

2.There are two ways you can do this
c.Method 1 – Click on Channels at the very top, then package search, next type in the package name
 Once you have found the package, click on the package name, and it will take you to a details screen, on that screen it will have available from: in that section it will list out the channels that are subscribed to satellite that have the package you are looking for available from.
d.Method 2 – This is the way I like to do it – Click on systems, then software tab, and then install new packages
 Next search for the package you wish to install, this will the latest available package from all channels available, and if you click on the package it will show the available channels for that specified package.

.

Upgrading packages

3.Click on systems, select the host, then software tab, and then upgrade
e.Search for the packages you wish to upgrade and select them by checking the box to to the left of it

Note: if you are going to do select all, I would recommend against this, as if you select this button, even its not listed on the page it will literally select all the packages available. So select them individually is the way to go.

f.Once completed check boxing scroll down to the bottom right and select upgrade packages, it will go to another confirmation screen, click on confirm.
g.This will then be queued.
h.If you click on Events, you should see it there and shortly within 5 min window it should disappear, if it does not then something is wrong, and you need to get a hold of satellite admin to investigate.

.

Downgrading packages

4.Click on systems, select the host, then software tab, and then profiles
i.Select the stored profile of the date/time that Under “Compare to Stored Profile” and hit compare.
j.You should see a list of packages that it is now going to synch back to, select sync package bottom right.
k.You should see it go the events page, after about 5 mins it should no longer be listed in events, which means the server picked up the process and should begin downgrading shortly.

.

.

.

.

.

.

.

.

.

.

How to upgrade mysql 5.1 to 5.6 with WHM doing master-slave

How to upgrade MYSQL in a production environment with WHM

Okay, so if you have a master slave database setup with large innodb and myisam, you probably want to upgrade to mysql 5.6. The performance tweaks make a difference especially with utilizing multicores.

Most of the time Cpanel is really good at click upgrade and it works, however with mysql if you’re running a more complex setup, then simply clicking upgrade in cpanel for mysql isn’t going to do the trick. I’ve outlined the process below to help anyone else trying to do this.

  1. Making a backup of the database using Percona and mysqldump
  • The first thing you need to do is make a backup of everything, since we have large innodb and myisam db’s, using mysqldump can be slow.
  • Using percona this will backup everything
    i.    Innobackupex /directory you want everything to backed up to (this will be uncompressed backup. (See my blog on multithreaded backup and restores using percona for more details on how to use Percona Backup)
    ii.    Next you need to make a mysqldump of all your databases
  • Mysqldump –all-databases > alldatabases.sql (old school)
  • I do it a bit differently. I have a script that makes full dump of all the databases and creates separate sql files for each db in case I need to import a specific database after that fact.
    http://nicktailor.com/files/mysqldumpbackup.sh (Here is the script edit according to your needs)

2.   Now you need to upgrade mysql, so log into WHM and run the mysql upgrade in the mysql section of whm. If your running a db server and disabled apache, renable it in WHM temporarily, because WHM will be recompiling php and easyapache with the new mysql modules, once its done you can disable it.

  • If your mysql upgrade fails check your permissions on mysql or you can run the upgrade from command line forced.
    /usr/local/cpanel/scripts/mysqlup –force

And after that run

         /usr/local/cpanel/scripts/easyapache

3. Since WHM upgrades /var/lib/mysql regardless if you specified another directory for your data we’re going to have to do a little bit of extra work, while were doing this were going to shrink ibdata1 file to fix any innodb corruption and save you a ton of space.

  • Find your mysql data directory if its different from /var/lib/mysql, if it’s the same then you don’t need to do these steps.
    i.    Delete everything inside the data directory
    ii.    Copy everything from /var/lib/mysql to mysql datadirectory
                  cp –ra /var/lib/mysql     /datadirectory
    iii.    Try to start mysql, if you get an error saying myqsl cant create a pid, its probably due to your my.cnf, some setting no longer work with mysql 5.6, easiest way to figure out is just comment stuff out until it works. I will provide a sample one that worked for me. Also its easier to start up in safe mode to avoid all the granty permissions simply uncomment the #skip-grant-tables in the my.cnf file
    http://www.nicktailor.com/files/my.cnf.sample (this sample has the performance tweaks enabled it)

         iv.    Once mysql is started, now ya want to fix up the innodb while you got a chance, if you weren’t using /var/lib/mysql as your data directory then the upgrade will have already created new ibdata1, ib_logfile0 & ib_logfile1 files. If however this is not the case, simply rename those files and restart mysql and mysql will create brand spanking new ones
         v.    Now we need to restore everything, now I have SSD drives and if you have large DB’s you should only be using SSD’s anyway. You need to do a mysqldump back to mysql using the all-databases.sql file you created earlier.

  •          Mysql –u root –p<password> < all-databases.sql (best to run this in a screen session on linux as it will take awhile and you don’t want to loose your connection during this)

       vi.    Once the dump is complete you now need to run mysql_upgrade to upgrade all the databases and tables that didn’t get upgraded to the new version, followed by a mysql-check

  • Mysql_upgrade –u root –p<password>
  • mysqlcheck –all-databases –check-upgrade –auto-repair

Now you should be able to set grant permissions and things, if you miss the mysql_upgrade step, some of your sites may work and some may not, in addition you will probably be unable to set grant permission in mysql, you’ll get a connection error most likely.

4. If you have a slave db, then you can continue reading. So the next piece is fixing our slave now. Thanks to percona we can do this quick. You will notice that your ibdata1 file is tiny now and clean, so the backup will be super fast.

  • You need to back-up full backup using percona
               i.    Innobackupex /directoryyouwanttobackupto
  • Now you need to copy the uncompressed backup to your slave server, you can either scp or rsync, whatever works for you. I have gige switch so I sync over
              i.    rsync -rva –numeric-ids –progress . root@192.168.0.20:/backupdirectory (this is just a sample)

              i.    Stop mysql
                    a.   /etc/init.d/mysql stop

             ii.    Delete the data directory on the slave
                    b. rm -f /mysqldatadirectory/*

            iii.    Do a full percona restore
                    c.  Innobackupex –copy-back /backupdirectory

5.     Once mysql is restored change your permissions on mysql files to mysql:mysql, edit your my.cnf and startup mysql and you should be good to go. You will need to fix replication, read my mysql failover setup post on how do that if you’re not sure.
                     chown -R mysql:mysql /mysqldatadirectory

How to check what processes are using your swap

Here is a little script that will show you what processes are using your swap

http://www.nicktailor.com/files/swapusage

vi swapusage && chmod +x swapusage

copy paste below & save

./swapusage (to run)

#!/bin/bash

# find-out-what-is-using-your-swap.sh
# — Get current swap usage for all running processes output
# — rev.0.1, 2011-05-27, Erik Ljungstrom – initial version
SCRIPT_NAME=`basename $0`;
SORT=”kb”; # {pid|kB|name} as first parameter, [default: kb]
[ “$1” != “” ] && { SORT=”$1″; }

[ ! -x `which mktemp` ] && { echo “ERROR: mktemp is not available!”; exit; }
MKTEMP=`which mktemp`;
TMP=`${MKTEMP} -d`;
[ ! -d “${TMP}” ] && { echo “ERROR: unable to create temp dir!”; exit; }

>${TMP}/${SCRIPT_NAME}.pid;
>${TMP}/${SCRIPT_NAME}.kb;
>${TMP}/${SCRIPT_NAME}.name;

SUM=0;
OVERALL=0;
echo “${OVERALL}” > ${TMP}/${SCRIPT_NAME}.overal;

for DIR in `find /proc/ -maxdepth 1 -type d -regex “^/proc/[0-9]+”`;
do
PID=`echo $DIR | cut -d / -f 3`
PROGNAME=`ps -p $PID -o comm –no-headers`

for SWAP in `grep Swap $DIR/smaps 2>/dev/null| awk ‘{ print $2 }’`
do
let SUM=$SUM+$SWAP
done

if (( $SUM > 0 ));
then
echo -n “.”;
echo -e “${PID}\t${SUM}\t${PROGNAME}” >> ${TMP}/${SCRIPT_NAME}.pid;
echo -e “${SUM}\t${PID}\t${PROGNAME}” >> ${TMP}/${SCRIPT_NAME}.kb;
echo -e “${PROGNAME}\t${SUM}\t${PID}” >> ${TMP}/${SCRIPT_NAME}.name;
fi
let OVERALL=$OVERALL+$SUM
SUM=0
done
echo “${OVERALL}” > ${TMP}/${SCRIPT_NAME}.overal;
echo;
echo “Overall swap used: ${OVERALL} kB”;
echo “========================================”;
case “${SORT}” in
name )
echo -e “name\tkB\tpid”;
echo “========================================”;
cat ${TMP}/${SCRIPT_NAME}.name|sort -r;
;;

kb )
echo -e “kB\tpid\tname”;
echo “========================================”;
cat ${TMP}/${SCRIPT_NAME}.kb|sort -rh;
;;

pid | * )
echo -e “pid\tkB\tname”;
echo “========================================”;
cat ${TMP}/${SCRIPT_NAME}.pid|sort -rh;
;;
esac
rm -fR “${TMP}/”;

#!/bin/bash

# find-out-what-is-using-your-swap.sh
# — Get current swap usage for all running processes
# —
# — rev.0.3, 2012-09-03, Jan Smid – alignment and intendation, sorting
# — rev.0.2, 2012-08-09, Mikko Rantalainen – pipe the output to “sort -nk3″ to get sorted output
# — rev.0.1, 2011-05-27, Erik Ljungstrom – initial version
SCRIPT_NAME=`basename $0`;
SORT=”kb”; # {pid|kB|name} as first parameter, [default: kb]
[ “$1” != “” ] && { SORT=”$1″; }

[ ! -x `which mktemp` ] && { echo “ERROR: mktemp is not available!”; exit; }
MKTEMP=`which mktemp`;
TMP=`${MKTEMP} -d`;
[ ! -d “${TMP}” ] && { echo “ERROR: unable to create temp dir!”; exit; }

>${TMP}/${SCRIPT_NAME}.pid;
>${TMP}/${SCRIPT_NAME}.kb;
>${TMP}/${SCRIPT_NAME}.name;

SUM=0;
OVERALL=0;
echo “${OVERALL}” > ${TMP}/${SCRIPT_NAME}.overal;

for DIR in `find /proc/ -maxdepth 1 -type d -regex “^/proc/[0-9]+”`;
do
PID=`echo $DIR | cut -d / -f 3`
PROGNAME=`ps -p $PID -o comm –no-headers`

for SWAP in `grep Swap $DIR/smaps 2>/dev/null| awk ‘{ print $2 }’`
do
let SUM=$SUM+$SWAP
done

if (( $SUM > 0 ));
then
echo -n “.”;
echo -e “${PID}\t${SUM}\t${PROGNAME}” >> ${TMP}/${SCRIPT_NAME}.pid;
echo -e “${SUM}\t${PID}\t${PROGNAME}” >> ${TMP}/${SCRIPT_NAME}.kb;
echo -e “${PROGNAME}\t${SUM}\t${PID}” >> ${TMP}/${SCRIPT_NAME}.name;
fi
let OVERALL=$OVERALL+$SUM
SUM=0
done
echo “${OVERALL}” > ${TMP}/${SCRIPT_NAME}.overal;
echo;
echo “Overall swap used: ${OVERALL} kB”;
echo “========================================”;
case “${SORT}” in
name )
echo -e “name\tkB\tpid”;
echo “========================================”;
cat ${TMP}/${SCRIPT_NAME}.name|sort -r;
;;

kb )
echo -e “kB\tpid\tname”;
echo “========================================”;
cat ${TMP}/${SCRIPT_NAME}.kb|sort -rh;
;;

pid | * )
echo -e “pid\tkB\tname”;
echo “========================================”;
cat ${TMP}/${SCRIPT_NAME}.pid|sort -rh;
;;
esac
rm -fR “${TMP}/”;

#!/bin/bash

# find-out-what-is-using-your-swap.sh
# — Get current swap usage for all running processes
# —
SCRIPT_NAME=`basename $0`;

SORT=”kb”; # {pid|kB|name} as first parameter, [default: kb]
[ “$1” != “” ] && { SORT=”$1″; }

[ ! -x `which mktemp` ] && { echo “ERROR: mktemp is not available!”; exit; }
MKTEMP=`which mktemp`;
TMP=`${MKTEMP} -d`;
[ ! -d “${TMP}” ] && { echo “ERROR: unable to create temp dir!”; exit; }

>${TMP}/${SCRIPT_NAME}.pid;
>${TMP}/${SCRIPT_NAME}.kb;
>${TMP}/${SCRIPT_NAME}.name;

SUM=0;
OVERALL=0;
echo “${OVERALL}” > ${TMP}/${SCRIPT_NAME}.overal;

for DIR in `find /proc/ -maxdepth 1 -type d -regex “^/proc/[0-9]+”`;
do
PID=`echo $DIR | cut -d / -f 3`
PROGNAME=`ps -p $PID -o comm –no-headers`

for SWAP in `grep Swap $DIR/smaps 2>/dev/null| awk ‘{ print $2 }’`
do
let SUM=$SUM+$SWAP
done

if (( $SUM > 0 ));
then
echo -n “.”;
echo -e “${PID}\t${SUM}\t${PROGNAME}” >> ${TMP}/${SCRIPT_NAME}.pid;
echo -e “${SUM}\t${PID}\t${PROGNAME}” >> ${TMP}/${SCRIPT_NAME}.kb;
echo -e “${PROGNAME}\t${SUM}\t${PID}” >> ${TMP}/${SCRIPT_NAME}.name;
fi
let OVERALL=$OVERALL+$SUM
SUM=0
done
echo “${OVERALL}” > ${TMP}/${SCRIPT_NAME}.overal;
echo;
echo “Overall swap used: ${OVERALL} kB”;
echo “========================================”;
case “${SORT}” in
name )
echo -e “name\tkB\tpid”;
echo “========================================”;
cat ${TMP}/${SCRIPT_NAME}.name|sort -r;
;;

kb )
echo -e “kB\tpid\tname”;
echo “========================================”;
cat ${TMP}/${SCRIPT_NAME}.kb|sort -rh;
;;

pid | * )
echo -e “pid\tkB\tname”;
echo “========================================”;
cat ${TMP}/${SCRIPT_NAME}.pid|sort -rh;
;;
esac
rm -fR “${TMP}/”;

How to pass a “password” to su as a variable in a script

How to pass a “password” to su as a variable in script and execute tasks to an array of hosts

Some of you may work for organizations that do access control for linux servers. In which case they do not use ssh keys for root, and are still doing the unthinkable allowing the use of password authentication.

So this means you have to log into a server and the “su –“ to root before you can execute commands, and if you have an array of servers this could be tedious and time consuming. I was told by everyone that you can’t pass a “password” as a variable in script to su, as it’s not allowed.

Guess what…that’s a lie, because I’m going to show you how to do it securely.

 

  1. So you need to install something called expect on all your servers. This tool is used for interactive testing of scripts. It makes the script pass human typing where needed. You can pass variables to this and use it as a wrapper inside another script.
    1. “Yum install expect” on debian “apt-get install expect”
  2. Now what you want to do is log into your server as the user not root, and inside the home directory you want to setup the following to scripts
    1. Create a file called gotroot
    2. Vi   gotroot, add the following below and save.

The script below is a wrapper script, that you will use inside another bash script later in this tutorial.

Usage would be

./gotroot <user> <host><userpass><rootpass>

These arguments will then get passed to the remote host and it will execute the send commands below in our case “ls –al”, and then once its done it will exit and log out of the server and return you to the host you started from. This script does not account for ssh fingerprinting, so you will need to ensure you fingerprint your user to each server before using this script. I will add fingerprinting in future, just got lazy.

What I like to do is..I will write a bash script that it going to do a bunch tasks, scp it over all the servers as my user, then comment out the ls –al section and uncomment the section where you can tell to run the bash script. This will then log in to the server and su to root, execute your bash script, exit and log  out.

========================
#!/usr/bin/expect -f
set mypassword [lindex $argv 2]
set mypassword2 [lindex $argv 3]
set user [lindex $argv 0]
set host  [lindex $argv 1]
spawn ssh -tq $user@$host

########################################################

#this section is only needed if you are NOT using ssh keys
#expect “Password:”
#send “$mypassword\r”
#expect “$ “

#########################################################

send “su -\r”
expect “Password:”

send “$mypassword2\r”
expect “$ “

 

#this will execute command on the remote host

send “ls -al\r”
expect “$ “

 

#this will execute script you want to run on remote host
#send “/home/nicktailor/script.pl\r”
#expect “$ “

 

#this command will exit the remote host
send “exit\r”
send “exit\r”

interact

==============================================

How to wrap this script so it will do any array of hosts within bash.

  1. Create a file called host
    1. Vi  hosts and the following below in it.
    2. Also created a file called logs.txt “touch logs.txt”

This script will all you to use the above script as a wrapper and it will go and execute the command you want from got root to each host in the servers variable listed below.

In addition you it will prompt you the user name and root pass, and will not show he passwords you enter, it will prompt you the same way if you were to do “su –“. It will then take those credentials and use it for each host securely, the passwords will not show up in logs or history anywhere, as some security departments would have issues with that.

So you simply type “./hosts”

It will prompt you for whatever it requires to continue. Just be sure that you add the array of hosts you want to execute the tasks on, and that you have setup a ssh fingerprint as your user first. Expect scripts are extremely easy to learn, once you play with this.

=======================================
#!/bin/bash

#########################

#some colour constants #

#########################

 

CLR_txtblk=’\e[0;30m’ # Black – Regular
CLR_txtred=’\e[0;31m’ # Red
CLR_txtgrn=’\e[0;32m’ # Green
CLR_txtylw=’\e[0;33m’ # Yellow
CLR_txtblu=’\e[0;34m’ # Blue
CLR_txtpur=’\e[0;35m’ # Purple
CLR_txtcyn=’\e[0;36m’ # Cyan
CLR_txtwht=’\e[0;37m’ # White
CLR_bldblk=’\e[1;30m’ # Black – Bold
CLR_bldred=’\e[1;31m’ # Red
CLR_bldgrn=’\e[1;32m’ # Green
CLR_bldylw=’\e[1;33m’ # Yellow
CLR_bldblu=’\e[1;34m’ # Blue
CLR_bldpur=’\e[1;35m’ # Purple
CLR_bldcyn=’\e[1;36m’ # Cyan
CLR_bldwht=’\e[1;37m’ # White
CLR_unkblk=’\e[4;30m’ # Black – Underline
CLR_undred=’\e[4;31m’ # Red
CLR_undgrn=’\e[4;32m’ # Green
CLR_undylw=’\e[4;33m’ # Yellow
CLR_undblu=’\e[4;34m’ # Blue
CLR_undpur=’\e[4;35m’ # Purple
CLR_undcyn=’\e[4;36m’ # Cyan
CLR_undwht=’\e[4;37m’ # White
CLR_bakblk=’\e[40m’   # Black – Background
CLR_bakred=’\e[41m’   # Red
CLR_bakgrn=’\e[42m’   # Green
CLR_bakylw=’\e[43m’   # Yellow
CLR_bakblu=’\e[44m’   # Blue
CLR_bakpur=’\e[45m’   # Purple
CLR_bakcyn=’\e[46m’   # Cyan
CLR_bakwht=’\e[47m’   # White
CLR_txtrst=’\e[0m’    # Text Reset

#

#########################

SERVERS=”host1 host2 host3”
#echo -e “${CLR_bldgrn}Enter Servers (space seperated)${CLR_txtrst}”

#read -p “servers: ” SERVERS
echo -e “${CLR_bldgrn}Enter User${CLR_txtrst}”
read -p “user: ” USERNAME
echo -e “${CLR_bldgrn}Enter Password${CLR_txtrst}”
read -p “password: ” -s USERPW
echo
echo -e “${CLR_bldgrn}Enter Root Password${CLR_txtrst}”
read -p “password: ” -s ROOTPW
echo
for machine in $SERVERS; do
~/gotroot ${USERNAME} ${machine} ${USERPW} ${ROOTPW}  2>&1 | tee -a logs.txt
done

======================================

Hope you enjoyed this tutorial and if you have any questions email nick@nicktailor.com

How to do multi-threaded backups and restores for mysql

How to do multi-threaded backups and restores for mysql

So there are probably a lot of people out there who have the standard master-slave mysql database servers running with InnoDB and MyISAM Databases.

This not usually a problem unless you have high amount of traffic going to your databases using InnoDB, since mysql does not do multithreaded dumps or restores, this can be problematic if your replication is broken to the point where you need to do a full restore.

Now if your database was say 15gigs in size, consisting of Innodb and myisam db’s in a production environment this would be  brutal, as you would need to lock the tables on the primary while your restoring to the slave. Since mysql does not do a multithreaded restores, this could take 12 hours or more, keep in mind this is dependent on hardware.  To give you an idea the servers we had when we ran into this issue, to help you gauge your problem.

Xeon quad core, sata 1 T drives, 18 gigs of ram (Master and Slave)

Fortunately, there is a solution 🙂

There is a free application called xtrabackup by Percona which does multithreaded backup and restores of myisam and innodb combined. In this blog I will be explaining how to set it up, and what I did to minimize downtime for the businesses.

What you should consider doing

Since drive I/O is a factor with high traffic Database servers which can seriously impede performace significantly. We built new servers same specs but with SSD drives this time.

Xeon quad core, (sata3) 1T, (SSD) 120G 18 gigs of ram

Now this is not necessary, however if database traffic is high you should consider SSD or even fiber channel drives if your setup supports it.

Xtrabackup is free unless you use mysql enterprise, then its $5000/server to license it. Honestly using mysql enterprise in my opinion is just stupid, is exactly the same except you get support, the same support you could get online or irc on any forum which is probably better, why pay for something you don’t need to.

Install and setup

Note: This will need to be installed on both master and slave database servers, as this process will replace the mysqldump and restore method you use.

  1. rpm -Uhv http://www.percona.com/downloads/percona-release/percona-release-0.0-1.x86_64.rpm
  2. Yum install percona-xtrabackup.x86_64 (Master & Slave both servers)

Create  backup

Note: There are number of ways you can do this. You can have it output to a /tmp directory while its doing the backup process, or you can have it output to stdout and compress to a directory. I will show you how to do both ways.

  1. Since innobackupex, which is the tool with xtrabackup we are going to use, looks at the /etc/my.cnf file for the data directory for mysql, we do not have to define a lot in our command string. For this example we do not setup a mysql password, however if you did you simply add –user <user> -pass <pass> to the string.

This process took 5 minutes on a 15gig Database with Xeon quad core, (sata3) 1T, (SSD) 120G 18Gram

2. Innobackupex <outputdirectory>
     Eg. Innobackupex /test (this command will create another directory inside this one with a time stamp, it’s a fullback of all databases innodb and myisam uncompressed.)

3. innobackupex –stream=tar ./ | gzip – > /test/test.tar.gz (This command will do the same as the above except will output to stdout and compress the fullbackup into the tar file

Note: you also need to use the -i option to untar it eg. tar -ixvf test.tar.gz, ensure mysql is stopped on any slave before restoring, and dont forget to chown -R mysql:mysql the files after you restore the data to the data directory using the innobackupex –copy-back command.

Note: I have experienced issues with getting replication to start doing a full backup and restore on to a slave with innodb and myisam, using the innobackupex stream compression to gzip, after untarring for whatever reason the innodb log files had some corruption, which caused the slave to stop replication upon immediate connection of the master.

if the stream compression doesnt work do a uncompressed backup as shown above, and then rsync the data from your master to the slave via a gige switch if possible (ie. rsync -rva –numeric-ids <source> <destination>:/)

Our 15gig DB compressed to 3.4gigs

  1. Now copy tar file or directory to that innobackupex created to the slave server via scp

Scp * user@host:  <-(edit accordingly)

Doing a Restore

Note: The beauty of this restore is it will be a multi-threaded restore utilizing multiple cores instead of just one, since our server data directory is now sitting on SSD, disk I/O will be almost nill, increasing performance significantly, and reducing load.

  1. On your primary database server log into mysql and lock the tables
    1. Mysql> FLUSH TABLES WITH READ LOCK;
    2. Now on your slave: To do a restore of all the databases its pretty easy.
      1. innobackupex –copy-back /test/2013-02-03_17-21-52/ (update the path to where ever the innobackupex files are.)

This took 3 mins to restore a 15gig DB with innodb and myisam for us on
Xeon quad core, (sata3) 1T, (SSD) 120G 18 gigs of ram

Setting up the backup crons

  1. Now if you were using mysqldump as part of your mysql backup process then you will need to change it to use the following.
  2. Create a directory on the slave called mysqldumps
  3. Create a file called backups.sh and save it.
    1. Add the following to it.

#!/bin/bash
innobackupex –stream=tar ./ | gzip – > /mysqldumps/innobackup-$(date +”%F”)

Note: that our backups are being stored on our sata3 drive and data directory resides on the SSD

  1. Now now add this to your crontab as root, again change the cron to run however often you need to run.
    1. 0 11,23 * * * /fullpath/ backups.sh

Setting up diskspacewatch for the SSD drive.

  1. Since the SSD drive is 120G, we need to setup alert to monitor to watch the space threshold. If you not have the resources to implement a tool to specifically to monitor diskspace, then you can write a script that watches the diskspace and send out an email alert in the event the threshold is reached.
  2. Run a df –h on your server find the partition you want it to watch edit (df /disk2) on the script to which ever partition you want it to watch, threshold is defined by ( if [ $usep -ge 80 ]; then)
  3. Create a file called diskspacewatch, add and save below

#!/bin/sh
df /disk2 | grep -vE ‘^Filesystem|tmpfs|cdrom’ | awk ‘{ print $5 ” ” $1 }’ | while read output;
do
echo $output
usep=$(echo $output | awk ‘{ print $1}’ | cut -d’%’ -f1  )
partition=$(echo $output | awk ‘{ print $2 }’ )
if [ $usep -ge 80 ]; then
echo “SSD Disk on slave database server Running out of space!!  \”$partition
($usep%)\” on $(hostname) as on $(date)” |
mail -s “Alert: Almost out of disk space $usep%” nick@nicktailor.com
fi
done

  1. Now you want to setup a cron that runs this script every 1 hour, or however long you want
    1. 0 * * * * /path/diskspacewatch

That’s my tutorial on a mysql multithreaded backup and restore setup. If you have questions email nick@nicktailor.com

 

 

 

0