Wednesday, March 04, 2020

Blackbox Exporter + Prometheus and icmp/ping issues on Ubuntu 18.04

If you are experiencing issues with icmp and blackbox exporter with prometheus,

run the following command against your blackbox_exporter binary

sudo setcap cap_net_raw=+ep blackbox_exporter

You should see the pings returning data now

Friday, January 03, 2020

Splunk is running out of disk space

This usually happens when /opt/splunk/var/lib/splunk/defaultdb fills up and the free space on disk is less than 5GB.

Adding more disk space should do the trick, otherwise there are other tricks you can use to buy more time.

1. Change the mnimum free space in $SPLUNK_HOME/etc/system/local/server.conf


This will allow the disk to fill up to a point where it has 2GB free, once this limit is hit splunk indexers will stop indexing data till more space is available.

NB: NOT recommended unless you really need to get your splunk instance indexing till disk space is increased.

2. if you have another disk or partition with free space on your splunk instance you could archive your index data to that disk/partition.


See below how splunk's indexers stores indexed data in directories called buckets and the associated stages they go through.



More here: https://docs.splunk.com/Documentation/Splunk/7.3.0/Indexer/Setaretirementandarchivingpolicy

We are going to limit the maxdbsize of our splunk instance and archive the frozen archive data to /var/splunk/archives

To do this we first create indexes.conf in $SPLUNK_HOME/etc/system/local/ if it does not exist.

Next we add the following:


In this case "main" is the name of our index and we are limiting the growth to 100GB, once this limit is reached, splunk will freeze the oldest data in the archive to our directory above.

Restart splunk for changes to take effect, splunk will almost immediately move frozen index data to the archive directory and your splunk instance will be healthy again.



Tuesday, May 21, 2019

VMware ESXi 6.7 - System logs on host are stored on non-persistent storage

When trying to apply the recommendation in the KB article (https://kb.vmware.com/s/article/2032823? )via the web client  we received the following error:


"Update option values failed!" "A specified parameter was not correct:"

To work around this issue we log into the esxi host via ssh and issue the following commands:

esxcli system syslog config get

#This dumps the current host config

We are interested in changing "Local Log Output"

Now browse to the DIR where you wish to have the logs stored. Copy the entire path and use in the command below to set the dir.

If it is shared storage, it is best to create a subdir for the host e.g. /vmfs/volumes/storagepool1/logs/esxihost1


Set the syslog like so:

esxcli system syslog config set --logdir=/vmfs/volumes/storagepool1/logs/esxihost1

Go back to your web client and confirm the setting has updated, you can also run "esxcli system  syslog config get" to confirm the settings have taken effect.

Wednesday, March 20, 2019



Dell Modular Disk Storage Manager reports unreadable sectors after a power failure

Fix below:

Open cmd as administrator ->

Navigate to the SMcli exe and execute the following to connect to the storage array via the two controller IPs.

C:\Program Files\Dell\MD Storage Software\MD Storage Manager\client>SMcli 192.168.100.1 192.168.100.2

Entering interactive mode. Please type desired command.

clear allVirtualDisks unreadableSectors
;
[Press CTRL+c]
Script execution complete.

All done.

If this error  re-occurs then you will need to do a support file export to identify affected disks as this may then be a  genuine hardware issue.

Thursday, March 07, 2019

Setting a static IP on Ubuntu server 18

Since ubuntu now uses netplan (more info here -> https://netplan.io) which uses yaml to define and create the required networks. The legacy way of making these changes via /etc/network/interfaces is now deprecated.

See below:

#sudo cat /etc/network/interfaces

# ifupdown has been replaced by netplan(5) on this system.  See
# /etc/netplan for current configuration.
# To re-enable ifupdown on this system, you can run:
#    sudo apt install ifupdown


1. Let's start by confirming our interface names:

#ifconfig -a

ens160: flags=4163  mtu 1500
        inet 172.1.2.3  netmask 255.255.255.0  broadcast 172.1.2.255
        inet6 fe80::250:56ff:fea8:38f8  prefixlen 64  scopeid 0x20
        ether 00:50:56:a8:38:f9  txqueuelen 1000  (Ethernet)
        RX packets 1545  bytes 955401 (955.4 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 827  bytes 101663 (101.6 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0


2. Now identify the config file netplan is using:

#ls -la  /etc/netplan/
total 12
drwxr-xr-x  2 root root 4096 Mar  6 06:15 .
drwxr-xr-x 90 root root 4096 Mar  6 06:16 ..
-rw-r--r--  1 root root  409 Mar  6 06:15 50-cloud-init.yaml


Note the interface name is  "ens160" taken from 1. above.

For our example we are going to set the static IP to 172.1.2.3 with a subnet mask of 255.255.255.0 and gateway of 172.1.2.1

3. Make the necessary changes as shown below:

#sudo vi /etc/netplan/50-cloud-init.yaml

# This file is generated from information provided by
# the datasource.  Changes to it will not persist across an instance.
# To disable cloud-init's network configuration capabilities, write a file
# /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following:
# network: {config: disabled}
network:
    ethernets:
        ens160:
            addresses: [172.1.2.3/24]
            gateway4:   172.1.2.1
            dhcp4: no
            nameservers:
              addresses: [1.1.1.1,2.2.2.2]

    version: 2

4. Lastly apply the configuration
#sudo netplan apply








Friday, March 01, 2019

VMware PowerCLI on Windows 10

This is as simple as running Windows Powershell as administrator and running the following:

Install-Module -Name VMware.PowerCLI


Close and open powershell to have access to your new vSphere PowerCLI commands.
When connecting to your environments you may get the following error:


Connect-VIServer : 2019/03/01 8:55:20 AM        Connect-VIServer                Error: Invalid server certificate. Use
Set-PowerCLIConfiguration to set the value for the InvalidCertificateAction option to Prompt if you'd like to connect
once or to add a permanent exception for this server.
Additional Information: Could not establish trust relationship for the SSL/TLS secure channel with authority


You can  run the following to get around this cert warning:

Set-PowerCLIConfiguration -InvalidCertificateAction Ignore -Confirm:$false



Thursday, February 28, 2019

How to check Cisco Catalyst Switch Port Traffic Utilization



Sometimes it is necessary to see what an uplink to a switch, vmware exsi host etc is doing in terms of traffic,

A useful command is show interface summary


This then outputs the following:

cat2960s#sh int summary

 *: interface is up
 IHQ: pkts in input hold queue     IQD: pkts dropped from input queue
 OHQ: pkts in output hold queue    OQD: pkts dropped from output queue
 RXBS: rx rate (bits/sec)          RXPS: rx rate (pkts/sec)
 TXBS: tx rate (bits/sec)          TXPS: tx rate (pkts/sec)
 TRTL: throttle count

  Interface               IHQ   IQD  OHQ   OQD  RXBS RXPS  TXBS TXPS TRTL
-------------------------------------------------------------------------
* Vlan1                    0     0    0     0  1000    1  1000    1    0
  FastEthernet0            0     0    0     0     0    0     0    0    0
* GigabitEthernet1/0/1     0     0    0  1747  1000    1 52000   49    0
* GigabitEthernet1/0/2     0     0    0  1757     0    0 52000   49    0
* GigabitEthernet1/0/3     0     0    0 191986 480000  106 290000  122    0
* GigabitEthernet1/0/4     0     0    0  1645 92000   40 158000   90    0
* GigabitEthernet1/0/5     0     0    0  1733     0    0 52000   49    0
* GigabitEthernet1/0/6     0     0    0  5969 1172000  658 13763000  1271    0
* GigabitEthernet1/0/7     0     0    0  1756     0    0 52000   49    0
* GigabitEthernet1/0/8     0     0    0  1726  1000    1 52000   50    0
* GigabitEthernet1/0/9     0     0    0  1952 147000  121 506000  191    0
* GigabitEthernet1/0/10    0     0    0  1737  6000    2 52000   49    0
* GigabitEthernet1/0/11    0     0    0  1736     0    0 52000   49    0
...

Note: RX is traffic Received and TX is traffic sent

Let's dig a bit deeper into an interface:

cat2960s# sh int gigabitEthernet 1/0/29 summary

 *: interface is up
 IHQ: pkts in input hold queue     IQD: pkts dropped from input queue
 OHQ: pkts in output hold queue    OQD: pkts dropped from output queue
 RXBS: rx rate (bits/sec)          RXPS: rx rate (pkts/sec)
 TXBS: tx rate (bits/sec)          TXPS: tx rate (pkts/sec)
 TRTL: throttle count

  Interface               IHQ   IQD  OHQ   OQD  RXBS RXPS  TXBS TXPS TRTL
-------------------------------------------------------------------------
* GigabitEthernet1/0/29    0     0    0 53071 477756000  39677 10729000  18785    0


From the output above we can deduce the following:

RX Traffic = 477.75 Mbits/second (477756000 bits/s)
TX Traffic =  10.72  Mbits/second (10729000 bits/s)