Date: 04-12-20  Time: 22:51 PM

Author Topic: Load Balancing VIO clients  (Read 14900 times)

0 Members and 1 Guest are viewing this topic.

potatoman

  • New Member
  • *
  • Posts: 2
  • Karma: +0/-0
Load Balancing VIO clients
« on: October 22, 2010, 08:34:00 AM »
Hi,

Looking for best practices and information on how to calculate the amount of Mem/CPU to allocate for your VIO.

I have recently load balanced the clients of VIO, by doing chpath -l hdiskX -p vscsiX -a priority=1/2. Do I need to reboot the client for this to take effect? Also, if I have 2 VIO servers, serving a client, and I reboot number 1, would number 2 keep connections active after I have started number1, or would it load balance on the fly?

I am a little new to VIO, please be gentle.

Michael

  • Administrator
  • Hero Member
  • *****
  • Posts: 1273
  • Karma: +0/-0
Re: Load Balancing VIO clients
« Reply #1 on: November 19, 2010, 09:10:06 AM »
My apologies for taking so long to reply - "Show unread posts" failed me.

Anyway, chpath should work instantly, and restore automatically.

But, I'll need to check that - I do not know all the parameters (attributes) from memory. I am going to try and look on my IVM based systems - only one VIOS - so I may not be able to verify MPIO settings real soon.

If you have a setup with 8GB fiber cards, and a SAN switch that supports NPIV you could also use dual VIOS, NPIV and native PCM drivers in the client - getting dynamic load-balancing. The client thinks it has two (or more) hba installed.

Michael

  • Administrator
  • Hero Member
  • *****
  • Posts: 1273
  • Karma: +0/-0
Re: Load Balancing VIO clients
« Reply #2 on: November 19, 2010, 01:33:12 PM »
Doing a little research...

Basically, there are two classes of objects we are interested in: adapter and disk

adapter, subclass vscsi
disk, subclass vdevice

# lsdev -PH -c disk -s vscsi       
class type  subclass description

disk  vdisk vscsi    Virtual SCSI Disk Drive
# lsdev -PH -c adapter -s vdevice
class   type       subclass description

adapter IBM,l-lan  vdevice  Virtual I/O Ethernet Adapter (l-lan)
adapter IBM,v-scsi vdevice  Virtual SCSI Client Adapter
adapter hvterm1    vdevice  LPAR Virtual Serial Adapter


we have at least one of each - let's look at vscsi0 and hdisk0 for attribute names we could modify:
# lsattr -El vscsi0
vscsi_err_recov delayed_fail N/A                       True
vscsi_path_to   0            Virtual SCSI Path Timeout True
# lsattr -El hdisk0
PCM             PCM/friend/vscsi                 Path Control Module        False
algorithm       fail_over                        Algorithm                  True
hcheck_cmd      test_unit_rdy                    Health Check Command       True
hcheck_interval 0                                Health Check Interval      True
hcheck_mode     nonactive                        Health Check Mode          True
max_transfer    0x40000                          Maximum TRANSFER Size      True
pvid            00c39b9daec18c210000000000000000 Physical volume identifier False
queue_depth     3                                Queue DEPTH                True
reserve_policy  no_reserve                       Reserve Policy             True


Both variables for the vscsi adapter look interesting - more on that below; for hdisk0 I am interested in algorithm, hcheck_cmd, hcheck_interval, and hcheck_mode. reserve_policy is not important for load balancing - but can be important for availability.

Using odmget I can get the default values for these attributes:

# for i in algorithm hcheck_cmd hcheck_interval hcheck_mode reserve_policy vscsi_err_recov vscsi_path_to
do
 clear
 echo ==== $i ====
 odmget -q attribute=$i PdAt
 read x
done


## This command generates a lot of output as there are different unique types that have there own range of values so I am editing the output to the values I am interested in...

PdAt:
        uniquetype = "PCM/friend/vscsi"
        attribute = "algorithm"
        deflt = "fail_over"
        values = "fail_over"
        width = ""
        type = "R"
        generic = "DU"
        rep = "sl"
        nls_index = 3

PdAt:
        uniquetype = "PCM/friend/vscsi"
        attribute = "hcheck_cmd"
        deflt = "test_unit_rdy"
        values = "test_unit_rdy, inquiry"
        width = ""
        type = "R"
        generic = "DU"
        rep = "sl"
        nls_index = 12

PdAt:
        uniquetype = "PCM/friend/vscsi"
        attribute = "hcheck_interval"
        deflt = "0"
        values = "0-3600,1"
        width = ""
        type = "R"
        generic = "DU"
        rep = "nr"
        nls_index = 7

PdAt:
        uniquetype = "PCM/friend/vscsi"
        attribute = "hcheck_mode"
        deflt = "nonactive"
        values = "enabled,failed,nonactive"
        width = ""
        type = "R"
        generic = "DU"
        rep = "sl"
        nls_index = 6

PdAt:
        uniquetype = "disk/vscsi/vdisk"
        attribute = "reserve_policy"
        deflt = "no_reserve"
        values = "no_reserve, single_path"
        width = ""
        type = "R"
        generic = "DU"
        rep = "sl"
        nls_index = 16



LESSON learned: Looking at the output above - I could drop reserve_policy, as it really concerns something else, and use the following command instead - to get all the default attributes, and possibe settings for PCM/friend/vscsi - the uniquetype I am interested in for my hdisks!

# odmget -q uniquetype=PCM/friend/vscsi PdAt

PdAt:
        uniquetype = "PCM/friend/vscsi"
        attribute = "dvc_support"
        deflt = ""
        values = "disk/vscsi/vdisk"
        width = ""
        type = "R"
        generic = ""
        rep = "sl"
        nls_index = 2

PdAt:
        uniquetype = "PCM/friend/vscsi"
        attribute = "algorithm"
        deflt = "fail_over"
        values = "fail_over"
        width = ""
        type = "R"
        generic = "DU"
        rep = "sl"
        nls_index = 3

PdAt:
        uniquetype = "PCM/friend/vscsi"
        attribute = "link_meth"
        deflt = ""
        values = ""
        width = ""
        type = "R"
        generic = ""
        rep = ""
        nls_index = 0

PdAt:
        uniquetype = "PCM/friend/vscsi"
        attribute = "hcheck_mode"
        deflt = "nonactive"
        values = "enabled,failed,nonactive"
        width = ""
        type = "R"
        generic = "DU"
        rep = "sl"
        nls_index = 6

PdAt:
        uniquetype = "PCM/friend/vscsi"
        attribute = "hcheck_cmd"
        deflt = "test_unit_rdy"
        values = "test_unit_rdy, inquiry"
        width = ""
        type = "R"
        generic = "DU"
        rep = "sl"
        nls_index = 12

PdAt:
        uniquetype = "PCM/friend/vscsi"
        attribute = "hcheck_interval"
        deflt = "0"
        values = "0-3600,1"
        width = ""
        type = "R"
        generic = "DU"
        rep = "nr"
        nls_index = 7


And just to show another command syntax (using AND and LIKE constructs in odmget -q)

# for i in vscsi_err_recov vscsi_path_to
do
echo ==== $i ====
 odmget -q "attribute=$i AND uniquetype like adapter/vdevice/*" PdAt
done
==== vscsi_err_recov ====

PdAt:
        uniquetype = "adapter/vdevice/IBM,v-scsi"
        attribute = "vscsi_err_recov"
        deflt = "delayed_fail"
        values = "delayed_fail, fast_fail"
        width = ""
        type = "R"
        generic = "DU"
        rep = "sl"
        nls_index = 0
==== vscsi_path_to ====

PdAt:
        uniquetype = "adapter/vdevice/IBM,v-scsi"
        attribute = "vscsi_path_to"
        deflt = "0"
        values = "0-3600,1"
        width = ""
        type = "R"
        generic = "DU"
        rep = "nr"
        nls_index = 2


Michael

  • Administrator
  • Hero Member
  • *****
  • Posts: 1273
  • Karma: +0/-0
Re: Load Balancing VIO clients
« Reply #3 on: November 19, 2010, 01:57:16 PM »
In the previous post - I researched the names I am interested in: what do there variables mean?

For the hdisk:
algorithm: only has one legal value, so fail_over is always the default behavior - if enabled

hcheck_mode: "Health Check Mode": values: "enabled,failed,nonactive", default: "nonactive"
hcheck_cmd: "Health Check Command": values:  "test_unit_rdy, inquiry", default:  "test_unit_rdy"
hcheck_interval: "Health Check Interval (seconds): values: "0-3600,1", default: "0"

With the hcheck_interval set to 0, the partition is not going to attempt to fail_over at a disk level. With a positive value, every hcheck_interval seconds the client will execute the hcheck_cmd to determine disk status

For load balancing I am most interested in the adapter used (as this determines path).

The variables are:
vscsi_err_recov: "VSCSI Error Recovery algorithm" : values: "delayed_fail, fast_fail", default: "delayed_fail"
vscsi_path_to: "VSCSI PATH to ???": values: values: "0-3600,1", default: "0"

The vscsi_err_recov parameter has a function similiar to the fiber scsi interface fscsiX attribute fc_err_recov
Quote
# lsattr -El fscsi0
attach       al           How this adapter is CONNECTED         False
dyntrk       no           Dynamic Tracking of FC Devices        True
fc_err_recov delayed_fail FC Fabric Event Error RECOVERY Policy True
scsi_id      0x1          Adapter SCSI ID                       False
sw_fc_class  3            FC Class for Fabric                   True

In short, with vscsi_err_recov set to fast_fail, the VIO client adapter will send a FAST_FAIL datagram to the VIO server and fail the I/O immediately rather than delayed. This may help to improve MPIO failover.

The vscsi_path_to attribute functions much like hcheck_interval. A value of 0 disables it, while positive values allows the virtual client adapter driver to determine the health or status of the VIO Server to improve and expedite path failover processing.

potatoman

  • New Member
  • *
  • Posts: 2
  • Karma: +0/-0
Re: Load Balancing VIO clients
« Reply #4 on: November 20, 2010, 11:33:21 AM »
Thanx.