Date: 18-09-20  Time: 11:23 AM

Author Topic: number of VIOS for large virtualized server  (Read 1630 times)

0 Members and 1 Guest are viewing this topic.

fplatel

  • New Member
  • *
  • Posts: 3
  • Karma: +0/-0
number of VIOS for large virtualized server
« on: April 14, 2020, 09:05:50 AM »
Hello
One of my customers is about to acquire three E980 POWER9  servers with a lot of cores (168 cores activated in total) and memory (more than 4TB of RAM in total). Two servers will be on production site plus one on DR site.
They want to consolidate several LPARs from various P6/P7/P8 servers, I cant remember how much exactly but it will be probably dozens.
Network virtualization will be mostly via vNIC with SR-IOV 10Gbps adapters , and storage access via NPIV over 32Gb FC adapters.

As I plan to implement vNIC failover with a highly redundant architecture especially for some critical LPARs, I am wondering if it is a good practice to use more than 2 VIOS per physical machine, and I am thinking about dedicating a pair of VIOS for critical LPARs while other LPARs will use another pair of VIOS. Or is it better to use only two VIOS for all LPARs with more entitled processor per VIOS?
My concern is to ensure that the critical LPARs will have constant performance so that's why I want to have dedicated access to virtualized ressources.

Thanks in advance for sharing advices.
Best regards
Fabrice


Michael

  • Administrator
  • Hero Member
  • *****
  • Posts: 1266
  • Karma: +0/-0
Re: number of VIOS for large virtualized server
« Reply #1 on: April 15, 2020, 07:50:28 AM »
It is quite common to have more than one pair of VIOS servers - for various reasons. The reasons you mention are common ones.
If you are considering LPM - frequent LPM - a dedicated (pair) of MSP (Moving Service Partitions) with zero "client partitions". This ensures that the "application" networks are not (directly) affected by LPM.Note: even though MSP such as these are not hosting clients in the classic sense of SEA Failover - they could still be involved in a vNIC failover scheme.
Further, if you are considering using LPM to migrate partitions remember than the POWER6 ones will have to go down at least once: POWER9 only supports POWER9, POWER8 and POWER7 modes. An inactive (L)PM should be possible between POWER6 and POWER9.
And - in your position - I would try and plan AIX updates, if any, prior to the migrations. And, before any critical partitions are on the POWER9 - do any firmware updates, including a IPL (if there are any DEFERRED updates).
Lastly, do not forget to examine PEP (Power Enterprise Pools) and SRR (Simplified Remote Restart).
Enjoy - sounds like you get to build/architect  something to be proud of!
Michael
« Last Edit: April 15, 2020, 09:43:22 AM by Michael »

fplatel

  • New Member
  • *
  • Posts: 3
  • Karma: +0/-0
Re: number of VIOS for large virtualized server
« Reply #2 on: April 20, 2020, 09:54:58 AM »
Hello Michael and many thanks for your quick answer.
Yes we are planning to use LPM but not for frequent LPAR move: it is supposed to be used primarily for failover in case of hardware failure or maintenance of one of the two production E980, we will make use of PEP and most of the cores are mobile activation.

So what you were saying about LPM is that it is a common practice to keep some VIOS idle (no client attached) just to use them as MSP with a dedicated network access ?

Regarding vNIC failover, as each backing device is statically bound to one VIOS and we will have at least 2 LAN switches, I plan to dedicate a set of 4 sr-iov ports for my critical LPARs and as I have 4 VLANs I think I will use one physical sr-iov port from 4 adapters to create logical ports for my 4 VLANs like this :

vNIC1/VLAN1 : main logical port on port1 of adapter1 managed by VIOS1 and backup on port1 of adapter2/VIOS2, port1 of adapter3/VIOS1 and port1 of adapter4/VIOS2

for other vNICs/VLANs I use the same set of physical ports to define the backing devices but with a rotation of main ports over the set of 4 physical ports : main backing device will be
vNIC2/VLAN2 => port1 of adapter2 / VIOS2
vNIC3/VLAN3 => port1 of adapter3 / VIOS1
vNIC4/VLAN4 => port1 of adapter4 / VIOS2

adapter1 is attached to LAN switch 1
adapter2 is attached to LAN switch 2
adapter3 is attached to LAN switch 1 (or switch 3)
adapter4 is attached to LAN switch 2 (or switch 4)

I am in charge for only some critical LPARs that need a highly redundant and performant architecture, but my customer will probably follow the same kind of architecture for other LPARs on the same servers.

Thank you also for your advice about POWER6 migration and also regarding software and firmware levels, I will give instructions so that we will have latest levels during the validation tests.

Best regards

Fabrice

Michael

  • Administrator
  • Hero Member
  • *****
  • Posts: 1266
  • Karma: +0/-0
Re: number of VIOS for large virtualized server
« Reply #3 on: April 20, 2020, 05:55:48 PM »
Hello Michael and many thanks for your quick answer.
Yes we are planning to use LPM but not for frequent LPAR move: it is supposed to be used primarily for failover in case of hardware failure or maintenance of one of the two production E980, we will make use of PEP and most of the cores are mobile activation.

So what you were saying about LPM is that it is a common practice to keep some VIOS idle (no client attached) just to use them as MSP with a dedicated network access ?
Yes, (becoming) common practice - especially when LPM is frequent and/or FRAME evacuation must be as quick as possible. Generally with dedicated (non-user) network.
Remember: for a failover (crash of frame or site) - LPM is too late. For that you need to be LPM capable, but you would use SRR for the "movement" of the partitioin. SRR ensures that when the frame/site comes back on-line the "original" LPAR gets removed - no double activation by accident.
Quote
Regarding vNIC failover, as each backing device is statically bound to one VIOS and we will have at least 2 LAN switches, I plan to dedicate a set of 4 sr-iov ports for my critical LPARs and as I have 4 VLANs I think I will use one physical sr-iov port from 4 adapters to create logical ports for my 4 VLANs like this :

vNIC1/VLAN1 : main logical port on port1 of adapter1 managed by VIOS1 and backup on port1 of adapter2/VIOS2, port1 of adapter3/VIOS1 and port1 of adapter4/VIOS2

for other vNICs/VLANs I use the same set of physical ports to define the backing devices but with a rotation of main ports over the set of 4 physical ports : main backing device will be
vNIC2/VLAN2 => port1 of adapter2 / VIOS2
vNIC3/VLAN3 => port1 of adapter3 / VIOS1
vNIC4/VLAN4 => port1 of adapter4 / VIOS2

adapter1 is attached to LAN switch 1
adapter2 is attached to LAN switch 2
adapter3 is attached to LAN switch 1 (or switch 3)
adapter4 is attached to LAN switch 2 (or switch 4)
Sounds about right - but I prefer diagrams for networking.
Have you been to a recent POWER TechU - if so, there are some good presentations re: vNIC setups for performance and elimination of SPOF.
Quote

I am in charge for only some critical LPARs that need a highly redundant and performant architecture, but my customer will probably follow the same kind of architecture for other LPARs on the same servers.

Thank you also for your advice about POWER6 migration and also regarding software and firmware levels, I will give instructions so that we will have latest levels during the validation tests.

Best regards

Fabrice
Glad to be of assistance.

fplatel

  • New Member
  • *
  • Posts: 3
  • Karma: +0/-0
Re: number of VIOS for large virtualized server
« Reply #4 on: April 29, 2020, 02:47:47 PM »
Hi Michael
I was wondering : if I have two lpars communicating on one same vlan through vnics defined with the main backing device on the same sr-iov physical port, the packets won't get out of the adapter and it is fine for me, but does it make a difference if I choose the same VIOS for the main backing device on both LPARs or if I choose two different VIOS ?
My concern is to have the shortest communication path between both lpars over this vlan, because one lpar is for an application environment and the other lpar will host some oracle databases, and the application is badly written with small SQL queries executed in loops so it is highly impacted by any change of latency and I want the smallest possible network latency for the communication between these two lpars.

By the way, do you have any information about the difference of performance / latency between vNIC adapters vs sr-iov logical port directly assigned to the LPAR ? is there a significant difference ?

Michael

  • Administrator
  • Hero Member
  • *****
  • Posts: 1266
  • Karma: +0/-0
Re: number of VIOS for large virtualized server
« Reply #5 on: April 30, 2020, 08:30:24 AM »
Quote
Hi Michael
I was wondering : if I have two lpars communicating on one same vlan through vnics defined with the main backing device on the same sr-iov physical port, the packets won't get out of the adapter and it is fine for me, but does it make a difference if I choose the same VIOS for the main backing device on both LPARs or if I choose two different VIOS ?
My concern is to have the shortest communication path between both lpars over this vlan, because one lpar is for an application environment and the other lpar will host some oracle databases, and the application is badly written with small SQL queries executed in loops so it is highly impacted by any change of latency and I want the smallest possible network latency for the communication between these two lpars.
Not sure I understand the question. However, if your goal is to ensure that multiple LPARS are sharing the same physical port I would define my backup ports to the same backup VIOS. Assumption: if the main port has failed, then it is failed for all LPARS connecting through it. Any other kind of failure - I would not be able to guess at system behavior impact on partition-partition communication.
Quote
By the way, do you have any information about the difference of performance / latency between vNIC adapters vs sr-iov logical port directly assigned to the LPAR ? is there a significant difference ?
No personal experience.
The way I understand the role of the VIOS in vNIC communication is that there should be no, or (nearly) un-measureable performance difference. Just as with "direct connect" the partition is writing directly to the port - not via the VIOS - as it would be with SEA.
Hope this helps!

orian

  • New Member
  • *
  • Posts: 2
  • Karma: +0/-0
Re: number of VIOS for large virtualized server
« Reply #6 on: June 21, 2020, 02:56:40 PM »
VIOC                                                    VIOS
aix tcp/ip stack                            SEA bridge(data moving)
vNIC driver                       vNIC driver              enet adapter driver
             physical memory DMA                                    outside/network switch

vs.

VIOC
aix tcp/ip stack
enet adapter driver
physical memory DMA/sriov sharing


the delay could be up to hundred milliseconds or more, the throughput shouldn't be impacted for lightweight networking traffic, but if there is less of physical CPU (entilment PU) on vios and huge network delay already, for example, long distance (more than hundred KMs) between production and dr site, the delay appended may be not acceptable.

to use CPU more effectiveness, use VIOS/SEA and balance CPU allocation by real test
to get good performance, especially for response sensitive application, SRIOV would be the best choice.
outside/network switch

orian

  • New Member
  • *
  • Posts: 2
  • Karma: +0/-0
Re: number of VIOS for large virtualized server
« Reply #7 on: June 21, 2020, 03:04:09 PM »
Hi Michael
I was wondering : if I have two lpars communicating on one same vlan through vnics defined with the main backing device on the same sr-iov physical port, the packets won't get out of the adapter and it is fine for me, but does it make a difference if I choose the same VIOS for the main backing device on both LPARs or if I choose two different VIOS ?
My concern is to have the shortest communication path between both lpars over this vlan, because one lpar is for an application environment and the other lpar will host some oracle databases, and the application is badly written with small SQL queries executed in loops so it is highly impacted by any change of latency and I want the smallest possible network latency for the communication between these two lpars.

if you just want to communicate between lpar on same box, and no partial takeover happen, I mean app may running on different box than DB, nothing needed, no physical card, no vios. just create vlan and assign vnic on each partitions onto the vlan, that is.

don't forget enable firmware memory mirroring of p980 to have the redundant, there is only ONE vlan.