content top

Whats New In vSphere 5

No Gravatar

Here is a list I compiled after completing my What’s New In vSphere 5 class.  I won’t bother getting into licensing….

-No block limit size for VMDK
-VMDK scales to 64 TB with 2TB limit on files
- 32 Cores up from 8, 1 TB of Ram
-NFS DataStores increase from 32 to 256
-SATP Modules loaded on demand no more setting from command prompt
-No more setting iscsi port binding from command prompt gui driven
-Software base fcoe initiator

–VAAI -thin provisioning stun + VMFS space reclamation = As the data store reaches max compacity, instead of overfilling and risking data corruption at 97% VMs are put in suspend mode.  Space Reclamation works with the storage array vendor to shrink the lun size on a thin provisioned volume as used space decreases to increase storage efficency.
–vCenter client for Linux.
-vCenter appliance linux based….No it doesnt support linked mode
-Profile Driven Storage- Create profiles for SLAs based on performance etc for performance and availability.
-Storage DRS- DRS based on i/o and size
-Virtual Machine format 8 not really exciting usb 3.0 support and also Windows Aero
-Storage I/O Control – Extended to NFS
-Network I/O Control – More granular offering per virtual machine control.
-Vsphere WebClient – Wasnt this in ESX2?
–RTT Latency increase from 5ms to 10ms in Enterprise Plus vMotion

http://www.vmware.com/files/pdf/products/vsphere/vmware-what-is-new-vsphere5.pdf

http://www.vmware.com/files/pdf/techpaper/Whats-New-VMware-vSphere-50-Storage-Technical-Whitepaper.pdf

http://www.vmware.com/files/pdf/techpaper/Whats-New-VMware-vSphere-50-Networking-Technical-Whitepaper.pdf

Read More

Why VMware Enterprise Plus?

No Gravatar

I got this questions two times this week “What is the benifit of VMware Enterprise Plus?” when I was a VMware customer I went Enterprise Plus right away…Lots of people complained however its great value IMO.   Perhaps the biggest value of Enterprise plus is just the licensing core count increase from 6 to 12.  The current Nahalem EX processors are at 8 cores, and AMD are at 12….Besides this however there are other great features:

  1. Host Profiles-  These are used to automate configurations and ensure compliance; VMware is adding more and more configuration parameters that are included in each release.
  2. 8 Core SMP virtual machines
  3. Network I/O Control this is used to QOS virtual network resources
  4. Storage I/O Control- priortization for VMs resources to shared storage, redistibutes vms un
  5. Distributed Virtual Switch- Besides trimming down to one switch, and then adding in hosts for consistency a few other benefits…PVLANS, Load Based Teaming, ingress traffic shaping , required for Cisco Nexus 1000v integration

Here is a quick few use cases and review on Pvlans (they were new to me) :

  • Communities can communicate with eachother in community and router use case: DMZ with 1 app server and 1 corresponding database server. 
  • Isolated- Can only communicate with router  use case:  in a DMZ where you have a standalone webserver that doesn’t have the needs to communicate to other servers in the DMZ as often the case with DMZ; 2nd use case in a desktop vdi enviroment where desktops don’t need to communicate with eachother (This example is from Eric Slueth blog video on Pvlans)
  • Promiscous – For routers.
Read More

Configuration Best Practices Vsphere and HP EVA

No Gravatar

Recently I was helping a customer architect a solution and I went back to the Best Practices for Vpshere and the EVA you can find it here:  http://h20195.www2.hp.com/V2/GetPDF.aspx/4AA1-2185ENW.pdf

I thought I would make a quick summation and comment as some of it seems neglected in terms of being updated.

  1. When creating hosts in commandview create them as Type: VMware
  2. Use a single disk group with the same size disks and the largest possible disk group
  3. Single Diskgroup sparing is sufficent unless MTTR is > 7 days
  4. Don’t use VRaid 6 – Vraid 5 provides adequate redundancy
  5. When using disks of varying performance characteristics still use a single diskgroup
  6. Balance your VMFS Luns across both controllers for ownership with alternating failback and failover policies  read more here (http://www.ivobeerens.nl/?p=465)
  7. Configure Round Robin advanced parameters to IOPS=1….I dont think HP is right here this was fixed fixed in Vpshere 4 update 1; This shouldnt be necessary
  8. When using DR groups, ensure that DR groups managing controllers are spread across controllers
  9. Use less than 16 VMDKs per VMFS datastore
  10. Use the same lunid per host per lun
  11. Create VMFS datastores through Vcenter for proper alignment
  12. For heavy i/o load vms use the paravirtualized virtualized virtual adapters for the VM data Luns
  13. Verify data drives for windows 2003 machines within os for proper alignment
  14. Set Round Robin multipathing on your hosts and then reboot; dont forget to set your Microsoft clusters to MRU here is the quick and easy command:
esxcli nmp satp setdefaultpsp --satp VMW_SATP_ALUA --psp VMW_PSP_RR
Read More

VMware NMP Errors and Lun Dropping with HP EVA SANs

No Gravatar

Issue:

Recently I came across a situation with a customer, where they were experiencing sparactic LUN drops in there Vsphere clusters.  This occured in 4.1 and 4.0.  Also, what is unique is this occurred between two HP storage arrays…The luns would disappear for a few minutes than come back.  I looked at it from the angle of performance problems in there SAN enviroment.

These x028 errors aren’t unique to HP Arrays shown below other Manufactures seem to be fighting them as well including EMC and IBM you can read about it here:

Looking at the vmkernel logs lots of lines followed like:

Dec 17 07:03:24 esx01 vmkernel: 22:01:35:32.584 cpu7:4517)NMP: nmp_CompleteCommandForPath: Command 0x2a (0×410001176200) to NMP device ”naa.600508b40008dfcc0000600000cc0000″ failed on physical path ”vmhba2:C0:T0:L1″ H:0×0 D:0×28 P:0×0 Possible sense data: 0×0 0×0 0×0.   Dec 17 07:03:24 esx01 vmkernel: 22:01:35:32.584 cpu7:4517)ScsiDeviceIO: 770: Command 0x2a to device ”naa.600508b40008dfcc0000600000cc0000″ failed H:0×0 D:0×28 P:0×0 Possible sense data: 0×0 0×0 0×0.
Here is the HP Published Solution
NOTE: The above-mentioned URL will take you to a non-HP Web site. HP does not control and is not responsible for information outside of the HP Web site.
The hexa decimal values H:0×0 D:0×28 P:0×0 decodes to Task set full as per the above article.
VMware reports this error when the storage controller returns Queue Full or BUSY signal to an IO request.
A storage controller may return Queue Full or BUSY signal when it encounters resource congestion due to overutilization.
In VMware environments, this may be caused by high Queue Depth at controller ports during heavy workload or due to large size IOs issued by VMware. By default VMware is capable of sending IO blocks up to 32MB.
In many cases the following steps have helped mitigate the issue:
  1. Capture the evaperf logs during the time the errors are reported and ensure that the array utilization is well within the acceptable safe IOPS values for the given configuration.
  2. Set the maximum IO size to 128 as mentioned in the below VMware article.
  3. Follow the EVA – VMware bestpractices guide and ensure the multipath policy is set correctly.
    Click here to access the technical article available athttp://h20195.www2.hp.com/v2/GetPDF.aspx/4AA1-2185ENW.pdf.
  4. Enable Adaptive Queue depth throttling as mentioned in the below article.
    NOTE: The above-mentioned URLs will take you to a non-HP Web site. HP does not control and is not responsible for information outside of the HP Web site.
    For EVA, QFullSampleSize value of 32 and QFullThreshold value of 8 is found helpful in many cases.
Read More

SNMP Monitoring HP ESX Servers Tips and Tricks

No Gravatar

Recently I was at a customers site doing a vSphere 4.1 upgrade, and SAN expansion.  I configured ESX hosts with SNMP monitoring for proactive support with Insight Remote Support for call home functionality with alerting incase of hardware failures.  Here are three good points I came across in dealing with SNMP monitoring configurations:

1)  If you deploy the latest 4.1 image through HP Insight Server Deployment Console you won’t have to install the HP agents.  I really like this saves a lot of time.

Note:  If your not using the HP Insight Server Deployment then you can get the agent here at this link you will have to manually install it.  It is pretty lengthy more than just an RPM.  You can just take all the defaults then modify what I am showing in step 2 below.

2)  The easiest way to configure the settings is post install through the systems management homepage.  You can just configure one system then copy its configuration show in the System Management Homepage to other ESX hosts System Management Homepage.  Below is the information that you will want to copy from the host.  Alternatively, if you are just starting out fill out the information below in <> to your desired configuration and your good to go.

dlmod cmaX /usr/lib64/libcmaX64.so
rwcommunity <your private community name>127.0.0.1
rocommunity <your public community name> 127.0.0.1
rwcommunity  <your private community name> <your ip of Insight Control server w/ Remote Support>
rocommunity  <your public community name> <your ip of Insight Control server w/ Remote Support>
trapcommunity <your public community name>
trapsink <your ip of Insight Control server w/ Remote Support> <your public community name>
syscontact Root <root@localhost> (configure /etc/snmp/snmp.local.conf)
syslocation Unknown (edit /etc/snmp/snmpd.conf)

3) Don’t forget when sending test traps you have to comment out the line in /etc/sudoers on each ESX box “Defaults requiretty” :)

Read More

Rethinking Network Design with FlexFabric and Vsphere 4.1

No Gravatar

The biggest changes we have seen is the with Vsphere is the vDS (distributed virtual switch), and FT (fault tolerance).  One of the biggest changes we have with the latest version of Vsphere 4.1 on the network side of the house is NET I/O Control NetIOC, not to mention the not often mentioned LBT (load-balanced teaming).   Also while these changes occur with VMware we see changes in the HP virtual 10GB networking bringing us Flex-Fabric which is enough change to really draw some confusion…I see these changes as bringing a certain synergy to the datacenter from a HP Blade prospective with Vsphere 4.1 implementations.  I also see a serious cost savings and increased efficiency not to mention cleaner design.

Here is the look from the physical topology:

Flex-Fabric brings HPs Flex-10 Converged I/O and 1 HOP FCOE while, not as far as Cisco maybe with there current Nexus line there is still a major cost savings to the customer.  Take for instance the current Flex-10 Virtual Connect Implementation, each c7000 would need a minimum of 4 switches two for Virtual Connect SAN connections and two for Network uplinks.  Now with converged I/O the customer could buy two switches and save roughly 40k per C7000.  The two switches would both have uplinks to both SAN and Network.

What will this look like inside of virtual connect/flex-fabric?

Instead of getting four flex Nics you will get three and one Converged Adapter.

Will this work on G5′s or G6′s, what about previous flex-10 modules?

No unfortunately only G7′s, and flex-fabric modules

Any recommendations on thoughts for a network design with Vsphere 4.1 and Flex-Fabric?

VMware provides us with this best practice document: http://www.vmware.com/files/pdf/techpaper/VMW_Netioc_BestPractices.pdf  I honestly haven’t seen the latest cookbook for virtual connect.  Hopefully it addresses the distributed virtual switch now.  This was lacking last I saw.

Utilize Vsphere Network I/O control and .1q with different dv port groups for virtual machine traffic, FT, vmotion, service console to achieve a fully dynamic use of your 10gb bandwidth and only use two uplinks (the converged I/O) in an Active/Active scenario.  I really don’t see the sense of having a scenario with 1 dVs then different uplinks to different dv port groups to different virtual flexnic uplinks since you already have features in VMware to tackle I/O contention and prioritize latency sensitive traffic like shares, limits, traffic shaping.  I would avoid the use of limits and reservations were possible.  Shares will trump limits and reservation providing a better use of capacity.

Limit VMotion through Egress Traffic Shaping at the dv port group as Ingress isn’t needed with NetIOC.  This will help in a situation with multiple vmotions from many hosts.  Picture a scenario where you place multiple hosts in a cluster in maintenance mode and it is set to fully automated DRS.  Limiting the MAX vmotion will help in ensuring latency sensitive traffic is interrupted.  The below example limits vmotion to 3GB.

Network Resource Pool Host Limit Physical Share Share Value
FT Unlimited High 100
vMotion Traffic 3GB Normal 50
Management Unlimited Normal 50
VirtualMachine Traffic Unlimited Custom 75

So In Active/Active Flex-10 or Flex-Fabric does this mean that it will load balance automatically?

This isn’t exactly what you think….This is where LBT steps in.  To use this select Route based on physical nic load on your dvportgroup settting for teaming and failover.  LBT will only move a flow when the mean send and receive utilization of an uplink exceeds 75% of a capacity over a 30 second period.  It won’t move it more than 30 seconds.  There may be some hidden way to adjust that setting, I just don’t know it. :)

The actual best practices from VMware are as follows:

NetIOC Best Practices: VMware provides us with this best practice document: http://www.vmware.com/files/pdf/techpaper/VMW_Netioc_BestPractices.pdf

Flex-Fabric Best Practices:      http://h20000.www2.hp.com/bc/docs/support/SupportManual/c02499726/c02499726.pdf

Best practice 1: When using bandwidth allocation, use “shares” instead of “limits,” as the former has greater flexibility for unused capacity redistribution. Partitioning the available network bandwidth among different types of network traffic flows using limits has shortcomings. For instance, allocating 2Gbps bandwidth by using a limit for the virtual machine resource pool provides a maximum of 2Gbps bandwidth for all the virtual machine traffic even if the team is not saturated. In other words, limits impose hard limits on the amount of the bandwidth usage by a traffic flow even when there is network bandwidth available.

Best practice 2: If you are concerned about physical switch and/or physical network capacity, consider imposing limits on a given resource pool. For instance, you might want to put a limit on vMotion traffic flow to help in situations where multiple vMotion traffic flows initiated on different ESX hosts at the same time could possibly oversubscribe the physical network. By limiting the vMotion traffic bandwidth usage at the ESX host level, we can prevent the possibility of jeopardizing performance for other flows going through the same points of contention.

Best practice 3: Fault tolerance is a latency-sensitive traffic flow, so it is recommended to always set the corresponding resource- pool shares to a reasonably high relative value in the case of custom shares. However, in the case where you are using the predefined default shares value for VMware FT, leaving it set to high is recommended.

Best practice 4: We recommend that you use LBT as your vDS teaming policy while using NetIOC in order to maximize the networking capacity utilization.

NOTE: As LBT moves flows among uplinks it may occasionally cause reordering of packets at the receiver.

Best practice 5: Use the DV Port Group and Traffic Shaper features offered by the vDS to maximum effect when configuring the vDS. Configure each of the traffic flow types with a dedicated DV Port Group. Use DV Port Groups as a means to apply configuration policies to different traffic flow types, and more important, to provide additional Rx bandwidth controls through the use of Traffic Shaper. For instance, you might want to enable Traffic Shaper for the egress traffic on the DV Port Group used for vMotion. This can help in situations when multiple vMotions initiated on different vSphere hosts converge to the same destination vSphere server.

Read More

VCAP-DCD-BETA-Experience

No Gravatar

Today I was scheduled to start my VCAP-DCD beta test today.  The last month, or so I have been working on my prep for my DCA on Dec 6th which is a lot, lot of lab prep which takes forever….Last week I got a beta invite for the DCD so I choose at the last minute to switch gears.   Houston, didn’t show available seats till Friday evening. I actually called Person Vue VMware Friday morning to see if there was a glitch when there was 0 open seats…

I decided to take the test and wish for the best.

Arriving at the test center at 8:00am when my test was scheduled to start I was confronted with a long line of doctors and told to take a number literally.  These doctors had to take biometrics both palm and finger prints for there exam and it took a hella long time for them to get going.  I was instructed “not to worry your exam doesnt start till your seated”

Anyhow, I was seated around 8:25 then started a 10minute survey….The exam seemed reasonable in difficulty as I was plowing away, however all the sudden I realized the questions were taking me to long being 130 questions with 4 hours.  The questions had a lot of situations and they were involved which means you have to read it more than once.  I could only answer the 1st 90 or so questions then speed through the rest in my last half hour and my test ended at 12:10 so I didnt even get 4 hours..I think my clock started at 8:00 even though my exam wasnt started then…Ugh….The problem is there is these situational questions and there is a ton of information, hindsight knowing what I know now I would have skimmed the questions more also I had a drag and drop question I had to reset 7 times.  Basically drag 4 from the left more than once to the 7 on the right but that whole system was bad.  I wasnt expecting a perfect glitchless exam though.  For the most part I thought the questions were good, one visio one was pretty awful though.  If the exam is cut down to about 100 questions that will probley be about right.  I did see one guy only finished 80/130 so who knows.  Hopefully I am not penalized for quick answering the last 30 of my questions who knows.

Anyone who reads this here is my advice…One thing watch your time, and don’t dwell on any one thing and if you don’t know it move on….Oh and btw there is no back button you cant go back….:)I am glad I am not getting my results for 6-8weeks after my DCA…

Read More

Troubleshooting Netapp VSC2.0 with Vsphere 4.1

No Gravatar

I recently came across a few issues implementing Netapp Virtualization Storage Console with Vsphere 4.1….When adding filers to the provisioning and cloning  section when timeout occur or you get the generic message “A general Error has occured” try the following work around steps:

1) Does your filer have NFS enabled even if your not using enable it support can provide you a temporary key

https://now.netapp.com/Knowledgebase/solutionarea.asp?id=kb59430

2) e0M interface issue should be temporarily disabled

: bug # 320355

http://now.corp.netapp.com/NOW/cgi-bin/bol?Type=Detail&Display=320355

3) Increasing ZAPI timeout via VSCPreferences file is in the KB below, and also in VSC’s known issues, and IAG on the NOW site. Links below:

https://now.netapp.com/Knowledgebase/solutionarea.asp?id=kb59154

http://now.netapp.com/NOW/download/software/vsc_win/2.0/

4) Enable SSL 2.0 in webbrowser

Read More

VMware Vsphere Site Survey Fault Tolerance Troubleshooting

No Gravatar

I was troubleshooting an issue with hardware compatability with Fault Tolerance when I ran into this neat utility (site survery)  A link to it can be found here on the Vmware site.  Once installed when you close your Vcenter client and open it back up, click on a cluster and then the site survey tab.  A report will generate after a few minutes….

Also on this site is a cpu identification utility for Vmotion compatability testing etc

Read More

The Proliant Vmware Support Matrix with Insight Manager Agents for Vsphere 4.x

No Gravatar

Recently working with an ESX 4.1 upgrade I was tasked with getting System Insight Manager Agents for the appropriate HP Server installed and configured.  I found this HP Website extremly beneficial in cross referencing your HP Server Models with the supported ESX version and compatibility matrix for appropriate SIM agent version with a download link :)

The page is called “VMware from HP” here is the link it contains both supportability for ESX/ESXi but also FT.  This also makes it very easy to find your appropriate insight manager agents.  Worth a bookmark at the least.

BL series VMware ESX/ESXi Server
FT ESXi 4.0 ESXi 4.0 U1 ESXi 4.0 U2 ESXi 4.1 ESX 4.0 ESX 4.0 U1 ESX 4.0 U2 ESX 4.1
BL2x220c G6(2)
BL260c G5
BL280c G6(2)
BL460c
BL460c G5
BL460c G6
BL465c
BL465c G5
BL465c G6
BL465c G7
BL480c
BL490c G6
BL495c G5
BL495c G6
BL680c G5
BL685c
BL685c G5
BL685c G6
BL685c G7
DL series VMware ESX/ESXi Server
FT ESXi 4.0 ESXi 4.0 U1 ESXi 4.0 U2 ESXi 4.1 ESX 4.0 ESX 4.0 U1 ESX 4.0 U2 ESX 4.1
DL160 G6(1)
DL160se G6
DL165 G7
DL170h G6(1)
DL180 G6(1)
DL320 G6(1)
DL360 G5
DL360 G6
DL360 G7 (3) (3) (3)
DL365
DL365 G5
DL370 G6
DL380 G5
DL380 G6
DL380 G7
DL385 G2
DL385 G5p
DL385 G5
DL385 G6
DL385 G7
DL580 G4(1)
DL580 G5
DL580 G7
DL585 G2
DL585 G5
DL585 G6
DL585 G7
DL785 G5
DL785 G6
DL980 G7
ML series VMware ESX/ESXi Server
FT ESXi 4.0 ESXi 4.0 U1 ESXi 4.0 U2 ESXi 4.1 ESX 4.0 ESX 4.0 U1 ESX 4.0 U2 ESX 4.1
ML150 G6(1)
ML330 G6
ML350 G5
ML350 G6
ML370 G5
ML370 G6
SL series VMware ESX/ESXi Server
FT ESXi 4.0 ESXi 4.0 U1 ESXi 4.0 U2 ESXi 4.1 ESX 4.0 ESX 4.0 U1 ESX 4.0 U2 ESX 4.1
SL165z G7
SL160z G6(1)
SL170z G6(1)
SL2x170z G6
Agents VMware ESX Server
4.0 4.0 U1 4.0 U2 4.1
8.2.5
8.3.0
8.3.1
8.4.0
8.5.1
8.6.0
*
“Supported” indicates that an operating system has been successfully tested on the server and drivers are available. Servers are supported with one socket populated.

FT
= Fault Tolerance.

(1)
Not supported for ESXi

(2)
Requires the HP custom ESXi image available at hp.com

(3)
Need to use April 2010 image – ESXi HD-USB-SD Image Installer CD(583772-003.iso)
Read More

Active Directory Authentication with VMware Vsphere ESX/ESXi 4.1 Gotchas

No Gravatar

This post assumes you already know how to configure ESX/ESXi 4.1 for Active Directory if not this will get you up and running: http://ict-freak.nl/2010/09/12/how-to-configure-vsphere-4-1-active-directory-authentication/

3 Gotchas

1) After joining ESX/ESXi hosts to the domain and listing the group or user Administrator access login failure occurs…

-Looking in the /var/log directory output is seen referencing “ESX Admins” group during the authentication failure.

Oct  1 09:27:36 hostname lsassd[13781]: 0xf7544b90:Failed to find user or group. [Error code: 40071]
Oct  1 09:28:04 hostname nssquery: Group lookup failed for ‘YourDomain\ESX Admins’

Oct  1 09:29:04 hostname nssquery: Group lookup failed for ‘YourDomain\ESX Admins’
Oct  1 09:30:05 hostname nssquery: Group lookup failed for ‘YourDomain\ESX Admins’
Oct  1 09:32:06 hostname last message repeated 2 times
Oct  1 09:34:07 hostname last message repeated 2 times
Oct  1 09:36:08hostname last message repeated 2 times

-After creating an ESX Admins group in Active Directory then assigning this to virtual center with the Administrator right authentication worked properly.

2) If you login to an ESX/ESXi 4.1 host that is authenticated and your Ad account is a member of more than 32 security groups you will either reboot or cause on non-responsive host.  VMware knowledge base article: ESX host reboots, becomes unresponsive, or experiences a purple diagnostic screen when logging into the service console

3) After “properly” authentication with Ad credentials I noticed an issue with being stuck in a home directory of / rather than /home/%username%

-Looking into this further I found a knowledge base article on

Home directories are not automatically created for Domain Users on ESX/ESXi 4.1 hosts that are joined to an Active Directory Domain

The create-homedir codepath has been disabled on ESX/ESXi 4.1.  Attempting to configure this behavior using the/etc/likewise/lsassd.conf file will not succeed.  To configure home directories for Active Directory user accounts, the directories must be manually created.

The /etc/likewise/lsassd.conf file can be modified to detail the location of the home directories once they exist by Adding or modifying these lines:

homedir-prefix = /home
homedir-template = %H/%U

This causes the homedir-prefix = /home to set the starting point for all home directories to be /home andhomedir-template = %H/%U sets the home directory to be the homedir-prefix %H followed by the user account name %U. The variable %D can also be used to substitute the Active Directory domain name into the user’s home directory.

Run these commands in sequence to restart the lsassd daemon and clear the Active Directory cache for these settings to take effect.

  1. /etc/init.d/lsassd stop
  2. rm /etc/likewise/db/lsass-adcache.filedb
  3. /etc/init.d/lsassd start
Read More

NMI Driver missing for HP servers and ESX/ESXi on 4.1

No Gravatar

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1021609

What is NMI?

NMI stands for Non-Maskable Interrupt.  It can be generated by a hard memory error, or by a button in ILO.  When a hard memory error occurs you want the system to crash so that data isn’t corrupted.  When a system is in a hard hung state, and totally unresponsive, you can use the NMI facility in ILO to crash the system so that a dump is created.

Where to get the patch?

http://h20000.www2.hp.com/bizsupport/TechSupport/SoftwareDescription.jsp?lang=en&cc=us&prodTypeId=3709945&prodSeriesId=3884113&swItem=MTX-c2dd60fc9642432dae086527f4&prodNameId=3884114&swEnvOID=4091&swLang=13&taskId=135&mode=3

Read More

VM Won’t Power On…File Locked on an Unknown Host

No Gravatar

In the process of completing a Vsphere 4 migration I ran into a problem where I was unable to poweron migrated machines due to file lock issues where the VM was still registered on the host in 3.5 cluster.  Through research I came across articles where customers had issues with DVR owning a lock and having similar issues. 

Here are a few tricks if you end up in file lock issues:

  1. SSH into your host that your VM won’t power on. 
  2. CD to your VMs directory on your VMFS datastore
  3. vmkfstools -D {yourvmname}.vmdk | cat/var/log/vmkernel | grep -i owner
    Your output will return something similar to this:

Aug 13 10:12:22 ESXHOSTNAME vmkernel: gen 717, mode 1, owner
c24d759-19fb4dc3-e694-001f295d02f6 mtime 3915152674347]

    4.  Now looking at the highlighted portion that is the host MAC address that has the lock moving forward we need to check our hosts for that mac address, rather than searching the whole address the last four digits will do fine.
    5.  Now on each host that is a lock contender you will need to putty in and  run this command: ifconfig | grep -i 02:f6 two things will happen if it isnt the host you will get a command prompt; if it is the host your output will look like this:
vmnic6    Link encap:Ethernet  HWaddr 00:1F:29:5D:02:F6
   6. 
Going back in virtual center select the VM and remove from inventory.
   7.  Read the virtual machine on the host that has the lock by browsing the vmfs datastore and clicking import virtual machine.
   8.  Power on your VM…

Read More
content top