I’m taking part in a webinar on 29th April in conjunction with PernixData and the UK based Computing magazine.
I’ll be presenting a case study on what I found when virtualising Microsoft SQL server under VMware and how PernixData FVP helped solve some significant storage performance problems.
Details for the webinar and the registration form are available at http://view6.workcast.net/register?pak=7542766467978297&referrer=pernix
I was asked today to figure out a quick (Implemented in less than 6 weeks from today 10th December!) short term (< 12 months before being replaced) solution to provide between 20 and 50 additional remote desktops for our existing VMware Horizon View environment. I have some constraints and options to consider. Continue reading
I have been running PernixData FVP software in my production datacenter for a little over a month now and I wanted to share some of the results we’ve seen from deploying this amazing software.
I recently deployed a Linux HA Cluster for NFS in an active/passive mode using drbd, running on VMware and using vCentre for fencing.
Whilst I was testing this deployment I noted that during our nightly backup process Veeam would (as is normal) request a snapshot of the cluster VMs and then release the snapshot after the backup was complete.
The VMs appeared to stop responding to Cluster heartbeats briefly whilst the snapshots where taken and released; this is something that Veeam customers are occasionally familiar with and usually it’s not an issue. However because of this brief dropout in heartbeats the passive node would then begin to promote itself to active status. In order to avoid a split brain scenario and because I had configured the cluster to use vCentre for fencing the previously active node would immediately be reset as soon as this promotion began.
Generally speaking Linux VMs prefer to be shutdown cleanly and safely; regular hard resets soon caused problems in my cluster and on several days of testing I’d come into the office to find that the cluster had just given up and both nodes were in a passive state, unable to decide which of them should be active and which should be passive.
The solution I’ve implemented is to use a Veeam feature called “Enable VMware Tools quiescence”, this allows me to use VMware tools to run a script prior to snapshot creation which correctly and cleanly forces the VM being backed up into passive mode (if it was active to begin with) allowing the other node to take over the cluster without heartbeat issues.
Having ticked the appropriate box in the Veeam backup job the necessary scripts needed to be deployed on the VMs themselves, there are two scripts I’ve hacked up that work together to make this work.
In my cluster I have two hosts LinuxNFS-1 and LinuxNFS-2, I’ve hardcoded the names of the hosts into my scrips, I’m pretty sure there’s probably a more elegant way of pulling the other hostname out of the cluster cdm tool but this works well enough for my purposes.
So I sat and passed VMware’s Advanced Professional exam for Data Centre Design today and whilst I’m super happy that I passed I only just scrapped a passing score and I had some serious concerns whilst sitting the exam that I had some holes in my knowledge base.
Having had a few hours to think and re-consider I’ve concluded that actually I’ve got one large, gaping hole in my knowledge base, Fibre Channel; it’s the one VMware storage technology that I’ve only got a theoretical knowledge of – I don’t have any practical hands on experience with FC and it shows.
I found that every time an FC question came up on the DCD exam (and there were quite a few) I dropped out of the flow of the exam; I had to think harder about what the question wanted and what the answer was – I found these questions much more difficult than I should have.
So – what do I do about this? I need a practical way to gain some hands-on experience with FC equipment. I’m thinking maybe I need to add some second hand FC HBAs, switches and storage to my test environment. Of course it can’t be too expensive but it also can’t be too old and no longer representative of the current state of typical FC installations.
I’m going to look for advice from folks I know who have lots of FC experience (look out twitter…) and then I guess I’ll be sending a good bit of time on e-bay….
Those of use who have spent any time in a technology role will be familiar with the quantity of spam generated by recruitment agencies, either trying to sell us on a new role or to convince us of how many really great candidates they are currently representing.
I know plenty of IT folk whose voice mail and e-mail filters are constantly under pressure from the efforts of recruiters but today I experienced an all new form of spam, one that actually got my attention, even if only briefly…
Yep that’s right – I got an actual hard copy postcard delivered to me at work, individually hand written and stamped so that it wouldn’t look like commercial mail.
The message on the back of the postcard read “Hi, I’ve joined [name of recruiter] and it’s great! I’m still available but might not be for long as there’s a financial services technology company interested in my 7 years experience in IT project management. Get in touch with me through [name of recruiter] I’m candidate #99999 if you want to see my CV.”
I’m impressed that this recruitment agency went out of it’s way to get my attention, I’m also impressed that they were willing to spend real money on doing so (sending out e-mail spam is very low cost, this particular post card had an Irish 60 cent stamp on it.) I’m also impressed by the timing; summer is typically when postcards show up in offices from staff who want everyone to know what a great time they are having by the beach and how much they don’t miss everyone back at HQ.
However much I was impressed by this I do hope it doesn’t catch on – or if it does maybe Google will launch a new Bayesian filter service for postal networks, I think maybe Postini might be a good name for it!
Did you ever wish you could just forget the last hour or so and start again?
I thought I’d found something useful today – unfortunately it turns out that even though the %DRPRX counter was running comfortably at zero I had also managed to completely break all connectivity to my nested ESXi hosts – so it’s back to the drawing board for this problem – I’ll leave the post here so that anyone who’s interested can see what doesn’t work!
I was getting some unexpected connectivity problems with some nested ESXi hosts I was running in my test lab. I had a look at esxtop on the real ESXi host and noticed in the network screen that I was getting unexpected values in %DRPRX – a quick dig online found this VMware KB ‘http://kb.vmware.com/kb/1010071′.
This article suggests that this problem is caused by not having sufficient receive buffers in the guest, and it provides instructions for increasing the number of receive buffers for Windows and Linux guests – unfortunately that’s not going to work with an ESXi guest.
The same article however also points out that the default number of receive buffers is much greater when using vmxnet3 compared to e1000. Which got me thinking… Can you use a VMXNET3 nic with ESXi? Well it didn’t take long to find this post from William Lam ‘http://www.virtuallyghetto.com/2012/09/nested-esxi-51-supports-vmxnet3-network.html’ William discovered that ESXi 5.1 does allow the use of vmxnet3 nice, it’s unsupported but it works.
Anyway a quick modification of my virtual machines and I’ve now got my nested ESXi hosts working with vmxnet3 nics and suddenly %DRPRX is a nice healthy 0.00
Filed under ESXi, VCAP-DCA