Enterprise Cloud and All-You-Can-Eat Buffets

Enterprise Cloud and All-You-Can-Eat Buffets

What does Enterprise Cloud have in common with an all-you-can-eat buffet?

Delegation and Self-Service

As well as automating tasks to be run on a regular schedule to avoid manual handling, many tasks are being automated to allow the delegation of executing certain tasks to other humans.

Consider the case where you have 10 VDI desktops deployed. As common tasks come up, such as restoring files from snapshots, diagnosing performance issues or provisioning new desktops, it’s easy to jump in and take care of matters by hand. Take that number to 1000 and you’re likely going to start to see issues maintaining those by hand as you scale. Get to 10,000 or more and it’s an entirely different class of problem.

This doesn’t just apply to VDI — DevOps deployments and Enterprise server farms are seeing the same kinds of challenges as they scale too.

In order to scale past a few systems, you need to start to delegate some number of tasks to someone else. Whether that be a helpdesk team of some kind, or a developer or application owner, or even potentially the end user of a VDI desktop.

However, delegation and self-service are not just a case of dumping a bunch of tech in front of folks and wishing them luck. In most cases, these folks won’t have the technical domain knowledge required to safely manage their portion of infrastructure. We need to identify the tasks that they need to be able to perform and package those up safely and succinctly.

Buffet!

Consider a restaurant with an all-you-can-eat buffet. One of the nice ones — we’re professionals here. Those buffets don’t have a pile of raw ingredients, knives and hotplates, yet they’re most definitely still self-service.

You’re given a selection of dishes to choose from. They’ve all been properly prepared and safely presented, so that you don’t need to worry about the food preparation yourself. There is the possibility of making some bad decisions (roast beef and custard), but you can’t really go far enough to actually do yourself any great harm.

They do this to scale. More patrons with fewer overhead costs, such as staff.

DIY Self-Service

As we deploy some kind of delegation or self-service infrastructure, we need to:

  1. Come up with a menu of tasks that we wish to allow others to perform,
  2. Work out the safety constraints around putting them in the hands of others, and
  3. Probably still having staff to pour the bottomless mimosas instead of simply a tap.

We did introduce these two things in previous series’ of articles. In particular, #1 is a case of listing and defining one or more business problems, as we saw in the automation series.. For example, users that accidentally delete or lose an important file, might need a way to retrieve files from a snapshot from a few days ago. #2 above is referring to taking and validating very limited user input. In the restore example above, we’d probably only allow the user to specify the day that contains the snapshot they’re looking for and maybe the name of their VM.

Public Cloud

Self-service and autonomy are one of the things that Public Cloud have brought to the table at a generic level. By understanding the specifics of your own Enterprise, you can not only meet, but exceed that Public Cloud agility within your own data centre. This can also be extended to seamlessly include Public Cloud for the hybrid case.

Next Steps

As with each of these series, we’re starting here with a high level overview and will follow that up with an illustrative example over the coming articles. We’ll build on what we’ve learned in those previous series and we’ll again use the common System Center suite to get some hands-on experience. As always, the concepts and workflow apply quite well to tools other than System Center too.

To summarise, delegation and self-service are essential for most organisations as they scale. When used to safely allow autonomy of other groups, it can save you and your team significantly.

[Buffet picture by Kenming Wang and used unmodified under SA2.0]

 

A Very Particular Set Of Skills

A Very Particular Set Of Skills

Has anybody not heard of the recent ransomware attack known as WannaCry? No? Good. Hopefully you’re only aware of it through news articles, but for far too many folks, this is not the case.

We all keep our patches up to date and we all use various levels of protection to limit the attack surface and potential spread of these kinds of attacks.

Unfortunately, and for various reasons, these kinds of attacks can still wreak havoc.

When this does happen, it doesn’t have to ruin your year.

For an individual virtual machine affected by this, simply:

  1. Revert your affected VM back to a previous snapshot using SyncVM
  2. Start the VM disconnected from the network
  3. Apply any updates to close the exploited security hole
  4. Reconnect to the network
  5. Don’t pay the ransom

In cases where there are a very large number of affected VMs, a lot of this process can be automated.

To misappropriate and misquote the famous speech from the movie Taken, our customers have a very particular set of skills. Skills we’ve been assisting them with over a long career.

[Ransom Note image by Sheila Sund and used unmodified under CC BY 2.0]

Orchestration For Enterprise Cloud

Orchestration For Enterprise Cloud

In our last series, we looked at taking a business problem or need and turning into a piece of automated code. We started out by breaking the problem down into smaller pieces, we then put together a piece of sortamation to demonstrate the overall solution, and then made it more modular and added some error handling.

In this series, we’re going to extend upon this and integrate our automation into an orchestration framework. This approach will apply to any orchestration framework, but we’ll use System Center Orchestrator 2016 in our examples.

Why Orchestration?

Primarily for delegation of tasks. We may have written a script that we can run to perform some mundane task, but for us to be able to successfully scale, we need to start putting some of these tasks into the hands of others.

As a dependency for delegation, we also want to use automation and orchestration as a way to guarantee us consistent results and general damage prevention. Sure, we could just allow everyone access to SCVMM or vCenter to manage their virtual machines, but that’s a recipe for disaster. Orchestration gives us a way to safely grant controlled access to limited sets of functionality to make other groups or teams more self-sufficient.

The Process

Much like in our earlier automation series, we want to start by defining a business problem and breaking it down to smaller tasks from there. We want to extend this too to include the safe delegation of this task and this will include thinking carefully about input we’ll accept from the user, information we’ll give back to the user, and what kind of diagnostic information we’ll need to collect so that if something goes wrong, we can take a look later.

The fictitious, but relevant, business problem that we’re going to solve in this series is a common DevOps problem:

Developers want to be able to test their code against real, live production data to ensure realistic test results.

The old approach would be to have someone dump a copy of the production application database and restore the dump to each of the developers’ own database instances. This is expensive in time and capacity and can adversely impact performance. It’s also error-prone.

Instead, we’ll look at making use of Tintri’s SyncVM technology to use space-efficient clones to be able to nearly-instantly make production data available to all developers. We’ll do this with some PowerShell and a runbook in System Center Orchestrator.

We can then either schedule the runbook to be executed nightly, or we can make the runbook available to the helpdesk folks, who can safely execute the runbook from the Orchestrator Web Console. [Later, we’ll look at another series that shows us how to make this available to the developers themselves — probably through a Service Manager self-service portal — but let’s not get too far ahead of ourselves]

Core Functionality

Our production VM and developer VMs all have three virtual disks:

  1. A system drive that contains the operating system, any tools or applications that the developer needs, and the application code either in production or in development.
  2. A disk to contain database transaction logs.
  3. A disk to contain database data file logs.

In our workflow, we’ll want to use SyncVM to take vDisks #2 and #3 from a snapshot of the production VM, and attach them to the corresponding disk slots on the developer’s VM. We want this process to not touch vDisk #1, which contains the developer’s code and development tools.

Inputs

For us to use SyncVM (as we saw previously), we need to pass in some information to the Tintri Automation Toolkit about what to sync. Looking at previous similar code, we probably need to know the following:

  • The Tintri VMstore to connect to.
  • The production VM and the snapshot of it that we wish to sync
  • The set of disks to sync
  • The destination developer VM

In order to limit potential accidents, we probably want to limit how much of this comes from the person we’re delegating this to. In our example, we’ll assume a single VMstore and a single known production VM. We’ve also established earlier that there is a consistent and common pattern for the sets of disks to sync (always vDisk #2 and #3). The only parameter where there should be any user input is the destination VM. The rest can all be handled consistently and reliably within our automation.

Output

After this has been executed, we’ll need to let the user know if it succeeded or failed, along with any helpful information they’ll need. We’ll also want to track a lot of detailed information about our automation so that if an issue arises, we have some hope of resolving it.

Summary

So far, we have again defined a business need or business problem (synchronisation of production data to developer VMs), defined the set of inputs we’ll need and where those will come from, and we’ve defined the outputs.

In the next installment, we’ll start to get our hands dirty with System Center Orchestrator, followed by PowerShell and the Tintri Automation Toolkit for PowerShell.

[Orchestra image by Sean MacEntee and used unmodified under CC2.0]

Hyper-V I/O Multi-tenancy for Private Cloud part II

Hyper-V I/O Multi-tenancy for Private Cloud part II

Overview

Last week, we looked at the use of user-based access controls to keep storage tenants on separate SMB shares for Hyper-V hosted private clouds. This week, we’ll expand on that through separation using VLANs.

VLANs themselves are pretty straightforward as a way to partition a physical network into multiple virtual, isolated networks. Where this can get a little murky, is when we put Kerberos and SMB into the mix. That’s the area that we’ll aim to clear up here this week.

freecell

[Freecell image by mahemoff under CC by 2.0]

Networks, Subnetworks and Names

We can create separation by creating a distinct VLAN for each tenant’s compute. Alice might be on VLAN 11 and Bob might be on VLAN 12. The Tintri VMstore serving I/O for these tenants would have its data interface on each tenant’s VLAN.

Each of these VLANs would be assigned a different subnet and the VMstore would have an IP address on each VLAN. Each tenant can communicate with the VMstore’s data interface, but can’t communicate with each others’. Easy.

So what’s the problem?

As we saw in Kerberos Fundamentals a few weeks ago, we have a DNS name that maps to a data IP address on the VMstore and has an associated Service Principal Name (SPN) in Active Directory. It might be tempting to have that DNS record resolve to all of the data IPs assigned to the VMstore (for each VLAN), but that won’t work in a lot of cases. Most DNS servers by default will return those IP addresses in a round-robin rotation. Using our Alice and Bob example, Bob’s compute trying to resolve the DNS name of the Tintri VMstore would have a high likelihood of getting an IP address on Alice’s VLAN/subnet, which is not reachable by Bob.

Split-DNS or a DNS server that responds differently to queries based on source IP address (such as views in ISC bind) are valid ways to get past this, but can be more unwieldy to manage and troubleshoot.

Extra SPNs

By creating a DNS record for each tenant data IP address and an associated SPN, we have a solution to the problem where all components (DNS records and SPNs) are kept with each other. The following table shows an example:

Data path DNS name IP address and netmask VMstore SPN
alice-data.vmlevel.com 172.16.11.1/24 cifs/alice-data.vmlevel.com
bob-data.vmlevel.com 172.16.12.1/24 cifs/bob-data.vmlevel.com

Done.

You’ll notice that when deploying the VMstore, we assign the SMB data path hostname in the UI so that the VMstore knows the name used by default for its SPN. It is not necessary to tell the Tintri VMstore about the subsequent SPNs or data path names for Alice and Bob above.

So as you can see, it isn’t necessarily obvious, but it isn’t very difficult to use SMB and Kerberos in a multi-tenant environment where VLAN and subnet separation has been employed.

Hyper-V I/O Multi-tenancy for Private Cloud part I

Hyper-V I/O Multi-tenancy for Private Cloud part I

Overview

This is the first of a two part series of articles that will talk a little about multi-tenancy in for Hyper-V in a Hosted Private Cloud deployment. Not all Private Clouds need this level of separation, but for the cases where it’s a requirement, this series aims to help.

In the context of Hyper-V, I/O Multi-tenancy is where we serve storage for VMs over SMB 3 to two or more clients (tenants) where they are not aware that the storage is shared with others. Essentially there needs to be some degree of security and privacy between tenants — especially when it comes to traversing the data network.

This first article will deal with SMB shares for user-based access control as separation, and we’ll extend upon that in the second article, which will discuss VLANs and how they impact Kerberos authentication.

In the case of VM-aware storage, once inside the storage array, each tenant’s I/Os are kept separate from each other through either implicit or explicit VM-level QoS. That might be a topic for another day.

6343998733_c19420febc_b

[Public Domain image taken by Bart Lumber]

An Example

To keep this topical, let’s say that we’re a hosting service for US Presidential Candidates and their campaign folks. Using standard fictional security characters, we’ll call these two candidates Alice and Bob.

Each has access to a Hyper-V compute platform and manages their own VMs, which are stored on their Tintri VMstore. Both groups need the ability to manage and maintain their own virtual machines and applications, but neither group should be able to see each other’s, which might contain sensitive emails or tax returns. Bob should see Bob’s stuff and Alice should see Alice’s. Neither should see anything else.

Separate Shares

By creating an SMB share (or more) each, we can grant access to only the parties that need it. Probably the easiest way to do this is with the (free) Tintri PowerShell Toolkit.

Consider the following illustrative example of creating a share each that is accessible by the right compute nodes for both Alice and Bob. The VMstore in this case being vmstore01.vmlevel.com.

PS:\> Connect-TintriServer vmstore01.vmlevel.com
PS:\> New-TintriSmbShare -Name AliceVMs
PS:\> Grant-TintriSmbShareAccess -Name AliceVMs -User alice-hv01$ -Access FullControl
PS:\> Grant-TintriSmbShareAccess -Name AliceVMs -User alice-hv02$ -Access FullControl

PS:\> New-TintriSmbShare -Name BobVMs
PS:\> Grant-TintriSmbShareAccess -Name BobVMs -User bob-hv01$ -Access FullControl
PS:\> Grant-TintriSmbShareAccess -Name BobVMs -User bob-hv02$ -Access FullControl

Above, we create a connection to the VMstore, we create a new SMB share for each tenant (Alice and Bob) and grant each tenant’s compute nodes access to the correct share.

VM Administrators

In single-tenant deployments, in addition to granting hosts/clusters of compute nodes access to the storage, it’s necessary to grant your VM Administrators access to the VMstore to manage storage and create VMs. This is most-simply done by assigning their Active Directory group(s) the Tintri Super Admins RBAC role, which allows them the administrative access that they need.

It might seem like a straightforward case of simply assigning both Bob and Alice’s VM admins the same role. However, that would grant both parties access to management functionality over the VMstore, and would grant each other access to the other’s VM data — the very thing we’re trying to avoid.

Instead, we just grant both parties explicit access to their own share using the same PowerShell cmdlets above, but naming Active Directory groups containing Alice and Bob’s VM administrators:

PS:\> Grant-TintriSmbShareAccess -Name BobVMs -User "Bob Admins" -Access FullControl
PS:\> Grant-TintriSmbShareAccess -Name AliceVMs -User "Alice Admins" -Access FullControl

And there you have it. Bob’s team can now create VMs against the share \\vmstore01-data.vmlevel.com\BobVMs and Alice’s team can do the same from their compute nodes against \\vmstore01-data.vmlevel.com\AliceVMs. Neither canĀ  see each other and our own admins can manage both should the need arise.

Summary

It’s possible with SMB 3 shares and access control lists to implement the foundation for a multi-tenant environment. In the next article, we’ll expand on this to include network-level multi-tenancy without breaking Kerberos.

 

Kerberos and External Trusts

Let’s say that for whatever reason, you have your Hyper-V compute in one Active Directory domain, say VMLEVEL.COM, and your Tintri VM-Aware Storage appliances in a different Active Directory domain, say TINTRI.COM. We know that users and computers in one domain can only be authenticated to access resources in another domain if there’s a trust relationship with them.

In the case of Active Directory forest trusts, the magic that is the global catalogue allows the domain controllers in one domain to resolve Service Principal Names (SPNs) for other domains within that forest.

However, external trusts by default don’t share the same types of information between domains that you find with forest trusts. External trusts are also at times preferred over forest trusts. One example being multitenancy for hosted private clouds.

Let’s see what happens here, so that we can see what the impact is and what we can do about it.

A Kerberos client wanting access to a particular service will construct a ticket request containing a Service Principal Name identifying the service, and will do so by taking a service name (cifs for SMB 3.0), appending the fully-qualified DNS name of the host (vmstore01-data.vmlevel.com for example) and that’s kind-of it: cifs/vmstore01-data.vmlevel.com.

In the forest trust case, the the domain controllers are able to search the whole forest for an account with that SPN. In the case of external trusts, this doesn’t happen by default and the client will fail to get a ticket.

It is possible, however, to configure a set of domains to search by both Kerberos clients and Kerberos KDCs. To do so, start gpedit.msc (the Group Policy Editor), then navigate down through:

  • Local Computer Policy
    • Administrative Templates
      • System

And then under both Kerberos and KDC, locate the Use forest search order parameter. By setting it to enabled and entering the externally trusted domain, you should find that Kerberos tickets can be requested across the external trust from this domain to the remote domain. In many cases, you may also want to set the same in the opposite direction and it is important to have this setting propagated to all domain controllers in the domain for consistent results.

forest-search-order.png

So whilst Kerberos won’t work across external trusts by default, it is possible to use the forest search order tunable to enable it for select domains.

 

Per-User Delegation

Per-User Delegation

In a previous article on Constrained Delegation, we looked the need to allow Hyper-V hosts to perform impersonation/delegation on behalf of your virtualisation administrators or other Hyper-V hosts for certain SMB 3.0 storage operations.

Per-user Active Directory controls exist to prevent certain accounts from being impersonated even in cases where constrained delegation may be allowed.

user-delegation

The Account is sensitive and cannot be delegatedattribute for each user account in Active Directory can prevent it from being used for delegation. Consider the following:

  1. Hyper-V host hyperv01.vmlevel.com is configured for constrained delegation for the SMB service on a Tintri VMstore, vmstore01.vmlevel.com. The Service Principal Name configured for delegation is cifs/vmstore01-data.vmlevel.com.
  2. Alice does not have the Account is sensitive and cannot be delegated checkbox checked on her account in Active Directory.
  3. Bob does have the Account is sensitive and cannot be delegated checkbox checked against his account in Active Directory.

In this case, if Bob attempts to use Hyper-V Manager on his desktop to create a VM running on the Hyper-V host above, but stored on the Tintri VMstore, he’ll see an error message from the Hyper-V host telling him that access has been denied.

If Alice tries the same thing, she’ll be able to create and administer VMs.

Tintri doesn’t have any explicit best practices around which accounts to set or clear this flag for — it’s up to the business. However, many operations require the use of delegation. If you find that some users are able to perform VM administration tasks, but others cannot, it’s worth checking the state of this user account flag in Active Directory.