About a month ago, while working on the first iteration of the Fabric capacity monitoring report, I stumbled across a bit of an invisible fence. I posted a quick teaser with a screenshot on LinkedIn to see if anyone else could spot it:
It's very subtle so you have to look closely. In case you haven't spotted it yet, the location (region) of your artifacts is directly tied to the location in which your capacity exists.
As I said, it's quite subtle, but this is actually a very big deal for several reasons.
Region Lock
When you create a Fabric workspace you must assign it to a capacity to use the Fabric features. Before assigning to a capacity, you must create a capacity in the Azure portal. When you create the capacity you pick a region for that capacity to exist in.
Let's assume you currently have a Trial Fabric capacity that exists in East US 2. We'll create a new workspace and assign it to the trial capacity. We'll then create a lakehouse in the new workspace.
After creating the workspace and lakehouse, you decide to reassign the workspace to a different capacity in another region, no big deal, right?
Unfortunately, by adding a lakehouse to the workspace we have region-locked ourselves. To move the workspace we would need to remove the storage account (lakehouse).
The limitation here makes sense when you consider storage costs are different depending on region. Now, I know what everyone is going to say, "Who cares, just create a shortcut and be done with it". Well, welcome to the conversation.
The Hidden Cost of Shortcuts
The reason I included "hidden" in the title of this section is due to the lack of transparency in the documentation. When it comes to pricing, we generally know how much storage and compute will cost, but the documentation around network and data transfer costs is almost non-existent. There's a single line buried in the pricing whitepaper:
https://azure.microsoft.com/en-us/pricing/details/microsoft-fabric/
I've been hesitant to write this article because I haven't been able to get a definitive answer, until now.
Let's review the following scenario:
In the above diagram, we have a lakehouse in a workspace assigned to a capacity in East US 2. Another team has requested access to our data and created a shortcut from their workspace assigned to a capacity in West US per the recommended approach.
Even when leveraging shortcuts, this scenario still produces a read operation that spans across data centers and therefore will (eventually) incur data transfer fees. I say eventually because we're not currently seeing these charges, but your team should be aware that they are coming.
Final Thoughts
When planning your Fabric implementation, capacity planning is going to play a big part, but not just from a cost and sizing perspective. Planning the location of the capacities is equally important if you want to avoid things like region lock and unexpected line items on your monthly bill.
If you haven't already, I encourage you to check out my last article on deployment as many of the broader architectural considerations will also be applicable here.
https://lucidbi.co/fabric-architecture-considerations
If you'd like to learn more about how Lucid can support your team, let's connect on LinkedIn and schedule an intro call.