Getting started with Azure Service Fabric

In case you are planning to build the new Facebook or Twitter you probably need to adopt a Microservice architecture that allows you to easily scale up to thousands of machines (Scalability) and be always-on having zero downtime during application upgrades or hardware failures (Availability-Reliability). While Microservices architecture provides this type of critical features it also raises operational or communication nature difficulties that need to be handled. The most common difficulty is the service discovery or in other words, how services communicate with each other when there may be thousands of these and each one may stop functioning or change hosting machine at any time. Another major challenge is how to apply system upgrades to such a large amount of hosted services. There are two popular solutions for building Microservices-based applications, the one that uses containers and Orchestrators such as Kubernetes or Docker Swarm to solve all the management and operational issues and Azure Service Fabric which is Microsoft’s distributed systems platform for packaging, deploying and managing scalable and reliable microservices and containers.

Azure Service Fabric Series

This post is the first one related to Azure Service Fabric but there will be more. The purpose of the Azure Service Fabric blog post series is to get you familiar with the platform and learn how to use it in order to build and manage enterprise, cloud-scale applications. You will learn the different types of services that you can use (Stateless, Stateful, Actors), how to scale them, how to handle deployments or upgrades in a fully distributed system and many more… Probably at the end of the series we will also build and learn how to manage a Microservice based application that can scale up to thousands of machines. But first things first so on this very post we ‘ll try to keep it simple and learn just the basics. More specifically:

  • Install Azure Service Fabric SDK: You will prepare your development environment by installing and configuring a local cluster
  • Create your first Stateless and Stateful services: You will scaffold a stateless and a stateful service. We ‘ll study their code and understand their differences and when to use one over the other
  • Deploy the services on the local cluster and study their behavior using the Diagnostic Event Viewer in Visual Studio. You will use Powershell to check and monitor your cluster’s status
  • Learn the basic Configuration options such as service’s Instance Count or Partition Count

Install Azure Service Fabric SDK

When using ASF you deploy your application services in an Azure Service Fabric Cluster which consists of 1 or more nodes. Ideally you would like to have a similar environment on your development machine so that you can test and simulate your multinode application behavior locally. Luckily you can install Azure Service Fabric SDK and run your services as if they were running in a production environment. Install the SDK by clicking one of the following links depending on your development environment.

Azure PowerShell

Go ahead and install PowerShell and Azure PowerShell. They can be used to monitor and manage the services deployed in a Service Fabric cluster

After installing the SDK you should see the Azure Service Fabric icon on right bottom of your screen.

Right click the icon and select the Manage Local Cluster menu item to open the Service Fabric Explorer. This is a panel where you can see all the applications and services hosted on your local cluster. One thing you can notice is that the cluster can be configured as a single or a 5 node cluster (this can be done through the Switch Cluster Mode menu item on the tray). Service Fabric Explorer can be opened alternatively by navigating to http://localhost:19080/Explore. Before opening though the explorer make sure the cluster is running by selecting Start Local Cluster from the tray icon.

A Service Fabric cluster is a shared pool of network-connected physical or virtual machines (Nodes) where microservices are deployed and managed. A Service Fabric cluster can scale up to thousands machines. Each machine or a VM in a cluster is considered as a Node but what Service Fabric actually considers as a Node is two executables, Fabric.exe and FabricGateway.exe which are started by a Windows Service named FabricHost.exe.

When Service Fabric local cluster is up and running you should be able to see these services in the Windows Task manager. The following screenshot shows the previous services for a 5-Node mode local cluster.

Stateless services

A service in SF is an isolated unit responsible to deliver specific functionality. It should be able to be managed, scaled and evolved independently from other services in the cluster. A Stateless service as its name implies is a service that doesn’t save any state and just to make it more clear, a service that doesn’t save any state locally. This is the main difference with the Stateful service that actually saves some type of state locally. Open Visual Studio 2017 as Administrator and create a new project of type Service Fabric Application named CounterApplication. You will find the template by selecting the Cloud templates.

Click next, select the .NET Core Stateless Service template and name the service CounterStatelessService.

When VS finishes scaffolding the SF project your solution should look like this:

Each SF application has a specific named Type and by default it is named as <solution-name>Type. This is defined on the application level in the solution and more specifically in the ApplicationManifest.xml file. Go ahead and open that file.

<ApplicationManifest ApplicationTypeName="CounterApplicationType"
                     ApplicationTypeVersion="1.0.0"
                     xmlns="http://schemas.microsoft.com/2011/01/fabric"
                     xmlns:xsd="http://www.w3.org/2001/XMLSchema"
                     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">

This file also defines which service types the application consists of plus any parameters and configuration to be used when the application is provisioned on the cluster. In our example the ApplicationManifest.xml file defines that we need to import the CounterStatelessServicePkg package with version “1.0.0” and instantiate [CounterStatelessService_InstanceCount] number of CounterStatelessService service instances.

  <Parameters>
    <Parameter Name="CounterStatelessService_InstanceCount" DefaultValue="-1" />
  </Parameters>
  <!-- Import the ServiceManifest from the ServicePackage. The ServiceManifestName and ServiceManifestVersion 
       should match the Name and Version attributes of the ServiceManifest element defined in the 
       ServiceManifest.xml file. -->
  <ServiceManifestImport>
    <ServiceManifestRef ServiceManifestName="CounterStatelessServicePkg" ServiceManifestVersion="1.0.0" />
    <ConfigOverrides />
  </ServiceManifestImport>
  <DefaultServices>
    <!-- The section below creates instances of service types, when an instance of this 
         application type is created. You can also create one or more instances of service type using the 
         ServiceFabric PowerShell module.
         
         The attribute ServiceTypeName below must match the name defined in the imported ServiceManifest.xml file. -->
    <Service Name="CounterStatelessService" ServicePackageActivationMode="ExclusiveProcess">
      <StatelessService ServiceTypeName="CounterStatelessServiceType" InstanceCount="[CounterStatelessService_InstanceCount]">
        <SingletonPartition />
      </StatelessService>
    </Service>
  </DefaultServices>

Lots of things were defined in the ApplicationManifest file so we need to see where are all these values come from and how they affect the final deployed application. Switch to CounterStatelessService project and check the first lines of its ServiceManifest.xml file. Each service has a manifest file that defines several types of configuration properties such as the service(s) to be activated, the package, config and code names and entry points as well.

<ServiceManifest Name="CounterStatelessServicePkg"
                 Version="1.0.0"
                 xmlns="http://schemas.microsoft.com/2011/01/fabric"
                 xmlns:xsd="http://www.w3.org/2001/XMLSchema"
                 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">

The 1st two lines of the previous snippet declare the name of the service type and the current version. Notice that these matches the related properties declared in the ApplicationManifest.xml file. Version numbers have a significant role in application upgrades but we will cover upgrades in a future post. In the same way an SF application has a specific type, a service has a type as well defined in the ServiceManifest.mxl file.

<ServiceTypes>
    <!-- This is the name of your ServiceType. 
         This name must match the string used in the RegisterServiceAsync call in Program.cs. -->
    <StatelessServiceType ServiceTypeName="CounterStatelessServiceType" />
  </ServiceTypes>

Note that this value matches the ServiceTypeName for each service type need to be activated in the ApplicationServiceManifest.xml file. The next important configuration property is the definition of what your service actually doing which is the EntryPoint.

<CodePackage Name="Code" Version="1.0.0">
    <EntryPoint>
      <ExeHost>
        <Program>CounterStatelessService.exe</Program>
      </ExeHost>
    </EntryPoint>
  </CodePackage>

EntryPoint can also be used in case you need to run some initialization scripts or code before the service instance is activated. At the end of the day a service is an executable program like a normal console app. The Program.cs registers the service type in Service Fabric. Before ASF activates an instance of a service, its service type needs to be registered.

private static void Main()
{
    try
    {
        // The ServiceManifest.XML file defines one or more service type names.
        // Registering a service maps a service type name to a .NET type.
        // When Service Fabric creates an instance of this service type,
        // an instance of the class is created in this host process.

        ServiceRuntime.RegisterServiceAsync("CounterStatelessServiceType",
            context => new CounterStatelessService(context)).GetAwaiter().GetResult();

        ServiceEventSource.Current.ServiceTypeRegistered(Process.GetCurrentProcess().Id, typeof(CounterStatelessService).Name);

        // Prevents this host process from terminating so services keep running.
        Thread.Sleep(Timeout.Infinite);
    }
    catch (Exception e)
    {
        ServiceEventSource.Current.ServiceHostInitializationFailed(e.ToString());
        throw;
    }
}

The Main method also scaffolds code to generate logs using Event Tracing for Windows which is very usefull to understand the behavior, state and failures of your SF applications and services. A Stateless service class inherits from the StatelessService class.

internal sealed class CounterStatelessService : StatelessService
{
    public CounterStatelessService(StatelessServiceContext context)
        : base(context)
    { }
    // code omitted

Optionally it registers communication listeners so that can accept requests from other clients/services. Service Discovery is one of the best features in SF by providing a simple and straightforward way for services to find and communicate each other even when they change host machine or fail. You can use a number of communication protocols, not only HTTP which is great for boosting performance between internal service communication. We will see more on service discovery in a future post. Last but not least a service has an optional RunAsync method that defines what your service does when instantiated.

protected override async Task RunAsync(CancellationToken cancellationToken)
{
    long iterations = 0;

    while (true)
    {
        cancellationToken.ThrowIfCancellationRequested();

        ServiceEventSource.Current.ServiceMessage(this.Context, "Iteration-{0}   |   {1}",
                                                    ++iterations, this.Context.InstanceId);

        await Task.Delay(TimeSpan.FromSeconds(5), cancellationToken);
    }
}

What CounterStatelessService service does is just logging a message with the current number of iterations plus the instance id of the service every 5 seconds. Every time you instantiate a service this method runs. In the case of a stateless service such as the CounterStatelessService, each one uses a different iteration variable which means they all log values of a different variable.

Deploy the Stateless service

Let’s deploy the Azure Service application and see what happens. You can do it either by pressing F5 and start debugging in VS as usual or by right clicking the CounterApplication project and select Publish.. In case you choose the F5 option, the ASF application will be automatically deployed to your local cluster and de-provisioned when you stop debugging. If you choose the Publish option a new window will open where you have to choose:

  • Target profile: The options are the publish profiles exist in the CounterApplication/PublishProfiles folder
  • Connection Endpoint: The endpoint is defined (or not) in the publish profile chosen in the previous step. If you choose the PublishProfiles/Local.1Node.xml or PublishProfiles/Local.5Node.xml then your local cluster will be selected. Otherwise you have to enter the endpoint of your cluster. This is defined on the publish profile xml files as follow:
    <?xml version="1.0" encoding="utf-8"?>
    <PublishProfile xmlns="http://schemas.microsoft.com/2015/05/fabrictools">
      <ClusterConnectionParameters ConnectionEndpoint="" />
      <ApplicationParameterFile Path="..\ApplicationParameters\Cloud.xml" />
      <CopyPackageParameters CompressPackage="true" />
    </PublishProfile>
    

    If ClusterConnectionParameters is empty then local cluster is selected

  • Application Parameters file: The options are one of the files exist in the CounterApplication/ApplicationParameters folder and should match your publish profile selection

When choosing PublishProfiles/Local.1Node.xml or PublishProfiles/Local.5Node.xml publish profiles make sure that your local cluster’s mode configuration is the same, meaning choose the Local.5Node.xml when your cluster runs in 5 nodes mode and Local.1Node.xml otherwise. By default the local cluster is configured to run as a 5 node so your publish window should look like this:

Publish the CounterApplication (preferably using F5 during this post) in your local cluster and then open the Diagnostic Events view. You can open the view in VS by navigating View -> Other Windows -> Diagnostic Events

Now check the changes happened in the Service Fabric Explorer.

At this point we will pause and explain a few things. First of all only one instance of our Stateless service is deployed despite the fact that we run the cluster in a 5 node mode so why that happened? The ApplicationManifest.xml file defines that the default number of our CounterStatelessService service instances should be -1 which in Azure Service Fabric means that the service should be deployed to all nodes in the cluster. We should have seen 1 instance deployed per node but we only see 1 instance (in my case deployed on the _Node_0 but in yours can be different). This happened because the CounterStatelessService_InstanceCount parameter was overridden by a value provided in the Local.5Node.xml Application Parameter file..

<?xml version="1.0" encoding="utf-8"?>
<Application xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" Name="fabric:/CounterApplication" xmlns="http://schemas.microsoft.com/2011/01/fabric">
  <Parameters>
    <Parameter Name="CounterStatelessService_InstanceCount" Value="1" />
  </Parameters>
</Application>

Back to service explorer there are several types of nodes:

  • Application type: The top level Node is the Application Type which is created when you create a Service Fabric Application. In our case is CounterApplicationType (1.0.0) as defined in the ApplicationManifest.xml file
  • Application instance: It’s the second level in the hierarchy and the first one under the Application type node. Deploying an ASF application you get an instance of that application type on the cluster named fabric:/ApplicationName, in our case fabric:/CounterApplication
  • Service type: The nodes that define the service types registered in Azure Service Fabric. In a real scenario you will have many service types registered. The service type name has a format of fabric:/ApplicationName/ServiceName which in our case is fabric:/CounterApplication/CounterStatelessService
  • Partition type: A partition is identified by a Guid and makes more sense for stateful services. In our case we have one partition.
    <Service Name="CounterStatelessService" ServicePackageActivationMode="ExclusiveProcess">
      <StatelessService ServiceTypeName="CounterStatelessServiceType" InstanceCount="[CounterStatelessService_InstanceCount]">
        <SingletonPartition />
      </StatelessService>
    </Service>
    

    We will discuss partitions in the Stateful services section.

  • Replica or Instance type: It defines the cluster node where the service is currently hosted/running


Assuming that CounterStatelessService got activated in _Node_3 which is shown in the Replica Node you can narrow down in Service Fabric Explorer and get the Instance ID of that service.

Switch back to VS Diagnostic Events window and confirm that you get logs from that instance. Now we will make a test and simulate a system failure and make the Node that the CounterStatelessService is currently deployed fail and see what happens. To do this, find the current Node (in my case is _Node_3), click on the 3 dots on the right and select Deactivate (restart).

You will be prompt to enter the Node’s name to confirm restart. While the node is restarting make sure to have the Diagnostic Events opened and see what happens..

As you can see multiple Service Fabric related events have fired. The most important are the highlighted where the currently active service received a CancellationToken request. When a cancellation request is received from your services make sure you stop any active process running on the service. Next you can see that the Node deactivation has been completed and a new CounterStatelessService got activated. Since this is a new instance, you can see in the logs a new Instance ID plus the initialization of the iterations property. When Azure Service fabric detected that a Node failed, it checked the [CounterStatelessService_InstanceCount] and decided that one instance should always be active. So it found a healthy node on the cluster and instantiated a new instance for you automatically. Tha same will happen in case you try to change the [CounterStatelessService_InstanceCount] number, it will always try to keep the configured instances active.

You can reactivate the Node you deactivated before in the same way you did before by selecting Activate..

If you set it to 5 in a 5 Node cluster ASF will equally distribute the services to all nodes by instantiating an instance on each node. You would expect though that deactivating a Node in that case would result ASF to create a new instance in one of the other healthy nodes which means one Node would host 2 instances of the service. Well.. this wont happen due to partition constraints where multiple instances of a single partition cannot be placed in a node. In case you want to enforce or test this behavior you need to apply the following configuration:

  • Change the partition number of the CounterStatelessService to 5: Check the highlighted modifications made in the ApplicationManifest.xml file where a new CounterStatelessService_PartitionCount parameter added partition type changed from SingletonPartition to UniformInt64Partition.
    <?xml version="1.0" encoding="utf-8"?>
    <ApplicationManifest xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" ApplicationTypeName="CounterApplicationType" ApplicationTypeVersion="1.0.0" xmlns="http://schemas.microsoft.com/2011/01/fabric">
      <Parameters>
        <Parameter Name="CounterStatelessService_InstanceCount" DefaultValue="-1" />
        <Parameter Name="CounterStatelessService_PartitionCount" DefaultValue="-1" />
      </Parameters>
      <!-- Import the ServiceManifest from the ServicePackage. The ServiceManifestName and ServiceManifestVersion 
           should match the Name and Version attributes of the ServiceManifest element defined in the 
           ServiceManifest.xml file. -->
      <ServiceManifestImport>
        <ServiceManifestRef ServiceManifestName="CounterStatelessServicePkg" ServiceManifestVersion="1.0.0" />
        <ConfigOverrides />
      </ServiceManifestImport>
      <DefaultServices>
        <!-- The section below creates instances of service types, when an instance of this 
             application type is created. You can also create one or more instances of service type using the 
             ServiceFabric PowerShell module.
             
             The attribute ServiceTypeName below must match the name defined in the imported ServiceManifest.xml file. -->
        <Service Name="CounterStatelessService" ServicePackageActivationMode="ExclusiveProcess">
          <StatelessService ServiceTypeName="CounterStatelessServiceType" InstanceCount="[CounterStatelessService_InstanceCount]">
            <UniformInt64Partition PartitionCount="[CounterStatelessService_PartitionCount]" LowKey="-9223372036854775808" HighKey="9223372036854775807" />
          </StatelessService>
        </Service>
      </DefaultServices>
    </ApplicationManifest>
    
  • Change the parameter values in the ApplicationParameters/Local.5Node.xml file to reflect an environment where there are 5 different partitions of the CounterStatelessService with 1 instance each:
    <?xml version="1.0" encoding="utf-8"?>
    <Application xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" Name="fabric:/CounterApplication" xmlns="http://schemas.microsoft.com/2011/01/fabric">
      <Parameters>
        <Parameter Name="CounterStatelessService_InstanceCount" Value="1" />
        <Parameter Name="CounterStatelessService_PartitionCount" Value="5" />
      </Parameters>
    </Application>
    

Hit F5 to publish the app in the local cluster and check the Service Fabric Explorer.

What you see is 5 different partitions with a single instance of the CounterStatelessService each, deployed in 5 different cluster nodes. Try to deactivate a node and Service Fabric will create a new instance of the service in one of the remaining 4 nodes. In the following screenshot notice that _Node_3 was deactivated and a new instance instantiated in _Node_4

Partitions make more sense in Stateful services and this is where we will explain them in detail.

Stateful services

Sometimes you need your microservices to save some type of local state and also survive and restore that state after system failures. This is where Service Fabric Stateful services come into the scene to enhance the level of reliability and availability in scalable and distributed systems. Service Fabric provide reliable data structures (dictionaries or queues) which are persisted and replicated automatically to secondary replicas in the cluster. The main idea is simple: You have a partition which is nothing more that a set of replicas. There is only one primary replica in a partition and all writes go through it. All other replicas are considered secondaries and don’t accept or process requests. All state changes though are replicated to all secondaries and are handled in transactions meaning that changes are considered committed when all transactions have been applied to all replicas (primary and secondaries). Let’s create our first Stateful service and explain how it works in more detail..
Right click the CounterApplication and select Add => New Service Fabric Service...

Select the .NET Core Stateful Service template and name the service CounterStatefulService.

A Stateful service inherits from the StatefulService class.

internal sealed class CounterStatefulService : StatefulService
{
    public CounterStatefulService(StatefulServiceContext context)
        : base(context)
    { }
    // code omitted

StatefulService class has a property named StateManager of type IReliableStateManager. This class gives you access to a Reliable State Manager which is used to access the reliable collections like if they were local data.

public abstract class StatefulService : StatefulServiceBase
{
    protected StatefulService(StatefulServiceContext serviceContext);
    protected StatefulService(StatefulServiceContext serviceContext, IReliableStateManagerReplica reliableStateManagerReplica);

    public IReliableStateManager StateManager { get; }
}

Keep in mind that reliable data structures aren’t actually local data but distributed which means that somehow their lifetimes should be managed properly to ensure consistency and data integrity between all replicas. Change the RunAsync method as follow:

protected override async Task RunAsync(CancellationToken cancellationToken)
{
    var counterDictionary = await this.StateManager.GetOrAddAsync<IReliableDictionary<string, long>>("counter");

    while (true)
    {
        cancellationToken.ThrowIfCancellationRequested();

        using (var tx = this.StateManager.CreateTransaction())
        {
            var result = await counterDictionary.TryGetValueAsync(tx, "iteration");

            ServiceEventSource.Current.ServiceMessage(this.Context, "Iteration-{0}   |   {1}",
                (result.HasValue ? result.Value.ToString() : "Value does not exist."), this.Context.ReplicaOrInstanceId);

            await counterDictionary.AddOrUpdateAsync(tx, "iteration", 0, (key, value) => ++value);

            // If an exception is thrown before calling CommitAsync, the transaction aborts, all changes are 
            // discarded, and nothing is saved to the secondary replicas.
            await tx.CommitAsync();
        }

        await Task.Delay(TimeSpan.FromSeconds(5), cancellationToken);
    }
}

Here is an example of a reliable dictionary in action. First we use the StateManager to get a reference to a reliable dictionary named counter that keeps <string.long> key/value pairs. Next we create a transaction and try to read the value for the key iteration in the dictionary. Then we add or update the new value for that key and finally we commit the transaction. Before testing the behavior of our Stateful service remove the CounterStatefulService project from the solution so that we can focus on the stateful service only. The ApplicationManifest.xml file will automatically change and should look like this:

<?xml version="1.0" encoding="utf-8"?>
<ApplicationManifest xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" ApplicationTypeName="CounterApplicationType" ApplicationTypeVersion="1.0.0" xmlns="http://schemas.microsoft.com/2011/01/fabric">
  <Parameters>
    <Parameter Name="CounterStatefulService_MinReplicaSetSize" DefaultValue="3" />
    <Parameter Name="CounterStatefulService_PartitionCount" DefaultValue="1" />
    <Parameter Name="CounterStatefulService_TargetReplicaSetSize" DefaultValue="3" />
  </Parameters>
  <!-- Import the ServiceManifest from the ServicePackage. The ServiceManifestName and ServiceManifestVersion 
       should match the Name and Version attributes of the ServiceManifest element defined in the 
       ServiceManifest.xml file. -->
  <ServiceManifestImport>
    <ServiceManifestRef ServiceManifestName="CounterStatefulServicePkg" ServiceManifestVersion="1.0.0" />
    <ConfigOverrides />
  </ServiceManifestImport>
  <DefaultServices>
    <!-- The section below creates instances of service types, when an instance of this 
         application type is created. You can also create one or more instances of service type using the 
         ServiceFabric PowerShell module.
         
         The attribute ServiceTypeName below must match the name defined in the imported ServiceManifest.xml file. -->
    <Service Name="CounterStatefulService" ServicePackageActivationMode="ExclusiveProcess">
      <StatefulService ServiceTypeName="CounterStatefulServiceType" TargetReplicaSetSize="[CounterStatefulService_TargetReplicaSetSize]" MinReplicaSetSize="[CounterStatefulService_MinReplicaSetSize]">
        <UniformInt64Partition PartitionCount="[CounterStatefulService_PartitionCount]" LowKey="-9223372036854775808" HighKey="9223372036854775807" />
      </StatefulService>
    </Service>
  </DefaultServices>
</ApplicationManifest>

Also change the ApplicationParameters/Local.5Node.xml file as follow:

<?xml version="1.0" encoding="utf-8"?>
<Application xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" Name="fabric:/CounterApplication" xmlns="http://schemas.microsoft.com/2011/01/fabric">
  <Parameters>
    <Parameter Name="CounterStatefulService_PartitionCount" Value="1" />
    <Parameter Name="CounterStatefulService_MinReplicaSetSize" Value="3" />
    <Parameter Name="CounterStatefulService_TargetReplicaSetSize" Value="3" />
  </Parameters>
</Application>

Hit F5 and deploy the Service Fabric application to your local cluster. Hopefully you should be able to understand that we expect 1 partition with 3 replicas to be deployed in the cluster.. Confirm this by opening the Service Explorer.

What you see is that there is indeed one partition having 3 replicas (instances) of our CounterStatefulService. The primary is hosted on _Node_2 and the other 2 secondaries are active and hosted on _Node_0 and _Node_1. The interesting thing though is in the Diagnostic Events view..

Notice that there is only one replica active and logging the value from the counter dictionary and of course this is the primary one. Now let’s run the same test we run on the stateless service after publishing the app. Switch to the Service Explorer and Deactivate (restart) the node where the primary replica is hosted (in my case is _Node_2). Then switch back to the Diagnostic Events and take a look what happened..

This is pretty amazing. What happened is that Service Fabric detected the node’s failure and of course the failure of the primary replica on that node. Then due to the configuration parameters decided to elect a new primary replica by selecting one of the previous secondaries replicas. The new primary replica was able to read the last value of the iteration key in the counter dictionary which was originally written by the replica that was running on _Node_2. One strange thing you might notice in the Service Explorer is that Service Fabric didn’t created a new instance in some other available node but instead kept that node as active secondary but unhealthy. This makes sense cause it assumes that the node will recover and there is no need to instantiate a new service in a different node in the cluster. In case it does create a new stateful instance of the service in a different node it also has to trigger the state synchronization with that node which of course will consume resources (especially network..). If you want to test this scenario click the 3 dots on the currently primary replica’s node and select Deactivate (remove data). You will see that a new instance will be created in one of the available nodes in the local cluster.

Partitions

Partitioning is all about divide and conquer in order to increase scalability and performance by splitting the state and processing into smaller logical units. The first thing you need to know before viewing some examples is that partitioning works different for stateless and stateful services in Service Fabric meaning that all replicas/instances in a stateless service partition are active and running (probably accepting client requests as well) while as we have already mentioned only one replica is actually running in a stateful service partition and all others simply participating in the write quorum of the set (syncing state). For example if you have a stateless service with 1 partition having 5 instances in a 5-node cluster then you have 1 service instance up and running on each node.

On the other hand if you have a stateful service with the same configuration then again each node will host an instance of the service but only one will be up and running, accepting client requests. All others will be active secondaries syncing the state and nothing more.

We will cover Service Fabric listeners and communication stacks in a future post but for now just keep in mind that service instances can be targeted through the partition key they belong to. You don’t actually need partioning in stateless services since they don’t save any state locally and there is nothing to distributed equally. If you need to scale just add more instances of the service by increasing the instance count parameter and that’s all.

Targeting specific instances

The only scenario to use multiple partitions in stateless services is where you want to route certain requests to specific instances of the service.

Partitioning in stateful services is about splitting responsibilities for processing smaller portions of the state. What we mean by that? Let’s take an example where you have a stateful service that accepts requests for storing demographic data in a city with 5 regions. Then you could create 5 different Named partitions using the region code as the partition key. This way, all requests that are related to a region code would end up to the same partition (replicas) resulting to better resource load balancing since requests are distributed to different instances depending on the region code.

Be carefull though to choose a good partitioning strategy cause you may end up having instances that serve more traffic than others which also means that probably save more amount of state as well. In our previous example assuming that 2 of the 5 regions takes up to 80% percentage of the city’s population then these partitions serve way more traffic than all the other 3.

So when choosing a partition strategy try to figure out how to evenly split the amount of state across the partitions.
Another key aspect in Service Fabric partitioning is that SF will always try to distribute partitions and replicas across all available nodes in the cluster to even out the workload. This is not something that happens only once during deployment but also when nodes fail or new ones are added. Let’s say you start your Service Fabric application in a 4-Node cluster and you have a stateful service having 8 partitions with 3 instances each (one primary, two secondaries).

Notice how Service Fabric has evenly distributed all the 8 primary replicas by deploying 2 of them on each node. Scaling out the 4-Node cluster to a 8-Node cluster would result re-destributing partitions and primary replicas across the 8 nodes as follow:

Service Fabric detected that new nodes were added to the cluster and tried its best to relocate replicas in order to even out the workload in the cluster. Rebalancing the primary replicas across all 8 nodes causes the client requests to be distributed across all 8 nodes which certainly increases overall performance of the application.

Monitor Service Fabric applications with PowerShell

While you can monitor your Service Fabric cluster status and services using UI (Azure or Orchestrators) you can also use PowerShell cmdlets. These cmdlets are installed by Service Fabric SDK. When developing on your local machine, most of the times you will have many services (or partitions and replicas if you prefer) published in the local cluster. In case you wish to debug a specific instance you can do that by finding information using the PowerShell. In the following example I deployed both the CounterStatelessService and the CounterStatefulService in the cluster having 1 partition with 3 instances and 2 partitions with 3 instances respectively.

What if I wanted to debug the primary replica of the Stateful service which is hosted on the _Node_0? First thing we need to do is connect to the cluster by typing the following command in PowerShell.

Connect-ServiceFabricCluster 

The Connect-ServiceFabricCluster cmdlet creates a connection to a Service Fabric cluster and when called with no parameters connects to your local cluster.

You can check the applications published on your cluster using the Get-ServiceFabricApplication cmdlet.

Now let’s see what’s happening in _Node_0 node that we were interesting on by running the following command:

Get-ServiceFabricDeployedReplica -NodeName "_Node_0" -ApplicationName "fabric:/CounterApplication"


As you can see there are 2 replicas of the CounterStatefulService service coming from 2 different partitions. The primary is the one we are interested on so now we know its process id which is 9796. We can switch to VS and select Debug => Attach to process.., find the process with that id and start debugging.

You can find all the PowerShell Service Fabric cmdlets here.

That’s it we have finished! We saw the most basic things you need to know before taking a deeper dive into Service Fabric. In upcoming posts we are going to dig deeper and learn about Actors, the available communication stacks and how services communicate each other and many more so stay tuned till next time..

In case you find my blog’s content interesting, register your email to receive notifications of new posts and follow chsakell’s Blog on its Facebook or Twitter accounts.

Facebook Twitter
.NET Web Application Development by Chris S.
facebook twitter-small
twitter-small
Advertisements


Categories: asp.net core, Azure

Tags: ,

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: