NiFi - Configure to poll Azure WebJobs

by Lewis


Posted on January 1, 2018 at 12:00 PM


In this article, I detail my steps that I use to workaround not being able to use Continous Web Jobs on the Azure Free Tier.



I have several applications that run in various cloud architectures. Underlying all my applications are a series of microservices that I can employ to perform certain pieces of functionality, independently away from UI threads etc. These micro services are queue based with messages consumed by Azure WebJobs. WebJobs is a feature of Azure App Service that enables you to run a program or script in the same context as a web app, API app, or mobile app. 

WebJob types

The following table describes the differences between continuous and triggered WebJobs.

Continuous Triggered
Starts immediately when the WebJob is created. To keep the job from ending, the program or script typically does its work inside an endless loop. If the job does end, you can restart it. Starts only when triggered manually or on a schedule.
Runs on all instances that the web app runs on. You can optionally restrict the WebJob to a single instance. Runs on a single instance that Azure selects for load balancing.
Supports remote debugging. Doesn't support remote debugging.

A WebJob can time out after 20 minutes of inactivity. Only requests to the actual web app reset the timer. Viewing the configuration in the Azure Portal or making requests to the advanced tools site don't reset the timer.

Continous WebJobs are not available on the Free Tier whereas Triggered webjobs are available to all Tier levels. The messages I send to my WebJobs aren't critical or real time but I would like the WebJob to respond within a reasonable amount of time. As WebJobs are triggered by hitting their endpoint we can use a scheduled NiFi workflow to trigger the WebJob.

 

Nifi

As mentioned previously a simple HTTP request is enough to Trigger the WebJob, so by polling the WebJob it triggers the job and causes it to execute its payload. In this example, we can use the "Invoke HTTP" process to touch our WebJob url and this should trigger it.

 WebJob_ProcessGroup

In this group, the calls are triggered from a process outside the group. The call comes in and then we touch each of the WebJob urls in a sequence. When we get a response from each of the webjobs, the flow records the date & time of the call in an attribute, and finally it calls into the "Completed" port, which ultimately triggers a further process outside of this group. In this example, I trigger 3 WebJobs, we can break down the flow to be the following:

NiFi flow

 

Lets break down the steps in this flow;

 

UpdateAttribute

This activity is used to update an attribute which is used in the next step. In this example, we set the "WebJobName" attribute to "LoggingWebJob" - which is the name of the web job we wish to poll.

 

InvokeHttp

InvokeHTTPAttributes

This activity is used to hit the WebJob url, which will trigger the webjob to execute.

To configure our "Invoke HTTP" process, we need

  • WebJob url, should look similar to this: https://<app_name>.scm.azurewebsites.net/api/continuouswebjobs/<app_name>/start
  • Azure credentials
  • Configured SSL Context Service (in NiFi)

With the pre-requisites listed above gathered, these are the relevant properties we need to configure:

Property Description
RemoteUrl This is the url we want to hit. It will take the following form: https://<projectname>.scm.azurewebsites.net/api/continuouswebjobs/${WebJobName}/start
SSLContextService We need to use HTTPS to hit our endpoint so we need to use a configured SSLContextService. Read more about SSL Context Services (in a seperate window) here
Basic Authentication UserName & Basic Authentication Password The credentials we need to use to access our RemoteUrl. These are configured in Azure: click here to read more.
Content-Type I set this to a variable that is set outside the process group. It is simply set to "application/json". 

 

UpdateAttribute

UpdateAttributes2

This step is not required to successfully trigger our webjob. This step is required to set some attributes that are used outside of this process group. These attributes are used in another flow to record the call in a DynamoDB table.

 

With the processors configured as above, we should be in a good state to hit our webjobs. If you have created a seperate process group, exit the group. Add a "GenerateFlowFile" processor and drag a connector to Trigger the ping flow;

Trigger

And configure its scheduling to be whatever you require. This example triggers every 60 minutes. You may want to set this lower in order to test it.

 

Testing

As I knew my WebJob already worked, all I needed to do was ensure that this workflow executed correctly. I performed the following;

  1. Closed any web browsers that were viewing WebJob status. I found that this could trigger the job and confuse things.
  2. Set the GenerateFlowFile on a schedule for every 2 minutes.
  3. I sent a message to the Logging Queue & waited two minutes.
  4. 2 minutes later, I could see traffic moving around in NiFi briefly. No error bullets on any processors.
  5. Viewed the logging data (DynamoDB) and observed that my test message had been consumed and recorded.

 

All done!