Keywords: 

Background

At Plantronics, we have taken our mission from building the best in class industry enterprise headsets to a relentless mission of providing insights for improving audio communications. The data we collect from headsets are aggregated in our cloud SaaS offering Plantronics Manager Pro. We also expose the data via APIs (Please refer links below) which help our partners, MSPs and customers to innovate and customize their tailored use cases with solutions, app etc.

Our APIs are of two categories(like most APIs there) :

  1. REST  pull mechanism using APIs for historical analytics use cases
  2. ​Real-time Streaming - using pubnub messaging for real time relay of device and call events from Plantronics headsets (Quick Disconnect, Mute etc.)  as in our call center wall board integration e.g. Agentq wall broad.
 

Streaming analytics backend architecture

Streaming API architecture uses Storm Topologies to subscribe to real time events pushed from Hubs(Plantronics client software in a laptop / mobile platform) to a pubnub channel. 







 
Architecture flow
The diagram above shows the command control for setting up basic authorization of realtime data flow between pubnub, Plantronics Cloud and  a partner application. For basics on starting a realtime stream with plantronics APIs, please look here: http://developer.plantronics.com/article/getting-started-real-time-apis

 

Components:

API servers (Host REST APIs, read/ write to persistence layer MySQL, Hbase, MongoDb).

1. 3rd party app calls into Streaming API using REST protocol to set up subscription interest for a specific real time stream.
2. As part of SUCCESS/201 response, the app gets pubnub subscription metadata (which pub/sub channel keyset etc). Now, the app is ready to subscribe to interested realtime events.
3. Plantronics cloud intenally pushes intent to subscribe to realtime events relevant to a tenant's Hubs. Now, Hub client software which communicate with customer headsets(via bluetooth, corded USB) start publishing real time events to plantronics cloud through a negotiated pubnub channel/keyset.


 

Apache Storm topologies

The topology is designed to subscribe to pubnub cloud
  • Pub Nub spouts  -   subscribe to pubnub channels for multiple tenants to listen to  device / call stat events coming in from Hub.  Each tenant's Hubs (running on customer laptops/mobile) uses multiple pubnub channels into which clients pushes events to load balance on spouts and pubnub sdk threads.
  • Pubnub Bolts - Bolt maintains a cache to relevant 3rd party app subscriptions and corresponding pubnub metadata needed. They parse event json from Spout does some aggregation, stores any analytics stat needed in hbase and pushes event to all apps that have subscription to the event stream.

Problem

Now, with the background of the overall use case and workflow of setting up realtime streaming for 3rd party apps, we can get to the description of challenge that this article tries to provide an architectural pattern for.

Requirement

This article focusses on business logic state management and not times series data aggregation state which lots of articles deal with. With the use case description above, tenants can be  subscribed  and unsubscribed using APIs, 3party app might subscribe and unsubscribe interested streams. Tenants can come and go. Application might reach end of life. So, how to update the cached state on Storm spouts/ bolts?  I researched many options based on our requirement. Having a direct very low latency db access is not a requirement here since this will be a periodic update of cache on when to start/stop reading some tenants, publish to apps  etc. Also the real time path should not be affected while processing events due to db read/ writes was a no compromise one. 

Solution

The above diagram illustrates the state mamagement we zeroed on which has worked well for us. The following component descriptions explain the ideas and flow.
  1. Dynamic state manager  This singleton instance will periodically kick off on an independent thread a local cache update which maintains the active apps and their event stream subscriptions, tenants which are enabled for realtime and their event subscriptions and other metadata needed for subscribing from tenants and publishing to partner applications.  The dynamic manager updates it state periodicallly using the REST API calls to API servers. They do not talk to databases directly.
  2. Spout The spout periodically checks the dynamic state manager and updates it local cache for which tenants it is has a subscription for so that it can add and delete tenants as and when needed. Use something like a regular java timers to kick off this task. Do it within your stream processing framework's(In this case Storm worker thread) thread to avoid synchronization related issues.
  3. Bolt The bolt periodically updates (using Tick tuple construct in Storm's doc) to kick off updates for apps and their subscripriptions.
The above pattern has some advantages:
  •  Avoids close coupling with database schema keeping interface contract with REST APIs.
  • Some of the APIs given to external parties can be reused. 
  • Testability increases at service level and storm components using mock libraries.
  • Performance tuning can focussed on the API server layer instead of becoming part of Strom components. So, it gives a good isolation of sub components for performance testing.