So what’s going on outside the Core Contact Center

I happened to be watching a few documentaries on how Einstein and other Key Scientists discovered their contributions for which they are now known for… I looked back at my work and see a plethora or technologies, buzzwords and jargons thrown around everywhere and I started thinking if it was possible to bring all of them under one logical roof. What I noticed was that the core contact center gets a lot of attention but the surrounding soft processes and technologies outside the core has been ignored. Let me first explain what I mean by the core:

It has three primary components:

  1. The Connect Services – Focus is on Expanding the ways by which the Customer is able to connect with the Contact center. Nowadays this extends to finding ways how agents also can perform their service from multiple channels other than the rather traditional phone + pc method.
  2. Self Service Services – These Focus on Diverting the Interaction from Human to Machine. The Motivations can be many but bottom-line it’s a machine doing the human’s job. Again the rant of IVR Hell is the most popular slogan in every CC Sales person’s narration and continues to the Rise of the Bots in service of Humanity and the best bots being from their garages.
  3. Automation Services – These Focus on Ensuring that the Customer gets serviced by the right agent or bot based on information collected during the interaction or history.

All of these are fundamental to any contact center solution for longer than the past 2 decades and hence I never got myself to blog on the various transformations happening here. What could happen outside the core is however never discussed and hence the key subject of this blog.

Let’s visualize this Core Thingy

So the Experience Gained by Customer when Interacting with the Connect services, we call “Customer Experience” aka “CX”. Similarly, the Experience the Core Gives to Agents becomes the “Agent Experience” aka “AX” … Right?

Wrong… Let’s see why…

Let’s focus on the Customer and see what actually is driving their CX… I hear your mind voice…you just thought about the new term “Omni-Channel” …. And Something Else is coming up… “Customer Engagement” …. Ah now I hear something else … Ok Stop… I’m here to tell my opinion …not Yours!

In my opinion Customer Experience is governed by three key activities

  1. Engineering – This is where the Engineers tirelessly build the core and associated solutions block by block. After crossing the mindless desert of bureaucracy, the storm of politics and whirlpools of bugs, the Engineers brings solutions to production. This used to consist of lifelong projects in the SDLC era but now has been cut short using DevOps so engineers have to cross smaller obstacles than larger ones before…
  2. Experience – Once the Solutions are brought to production the customer happens to use the solution and hence you get “Customer Experience”. Thankfully there are tools which are able to Quantitatively measure these customer experiences using DataOps. This used to be a laborious manual task in the past but nowadays has become automatic to a large extent letting the Data Engineers to focus on Insights
  3. Insight – The Insights is the activity performed typically by Supervisors but now slowly business managers and marketing managers are also getting into these tools to gain insights to better their side of business. These Insights result in Stories which in turn fuels the next round of Engineering.

Now let’s visualize what I’m talking about …

Now in Traditional Environments, this whole cycle would happen every month at max but the way things are moving in the Digital Economy, it actually moved on to Events based model thanks to AI ….

On a similar note the same cycle goes on in the Agent Side as well contributing and improving the “Agent Experience” and “Agent Engagement”

So What else could be happening here… All the Engineering activity happen mostly on the CC Platform and the Data about Customer and Agent Experiences and Interaction Histories are stored in Data Stores

So Let’s bring them all together:

So Let’s look at this new box called Platform we just added… It’s basically the core of the contact center exposed to Developers and Infrastructure Engineers.

The AppOps Team would use Observability Tools to understand the Services’ performance and bottlenecks.

The AIOps on the other hand use Experience Monitoring Solutions and Uptime Monitoring Solutions with Automated Remediation Solutions.

For the Developer there is the DevOps Stack with the Code Repository to store their configurations and code. Continuous Integration Ensures that the ready to release software/configuration gets Tested functionally and for Security Vulnerabilities as well, before landing on the platform.

So this is how all this would look like:

So the Platform has a lot of real-time and historical data in the Data Store… Let’s see what the Data Folks do with it…

So If you have a real Data Engineering Minded Org then the Data Engineers and Scientists would like to have their own layer of lakes to handle the processed data in their useable form.

Most Orgs would use prebuilt Analytics solutions to serve business metrics to Business Managers and Contact Metrics to Supervisors…

There could and should be more outside the core that typically gets ignored in most orgs… If you know anything I missed please do let me know

An Approach to Cognify Enterprise Applications

I recently witnessed the setup of my brand new Windows 10 Laptop and was surprised when Cortana guided the installation with voice recognition! This was happening before OS is there on the laptop! … I wouldn’t have imagined this 5 years ago and set off imagining how the experience would have been if the setup designer decided to completely remove any mouse/keyboard inputs. Further, what if Cortana had matured to converse with me naturally without any Pre-Coded questions being asked in sequence! Instead of saying yes or no I dabble about how good the laptop looks and Cortana responds with affirmation or otherwise but gently getting me to respond to the key questions needed to be answered before the full blown OS installation could start… It sounds cool but in future releases this may be the reality!

Back to the topic of Enterprise Applications, Conversational experiences are being continuously developed and improved upon with the bots learning how to converse from both pre-built flows and historical conversation logs. In the enterprise context it now becomes important that CIOs & CTOs start thinking about how their Business Applications can be used on these Conversational Platforms. Enterprise Leaders need to think carefully about how this gets architected and deployed so that it does not become something mechanical and irritating like traditional IVR Solutions. To Succeed in the endeavor we need to look not just at the New Cognitive Platform but also the Services expected to be enabled on the bot and keep the experience exciting so it does not meet the same fate as IVR in terms of experience.

I see the following SUPER aspects of the Solution to be first Scrutinised carefully before project initiation:

  • Service – Look at where the Service is currently performed and check for viability of being able to Integrate with the Cognitive Platform
  • User Experience – Look at how complex is the service to be executed over Automated Interfaces like Phone, Virtual Assistants and Chat UI
  • Peripherals – Look for the peripherals where the services have been provided currently and check if the same can be reused or replacement would be required. Oversight here could lead to Urgent and Expensive replacement later and decreased User Adoption.
  • Environment – Different Services are performed in different work conditions and careful consideration should be made so appropriate services are not provided in certain conditions. For example, speaking out Bank Balance on a Loud Personal Assistant as Speech could embarrass users and lead to privacy concerns of a different nature.
  • Reliability – Here the Cognitive Platform itself should be judged in terms of fragility not just in terms of uptime but in terms of handling edge cases. This is where the continuous unsupervised learning capability needs to be looked at very carefully and evaluated to ensure that the Platform builds up cognition over time.

Here is an approach of how Enterprise Leaders can start moving their workforce to embrace Cognitive Applications

Step 1) Service Audit – Perform an Audit of Services Being performed and the related applications.

Step 2) Cognitive Index Evaluation – User the SUPER aspects to Evaluate the Cognification of each service.

Step 3) Build Road Map – Categorise the Services in terms of ease of introduction and ease of development and batch them in phases.

Step 4) Identify Rollout Strategy – Based on complexity and number of possible solutions and channels under consideration, one or more POCs may need to be initiated followed by bigger rollouts. In case of multiplicity of Business Applications needing to be integrated, then Business Abstraction Layer Solutions could be brought in to significantly boost Integration time.

Step 5) Monitor and Manage –  While the Cognitive Solution brings reduction in service tickets to IT, injection of capabilities like ‘Undirected Engagement’ could lead to monitoring and management of conversations in terms of Ethics, Privacy and Corporate Diversity Policy.

What do you think?

Speech to Text for PSTN using Twilio cPaaS

I’ve been looking for Speech Recognition capability on Cloud that can work for Plain Old Telephone Services (PSTN)…but looks like there only three options that I’m aware of….

  1. Cisco Tropo ASR

    Tropo Automatic Speech Recognition (ASR) has been there in the market for over 2 years. I have a sample code that you could try on github ( The limitation that I felt with this solution was that it is a word spotting solution. So you provide a set of words or sentences that you expect to hear and Tropo ASR matches them to the set. While this is cool a decade back, we live in an era where Google Chrome Browser is capable of converting spoken sentences from voice to text immediately on budget phones! I found the Tropo ASR to have a delay of average 3-7 seconds between the user stopping their utterance and the ASR providing its detection result. So I give this a skip.

  2. Jugaad Solution

    I have seen some trials that used the ‘Recording’ available in Twilio’s Gather verb, send it to Cloud based Speech to Text Providers like IBM Watson and then use the result. While this solution works, the UX suffers as the delay is upwards of 10 seconds! So this too is not a practical solution

  3. Twilio Speech to Text

    Twilio has now announced its own Speech to Text Capability as part of its Gather verb and it works perfectly from my tests. This service is in beta right now and costs about $0.02 for 15 second batches of speech. Of course the rate reduces with volume.

In this blog I’ll show how to quickly try out the Twilio Speech to Text in 10-15 mins.

I’m going to use the glitch WebIDE to build the code. You can remix your own version quickly from the url provided to get started even more quickly.

The Plan

So I’m going to make a very simple app that will receive the call, Look at the Result and Tell the Caller what it heard and then disconnect…Simple. As per the Gather Documentation there are three important receivers that we need to build in our app.

Call Hook -> This is the url that you would configure on Twilio Portal as part of the number procurement. When Twilio gets a call on this number it will immediately send the available call details to the ‘call-hook’ url. For this blog I’m going to use the ‘’. You can replace the url as per your solution later.

This URL Needs to return a Twiml XML Message with the next steps. I plan to just tell Twilio to Gather the Speech and send it back to the ‘action url’ . 

action -> This is the url that will be contacted by Twilio when the user has stopped speaking and Twilio has done the speech to text conversion. As per my test this takes less than a second. Twilio will expect a return Twiml on what to do next. I Intend to just parse the Incoming message for SpeechResult and Confidence and speak them back…

partialResultCallback -> This is the most exciting part and is similar to the google StreamingRecognize Solution. Here Twilio sends the text (UnstableSpeechResult) as the user speaks instead of waiting for the user to stop. This feature could be useful if your solution is actually doing some real-time NLP.

Show me the Code

I have used a simple swagger based NodeJS app to implement this…You can find the code at 

Hope this helps ….

Call Flow Manager for Skype For Business Server

Another Tool that I’m sure other too have been waiting for but not delivered from Microsoft

Andrew Morpeth’s blog about this upcoming tool that gives the Powershell + non-Intuitive LCP a bye bye…

I wish he is able to come up with the Routing Group Designer and IVR Designer for Skype for Business successfully…

In its current screen shots shown in his blog the RGS part is definitely greatly simplified than the native options…IVR Does have to catch up with the capabilities provided by Avaya/Cisco and Genesys…For now the Lync/S4B community needs such tools to move Lync into the Contact Center Space that currently is filled with expensive solutions which have more unused features than needed