Sandeep Raina, Product Marketing director, MYCOM OSI

As CSPs undergo digital transformation – which means offering digital services through a Telco Cloud environment – they face a whole set of operational challenges, including operating a new virtualised network, focusing on customer experience more than before and providing high reliability and availability for upcoming IoT services. Simplifying the operations can help in tackling these challenges.

The complexity of the operations in a Telco Cloud owes to the following:

The high speed at which digital services will be deployed – Digital services require real-time dynamic deployment, adaptation and customisation. Automation of many Operations Centres processes, including monitoring, orchestration, feedback, audit and messaging, are needed to support this Running a hybrid network: part virtualised, part physical – Since the process of virtualisation will take 3-4 years to stabilise, extra vigilance is required as new nodes/VNFs are added/removed. Seamless operations will require systems that quickly adapt to the network changes, says Sandeep Raina, Product Marketing director, MYCOM OSI. Dynamic services need constant and consistent management – Policy-based management is required for constant and rapid management, leading to automated simplified configuration in a virtualised environment Additional attention to IoT operations – IoT traffic is expected to run on highly reliable and error-free networks, which drive expectations or objectives for the IoT network, service and devices to have minimum failures. Every new piece of equipment, software and device will bring its own failure points and requires upping of the fault management to ensure reduction in the number of faults Impact on life-critical or mission-critical communication – In a hyper-connected world, failed devices or connections might not only breach SLAs with massive penalties, but, more importantly, impact lives. Although complex mesh topologies with high availability and redundancy will serve to minimise failures, they still require a highly efficient system to discover, interpret and manage the faults Operation Centres need to be more proactive and predictive – This comes from the need to minimise performance degradations, prevent failures and eliminate critical customer-impacting problems

Integration and consolidation of OSS components is the first step towards simplification of the Telco Cloud operations. Automation – including machine learning – is the next.

Integration and consolidation : The introduction of NFV with network functions and services hosted on common resources inherently helps to achieve the required integration to an extent. Open REST APIs also help in connecting the OSS layers. Finally, hosting of OSS functionalities (analytics, automation and SQM) in the cloud can also accelerate the integration of the required functionalities of the Operation Centre. Introducing topology-based root cause analyses integrates services with the underlying network, closing the remediation loop

Automation: Automating the Operation Centre means encapsulating the best practices for standard operating procedures and using machine learning to derive or improve them. This frees up resources by automating and orchestrating complex processes across multiple domains and functions. Not only does it reduce human error and increase employee productivity, but it also greatly simplifies complex operations involving a large number of processes. The simplification benefits can be reaped by various functions, including planning, optimisation and business teams. The highest level of automation would lead to the desired zero-touch Operation Centre. Building a zero-touch Operation Centre for the Telco Cloud will require the following key steps: Automating critical OSS actions Exploiting machine learning for efficiency Self-healing and optimisation by feedback loop

Here are some suggested use cases for the simplified (Integrated and Automated) zero-touch Operations Centre:

QoS-driven orchestration in hybrid networks : Using integrated performance and fault data on network/services, QoS policies can be derived and operated to orchestrate both physical and virtualised (hybrid) networks. This requires an integrated SQM/automation/orchestration system

Management of end-to-end IoT: Managing IoT traffic by using analytics to forecast patterns and prevent IoT network, service and device failures. This includes building dashboards for service availability, incident and unavailability breakdown by location and geolocation-based service impact

Prediction of SLA breaches: Machine learning, when integrated with analytics based on performance/fault data, offers powerful predictive management capability to anticipate problems and helps in protecting customer SLAs

Service impact analysis and root cause analysis : With SQM integrated with fault data, faster service impact visualisation is possible for the Telco Cloud. Also by automating root cause analysis problems can be quickly identified to reduce mean time to repair

Automating outage recoveries: By automating fault management, network outage recoveries can be accelerated. Additionally, by integrating fault management with the OSS ecosystem (Trouble-Ticket, Inventory, Orchestrators, SQM, CRM, Work Force Management, etc.) problems are reported and solved much faster

A simplified zero-touch Operations Centre provides many benefits. However, it does require drastic changes in the way OSS components integrate and interact with each other and how network/service data is visualised and actioned in the Operation Centre. Introducing analytics, machine learning, messaging bots, automated RCA and orchestration will simplify the operational complexities of the hyper-converged network and its services.

