by Juan Penaranda, Corning Optical Communications and Ryan Fontaine, Citadel Analytics
In this article Corning Optical Communications partnered with Dallas based Citadel Analytics which has been deploying an AI platform in Multitenant Data Centers (MTDCs).
Artificial Intelligence and Machine Learning are at the Edge of What?
Artificial Intelligence and Machine Learning are at the Edge of What?
A quote commonly found on the internet goes “knowledge is knowing that a tomato is a fruit. Wisdom is not putting it in a fruit salad”. Machine Learning (ML) would lead to knowing a tomato is a fruit, but Artificial Intelligence (AI) would suggest not putting it in a salad. Jokes aside, ML would eventually pick it up. There really is so much more to AI and ML than meets the eye, from language translations to more accurate diagnosis of complex diseases. To give you an idea of how much computing power AI and ML need, training in 2017 showed one of Baidu’s Chinese speech recognition models required not only four terabytes of training data, but also 20 exaflops of compute, or 20 billion, billion math operations across the entire training cycle.
The balance that providers need to meet with AI and ML is delivering the highest quality of service at the lowest cost. How does one provide the highest quality of service? Well this comes from reducing latency and being able to handle the bandwidth demands of future applications. The effect of latency can be improved by reducing the physical distance that the data must travel between the device and the processor. Overall these latency demands are driving smaller data centers closer to where data is created and consumed. This optimizes transmission costs and quality of service. The second balance is to look for the lowest cost of utilizing these applications. In the past, the architecture increased costs with the amount of data moved and the distance or “hops”. AI and ML dramatically increased the amount of data being transferred which resulted in greater transport costs. Edge data centers are increasingly the answer because of their proximity to where the data is being created, and MTDCs are where a good portion of the edge will be housed. MTDCs offer the lowest risk in deploying a local data center while also being the fastest speed to revenue platform.
Before we discuss AI, ML, edge data centers, and MTDCs, it’s worth going over definitions to make sure everyone is on the same page.
Artificial Intelligence is the main umbrella that reaches out over all other types of AI/ML. It is typically the theory and development of computer systems able to perform tasks that normally require human intelligence, such as visual perception, speech recognition, decision-making, and translation between languages. The best way to describe these relationships is by visualizing Russian Dolls that all fit inside one another. AI is the biggest doll, Machine Learning goes inside that, and deep learning goes inside the Machine Learning doll.
Machine Learning is an application of AI that provides systems the ability to automatically learn and improve from experience without being explicitly programmed.
Edge data centers are facilities that bring the computing and processing powers of data centers closer to where the data is being created, by de-centralizing some of the latency dependent applications from the core data center. Customers are moving to the edge for many reasons including reduced transmission costs, increased quality of service, security, and future-proofing.
Multitenant data centers (MTDCs) also known as colocation data centers, are facilities where organizations can rent space to host their data. MTDCs provide the space and networking equipment to connect an organization to service providers at minimal cost. Businesses can lease this space to meet varying needs—from a server rack to a complete purpose-built module.
AI and ML are the most revolutionary technologies we have seen since electricity. They are more powerful than the internet and mobility revolutions combined. The reason these technologies are so powerful and so impactful is because they make sense of vast amounts of data quickly and efficiently. We live in a data-generating and data-driven world (market analysts predict over 80% of the data in existence today was only created in the last 2 years), and without tools to make sense of the data we would drown in it.
To give a quick example, this year the world will create roughly 40 zettabytes of information. That’s 40 TRILLION gigabytes of information. There’s no way humans can make sense of all that information – even if every human worked day and night together, it is mathematically impossible.
So how do we make sense of all that data? With AI and ML. These technologies LOVE data – it is their oxygen. By using powerful and properly trained AI / ML models we can accurately process vast amounts of information, revealing the very valuable bits of data we need to act on.
A good example of this is MRI ML models. They are tested against known outcomes of cancer that are confirmed to be cancerous or not cancerous (both positive and negative results). This is called the training phase.
Then a new set of MRIs are loaded into the now trained model and are analyzed. This data also contains MRIs with known outcomes, but they have never been seen before by the model. These new MRIs are called a validation dataset. This data is run through the now trained model and the results are computed and displayed. The results of the validation data are then evaluated for the performance metrics chosen for that model. If the results are acceptable then the model is trained and ready for more testing/validation or live deployment. If the validation data fails to meet the metrics, then we circle back and either redesign the model or give it more data to be better trained for the next validation test. This phase is called the validation phase.
The benefits of AI are not always in the areas where most businesses expect. Most businesses that Citadel Analytics has dealt with as clients expect a sales jump or a big jump in savings / cost reductions through efficiency gains. And while those do happen over time, the biggest initial benefit is employee satisfaction and performance. They have found that employees who work for a company that understands how to utilize AI / ML tend to be significantly happier with their jobs and want to stay with their current employer at significantly higher rates than companies that do not utilize AI.
This makes a lot of sense because AI / ML is all about automating the “boring” stuff and letting your workforce do what they are good at and passionate about without suffering a drop-in efficiency (businesses usually see a spike in productivity). Having happier employees and reducing employee churn is a massive benefit when you utilize AI, but it is often overlooked at first by many businesses.
How to cable and deploy AI / ML?
The traditional problem with AI has been its enormous processing power requirements. Thankfully companies like Nvidia, Intel, AMD, and many more are closing the processing power gap. This enables companies such as BMW, Walmart, Target, and countless more to deploy AI Edge capabilities. This entails installing powerful hardware on premises that will process your on-prem data using the pre-trained model. This dramatically cuts down on latency and the requirement for real-time bandwidth.
However, the problem is that no one can really do an “edge only” AI/ML deployment because while the hardware can handle the data processing of pre-trained models, it’s not powerful enough to train and update the model – much more powerful hardware is needed for that.
This is where the hybrid approach comes in.
In a typical hybrid design, the Edge server will process all on-prem data utilizing the trained model. These edge servers will select MTDCs as the best fit for their key drivers, allowing for flexibility as the network and applications evolve. For the optical infrastructure, MTDCs typically deploy single-mode fiber to enable the end users to scale. From the company deploying AI / ML, it’s important to account for both the bandwidth of today and tomorrow in their network. Citadel Analytics typically has a rule of thumb, which is to take the average bandwidth expected and multiply 4x. That is how much bandwidth a company should be accounting for within their system. Bandwidth is the first thing to get eaten up in real world AI/ML deployments.
This uptick in bandwidth is also highlighting the need for high density solutions for both the MTDCs and the end users. The MTDCs maximize their revenue generating whitespace while the end user can efficiently utilize the whitespace they’ve invested in. This space and infrastructure can come in many shapes and sizes, but overall, it’s important to look for a provider with product breadth that supports anything you want to do (single-mode or multimode, LC or MTP). For the end user, one way to reduce the total cost of ownership is to increase the density and reduce power consumption. This can be enabled through parallel optics and utilizing port break out. This application allows the use of higher speed transceivers being broken out in the cabling. It is possible to take a 40gig transceiver and operate 4 individual 10gig transceivers which can allow for high density switching. To really take advantage of AI and ML, there will need to be a network of interconnected data centers providing the computing closer to where the data is being created. This creates a need to globally scale with a consistent and modular product, which offers a full solution from the edge to the central data centers.
MTDCs are the lowest risk investment and provide a faster path to revenue due to the speed of deployment through availability. Now to train and update the model, all the data is sent to the training servers located at a private data center, primary a multitenant data center location, or the cloud. Those training servers will use the new data to automatically train and enhance the accuracy of the model being used.
By going to less than 10km, providers can often reduce latency by 45%. What does this mean? There will still be a need for a central data center, but there will also be a push for smaller and more regional data centers closer to where the data is being produced. MTDCs will be the main vehicle for these smaller and regional data centers. Edge data centers will be an extension of, or hosted at, these interconnection dense MTDCs, with both services needing each other to provide a full service for the customer and network. MTDCs with the most interconnected facilities and ecosystem rich customer mixes will capitalize on the initial benefits of the edge data center opportunity.
What can we look forward to?
AI and ML are applications that are here to stay. There is a clear business and human case for the magnitude of revenue and productivity these applications will provide to adopters. The shift to more bandwidth at the regional level isn’t optimized with the current architecture (while maintaining the lowest cost and highest quality of service). We will see these applications running at the edge which for most companies are at MTDCs. MTDCs offer the lowest risk and fastest speed to revenue for these companies. This will result in a push for more interconnected facilities with higher density solutions in more locations, rather than in larger central hubs as we’ve seen before.