Posts

Cedexis Solves Avoidable Outages in Real-Time

Portland, Ore. – August 15, 2017 Cedexis, the leader in crowd-optimized application and content delivery for clouds, CDNs and data centers, today announced the release of its connected Sonar service, which uses low-latency synthetic monitoring to eliminate costly and avoidable outages by ensuring consistent application delivery. Providing exceptional quality of experience (QoE) to application consumers by eliminating outages and slowdowns is at the heart of building a profitable and sustainable cloud-native service, but has proven to be an elusive goal.

Synthetic monitoring uses programmed requests of customer-designated endpoints to validate, on an ongoing basis, that those endpoints are available for use. Until now, the marketplace has offered only two substantial choices:

  • Implement disconnected synthetic monitoring, which requires manual intervention when problems arise. The time from anomaly detection to resolution can range from minutes to hours, often resulting in prolonged outages and slowdowns.
  • Implement cloud vendor-specific synthetic monitoring, which delivers automatic intervention when problems arise. The time from anomaly detection to resolution is measured in just minutes, but is generally restricted to re-routing within that vendor’s cloud infrastructure, resulting in shorter, but still meaningful, outages and slowdowns.

With the release of Sonar, there is now a third, and more effective, option:

  • Implement Sonar connected synthetic monitoring, which delivers an automatic intervention when problems arise. The time from anomaly detection to resolution is measured in just seconds, and traffic can be re-routed across and between substantially any infrastructure (from data center to hosting facility to cloud provider), resulting in the elimination of most outages and slowdowns entirely.

Sonar is able to reduce the MTTR (mean time to repair) – and thus prevent consumer-visible outages and slowdowns – automatically owing to two key characteristics:

  1. Sonar is connected to the broader Cedexis application delivery platform. As such, the data that is collected automatically flows into the Openmix global traffic manager, which is able to adjust its traffic routing decisions in just seconds.
  2. Sonar is configurable to run endpoint tests as frequently as every two seconds, providing up-to-date telemetry moving at the speed of the Internet. By contrast, cloud vendor-specific solutions often limit their testing frequency to 30 – 120 second intervals – far from sufficient to contend with rapidly-evolving global network conditions.

“Synthetic monitoring data, as a core input to traffic routing decisions, must be accurate, frequent, and rapidly integrated into algorithms,” said Josh Gray, Chief Architect at Cedexis. “However, data is only as valuable as the actions that it can automatically activate. Updating traffic routing in just seconds is the key to making outages a thing of the past and ensuring unparalleled user experience.”

The updated Sonar synthetic monitoring service enhances the industry-leading actionable intelligence that already powers the Openmix global traffic management engine. The Cedexis application delivery platform (ADP) uniquely uses three different sources of actionable data to ensure the smoothest internet traffic logistics:

  • Radar: the world’s largest community of instantaneous user experience data
  • Fusion: the powerful 3rd party data ingestion tool that makes APM, Local Load Balancer, Cloud metrics, and any other dataset actionable in delivery logic
  • Sonar: a massively scalable and architecture-agnostic synthetic testing tool that is immune to the latency issues of proprietary cloud tools

“No global traffic management platform can provide reliable, real-time traffic shaping decisions without access to accurate, actionable data,” noted Ryan Windham, Cedexis CEO. “The evolution of Sonar to provide industry-leading latency levels confirms our commitment to delivering an end to avoidable outages.”

Which Is The Best Cloud or CDN?

Oh no, you’re not tricking us into answering that directly – it’s probably the question we hear more often than any other. The answer we always provide: it depends.

Unsatisfying? Fair enough. Rather than handing you a fish, let us show you how to go haul in a load of blue fin tuna.

What a lot of people don’t know is that, for free, you can answer this sort of thing all by yourself on the Cedexis portal. Just create an account, click through on the email we send, and you’re off to the races (go on – go do it now, we’ll wait…it’s easier to follow along when you have your own account).

The first thing you’ll want to do is find the place where you get all this graphical statistical goodness: click Radar then select Performance Report, as shown below

With this surprisingly versatile (and did we mention free) tool, you can answer all the questions you ever had about traffic delivery around the world. For instance, if I’m interested in working out which continent has the best and worst availability. Simply change the drop down around the top left to show ‘Continent’ instead of ‘Platform’, and voila – an entirely unsurprising result:

Now that’s a pretty broad brush. Perhaps you’d like to know how a different group of countries, or states look relative to one another – simply select those countries or states from the Location section on the right hand side of the screen and you’re off to the races. Do the same with Platforms (that’s the cloud providers and CDNs), and adjust your view from Availability to Throughput or Latency to see how the various providers are doing when they are Available.

So, if you’re comparing a couple of providers, in a couple of states, you might end up with something that looks like this:

Be careful though – across 30 days, measured day to day, it looks like there’s not much difference to be told, nor much improvement to be found by using multiple providers. Ensure that you dig in a little deeper – maybe to the last 7 days, 48 hours, or even 24 hours.Look what can happen when you focus in on, for instance, a 48 hour period:

There are periods there where having both providers in your virtual infrastructure would mean the difference between serving your audience really well, and being to all intents and purposes unavailable for business.

If you’ve never thought about using multiple traffic delivery partners in your infrastructure – or have considered it, but rejected it in the absence of solid data – today would be a great day to go poke around. More and more operations teams are coming to the realization that they can eliminate outages, guarantee consistent customer quality, and take control over the execution and cost of their traffic delivery by committing to a Hybrid Cloud/CDN strategy.

And did we mention that all this data is free for you to access?

 

Live and Generally Available: Impact Resource Timing

We are very excited to be officially launching Impact Resource Timing (IRT) for general availability.

IRT is Impact’s powerful window into the performance of different sources of content for the pages in your website. For instance, you may want to distinguish the performance of your origin servers relative to cloud sources, or advertising partners; and by doing so, establish with confidence where any delays stem from. From here, you can dive into Resource Timing data sliced by various measurements over time, as well as through a statistical distribution view.

What is Resource Timing? Broadly speaking, resource timing measures latency within an application (i.e. browser). It uses JavaScript as the primary mechanism to instrument various time-based metrics of all the resources requested and downloaded for a single website page by an end user. Individual resources are objects such as JS, CSS, images and other files that the website pages requests. The faster the resources are requested and loaded on the page, the better quality user experience (QoE) for users.  By contrast, resources that cause longer latency can produce a negative QoE for users.  By analyzing resourcing timing measurements, you can isolate the resources that may be causing degradation issues for your organization to fix.  

Resource Timing Process:

Cedexis IRT makes it easy for you to track resources from identified sources, normally identified through domain (*.myDomain.com), by sub-domain(e.g. images.myDomain.com), and by the provider serving your content. In this way, you can quickly group together types of content, and identify the source of any latency. For instance, you might find that origin-located content is being delivered swiftly, while cloud-hosted images are slowing down the load time of your page; in such a situation, you would now be in a position to consider a range of solutions, including adding a secondary cloud provider and a global server load balancer to protect QoE for your users.

Some benefits of tracking Resource Timing.

  • See which hostnames  – and thus which classes of content – are slowing down your site.
  • Determine which resources impact your overall user experience.
  • Correlate resource performance with user experience.

Impact Resource Timing from Cedexis allows you to see how content sources are performing across various measurement types such as Duration, TCP Connection Time, and Round Trip Time. IRT reports also give you the ability to drill down further by Service Providers, Locations, ISPs, User Agent (device, browsers, OS) and other filters.

Check out our User Guide to learn more about our Measurement Type calculations.

There are two primary reports in this release of Impact Resource Timing. The Performance report, which gives you a trending view of resource timing over time and the Statistical Distribution report, which reports Resource Timing data through a statistical distribution view.  Both reports have very dynamic reporting capabilities that allow you to easily pinpoint resource-related issues for further analysis.  


Using the Performance report, you can isolate which grouped resources are causing potential end user experience issues by hostname, page or service provider and when the issue happened. Drill down even further to see if this was a global issue or localized to a specific location or if it was by certain user devices or browsers.  

IRT is now available for all in the Radar portal – take it for a spin and let us know your experiences!

Why The Web Is So Congested

If you live in a major city like London, Tokyo, or San Francisco, you learn one thing early: driving your car through the city center is about the slowest possible way to get around. Which is ironic, when you think about it, as cars only became popular because they made is possible to get around more quickly. There is, it seems, an inverse relationship between efficiency and popularity, at least when it comes to goods that pass through a public commons like roads.

Or like the Internet.

Think about all that lovely 4K video you could be consuming if there was nothing between you and your favorite VOD provider but a totally clear fiber optic cable. But unless you live in a highly over-provisioned location, that’s exactly not what’s going on; rather, you’re lucky to get a full HD picture, and even luckier if it stays at 1080p, without buffering, all the way through. Why? Because you’re sharing a public commons – the Internet – and its efficiency is being chewed away by popularity.

Let’s do some math to illustrate this,

  • Between 2013 and January 2017 the number of web users increased by 1.4 billion people to just over 3.7 billion. Today Internet penetration is at 50% (or put another way – half the world isn’t online yet)
  • In 2013, the average amount of Internet data per person was 7.9G per month; by 2015 it was 9.9G, with Cisco expecting it to reach over 25Gb by 2020 – so assume something in the range of 15Gb by 2017.
  • Logically, then in 2013 web traffic would have been around 2.3B * 7.9G per months (18.1t exabytes), by 2015 it would have been  3.7B * 17Gb per month (62.9 exabytes)
  • If we assume another billion Internet users by 2020, we’re looking at 4.7B & 25Gb per month – or a full 117.5 exabytes

In just seven years, the monthly web traffic will have grown 600% (based on the math, anyway: Cisco is estimating closer to 200 exabytes monthly by 2020).

And that is why the web is so busy.

But it doesn’t describe why the web is congested. Congestion happens when there is more traffic than transit space – which is why, as cities get larger and more populous, governments add lanes to major thoroughfares, meeting the automobile demand with road supply.

Unfortunately, unlike cars on roads, Internet traffic doesn’t travel in straight lines from point to point. So even though infrastructure providers have been building out capacity at a madcap pace, it’s not always connected in such a way that makes transit efficient. And, unlike roads, digital connections are not built out of concrete, and often become unavailable – sometimes for a long time that causes consternation and PR challenges, and sometimes just for a minute or so, stymying a relative handful of customers.

For information to get from A to B, it has to traverse any number of interconnected infrastructures, from ISPs to the backbone to CDNs, and beyond. Each is independently managed, meaning that no individual network administrator can guarantee smooth passage from beginning to end. And with all the traffic that has been – and will continue to be – added to the Internet, it has become essentially a guarantee that some portion of content requests will bump into transit problems along the way.

Let’s also note that the modern Internet is characterized less by cat memes, and more by the delivery of information, functionality, and ultimately, knowledge. Put another way, the Internet today is all about applications: whether represented as a tile on a smart phone home screen, or as a web interface, applications deliver the intelligence to take the sum total of all human knowledge that is somewhere on the web and turn it into something we can use. When you open social media, the app knows who you want to know about; when you consult your sports app, it knows which teams you want to know about first; when you check your financial app, it knows how to log you in from a fingerprint and which account details to show first. Every time that every app is asked to deliver any piece of knowledge, it is making requests across the Internet – and often multiple requests of multiple sources. Traffic congestion doesn’t just endanger the bitrate of your favorite sci fi series – it threatens the value of every app you use.

Which is why real-time predictive traffic routing is becoming a topic that web native businesses are digging deeper into. Think of it as Application Delivery for the web – a traffic cop that spots congestion and directs content around it, so that it’s as though it never happened. This is the only way to solve for efficient routing around a network of networks without a central administrator: assume that there will be periodic roadblocks, and simply prepare to take a different route.

The Internet is increasingly congested. But by re-directing traffic to the pathways that are fully available, it is possible to get around all those traffic jams. And, actually, it’s possible to do today.

Find out more by reading the story of how Rosetta Stone improved performance for over 60% of their worldwide customers.

 

Amazon Outage: The Aftermath

Amazon AWS S3 Storage Service had a major, widely reported, multi-hour outage yesterday in their US-East-1 data center. The S3 service in this particular data center was one of the very first services Amazon launched when it introduced cloud computing to the world more than 10 years ago. It’s grown exponentially since–storing over a trillion objects and servicing a million requests/second supporting thousands of web properties (this article alone lists over 100 well-known properties that were impacted by this outage).

Amazon has today published a description of what happened. The summary is that this was caused by human error. One operator, following a published run book procedure, mis-typed a command parameter setting a sequence of failure events in motion. The outage started at 9:37 am PST.  A nearly complete S3 service outage lasted more than three hours and full recovery of other AWS S3-dependent services lasted several hours more.

A few months ago, Dyn taught the industry that single-sourcing your authoritative DNS creates the risk the military described as two is one, one is none. This S3 incident underscores the same lesson for object storage. No service tier is immune. If a website, content, service or application is important, redundant alternative capability at all layers is essential. And this requires appropriate capabilities to monitor and manage this redundancy. After all, fail-over capacity is only as good as the system’s ability to detect the need to, and to actually, failover. This has been at the heart of Cedexis’ vision since the beginning, and as we continue to expand our focus in streaming/video content and application delivery, this will continue to be an important and valuable theme as we seek to improve the Internet experience of every user around the world.

Even the very best, most experienced services can fail. And with increasing deconstruction of service-oriented architectures, the deeply nested dependencies between services may not always be apparent. (In this case, for example, the AWS status website had an underlying dependency on S3 and thus incorrectly reported the service at 100% health during most of the outage.)

We are dedicated to delivering data-driven, intelligent traffic management for redundant infrastructure of any type. Incidents like this should continue to remind the digital world that redundancy, automated failover, and a focus on the customer experience are fundamental to the task of delivering on the continued promise of the Internet.

How To Deliver Content for Free!

OK, fine, not for free per se, but using bandwidth that you’ve already paid for.

Now, the uninitiated might ask what’s the big deal – isn’t bandwidth essentially free at this point? And they’d have a point – the cost per Gigabyte of traffic moved across the Internet has dropped like a rock, consistently, for as long as anyone can remember. In fact, Dan Rayburn reported in 2016 seeing prices as low as ¼ of a penny per gigabyte. Sounds like a negligible cost, right?

As it turns out, no. As time has passed, the amount of traffic passing through the Internet has grown. This is particularly true for those delivering streaming video: consumers now turn up their nose at sub-broadcast quality resolutions, and expect at least an HD stream. To put this into context, moving from HD as a standard to 4K (which keeps threatening to take over) would result in the amount of traffic quadrupling. So while CDN prices per Gigabyte might drop 25% or so each year, a publisher delivering 400% the traffic is still looking at an increasingly large delivery bill.

It’s also worth pointing out that the cost of delivery relative to delivering video through a traditional network, such as cable or satellite is surprisingly high. An analysis by Redshift for the BBC clearly identifies the likely reality that, regardless of the ongoing reduction in per-terabyte pricing “IP service development spend is likely to increase as [the BBA] faces pressure to innovate”, meaning that online viewers will be consuming more than their fair share of the pie.

Take back control of your content…and your costs

So, the price of delivery is out of alignment with viewership, and is increasing in practical terms. What’s a streaming video provider to do?

Allow us to introduce Varnish Extend, a solution combining the powerful Varnish caching engine that is already part of delivering 25% of the world’s websites; and Openmix, the real-time user-driven predictive load balancing system that uses billions of user measurements a day to direct traffic to the best pathway.

Cedexis and Varnish have both found that the move to the Cloud left a lot of broadcasters as well as OTT providers with unused bandwidth available on premise.Bymaking it easy to transform an existing data-center into a private CDN Point of Presence (PoP), Varnish Extend empowers companies to easily make the most out of all the bandwidth they have paid for, by setting up Varnish nodes on premise, or on cloud instances that offer lower operational costs than using CDN bandwidth.

This is especially valuable for broadcasters/service providers whose service is limited to one country: the global coverage of a CDN may be overkill, when the same quality of experience can be delivered by simply establishing POPs in strategic locations in-country.

Unlike committing to an all-CDN environment, using a private CDN infrastructure like Varnish Extend supports scaling to meet business needs – costs are based on server instances and decisions, not on the amount of traffic delivered. So as consumer demands grow, pushing for greater quality, the additional traffic doesn’t push delivery costs over the edge of sanity.

A global server load balancer like Openmix automatically checks available bandwidth on each Varnish node as well as each CDN, along with each platform’s performance in real-time. Openmix also uses information from the Radar real user measurement community to understand the state of the Internet worldwide and make smart routing decisions.

Your own private CDN – in a matter of hours

Understanding the health of both the private CDN and the broader Internet makes it a snap to dynamically switch end-users between Varnish nodes and CDNs, ensuring that cost containment doesn’t come at the expense of customer experience – simply establish a baseline of acceptable quality, then allow Openmix to direct traffic to the most cost-effective route that will still deliver on quality.

Implementing Varnish Extend is surprisingly simple (some customers have implemented their private CDN in as little as four hours):

  1. Deploy Varnish Plus nodes within existing data-centre or on public cloud,
  2. Configure Cedexis Openmix to leverage these nodes as well as existing CDNs.
  3. Result: End-users are automatically routed to the best delivery node based on performance, costs, etc.

Learn in detail how to implement Varnish Extend

Sign up for Varnish Software – Cedexis Summit in NYC

References/Recommended Reading:

Cedexis Predictions for 2017

Content & Application Delivery Predictions for 2017

It’s that time of the year, already, when we look back and evaluate how the year went – and look forward to prognosticate about what’s right around the corner.

It’s been a huge year for Internet operators, between hacking scandals (and hints of hacking scandals), DDOS takedowns, and the continued mammoth uptake of cloud services and streaming video. We’re back over a billion websites worldwide (the number was first reached in 2014, but then saw a dip, for reasons that remain opaque); the US e-commerce economy seems on pace to exceed $400B; and streaming video is so popular that DirecTV Now is about to launch – not only will you not need a satellite, you’ll simply stream DirecTV over the Internet.

All these things have conspired to boost the amount of traffic on the Internet – one projection says that more traffic will flow through the Internet in 2017 than all prior years combined. And more traffic means more challenges in making sure your content reaches your customers properly. So here are some bold predictions on what will impact your choices as you plan to do battle for bandwidth in the next 12 months:

  • SSL / TLS will be adopted at scale by website publishers: as HTTP2 sees widespread adoption, there is a wide-open window for broad update to the more secure TLS protocol (seriously, if you’re still on SSL it’s only a matter of time…). Global Content Distribution Networks (CDNs) are investing heavily in expanding and optimizing their SSL/TLS services – and, believe it or not, more than half of all the measurements taken by the Radar community are now over SSL/TLS.
  • Real User Monitoring (RUM) will emerge as a critical Application Performance Management (APM) metric: serverless architectures are making existing APM solutions less valuable, as they simply can’t ‘see’ everything that is happening across the cloud. Instead, companies will turn to RUM: measures of what is actually happening at the point of consumption. Only a clear understanding the experience being enjoyed (or not!) by the consumer will permit meaningful tuning for all forms of content and application delivery, from file downloads to streaming video to API access – and beyond..
  • Hybrid CDN architectures will gather momentum: cloud ubiquity, scalability and cost effectiveness will drive ‘CDN offload’ scenarios: moving some traffic back to publisher delivery from the CDNs. Increasingly popular off-the-shelf content and application caching solutions like Varnish Software will continue to decrease the complexity of deploying private networks. And as large scale web publishers adopt do-it-yourself (DIY) content and application delivery strategies, enhancing experience by relying less on CDNs will become an industry-wide trend.
  • Content Delivery budgets will shift to Quality of Experience (QoE) spending: consumers don’t care how their content gets to them – but they do care about the experience they receive, and also vote with their feet when dissatisfied. With CDN pricing in decline, and differentiation harder to establish, publishers will feel the pressure to invest in solutions that optimize QoE in order to attract, and retain, their audiences. And as consumers come to expect ever more dynamic and personalized applications, efficient delivery will need to be balanced against QoE. The winners will not be those with the most complex applications, but rather those who can deliver captivating and seamless experiences.
  • Application Delivery becomes QoE driven: call 2017 the Year of the Human, because it will be the year that ‘application health checks’, typically synthetically generated by probes, don’t reflect real-world user experience. They will be replaced by RUM measurements that accurately reflect the QoE reality. This will lead to significant investments in APM and Big Data solutions, which will be used to sort through the voluminous data to deliver experiences that delight audiences.

2017 will continue to be an exciting year for everyone involved in content and application delivery performance.  We at Cedexis look forward to sharing our thoughts on the evolving space, and the data around what works and what does not in the coming year.

Network Resilience for the Cloud

If we’ve learned nothing else over the last few weeks, it has been that the Internet is an unruly, inherently insecure network. The ups and downs of Dyn – taken offline by hackers, yet subsequently purchased for what appears to be north of half a billion dollars – remind us that we are still a generation or two away from comfortable consistency. More importantly, they remind cloud businesses that they are relying upon a network over which they have only limited control.

Peter Deutch and others at Sun Microsystems proposed, a number of years ago, the Fallacies of Distributed Computing.  They are:

  1. The network is reliable.
  2. Latency is zero.
  3. Bandwidth is infinite.
  4. The network is secure.
  5. Topology doesn’t change.
  6. There is one administrator.
  7. Transport cost is zero.
  8. The network is homogeneous.

The briefest look through this list tells you that these are brilliantly conceived, and as valid today as they were when first introduced in 1994 (in fairness, number 8 was added in 1997). Indeed, turn them upside down and you can already see the poster to go on every Operations team’s wall, reminding them that:

  1. The Internet is inherently unreliable
  2. Internet latency is a fact of life, and must be anticipated
  3. Bandwidth is shared, precious, and limited
  4. No Internet-connected system is 100% secure
  5. Internet topology changes quicker than the staircases at Hogwarts
  6. There are so many administrators of the Internet there may as well be none
  7. Transport always has a cost – your job is to keep it low
  8. The Internet consists of an infinite number of misfitting pieces

In a recent paper commissioned by Cedexis from The FactPoint Group, a new paradigm is proposed: stop building fault-tolerant systems, and start building failure-tolerant systems. Simply stated, an internally-managed network can be constructed with redundancy and failover capabilities, with a reasonable goal of near-100% consistent service.  Cloud architectures, however, have so many moving parts and interdependencies, that no amount of planning can eliminate failures. Cloud architecture, therefore, requires a design that assumes failures will happen and plans for them.

data-center-evolution-icons

(Click here if you’d like to read the paper in full)

This means there’s more to this than load balancing – we’re really talking about resource optimization. Take, for example, caching within a private cloud. First, we can use a Global Traffic Manager (GTM) to maximize Quality of Experience (QoE) by routing traffic along the pathways that will deliver content the most quickly and efficiently. Secondly we can use intelligent caching to protect against catastrophic failure: a well-tuned Varnish server for instance, can continue to serve cached content while an unavailable origin server is repaired and put back into service. In a situation where DNS services are down, the GTM can use Real User Measurements (RUM) to spot the problem, and direct requests to the right Varnish server (and, in case DNS is down, a well-constructed decision set can contain IP addresses for emergencies). The Varnish server can check for the availability of its origin and, if DNS problems prevent it from sourcing fresh content, can serve cached content.

Will this solve for every challenge? Assuredly not – but the multi-layered preparation for failure greatly improves our chances of protecting against extended outages. Meanwhile, the agility that is adopted by Operations teams as they prepare for failure means a more subtle, sophisticated set of network architectures, which lend themselves to way greater resiliency.

As applications increasingly become a tightly-knit conglomeration of web-connected services and resources, planning for failure is not a choice, it is an imperative. Protecting against the variety of threats to the shared Internet requires agility, forethought, and the zen-like acceptance that failure is inevitable.

APM Tools: 2 MORE reasons to include APM tools in your real-time traffic decisions!

In a previous post we have described how the Cedexis “Fusion” tool allows users to ingest APM data in real time and make it actionable to avoid outages.

As I stated in that posting:

“..there is a persistent need to synthetically monitor specific elements of an application stack to ensure uptime and understand WHAT is failing when something inevitably fails”

To briefly review Fusion take a glance at this diagram:

Screen-Shot-2015-09-30-at-4.06.40-PM

So now on to the great news! We are really excited to add 2 more premier partners to our growing Fusion family of APM tool integrations! 

Two new important Fusion partners join the Cedexis community.

Both Azure and AWS (Cloudwatch) have been added as APM data feeds to our growing set of APM integration partners.  These include great names like New Relic, Rigor, Catchpoint and many others. See the full list here.

These integrations allow the customer to pull-in operational data about the CPU, disk, memory and network usage of an instance into Openmix applications. They can then shed or halt traffic if the machine gets overloaded on any of those metrics.  So your app stays up. And you can be the on-call hero!

Screen-Shot-2015-10-01-at-9.31.08-AM

In both cases you can simply configure your servers with whichever APM tool you are using. They will both produce simple JSON feeds that can be ingested into Openmix for realtime decisions. Lets take a look at the JSON you can expect to get back from each of the tools.

Azure Openmix Json

The Azure fusion data feed produces a json object that contains Azure Virtual Machine monitoring metrics.  The Fusion Azure data structure that will be sent to Openmix currently looks like this:

{  
  "disk_writes_kbps": 0.279,
  "network_in_mb": 3286.48,
  "memory_available_mb": 1479,
  "disk_reads_kbps": 0,
  "memory_available_pct": 88,
  "network_out_mb": 2343.892,
  "cpu_time_pct": 20.5
}

AWS Cloudwatch Openmix Json

The CloudWatch fusion data feed produces a json object that contains AWS CloudWatch Virtual Machine monitoring metrics.  The Fusion CloudWatch data structure that will be sent to Openmix currently looks like this:

{  
  "disk_writes_kbps": 0.279,
  "network_in_mb": 3286.48,
  "memory_available_mb": 1479,
  "disk_reads_kbps": 0,
  "memory_available_pct": 88,
  "network_out_mb": 2343.892,
  "cpu_time_pct": 20.5
}

As you can see – both tools return very similar data – in fact you can almost imagine deploying servers on both of these Cloud vendors and being able to abstract back and monitor them using their native tools – and be able to switch in real time between them based on Network Performance + Server Health. And that is in fact the point.

Fusion – The Key to Multi-Cloud, Hybrid-Cloud Solutions

The Cloud Maturity Model clearly shows that as we evolve up the stack to Multi-Vendor/ Multi-Cloud we will need to be able to monitor the application stack’s performance as part of the real-time switching.

Hybrid-Cloud-Maturity-Model

 

These two new APM integrations provide just the feature set to do that for users that are using Azure or AWS in their portfolio of clouds.

Fusion: The glue that allows Cedexis customers to make APM data actionable.

Application Performance Monitoring (or APM) for Cloud applications is a hot topic. APM tools help online businesses figure out what’s going on with their service. The level of analysis that APM tools provide is rich, varied and important. RUM is awesome. Yet, there is a persistent need to synthetically monitor specific elements of an application stack to ensure uptime and understand WHAT is failing when something inevitably fails. At Cedexis, we have developed an integration architecture that allows us to ingest these important pieces of information and use them as part of the traffic routing decision. We call this architectural framework Fusion.

fusion

A common use case is an eCommerce company that has multiple CDNs using Dynamic Site Acceleration (DSA). This (smart) eCommerce company also uses Rigor, Keynote or Catchpoint for Shopping Cart Process Validation. In practice, this amounts to the APM tool running through the shopping cart process every so often to validate that the process is working as designed. So, this company can ingest the DSA benchmarks + the APM process checks into Openmix and ensure the best performance and a working shopping cart every time.

process-flow

Cedexis Fusion is about availability, cost control and options. Options to uniquely define your own set of DNS decision routing rules. Fusion addresses issues like:

  • Is your website available to all users?
  • Are your servers at capacity?
  • Are your CDN costs reaching contractual limits (cost control)?
  • Do you want to provide your own data to influence DNS decisions?

Fusion ingests all this data, packages it up and makes it available to Openmix for custom DNS routing decisions. Openmix can route decisions to reduce your CDN costs and to ensure none of your servers are at disk, CPU or memory capacity. If one of these conditions happens, Openmix can route away from that server or it can route on your specific user-defined data (json, xml, http, csv, text) using custom HTTP Get data feed. It’s all possible with Cedexis Fusion.

Cedexis has aligned itself with the best tools in the world of synthetic application monitoring. We have pre-built packages for names like Catchpoint, Rigor, Keynote, New Relic and many others (over 20 integrations).

It’s extremely easy to get started. First, go to the Fusion Menu under Openmix:Screen-Shot-2015-08-17-at-10.18.21-AM

As you can see, the best APM tools in the world already have pre-built integrations. We call these recipes.

Screen-Shot-2015-08-17-at-10.24.12-AM

Many CDNs also have pre-built integrations that allow our users to gather usage data and use that data to avoid nasty overages. This ensures that a multi-CDN customer can not only have great performance but can do so without it costing an arm and a leg. So, back to our eCommerce customer above – assuming the performance of the two DSA CDNs are within some reasonable range of performance – the traffic can be routed to the CDN that is not threatening to have an overage. The same type logic can be used to hit minimum commits.

The provisioning process for Fusion is quite simple. You choose a provider, add your credentials for security and then gather the data that the APM tool or CDN is delivering. Cedexis then automatically distributes that data to its global network of workers that allow your decision scripts to utilize the data in real time.

Fusion-Work-Flow

For example, if you are monitoring your site with New Relic you might be ingesting this data into Openmix for DNS decisions.

Fusion-New-Relic-Feed-

When implementing Fusion, these activities should always be considered:

  • You must configure customer desired external data access points – i.e. define where the data is coming from.
  • Ingest and transform external data for use in custom decision-routing scripts.
  • Make external data available for real-time DNS routing decisions.

Fusion-AdvantagesFusion is indicative of the Cedexis philosophy – provide best of breed service on our core platform (Openmix) and integrate with the best products in the industry to provide our clients with top performance and reliability.

To learn more about Fusion, check out our website.