10 Ways to Make Your Outage Emergency Room Fun

Originally published in Devops Digest website. By Andrew Marshall, Director of Product Marketing at Cedexis

Blogpost Make outages fun

It’s 3:47am. You and the rest of the Ops team have been summoned from your peaceful slumber to mitigate an application delivery outage. As you catch up on the frantic emails, Slack chats, and text messages from everyone on your international sales team (Every. Single. One.), your mind races as you switch to problem solving mode. It’s time to start thinking about how to make this mitigation FUN!


No need to rub it in, but … if you had turned on a software-defined application delivery platform for your hybrid infrastructure, you’d all be sound asleep right now. Automated real-time delivery decisions and failover would be nice, right? Just sayin’.


Your coworkers’ opinions on gaming consoles, Spiderman movies, and music are different than yours! The perfect animated GIF will remind them of your erudite tastes, while you have their attention. Extra credit if you can work in some low-key shade about not listening to your equally sophisticated opinions on optimized outage mitigation.


Oh hey, look at that! Your cloud provider’s health dashboard page says everything is fine…because it’s powered by the services that went down. Help your team vent their creative energy (and frustration) with some fun customized MS Paint updates to the offending page. Bonus points for art that reminds everyone of the value of a multi-cloud strategy powered by a programmable application delivery platform.


Tired of “learning lessons” from these emergency room drills? You depend on NGINX for your LLB, but don’t have a way to use those LLB health metrics and data to automate global delivery. Disjointed delivery intelligence means you don’t know how your apps will land with users. You need an end user-centric approach to app delivery that automates the best delivery path and ensures failover is in place at all times. Micro-outages often fly under your passive monitoring radar, but that doesn’t mean your users don’t notice them. An active, integrated app delivery approach re-routes automatically before you lose business. Post-mortems are fun…but so is making sure your apps survive the last mile. Arguably more so?


“Sales needs to sell more stuff.” You’ll feel better.


Sure, your Mode1 ADC hardware was a sunk cost so you’re stuck with them for a while. But you’re one unnecessary emergency closer to having a fully software-defined application delivery platform for your hybrid cloud. And now you’re even closer. And closer … Tick. Tock.


Probably best to do this during normal work hours. User experience data from around the world can detect degrading, sluggish resources in real-time, and user-centric app delivery logic powered by RUM can make quick re-routing decisions automatically. No more getting woken up after the application crashes. While you’re wishing you had RUM on your side, you can look up some fun facts from the countries experiencing app outages. Did you know Luxembourgish is an official language?


You too can be the weird colleague who sends emails with crazy, middle-of-the-night time stamps.


Browse around to see what you could have purchased with the money you were just forced to spend on unplanned cloud instance provisioning in order to keep your app running. That desktop Nerf missile launcher (or 700 of them) would have been pretty nice.


You’ve just proven it’s not that much fun after all. Don’t just dump everything onto one cloud and call it done. Clouds go down for so many reasons. Use an application delivery platform control layer to build in the capability to auto-switch to an available resource, while you sleep soundly. Running on multi-cloud without an abstracted control layer removes most of the value of the cloud. Swear off the game of chance. Out loud. Right now.


More information:

Announcing Cedexis Netscope: Advanced Network Performance and Benchmarking Analysis

The Cedexis Radar community collects tens of billions of real user monitoring data points each day, giving Cedexis users unparalleled insight into how applications, videos, websites, and large file downloads are actually being experienced by their users. We’re excited to announce a product that offers a new lens into the Radar community dynamic data set: Cedexis Netscope.

Know how your service stacks up, down to the IP subnet
Metrics like network throughput, availability, and latency don’t tell the whole story of how your service is performing, because they are network-centric, not user-centric: however comprehensively you track network operations, what matters is the experience at the point of consumption. Cedexis Netscope provides you with additional user-centric context to assess your service, namely the ability to compare your service’s performance to the results of the “best” provider in your market. With up-to-date Anonymous Best comparative data, you’ll have a data-driven benchmark to use for network planning, marketing, and competitive analysis.

Highlight your Service Performance:

  • Relative to peers in your markets
  • In specific geographies
  • Compared with specific ISPs
  • Down to the IP Sub-net
  • Including both IPv4 and IPv6 addresses
  • Comprehensive data on latency or throughput
  • Covering both static and dynamic delivery

Actionable insights
Netscope provides detailed performance data that can be used to improve your service for end users. IT Ops teams can use automated or custom reports to view performance from your ASN versus peer groups in the geographies you serve. This lets you fully understand how you stack up versus the “best” service provider, using the same criteria. Real-time logs organized by ASN can be used to inform instant service repairs or for longer-term planning.

Powered by: the world’s largest user experience community
Real User Monitoring (RUM) means fully understanding how internet performance impacts customer satisfaction and engagement. Cedexis gathers RUM data from each step between the client and any of the clouds, data centers, and CDNs hosting your applications to build a holistic picture of internet health. Every request creates more data, continuously updating this unique real-time virtual map of the web.

Data and alerts, your way
To effectively evaluate your service and enable real-time troubleshooting, Netscope lets you roll up data by the ASN, country, region, or state level. You can zoom in within a specific ASN at the IP subnet level, to dissect the data in any way your business requires. This data will be stored in the cloud on an ongoing basis. Netscope also allows users to easily set up flexible network alerts for performance and latency deviations.

Netscope helps ISP Product Managers and Marketers better understand:

  • How well users connect to the major content distributors
  • How well users/business connect to public clouds (AWS, Google Cloud, Azure, etc.)
  • When, where, and how often outages and throughput issues happen
  • What happens during different times of day
  • Where are the risks during big events (FIFA World Cup, live events, video/content releases)
  • How service on mobile looks versus web
  • How the ISP stacks up v. ”the best” ISP  in the region

Bring Advanced Network analysis to your network
Netscope provides a critical data set you need for your network planning and enhancement. With its real-time understanding of worldwide network health, Netscope gives you the context and actionable data you need to delight customers and increase your market share.

Ready to use this data with your team?

Set up a demo today


The Cloud Is Coming

Still think the cloud (or should that be The Cloud?)is a possible-but-not-definite trend? Take a look at IDC’s projection of IT deployment types:

Credit: Forbes

So much to unpack! What really jumps out is that

  • Traditional data centers drop in share, but hang in there around 50%: self-managed hardware will be a fact of life as far out as we can see
  • Public cloud will double by 2021, but it isn’t devouring everything, because in the final analysis no Operations team wants to give up all control
  • Private cloud expands rapidly, as the skills to use the technology become more widespread
  • But most importantly…in the very near future, most every shop will likely be running a hybrid network, which combines traditional data centers, private cloud deployments, public clouds for storage and computation, and CDNs for delivery (don’t forget that Cisco famously predicted over half of all Internet traffic would traverse a CDN by the year after next)

It’s a brave new world, indeed, that has so many options in it.

If it is true, though, that cloud computing will be a $162B a year business by 2020 (per Gartner), and that 74% of technology CFOs say cloud computing will have the most measurable impact on their business in 2017, that means this year will end up having been one of upheaval, and of transformation. As ever more complex permutations of public/private infrastructure hit the market, the challenges of keeping everything straight will rapidly multiply: can one truly be said to be optimizing if one cannot centralize the tracking and traffic management for all resources, regardless of whether they’re in your own NOC, under Amazon’s tender care in Virginia, or located at some unidentified POP somewhere in Western Europe?

The truth is that, as with all transformations, this move to hybrid networks will be marked by the classic Hype Cycle:

We are fast approaching the Peak of Inflated Expectations; the sudden fall into the Trough of Disillusionment will be precipitated by the realization that there are now so many different sources of computation in the mix that nobody is quite sure where the savings are. Perhaps we’re saving money by using different CDNs in different geographies – but it’s hard to tell if we’re balancing for economic benefit; perhaps we’re making the right move by storing all our images on a global cloud, but it’s hard to tell whether adding a second (with the inevitable growth in storage fees) would result in faster audience growth; perhaps we’re right to avoid sending content requests back to origin, but at the same time, that seems like a lot of resources to not use.

The Slope of Enlightenment will hit when the tools come along to put all the metrics of all the elements of the hybrid network onto a single pane: balancing between nodes that are, at an abstract level at least, equally measurable, configurable, and tunable will start us down the path to the Plateau of Productivity.

The Cloud is coming; how long we spend in the Trough of Disillusionment trying to figure out how to make it hum like a well-oiled machine is assuredly on us.

Amazon Outage: The Aftermath

Amazon AWS S3 Storage Service had a major, widely reported, multi-hour outage yesterday in their US-East-1 data center. The S3 service in this particular data center was one of the very first services Amazon launched when it introduced cloud computing to the world more than 10 years ago. It’s grown exponentially since–storing over a trillion objects and servicing a million requests/second supporting thousands of web properties (this article alone lists over 100 well-known properties that were impacted by this outage).

Amazon has today published a description of what happened. The summary is that this was caused by human error. One operator, following a published run book procedure, mis-typed a command parameter setting a sequence of failure events in motion. The outage started at 9:37 am PST.  A nearly complete S3 service outage lasted more than three hours and full recovery of other AWS S3-dependent services lasted several hours more.

A few months ago, Dyn taught the industry that single-sourcing your authoritative DNS creates the risk the military described as two is one, one is none. This S3 incident underscores the same lesson for object storage. No service tier is immune. If a website, content, service or application is important, redundant alternative capability at all layers is essential. And this requires appropriate capabilities to monitor and manage this redundancy. After all, fail-over capacity is only as good as the system’s ability to detect the need to, and to actually, failover. This has been at the heart of Cedexis’ vision since the beginning, and as we continue to expand our focus in streaming/video content and application delivery, this will continue to be an important and valuable theme as we seek to improve the Internet experience of every user around the world.

Even the very best, most experienced services can fail. And with increasing deconstruction of service-oriented architectures, the deeply nested dependencies between services may not always be apparent. (In this case, for example, the AWS status website had an underlying dependency on S3 and thus incorrectly reported the service at 100% health during most of the outage.)

We are dedicated to delivering data-driven, intelligent traffic management for redundant infrastructure of any type. Incidents like this should continue to remind the digital world that redundancy, automated failover, and a focus on the customer experience are fundamental to the task of delivering on the continued promise of the Internet.

How To Deliver Content for Free!

OK, fine, not for free per se, but using bandwidth that you’ve already paid for.

Now, the uninitiated might ask what’s the big deal – isn’t bandwidth essentially free at this point? And they’d have a point – the cost per Gigabyte of traffic moved across the Internet has dropped like a rock, consistently, for as long as anyone can remember. In fact, Dan Rayburn reported in 2016 seeing prices as low as ¼ of a penny per gigabyte. Sounds like a negligible cost, right?

As it turns out, no. As time has passed, the amount of traffic passing through the Internet has grown. This is particularly true for those delivering streaming video: consumers now turn up their nose at sub-broadcast quality resolutions, and expect at least an HD stream. To put this into context, moving from HD as a standard to 4K (which keeps threatening to take over) would result in the amount of traffic quadrupling. So while CDN prices per Gigabyte might drop 25% or so each year, a publisher delivering 400% the traffic is still looking at an increasingly large delivery bill.

It’s also worth pointing out that the cost of delivery relative to delivering video through a traditional network, such as cable or satellite is surprisingly high. An analysis by Redshift for the BBC clearly identifies the likely reality that, regardless of the ongoing reduction in per-terabyte pricing “IP service development spend is likely to increase as [the BBA] faces pressure to innovate”, meaning that online viewers will be consuming more than their fair share of the pie.

Take back control of your content…and your costs

So, the price of delivery is out of alignment with viewership, and is increasing in practical terms. What’s a streaming video provider to do?

Allow us to introduce Varnish Extend, a solution combining the powerful Varnish caching engine that is already part of delivering 25% of the world’s websites; and Openmix, the real-time user-driven predictive load balancing system that uses billions of user measurements a day to direct traffic to the best pathway.

Cedexis and Varnish have both found that the move to the Cloud left a lot of broadcasters as well as OTT providers with unused bandwidth available on premise.Bymaking it easy to transform an existing data-center into a private CDN Point of Presence (PoP), Varnish Extend empowers companies to easily make the most out of all the bandwidth they have paid for, by setting up Varnish nodes on premise, or on cloud instances that offer lower operational costs than using CDN bandwidth.

This is especially valuable for broadcasters/service providers whose service is limited to one country: the global coverage of a CDN may be overkill, when the same quality of experience can be delivered by simply establishing POPs in strategic locations in-country.

Unlike committing to an all-CDN environment, using a private CDN infrastructure like Varnish Extend supports scaling to meet business needs – costs are based on server instances and decisions, not on the amount of traffic delivered. So as consumer demands grow, pushing for greater quality, the additional traffic doesn’t push delivery costs over the edge of sanity.

A global server load balancer like Openmix automatically checks available bandwidth on each Varnish node as well as each CDN, along with each platform’s performance in real-time. Openmix also uses information from the Radar real user measurement community to understand the state of the Internet worldwide and make smart routing decisions.

Your own private CDN – in a matter of hours

Understanding the health of both the private CDN and the broader Internet makes it a snap to dynamically switch end-users between Varnish nodes and CDNs, ensuring that cost containment doesn’t come at the expense of customer experience – simply establish a baseline of acceptable quality, then allow Openmix to direct traffic to the most cost-effective route that will still deliver on quality.

Implementing Varnish Extend is surprisingly simple (some customers have implemented their private CDN in as little as four hours):

  1. Deploy Varnish Plus nodes within existing data-centre or on public cloud,
  2. Configure Cedexis Openmix to leverage these nodes as well as existing CDNs.
  3. Result: End-users are automatically routed to the best delivery node based on performance, costs, etc.

Learn in detail how to implement Varnish Extend

Sign up for Varnish Software – Cedexis Summit in NYC

References/Recommended Reading:

Mobile Video is Devouring the Internet

In late 2009 – fully two years after the introduction of the extraordinary Apple iPhone – mobile was barely discernible on any measurement of total Internet traffic. By late 2016, it finally exceeded desktop traffic volume. In a terrifyingly short period of time, mobile Internet consumption moved from an also-ran to a behemoth, leaving behind the husks of marketing recommendations to “move to Web 2.0” and to “design for Mobile First”. And along the way, Apple encouraged us to buy into the concept that the future (of TV at least) is apps.

Unsurprisingly, the key driver of all this traffic is – as it always is – video. One in every three mobile device owners watches videos of at least 5 minutes’ duration, which is generally considered the point at which the user has moved from short-form, likely user-generated, content, to premium video (think: TV shows and movies). And once viewers pass the 5minute mark, it’s a tiny step to full-length, studio-developed content, which is a crazy bandwidth hog.  Consider that video is expected to represent fully 75% of all mobile traffic by 2020 – when it was just 55% in 2015.

As consumers get more interested in video, producers aren’t slowing down. By 2020, it is estimated that it would take an individual fully 5 million years to watch the video being published and made available in just a month. And while consumer demand varies around the world – 72% of Thailand’s mobile traffic is video, for instance, versus just 41% in the United States – the reality is that, without some help, the mobile Web is going to be straining under the weight of near-unlimited video consumption.

What we know is that, hungry as they are for content, streaming video consumers are fickle and impatient. Akamai demonstrated years ago the 2-second rule: if a requested piece of content isn’t available in under 2 seconds, Internet users simply move on to the next thing. And numerous studies have shown definitively that when re-buffering (the dreaded pause in playback while the viewing device downloads the next section of the video) exceeds just 1% of viewing time, audience engagement collapses, resulting in dwindling opportunities to monetize content that was expensive to acquire, and can be equally costly to deliver.

How big of a problem is network congestion? It’s true that big, public, embarrassing outages across CDNs or ISPs are now quite rare. However, when we studied the network patterns of one of our customers, we found that what we call micro-outages (outages lasting 5 minutes or less) happen literally hundreds to thousands of times a day. That single customer was looking at some 600,000 minutes of direct lost viewing time per month – and when you consider how long each customer might have stayed, and their decreased inclination to return in the future, that number likely translates to several million minutes of indirectly lost minutes.

While mobile viewers are more likely to watch their content through an app (48% of all mobile Internet users) than a browser (18%), they still receive the content through the chaotic maelstrom of a network that is the Internet. As such, providers have to work out the best pathways to use to get the content there, and to ensure that the stream will have consistency over time so that it doesn’t fall prey to the buffering bug.

Most providers use stats and analysis to work out the right pathways – so they can look at how various CDN/ISP combos are working, and pick the one that is delivering the best experience. Strikingly, though, they often have to make routing decisions for audience members who are in geographical locations that aren’t currently in play, which means choosing a pathway without any recent input on which is going to be the best pathway – this is literally gambling with the experience of each viewer. What is needed is something predictive: something that will help the provider to know the right pathway the first time they have to choose.

This is where the Radar Community comes in: by monitoring, tracking, and analyzing the activity of billions of Internet interactions every day, the community knows which pathways are at peak health, and which need a bit of a breather before getting back to full speed. So, when using Openmix to intelligently route traffic, the Radar community data provides the confidence that every decision is based on real-time, real-user data – even when, for a given provider, they are delivering to a location that has been sitting dormant.

Mobile video is devouring the Web, and will continue to do so, as consumers prefer their content to move, dance, and sing. Predictively re-routing traffic in real-time so that it circumvents the thousands of micro-outages that plague the Internet every day means never gambling with the experience of users, staying ahead of the challenges that congestion can bring, and building the sustainable businesses that will dominate the new world of streaming video.

How to Make Cloud Pay Its Own Way

Rightscale came out with a wonderful report on the state of the cloud industry, and we learned some important new things:

  • 77% of organizations are at least exploring private cloud implementations
  • 82% of enterprises are executing a hybrid cloud strategy
  • 26% of respondents are now listing cost as significant challenge – ironically, given the importance of cost-cutting in the early growth of cloud services

The growth in hybrid cloud adoption is particularly striking: by Rightscale’s count, only 6% of companies are exclusively looking at private cloud,  18% are exclusively looking at public cloud , while a full 71% have a toe dipped into each pool.

Meanwhile, Cisco estimates that two thirds of all Internet traffic will traverse at least one content delivery network by 2020 – which tends to imply that most organizations are, right now, invested in getting the most out of some combination of private cloud, public cloud, CDN, and, presumably, physically-managed data center.

Fundamentally, there are a few core ways that we see organizations using this market basket of delivery pathways – and, naturally, our Openmix global server load balancer – to better serve their customers, and to protect their economics as demand grows, apparently insatiable. The core strategies are:

  1. Balance CDNs, offload to origin. For web-centric businesses, delivering content across the Internet is fundamental to their success (possibly their survival), so they tend to rely upon one or more CDNs to get content to their users effectively. Over time, they tend to expand the number of CDN relationships, in order to improve quality across geographies, and to make the most of pricing differences between providers. Once they get this set to equilibrium, they discover that there is unused capacity at origin (or within a private or public cloud instance) to which they can offload traffic, maximizing the return they get on committed capacity, and minimizing unnecessary spend.
  2. Balance clouds, offload to CDN. For businesses that are highly geographically-focused, it is often more effective to create what is essentially a self-managed CDN, establishing PoPs through cloud providers in population centers where their customers actually originate. Even the most robust internally-managed system, however, is subject to traffic spikes that are way beyond expectations (and committed throughput limits), and so these companies build relationships with CDNs in which excess traffic is offloaded at peak times.
  3. Balance Hybrid Cloud. Organizations at the far right of Rightscale’s cloud maturity scale (in their words, the Cloud Explorers and Cloud Focused) are starting to view each of the delivery options not as wildly distinct options, but merely as similar-if-different-looking cogs in the machine. As such, they look at load and cost balancing through a pragmatic prism, in which each user is simply served through the lowest cost provider, so long as it can pass a pre-defined quality bar (a specified latency rate, for instance, or a throughput level). By shifting the mindset away from ‘primary’ and ‘offload’ networks, organizations are able to build strategies that optimize for both cost and quality.

Of course, to balance traffic across a heterogeneous set of delivery networks (and provider types), while adjusting for a combination of both economic and quality of service metrics, requires three things:

  1. Real-time visibility of the state of the Internet beyond the view of the individual publisher, in order to be able to evaluate Quality of Service levels prior to selecting a delivery provider
  2. Real-time visibility into the current economic situation with each contracted provider: which offers the lowest cost option, based on unit pricing, contract commitments, and so forth
  3. Real-time traffic routing, which takes the data inputs, compares them to the unique requirements of the requesting publisher, and seamlessly directs traffic along the right pathway

Not an easy recipe, perhaps, but when found, it results in the opportunity to apply sophisticated algorithms to delivery – in effect to exercise a Wall Street-level arbitrage approach, which results in a combination of delighted customers, and reduced infrastructure costs.

Or, put another way, the opportunity to make your hybrid cloud strategy pay for itself – and more.

To find out more about real-time predictive traffic routing, please take a look around our Openmix pages,  read about how to deliver 100% availability with a Hybrid CDN architecture, and visit our Github repository to see how easy it is to build your own real-time load balancing algorithm.

Make Mobile Video Stunning with Smart Load Balancing

If there’s one thing about which there is never an argument it’s this: streaming video consumers never want to be reminded that they’re on the Internet. They want their content to start quickly, play smoothly and uninterrupted, and be visually indistinguishable from traditional TV and movies. Meanwhile, the majority of consumers in the USA (and likely a similar proportion worldwide) prefer to consume their video on mobile devices. And as if that wasn’t challenging enough, there are now suggestions that live video consumption will grow – according to Variety by as much as 39 times! That seems crazy until you consider that Cisco predicted video would represent 82% of all consumer Internet traffic by 2020.

It’s no surprise that congestion can result in diminished viewing quality, leading over 50% of all consumers to, at some point, experience buffer rage from the frustration of not being able to play their show.

Here’s what’s crazy: there’s tons of bandwidth out there – but it’s stunningly hard to control.

The Internet is a best-efforts environment, over which even the most effective Ops teams can wield only so much control, because so much of it is either resident with another team, or is simply somewhere in the amorphous ‘cloud’.  While many savvy teams have sought to solve the problem by working with a Content Delivery Network (CDN), the sheer growth in traffic has meant that some CDNs are now dealing with as much traffic as the whole Internet transferred just a few years ago…and are themselves now subject to their own congestion and outage challenges. For this reason, plenty of organizations now contract with multiple CDNs, as well as placing their own virtual caching servers in public clouds, and even deploying their own bare-metal CDNs in data centers where their audiences are centered.

With all these great options for delivering content, Ops teams must make real-time decisions on how to balance the traffic across them all. The classic approaches to load balancing have been (with many thanks to Nginx):

  • Availability – Any servers that cannot be reached are automatically removed from the list of options (this prevents total link failure).
  • Round Robin – Requests are distributed across the group of servers sequentially.
  • Least Connections – A new request is sent to the server with the fewest current connections to clients. The relative computing capacity of each server is factored into determining which one has the least connections.
  • IP Hash – The IP address of the client is used to determine which server receives the request.

You might notice something each of those has in common: they all focus on the health of the system, not on the quality of the experience actually being had by the end user. Anything that balances based on availability tends to be driven by what is known as synthetic monitoring, which is essentially one computer checking another computer is available.

But we all know that just because a service is available doesn’t mean that it is performing to consumer expectations.

That’s why the new generation of Global Server Load Balancer(GSLB) solutions goes a step further. Today’s GSLB uses a range of inputs, including

  • Synthetic monitoring – to ensure servers are still up and running
  • Community Real User Measurements – a range of inputs from actual customers of a broad range of providers, aggregated, and used to create a virtual map of the Internet
  • Local Real User Measurements – inputs from actual customers of the provider’s own service
  • Integrated 3rd party measurements – including cost bases and total traffic delivered for individual delivery partners, used to balance traffic based not just on quality, but also on cost

Combined, these data sources allow video streaming companies not only to guarantee availability, but also to tune their total network for quality, and to optimize within that for cost. Or put another way – streaming video providers can now confidently deliver the quality of experience consumers expect and demand, without breaking the bank to do it.

When you know that you are running across the delivery pathway with the highest quality metrics, at the lowest cost, based on the actual experience of your users – that’s a stunning result. And it’s only possible with smart load balancing, combining traditional synthetic monitoring with the real-time feedback of users around the world, and the 3rd party data you use to run your business.

If you’d like to find out more about smart load balancing, keep looking around our site. And if you’re going to be at Mobile World Congress at the end of the month, make an appointment to meet with us there so we can show you smart load balancing in real life.

Tracking Video QoS Just Got A Whole Lot Easier

If you follow this blog, you know we’ve mentioned before working with innovative customers to create a creative way to track video Quality of Service (QoS) metrics and make sense of them.

It’s exciting therefore to share that now anyone and everyone can track video QoS in Radar.

Video is fundamentally different to a lot of other online content: not only is it huge (projections are that in the next four or five years video will make up as much as 80% of Internet traffic), it is inherently synchronous. Put another way, your customer might not notice if a page takes an extra second or two to load, but they surely notice if their favorite prime time show keeps stalling out and showing the re-buffering spinner. So our new Performance Report focuses on the key elements that matter to viewers, specifically:

  • Response Time: how long it takes the content source to respond to a request from the intended viewer. Longer is worse!
  • Re-Buffering Ratio: the share of viewing time spent with the content stalled, the viewer frustrated, and the player trying to catch up. Lower is better!
  • Throughput: the speed at which chunks of the video are being delivered to the player after request. Faster is better!
  • Video Start Time: how long it takes for the video to start after viewer request. Shorter is better!
  • Video Start Failures:the percentage of requested video playbacks that simply never start. Lower is better!
  • Bitrate: the actual bitrate experienced by the viewer (bitrate is a pretty solid proxy for picture quality, as the larger the bitrate, the higher the likely resolution of the video). In this case, higher or lower may be better, depending on your KPIs.

Once you enable the tag for your account and add it to your video-bearing pages (see below), you’ll be able to track all these for your site. And, as with all Radar reports, you can slice and dice the results in all sorts of different ways to get a solid picture of how your service is doing, video-wise. Analyses might include:

  • How do my CDNs compare at different times of day, in different locations, or on different kinds of device?
  • What is the statistical distribution of service provided through my clouds? Does general consistency hide big peaks and valleys, or is service generally within a tight boundary?
  • What is the impact of throughput fluctuations to bitrates, video start times, or re-buffering ratios? What should I be focused on to improve my service for my unique audience?

In no time, you’ll have a deep and clear sense of what’s going on with video delivered through your HTML5 player, and be able to extrapolate this to make key decisions on CDN partnering, cloud distribution, and global server load balancing solutions. The ability to really dig down into things like device type and OS – as well as the more expected geography, time, delivery platform, and so forth – means you’ll be able to isolate issues that are not, in fact, delivery-related: for instance, it is possible to see a dip in quality and assume it’s cloud-related, only to discover, in drilling down, that the drop occurs on only one particular device/OS combination, and thus uncover a hiccup in a new product release.

So here’s the scoop. Collecting these QoS metrics isn’t just easy – it’s free, just like our other Radar real user measurements. With the video QoS, you’ll be tracking your own visitors’ experiences, and be able to compare them over time.

The tag works with HTML5 players, running in a browser, and it’s unsurprisingly takes a bit more planning to implement than our standard tag, so you’ll likely want to drop us a line to get started. We’ll be delighted to help you get this up and running – just contact us by going to your Portal and navigating to Impact -> Video Playback Data, then clicking the Contact button..

Cedexis Openmix, an alternative to Route 53 load balancing

Congratulations on taking your company to the cloud! Now, are you ready for it to fail?


You see, in the cloud, you have to build for failure.

At the very least, geodiversity is required to ensure high availability. That means using your cloud’s other geographical regions to protect yourself from availability issues.

When you deploy in a second or third cloud region, how do you direct traffic to the correct region? How do you make the internet users going to your site or mobile app go to a different region when your East Coast cloud is out of commission?

There are a few tools out there to do just that. One of them is owned and operated by one of the major cloud providers. It’s called Route 53.

Route 53 started out as an authoritative DNS service for AWS users. They have more recently started introducing features that allow you to push your traffic between different AWS regions. Today, we want to take a look at how that product stacks up.

Route 53 allows users to set up conditional routing trees that establish rules for when to fail over to your various other cloud instances in different regions.  These conditional routing trees are dependent on some form of health check that you establish, as well. Sound like a lot of work? It is.

Cedexis offers a smarter alternative to Route 53 load balancing:

  • The Cedexis Radar community collects billions of latency and availability measurements of every AWS instance every day.
  • Radar is real-time data. Route 53 is a daily score. So, if your cloud is having an issue, your users will get routes elsewhere tomorrow. Not helpful.
  • Radar provides performance data on non-AWS endpoints, public and private. 
  • Cedexis has off-the-shelf APM integrations (New Relic, AppDynamics, CloudWatch, Catchpoint and more)
  • Openmix is application defined, allowing custom algorithms. Route 53 allows for nested policies, which are complex and restrictive.
  • Openmix can be queried via DNS and HTTP (think video players, games, mobile apps, etc.).
  • Cedexis is separate from primary DNS. You can keep using AWS, or any other DNS solution for authoritative DNS.
  • Only Cedexis has the capabilities that allow you to:
    • Deliver performance-based load balancing between AWS and your private data center
    • Optimize video stream delivery from your player or CMS
    • Automate traffic management based on APM data (or any other top Synthetic Monitoring tool, for that matter)


Join us for a free webinar on Wednesday, October 28 at 10:00am PT, where we will take a deeper dive on the benefits of Openmix over Route 53. Register now!

To learn more about how we can help companies with GSLB, read our Tango Case Study.