|
|
|
|
|
Misunderstandings
about protection strategies and differences in software implementations
among vendors make SONET protection a headache.
|
|
John Brandte Ncomm
|
|
Protecting SONET networks should be far easier than it has turned out
to be. Standards bodies, vendors, and providers have been hard at work
for some time trying to protect all aspects of SONET networks -- be it
the facility equipment, the physical network links, or the switching
equipment. But confusion about different protection mechanisms and
unresolved incompatibility between vendors' solutions continues to
stand in the way of effective protection, fail-over, and recovery in
heterogeneous environments.
|
|
Protection switching -- for availability or reliability?
|
|
Let's begin by clarifying that protection switching, as defined in the
standard, is a technique for addressing network availability (mean
duration of failure) rather than reliability (mean time between
failures).
|
|
Availability, or uptime, is measured by how quickly network operations
are restored after a failure, with at least "five 9s" (99.999%)
availability being the carrier standard. The motivation for
implementing automatic protection switching (APS) is that even with the
most reliable circuit, an outage -- even of short duration -- continues
to be painful.
|
|
Backing up critical circuits has always been a key network design
consideration, and automatic fail-over to alternative facilities has
been available for decades. Leased-line analog circuits (1.2-19.2
kbits/sec) were backed up with dial-up analog circuits. Digital
circuits (2.4-64 kbits/sec) were initially backed up by switched analog
circuits, then by switched 4-wire and 2-wire digital service. With
services like T1, using the idle leased line as a hot standby became
more prevalent.
|
|
While protection methods like these met basic customer needs, there was
an enormous downside to consider -- they were all proprietary to a
single equipment provider. Heterogeneous environments were not an
option. Changing vendors meant "forklift changes" not just of hardware,
but also in operations. This situation still exists today with
technologies like T1/T3 and E1/E3, which have neither the open
standards nor uniform implementations of protection switching necessary
to achieve cross-vendor interoperability.
|
|
Fortunately things are beginning to improve, thanks to newer physical
layer WAN technologies implemented with SONET/SDH that, unlike T1/E1,
have specific standards for APS.
|
|
Defining APS -- facility or equipment protection?
|
|
APS is often mistakenly used to describe two different kinds of
protection -- that of equipment and that of the transmission facility.
The methods to achieve equipment and facility protection are different,
and only facility protection is defined in the SONET APS standard.
|
|
Equipment protection switching accommodates
potential hardware failures, while the transport facility
(fiber/coax/copper) itself is still functional. Should the equipment
fail, alternative hardware is substituted. Usually, protection ports
are on different boards from the protected ports to avoid cascading
failure and enable nondisruptive hardware repairs.
|
|
The SONET APS standard defines facility protection switching,
which deals with transport link failure. Should the transport medium
become compromised, a mechanism is put in place to supply an
alternative physical path. Redundant facilities are provisioned that
may be switched into the original port or a new port.
|
|
There are four standards that define APS for SONET/SDH transport. The
general set for SONET is described in document T1.105; GR-253 addresses
linear APS, while GR-1400 and GR-1230 specify unidirectional
path-switched ring (UPSR) and bidirectional line-switched ring (BLSR),
respectively. The SDH equivalents are outlined in ITU documents ITU-T
G.783 and ITU-T Q.784, ITU G.826, and ITU G.774.
|
|
SONET APS can be configured in linear (point-to-point) or ring network
architectures, depending on the different needs of the application,
traffic and incumbent equipment.
|
|
Linear APS at work
|
|
1+1 linear APS provides two redundant fiber links, each carrying
identical traffic, with receivers at each end monitoring the bit
streams and choosing the "best" link. It is costly because two
receivers are required at each end point, twice as much fiber is
needed, and no additional capacity is gained. Although 1+1 is the most
expensive technique, it also offers the fastest recovery, often without
any data loss.
|
|
1:1 works slightly differently and is less costly. Although 1:1 also
requires a backup or "protection" fiber for each primary or "working"
fiber, the protection fiber remains idle or carries low-priority
traffic when not switched in.
|
|
1:n
provides a single backup fiber for up to 14 primary fibers. The
secondary fiber can carry low-priority traffic when not in use as a
backup. This method is much less expensive than the other linear APS
alternatives, because one secondary fiber provides coverage for
multiple primary fibers.
|
|
All linear APS solutions share the drawback of asymmetric delay.
Additional buffering at the nodes needed to overcome this problem
raises equipment costs. But linear APS is simple to install and
provides adequate point-to-point availability.
|
|
Ring APS at work
|
|
There are two main types of ring APS: UPSR and BLSR. UPSR is the
simpler of the two, with its dual counter-rotating fiber links, each of
which carries identical traffic. Both sending and receiving nodes
monitor the two fibers and select the better of the two signals based
on criteria such as bit error rate and Alarm Indication Signal (AIS).
|
|
Advantages of UPSR include the fact that the receiving node makes all
decisions with no interaction with either local or remote transmitters,
no communications channel is needed, and UPSR provides virtually
uninterrupted service. The downside includes the need for redundant
transceivers, as well as the introduction of asymmetric delay.
|
|
BLSR is frequently used in core network applications and is more
complicated, using line switching to redirect traffic to the protection
fiber in the event of failure. BLSR uses the K1/K2 bytes along with
other local indications to raise a flag to switch. Once the flag is
raised, an independent "controller" communicates via the K1/K2 bytes
with the local backup facility (through the backup SONET/SDH
transceiver) and then contacts a far-end transceiver to prepare for
transfer of traffic from the failed working facility to the protection
facility.
|
|
What happens after the switch is synchronized and executed depends on
whether the configuration is revertive or non-revertive. If revertive,
traffic is automatically switched back to the original working facility
once it is recognized as operationally sound. In non-revertive
scenarios, traffic remains on the protection fiber until it's manually
switched back.
|
|
Switches are generally configured as non-revertive. Bad lines often
appear to be fixed for short time periods while the provider is
troubleshooting and repairing line problems. Automatically switching
back and forth repeatedly would disrupt service unnecessarily.
|
|
All APS must meet certain performance thresholds to be
standards-compliant. The total budget for detecting a failure and
completing the switchover must occur within a maximum restoral time of
60 msec, with 10 msec to detect and 50 msec to switch.
|
|
There are several different criteria for making the switching decision, including:
- AIS
- Loss of pointer
- Unequipped (indicated in the C2 byte)
- Remote defect indication
- Bit-error ratio (severely errored seconds/errored seconds).
|
|
Where these indications are used depends on the type of APS. Line-level
indicators are used for linear models, while line- and path-level are
used for ring configurations. K1/K2 are used only when there is a
sharing of the backup facility or when communication with the far end
is required, typically in bi-directional and 1:n APS.
|
|
Today's struggle for interoperability
|
|
In looking at the implementation considerations associated with SONET
APS, it's clear that, despite standardization efforts, providing
interoperable protection switching is still not simple. Providers and
equipment vendors acknowledge this complexity and are eager to find a
solution. Yet vendors, currently using custom implementations, are
fighting a losing battle to provide reliable protection switching that
is fully interoperable across vendor platforms.
|
|
The root of the problem lies in the software -- multiple
implementations, each of which interprets the standard in a slightly
different way. While the standards themselves are well defined, it is
difficult to make fully interoperable products when each vendor's APS
software implements the standard just a little differently. In
addition, WAN transport expertise may be outside a vendor's area of
core competency, so the critical functions necessary to build stable
APS source code implementations may not be well understood by
developers who are not SONET experts.
|
|
Then consider that there are several critical areas that are extremely
difficult to coordinate in cross-platform environments -- including
timing, mapping, failure notification, alarm condition handling, and
predictable performance -- and you're at the heart of the problem.
|
|
So what are the options? Well, carriers can continue to single-source
equipment and avoid heterogeneous implementations. Or vendors and
providers can hope that standards bodies or, more likely, a vendor
consortium will define interoperability conformance details.
|
|
There is also another alternative. Board manufacturers and software
vendors are developing and delivering standards-compliant,
interoperable APS implementations. So whether it's embedded board-level
firmware or source-code software that vendors choose to ease their
interoperability woes, there's no longer any need to continue to be
held captive by single-sourcing, with more alternatives becoming
available every day.
|
|
John Brandte is vice president of marketing and business development at Ncomm
(Salem, NH), a provider of APS and other telecom wide area networking
source code, software and custom consulting. His experience in the
communications industry spans engineering, standards development,
business planning, and analysis.
|
Interested in a subscription to Lightwave Magazine? Click here to subscribe!
|