Since
2001 in every industry, IT has come under intense pressure to make
organizations perform more efficiently while still contributing to
the bottom line. Nowhere is this more apparent in financial services
where batch job scheduling has become the critical component to IT
success.
advertisement
When it comes to an analysis of the industry issues behind this
data, most industry experts agree that there are three major trends
are re-shaping the information infrastructures of financial institutions
today: Globalization, Automation & Service Levels. These three
trends serve as a stark reminder of the challenges that financial
organizations face today as they make the transition to a global,
web-based economy. Learn more by downloading the Business
Integrated Scheduling for Banking and Finance white paper from
BMC Software |
The U.S.
Securities and Exchange Commission has asked that all stock
trades be cleared on what's called the trade day plus one or T+1 by
June 2005. This requirement will force a switch from Wall Street's
traditional batch processing systems to a real-time processing network
that never crashes, according to a Computerworld article. The article
adds that while upgrading to comply with T+1 will cost about $8 billion,
the financial services industry will see savings of about $2.7 billion
a year. In addition, this industry will have lower costs, lower error
rates, and higher productivity while graining the ability to handle
greater transaction volume.
Based
on the ROI figures for the financial services industry, it comes clear
that there are significant monetary benefits to be gained from implementing
an automated job scheduling solution. It also becomes clear that beyond
the direct addition to the bottom line through cost savings there
are benefits of freeing up systems resources allowing them to be used
more productively (e.g. talented human resources can be put to better
use on more important IT projects).
Taken
together, automating job scheduling on the surface can offer significant
benefits to enterprises of every size. Since not all job schedulers
are created equal and don't yield the same benefits, you need to understand
how the different types of schedulers work and what attributes to
look for in an automated job scheduler. This article will provide
you with a crash course on the subject.
How
Job Schedulers Work
Job
scheduling comprises one of the most important components in a production-computing
environment. Job schedulers do many things. They initiate and help
manage long, complex jobs, such as payroll runs and inventory reports.
They also launch and monitor applications.
Most
computer environments use some kind of job scheduler. With the large
distributed computing environments, some job schedulers have not scaled
to meet the challenges of enterprise computing. Mainframe schedulers
enjoy a reputation for power and robustness, but can be limited to
working on mainframes. Unix schedulers, on the other hand, have a
reputation for being severely limited in functions, but have cross-platform
abilities which mainframe schedulers lack.
When
beginning to manage batch workloads in open systems environments,
most companies launch their first jobs using manual methods. This
technique is understandable and appropriate. However, this technique
quickly breaks down when the number of machines and batch jobs increases.
For
example, Unix and NT systems provide job launchers. These native tools
allow users to launch jobs at specific times and specific dates. These
commands provide a basis for scheduling, yet on their own do not deliver
a solution for complex scheduling requirements. They rely on operators
manually submitting jobs from a workstation. This technique is costly,
and potentially unreliable and error prone.
In distributed
systems, the job launchers in Unix and NT systems provide simple job
launching capability. They offer the ability to start a batch job
at a specific time, based upon an adequate set of time and date matching
criteria. They perform simple job scheduling tasks such as kicking
off a backup every Saturday.
The
biggest weakness of these native tools is their inability to monitor
and to correlate the execution of one job with the results of another.
If a backup job fails, these tools don't know it should suspend the
jobs that update the tape catalogs or deletes yesterday's old files.
If the backup finishes early, these tools can't move up jobs that
are to be executed upon completion of the backup.
Also,
these native tools can only start jobs that are time-dependent. This
procedure makes it difficult to create a job that runs when a file
disappears or when a system resource has a certain threshold.
Job
launching of configuration files are difficult to maintain. Even minor
changes to a job's start time are time consuming and error prone.
And there are no layered tools to make job creation easier. Remember,
these tools are simple job launching tools designed for low volume
environments. They lack all critical features required for complex,
large systems.
To make
up for this deficiency, many systems administrators create their own
job management system. They use these native tools to initiate a job
controller and create scripts that detect failure conditions, initiate
other jobs, and provide some degree of checkpoint and restart capabilities.
While
these solutions often work adequately for small job streams, they
rarely scale to handle job loads of complex network environments.
They also lack sophisticated user interfaces and reporting tools that
allow users to keep audit trails of job streams.
More
importantly, home-grown job schedulers quickly turn into full-time
programming commitments. As dependence increases on the tool, more
and more features get added. The result is usually a varied mix of
scripts, programs, and Unix utilities that only a few people actually
understand. This causes a situation prone to problems.
Mainframe
job scheduling is the complete opposite of Unix job scheduling. Mainframe
tools provide robust scheduling capabilities that handle huge, complex
job streams with ease. Mainframe schedulers group jobs into collections,
treating the collection as a single entity whose execution, success,
or failure can be tracked and used to trigger other jobs or collections
of jobs. Users start jobs and job collection using time triggers or
other criteria, such as creation of a file, mounting a tape, or the
shutdown of a database. The job scheduler is aware of almost all activity
within the system and can respond accordingly.
Using
screen-oriented user interfaces, system operators can track the status
of jobs, noting which are running long and which are completing. Using
this interface, operators can suspend jobs, delay execution, restart
jobs, and track schedule slippage. It's possible to alert an operator
if a job exceeds a maximum run time, or if a job failed to start due
to not met execution criteria.
Mainframe
schedulers also offer good reporting tools. They create execution
logs and report job failure and success. Analyzing these reports over
a period of time lets users see trends, such as accounting job streams
that take longer and longer to backup jobs that begin to press against
the limits of back windows.
What
to Look For in an Automated Job Scheduler
With
the increase in jobs in all businesses and the need to have these
jobs run more quickly, it makes sense and pays dividends to automate
job scheduling. Automating job scheduling yields several tangible
benefits:
·
Reduces personnel costs while freeing up those human resources for
more important and more profitable projects
· Launches jobs on time, thus improving efficiency and reduces
potential for human error
· Optimizes resources allowing more work to be accomplished.
A properly functioning job scheduling solution also allows new resources
to be added or existing resources to be reconfigured with minimal
impact on IT operations.
advertisement
When it comes to an analysis of the industry issues behind this
data, most industry experts agree that there are three major trends
are re-shaping the information infrastructures of financial institutions
today: Globalization, Automation & Service Levels. These three
trends serve as a stark reminder of the challenges that financial
organizations face today as they make the transition to a global,
web-based economy. Learn more by downloading the Business
Integrated Scheduling for Banking and Finance white paper from
BMC Software |
Whether
it's an NT/2000, Unix, mainframe, or something else, there are specific
capabilities a good automated job scheduler should have.
A good
scheduler supports non-temporal job triggers such as file creation
of system alerts. Users must be able to suspend job stream, slip a
schedule to another time of day, and cancel a single instance of a
job without affecting its overall schedule. There should be no limit
to the number of jobs that can be created, and the system should be
easy to use with 10 jobs as it is with 10,000 jobs.
And
the job scheduler should be not only a technical asset, but a business
asset. It should reduce costs, increase productivity, and maximize
efficiency so that IT can fulfill its mission of adding value to the
business.
Several
computing job scheduling architectures have emerged for heterogeneous,
distributed environments: collaborative; master and agent; and variations
of master and agent which include master, submaster, agent, and console,
master and agent. Because there are many similarities between master
and agent and its variations, one need complete the collaboraton with
the master agent architectures.
Master
and Agent Architecture
The
traditional architecture for job scheduling solutions is the master
and agent architectures. Schedulers using this model generally evolved
from mainframe concepts. This architecture involved putting a full
implementation of the job schedulers on one server, the master, and
putting agents on a series of other series, the agents.
In the
master and agent configuration, jobs are set up, scheduled and administered
from the master server. The actual work is done on the agents. The
agents communicate with the master throughout the job run as the master
passes parameters and other critical data to the agent. Jobs might
be partitioned among agents. As the job is passed from server to server,
communications must be maintained between agents and master. This
makes network availability critical to successful completion of jobs.
On the
one hand, the master and agent central administration allows tight
control over jobs. This benefit comes at the cost of central, top-down,
rather than inflexible tree structure. On the other hand, the most
significant limitation of master and agent systems is the requirement
for the master and agents to remain in sync. When the network or central
server is interrupted, how long will it take to reconstruct your activity?
The well-known volatility of distributed networks is an important
consideration when considering schedulers based on master/agent architecture.
A second
area of concern is performance. In master and agent environments,
communication continually flows between the master and each of the
agents. As the work workload increases so does the network traffic.
As the traffic increases, the potential for overload expands.
Another
aspect to consider is scalability. A master can only support a limited
number of agents, and this depends on the number of jobs to be run.
Creating a new master or instance creates a new and separate administration.
The more instances you create, the more management you need. When
you create a new instance, you need to recreate all jobs. The process
can take days, weeks, or even months. The process itself can lead
to errors and failures at any point along the way. While the new instances
can be managed by the same administrator, within reason, the inability
to administer the entire job scheduling environment from a single
point increases complexity, and the likelihood of confusion and errors.
This
lack of scalability can affect your overall costs drastically. When
you create a new master, you need to add new hardware at the master
and agent levels. In a large enterprise, this could quickly grow to
a $1 million problem.
Collaborative
Architecture
Designed
for distributed environments, the collaborative architecture leverages
the combined computing power of networks. In collaborative architecture
environments, a full copy of the job scheduler is carried out on every
server on the network. With this technique, once a server is given
parameters for a job, it can run independently.
Each
server runs jobs independently of all others. Communication occurs
for coordination and updates. It effectively uses network resources
to combine mainframe-like robustness with distributed flexibility.
Administration
in collaborative environments is flexible. You can manage your job
scheduling from either a central point or at the local level.
Since
the collaborative architecture was designed for distributed environments,
it has many benefits. With a full working copy of the software on
every server, network downtime has diminished affect. Jobs continue
to run even during network outages. The same applies to individual
servers. If one server crashes, all other servers in the network continue
their jobs. Any interdependent jobs are held until the crashed server
resumes activity.
Since
jobs can run locally, network communications and overhead decrease.
This decrease translates into improved network and system performance.
In a
collaborative environment scaling can be limited to the size of your
network. Some job schedulers might be able to handle 500 servers each
running 1,000 jobs for a total of 500,000 jobs. Replicating jobs is
straightforward. Based on logical views of jobs and the environment,
even the most complex jobs can be replicated in minutes.
Another
distinct advantage is the most efficient use of hardware resources.
Typically, in a collaborative architecture, your total job scheduling
overhead is about one percent of central processing unit resources
on each server in the network. In master and agent profiles, you need
a dedicated server for the job scheduler itself plus a backup server
in case the master fails. This feature is in addition to resources
used on each server. Because of the limits on scalability, each time
you expand to a new master configuration, you need to add hardware
and software for the job scheduling server.
CIO
- Measure Your ROI!
The
pressure on IT to produce promised savings and efficiencies from new
technologies they implement will only increase. In an era of fiscal
belt tightening, these pressures increase even more. To this end,
automated job scheduling can alleviate some of these pressures while
adding value to the business. The time has come to measure that value
in terms of return on investment.