GXP Grid & Cluster Shell †
NEWS †
GXP is a parallel shell tool that can work on any machine reachable
via one of these "remote-exec" commands (ssh, rsh, batch queueing
commands, etc.). Machines may be in your office LAN, in a cluster, or
in multiple LANs/clusters widely distributed. These machines may be
behind NAT routers or firewalls, as long as you can reach them via
multiple hops of remote-exec commands (e.g., ssh to a gateway and then
to hosts behind firewalls). No matter where they are, GXP connects
all reachable machines with underlying remote-exec commands and lets
you operate them very efficiently. With a single stroke of typing a
command, you can invoke it on hundreds of computers in seconds (demo
video).
What you can do with GXP †
Basically you can run an identical or a similar command line to many
machines in parallel and get results back interactively. You can
choose host to execute commands on interactively by various criterion.
Tools accompanying GXP allow you to run embarrassingly parallel jobs
on any machine reachable via GXP.
Some notable features †
- Easy to install
- When you use GXP on many hosts (as you probably
do), you need to install it only on one host. GXP automatically copies
itself to remote hosts you run commands on, as necessary. It is
written in Python so installation only takes download + PATH setting
(plus whatever your underlying remote-exec commands demand for
successful authentication).
- Fast in distributed environments
- GXP brings up daemon processes on
hosts you run commands on and keeps them running until you quit a
session. It does not require connection establishment and
authentication for every single command submission. This feature makes
operations fast especially in distributed environments where some
hosts are remote, and in large scale environments having many hosts.
- Scalable
- Daemon processes are connected in a tree structure so as
to keep the number of processes directly connected to a single process
below a constant threshold. Thus GXP comfortably scales to hundreds of
processes without stressing the root host.
- Interactive
- GXP allows you to interactively specify hosts on which
you bring up daemons and then select hosts to run commands on. It
keeps these configurations in a session state. In a simple
cluster-like enviroment, you typically need only a few interactions
with GXP to run commands in parallel on cluster hosts. In more complex
and heterogeneous environments, you can write whatever configurations
are necessary just as a shell script you are familar with.
- Useful for parallel processing
- GXP accompanies a tool to perform
so-called "embarrassingly parallel tasks" with a trivial amount of learning and
setting. It does not need any other infrastructure (e.g., batch
queueing system).
today 1 / total 53045 accesses