What is Nebulous?

Nebulous is the file distribution program for pantasks. It takes a list of available hosts, and distributes the output files from any pantasks processing across those hosts, so that no one disk is being heavily abused by lots of file I/O.

How do I configure nebulous?

In your site.config file, specify the nebulous server:

# nebulous server NEB_SERVER STR http://ipp004/nebulous/ # Nebulous server

Set up a metadata config file that tells pantasks where to allow Nebulous to place files. This is the ipphosts.config file indicated in the server.pro script on the Pantasks_server_mode page.

How do I list the contents of a directory located by Nebulous?

Use neb-ls which is like 'ls' for nebulous.

Nebulous has assigned a path, how do I find the file?

e.g. suppose you want to find some files that were processed with nebulous file assignments:

+------------------------------------+---------+-------------------------------------------------------------------------------+
| workdir                            | chip_id | path_base                                                                     |
+------------------------------------+---------+-------------------------------------------------------------------------------+
| neb://@HOST@.0/gpc1/scitest.200807 |     293 | neb://ipp009.0/gpc1/scitest.200807/o4642g0414o.19490/o4642g0414o.19490.ch.293 | 
| neb://@HOST@.0/gpc1/scitest.200807 |     293 | neb://ipp011.0/gpc1/scitest.200807/o4642g0414o.19490/o4642g0414o.19490.ch.293 | 
| neb://@HOST@.0/gpc1/scitest.200807 |     294 | neb://ipp009.0/gpc1/scitest.200807/o4642g0415o.19491/o4642g0415o.19491.ch.294 | 

If you know the full nebulous name of the file (with extension) then you can use ipp_datapath.pl:

ipp_datapath.pl neb://ipp009.0/gpc1/scitest.200807/o4642g0414o.19490/o4642g0414o.19490.ch.293.XY10.ch.fits

/data/ipp006.0/nebulous/ae/ed/3145232.gpc1:scitest.200807:o4642g0414o.19490:o4642g0414o.19490.ch.293.XY10.ch.fits

Alternatively, the program neb-locate is slightly faster. Use the --path argument to return just the path to the file of interest, without the "file://" prefix (useful for scripting).

neb-locate says that "--server" is a required option, but I already set the NEB_SERVER in site.config!

A workaround is just to give it what it wants. Put this in your .tcshrc or equivalent:

alias neb-locate "neb-locate --server http://ipp004/nebulous/"

(where you should replace the http:// part with the URL for whatever machine is running your nebulous server).

Another more convenient solution is to set the environment variable NEB_SERVER to the location of your nebulous server.

A pantasks process failed, where did Nebulous put the .trace and .log files?

First, figure out the base path of the output files you are looking for. This may appear in the error stream on pantasks, or you can query the MySQL database, or dig it out of ippMonitor, or (perhaps?) use one of the ippTools. Once you have path_base, you need to determine the extension that your trace or log file has. If it was a chip-level process that failed, then you need to know which chip. This is stored in the "class_id" column of the chipProcessedImfile table in mySQL. Then you can tack on the .log or .trace extension to finish off the query.

Here is an example: