11.3. File type plugins¶
11.3.1. Extending ATHENA to read new file types¶
ATHENA uses IFEFFIT's read\_data()
function or
LARCH's read_ascii() function to
import data. This means that ATHENA's notion of what is an
acceptable data format is completely identical to IFEFFIT's
(or LARCH's) notion. The contrapositive is also true – if
IFEFFIT (or LARCH) can read a data file, so can
ATHENA.
In practice, this works great. IFEFFIT is able to read the data files generated by many of the world's XAS beamlines. And so, consequently, is ATHENA. Sadly, there are many beamlines that use a format that confounds IFEFFIT and ATHENA. LARCH is rather more intelligent, but still unable to read some of the wackier file types. There are two obvious ways that I could deal with data from those beamline:
- Refuse to deal with them and require the user to transform the data into a form that IFEFFIT can handle.
- Hard-wire code into ATHENA to deal with each new data format as I become aware of it.
Neither of those are particularly user-friendly. ATHENA instead relies on a plugin architecture allowing ATHENA to be extended on the fly to deal well with new data formats without having to change the underlying code.
This page documents the plugin architecture so that ATHENA's users can write their own file type plugins.
11.3.2. Overview of how plugins work¶
In simple language, a perl module is a short file containing special perl code placed in a special location. ATHENA uses the code contained in that file to recognize and pre-process data files so that they can be imported properly using IFEFFIT or LARCH.
In somewhat more technical language, a plugin is just a perl module placed on your computer in a place where it can be found. This file is used when ATHENA starts and its methods are available when data are imported.
When a plugin is available for use, it is invoked every time a file is imported into ATHENA using the Open file function. The new file is checked using one of the plugin's methods to ascertain if the file is of the sort serviced by the plugin. If the file is recognized, another method in the plugin transforms the original data file into a form that is readable by IFEFFIT or LARCH. This transformation is done in a way that leaves the original data file unchanged.
If the transformation is successful, the user is presented with ATHENA's column selection dialog and can import data in the normal manner. Ideally, a plugin is written in a way that makes the import of the data into ATHENA a completely transparent process for the user.
11.3.3. Example plugin¶
Here is a complete example of a functional plugin taken from the DEMETER distribution. This plugin allows ATHENA to import files from NSLS beamline X10C. As you can see, the plugin is quite short. The following sections of this page will explain this example in detail.
package Demeter::Plugins::X10C;
use Moose;
extends 'Demeter::Plugins::FileType';
has '+is_binary' => (default => 0);
has '+description' => (default => "NSLS beamline X10C");
has '+version' => (default => 0.1);
has '+metadata_ini' => (default =>
File::Spec->catfile(File::Basename::dirname($INC{'Demeter.pm'}),
'Demeter', 'share', 'xdi', 'x10c.ini'));
sub is {
my ($self) = @_;
open D, $self->file or $self->Croak("could not open " . $self->file . " as data (X10C)\n");
my $first = <D>;
close D, return 0 unless (uc($first) =~ /^EXAFS/);
my $lines = 0;
while (<D>) {
close D, return 1 if (uc($first) =~ /^\s+DATA START/);
++$lines;
};
close D;
};
sub fix {
my ($self) = @_;
my $new = File::Spec->catfile($self->stash_folder, $self->filename);
($new = File::Spec->catfile($self->stash_folder, "toss")) if (length($new) > 127);
open D, $self->file or die "could not open " , $self->file . " as data (fix in X10C)\n";
open N, ">".$new or die "could not write to $new (fix in X10C)\n";
my $header = 1;
my $null = chr(0).'+';
while (<D>) {
$_ =~ s/$null//g; # clean up nulls
print N "# " . $_ if $header; # comment headers
($header = 0), next if (uc($_) =~ /^\s+DATA START/);
next if ($header);
$_ =~ s/([eE][-+]\d{1,2})-/$1 -/g; # clean up 5th column
print N $_;
};
close N;
close D;
$self->fixed($new);
return $new;
}
sub suggest {
my ($self, $which) = @_;
$which ||= 'transmission';
if ($which eq 'transmission') {
return (energy => '$1',
numerator => '$4',
denominator => '$6',
ln => 1,);
} else {
return ();
};
};
__PACKAGE__->meta->make_immutable;
1;
11.3.4. Namespace¶
The module must be in a particular namespace. The namespace is defined
by the package function on line 1 of the example. The package must be
below the Demeter::Plugins
namespace and should have a name that
is descriptive of what format it is made for. In the case of the
example, the plugin is intended to transform files from NSLS beamline
X10C, so the full namespace of the module is
Demeter::Plugins::X10C
. Lines 3, 4, 62, and 63 are some requisite
boilerplate which allow this module to work properly with
DEMETER and ATHENA.
11.3.5. Required methods and variables¶
The plugin must supply three methods and must set several attributes of the Plugin object.
11.3.5.1. required attributes¶
Lines 12-14 define the two required variables in a way that allows them to be accessed outside the scope of this module.
is_binary
- (Line 6) A boolean that tells ATHENA whether the input file format is in a text or binary format. ATHENA handles binary files slightly differently in the column selection dialog.
description
- (Line 7) A short text string describing the purpose of this plugin. This string will be displayed in the plugin registry. This description should be no more than a few dozen characters.
version
- (Line 8) This is a numeric version of the plugin.
metadata_ini
- The file in
share/xdi/
folder that contains metadata common to the beamline and facility. headers
- A reference to a hash containing additional metadata related to the work done by the plugin.
11.3.5.2. the is
method¶
Lines 12-23 show the is
method. This method is called by
ATHENA to try to recognize an input data file as being of a
particular format. In the case of this example, the X10C file is
recognized by some of the text in the first few lines of the
files. When the file is recognized, this method returns a true
value. If the test fails, it returns 0. When ATHENA sees
the true return value, it applies the fix method to transform the data
file into an IFEFFIT- or LARCH-friendly format.
It is quite important that the is method be fast. It is possible that a data file will have to be tested against a large number of plugins. If the is method is slow, file import will be slow.
11.3.5.3. the fix
method¶
Lines 26-46 show the fix
method. This method is called when the is
method returns true. In some manner it makes a copy of the original
data file and transforms that copy into a form that can be read by
IFEFFIT or LARCH. This method needs to follow a
number of strict rules, however within those rules there is a lot of
flexibility about how the transformation is accomplished and the scope
of what that transformation does to the data.
First and most important, never alter the original data! Either work on the contaents of the original file in memory or make a copy of the data, preferably in the stash folder (a folder known to DEMETER as a place for writing scratch files). At line 29, we see that file is opened in the stash folder for holding the transformed data. As the data is processed, the output is written to that file (see lines 36 and 40).
Do whatever chore needs doing to transform the portion of the original data file that needs attention. Afterwords close both the input and output files. It is esential that the files be closed, particularly on Windows, which locks opened files from other uses.
Finally set the fixed
attribute of the object to the path and name
of the transformed file and return that same string.
In the example given on this page, the first thing the fix
method
does is to create a file name in the stash directory for the
transformed file. Line 28 tells ATHENA to give the stash
file the same name as the original file (before calling this method,
ATHENA sets the filename
attribute appropriately) but
in the stash directory (the catfile method builds a fully resolved
filename in a platform transparent manner). Line 29 checks the length
of the fully resolved filename to avoid running into one of
IFEFFIT's internal limitations.
Three things are done to transform an X10C file. The header is
stripped of null characters, the header is commented out by putting
#
characters in the first column, and a formatting problem in some
files involving a lack of white space between columns is
resolved. Each line of the original file is read, operated on, and
written to the transformed file in the stash directory. The while loop
starting at line 34 reads through the file line-by-line and performs
the operations.
Lines 42 and 43 close the original and new file handles. The filter should always close the file handles. This is not such a huge issue under unix, but Windows places a lock on any open file handle. If you fail to close one, for as long as ATHENA is running no other process will be able to do anything with that file.
At line 45, the method returns with the fully resolved name of the transformed file. At no point was the original file altered. When ATHENA exits, it will clean up the stash directory, thus avoiding a pile up of unnecessary data files.
DEMETER ships with a number of differnt kinds of plugins. Some of them perform simple, linear transofrmations (like this one). Others interpret binary data. A couple export project files rather than data files. One even performs an on-the-fly deadtime correction for data from an energy dispersive detector. Examine them for hints about how to create your own plugins.
11.3.5.4. the suggest
method¶
Lines 48-59 show the suggest
method. This provides feedback for use
by the column selection dialog is selecting
initial guesses for the columns containing the numerator and denominator
of the data. In this case, the method suggests columns for transmission
data butmakes no suggestions of fluorescence data.
11.3.6. ATHENA's plugin registry¶
Because there might be a large number of file type plugins, it is
possible for the user to turn the checks for the file types on and
off. In the main menu, you will find the Plugin
Registry. This is a simple list of all plugins found in the system
and user directories. The check buttons enable and disable the
plugins. The value of the description
attribute is displayed in
the list (so be sure to choose a suitable and suitably short value for
that variable).
Note that the order in which the plugins are displayed above is the same order in which files are checked against the plugins. User plugins are checked before system plugins. After that the plugins are ordered alphabetically. If you want your system plugins to be checked against the data first, choose a name that comes early in the alphabetical sense.
Right-clicking on an item in the registry posts the context menu shown in the figure above. All such context menus have at least one item for reading the documentation contained in the plugin source code file. Some plugins, such as the one shown, also provide a way of configuring the behavior of the plugin.
11.3.7. System plugins and user plugins¶
ATHENA looks in two different places for these plugins. One place is in
ATHENA's installation location where it finds the plugins that come with
the horae distribution. The other is in the user's space (on Windows
plugins are located in C:\\Program File\\Ifeffit\\horae\\Ifeffit\\Plugins\\Filetype\\Athena\\
, on unix
$HOME/.horae/Ifeffit/Plugins/Filetype/Athena/)
. In both places, it reads
the contents of the plugin directory and attempts to import the files
which end in .pm.
11.3.8. Miscellaneous advice on plugins¶
- Cut-n-paste is an excellent way to get started on a new plugin. Make a copy of a plugin for a file that is similar to your own file and use that as the basis for your new plugin.
X15B.pm
is an example of a plugin for a binary format.- You can use any module that you need, thus you have all of CPAN
available to you when designing your plugin. If you need to do any
seriously heavy lifting, check out the
Math::Pari
module or the Perl Data Language - Although a well-tested, robust plugin should be your goal, one of the nice features of the plugin architecture is that a “good-enough” plugin is easy to write and can quickly get you over a hurdle.
DEMETER is copyright © 2009-2016 Bruce Ravel – This document is copyright © 2016 Bruce Ravel
This document is licensed under The Creative Commons Attribution-ShareAlike License.
If DEMETER and this document are useful to you, please consider supporting The Creative Commons.