II. SOFTWARE FUNCTIONS
II.1 Using the invert.p processing file
II.2 Using Matlab functions for detailed processing
Loading your files
Viewing the raw data files
Viewing partial results with one command
Performing the inversion
Computing mean and median particle size
III. SPECIAL PROCESSING METHODS
III.1 Discarding problem data
III.2 Using a local data record as a background file
III.3 Using the optical path reduction modules (PRM)
IV. THE REASONABLENESS TEST
V. TO CONCLUDE
The formats of LISST-100 and LISST-100X data are identical. The instruments store a set of 40 variables in each data record. The first 32 of these are outputs of 32 ring detectors, each of which measures scattering of laser light from particles into a specific narrow sub-range of angles. The only difference between LISST-100 and LISST-100X is that the LISST-100X automatically switches to 10 times higher sensitivity when particle concentration is low. To keep data processing consistent between the two gains (standard or 10x), when operating at standard sensitivity the data are stored after multiplication by 10. This has an important implication for the MATLAB data processing as we shall see below. The other variables that are stored are, respectively, laser power transmitted through water (variable 33), battery voltage (variable 34), external analog voltage (variable 35), laser reference power (variable 36), pressure (variable 37), temperature (variable 38), and 2 variables which contain time information. These are variable 39: day number*100+hr, and variable 40: minutes*100+seconds.
This write-up explains how to process the data from the rings. [A detailed step by step guide is offered in section II]. To process this data, two pieces of information are required: a measure of the background light scattering from optical surfaces, and the field data from turbid water. The background (also called zscat) MUST BE measured prior to each experiment. The field data is the summation of the background and contributions from particles. The background as well as the scattering from particles, both is attenuated in water. Consequently, the data must be de-attenuated and for this purpose, the transmission measurement is included in the data stream. Transmission t is computed from the ratio of laser power transmitted through water in turbid conditions to its value in clean conditions – this is the ratio of variable 33 measured in situ and its value in clean water. Since laser output can drift over time, a correction for drift is applied using the laser reference sensor. If r is the ratio of laser transmitted power to laser reference measurement in clean water, then it follows that despite any drift in laser output, the clean water transmitted power would be the product of laser reference and r, i.e. r *element(36). It then follows that transmission for the ith data record in-situ is
t = data(i,33)/[r data(i,36)] (1)
One of the most useful steps in processing data is to view the time-series of t through the experiment. This can reveal periods of sediment events, bio-fouling, and instrument health (e.g. t >1 is not possible, so this must mean something like a bad previous background data file, or extreme water clarity, or instrument fault).
Once the transmission time series is computed, the net particle scattering is computed. Depending on whether the instrument is a LISST-100X or LISST-100, a division by 10 is carried out to adjust ring output:
data(:,1:32)= data(:,1:32)/10; if the instrument is a LISST-100X. (2)
The net scattering is computed by subtracting the background from the total. The background, if acquired using the Windows software and saved as an ASCII background file, is already saved as a standard gain measurement (i.e. no 10x gain), regardless of the instrument type. If, however, you acquire a binary file as a background file, then the background should also be reduced, similar to equation (2). You can identify a binary background file or an ASCII file, depending on the file type. The binary files have the .dat file extension, while the ASCII files have the .asc extension. Refer to the manual on how to acquire these datafiles.
As examples, if the background file is called background.asc, load it with:
background = load(‘backgroundfile.asc’);
However, if the background file was a binary file acquired with a LISST-100X, backgroundfile.dat, you must load it and divide by 10 as:
background(1:32)= background(1:32)/10; (3)
Variables 33 to 40 are not multiplied by 10, so no division is ever required. The next step is to compute the net particle scattering, after de-attenuating and subtracting the background:
scattering(i, 1:32) = data(i,1:32)/ t(i) – background (1:32)*data(i,36)/zsc(36) (4)
In Eq. 4, the data are first de-attenuated, then the background is corrected for drift in the laser reference sensor itself over time, and finally subtracted from the de-attenuated data.
Next, one applies the ring area correction file, provided with names such as Ringarea_2023.asc. To keep variable names short, one loads this file as:
dcal = load(‘Ringarea_2023.asc’);
and the corrected scattering (corrected for variations in areas of detector rings from ideal) is done by:
c_scat(i,:)= dcal.* scattering(i,:); (5)
Once the net scattering is computed, as in Eq.(4), the next step is to invert the data to generate size distribution, which we also call volume distribution since it is actually the volume concentration of particles in the different size classes. This step calls the proprietary routine invert.p and is computed as follows:
volume_dist(i,1:32) = invert(c_scat(i,:),instrument_type,ST,RANDOM,SHARPEN,GREEN,WAITBARSHOW)/Vcc (6)
From the volume distribution for the ith size class above, the total concentration is computed by simple summation. The variable Vcc in Eq.(6) is the Volume Conversion Constant. This is a calibration constant provided with the instrument and can be found in the InstrumentData.txt file.
Note that Sequoia leaves it to the user to convert volume concentration to mass. This is because the LISST-100 and LISST-100X instruments do not measure mass density (for this purpose, the LISST-ST may be used). Mass density can vary from 1.01 for extremely light biological type particles, to 2.65 for sand and clay type materials. The user should consult published literature for the appropriate density applicable to your case.
The functionality described above is all performed automatically in the Windows software provided by Sequoia. If you’re interested in processing your files in MATLAB, read on. For example, the invert.p function can perform all operations blindly in concert with getscat.m. The user only needs to specify the filenames and some constants. Alternately, the user may view details of each file, in which case a file must first be loaded into Matlab and then viewed. If the file is binary, (.dat) then the tt2mat.m function is invoked. If the file is ASCII, a simple load command is used. Below, we describe these functionalities separately.
You will need 3 files from Sequoia: getscat.m, tt2mat.m and invert.p. They can be downloaded from the library section on our website. Place them all in your MATLAB working folder.
The first step is to convert the binary .DAT file with scattering data into corrected scattering data (cscat). This is accomplished using getscat.m:
[scat,tau,zsc,data,cscat] = getscat(‘datafile’,’zscfile’,X,’ringarefile’);
‘datafile’ is the path and file name for the binary .DAT file offloaded from your LISST-100 or -100X instrument.
‘zscfile’ is the path and file name for your zscat (background) file, typically obtained using the Windows SOP.
‘ringareafile’ is the path and file name for your instrument specific ringarea_xxxx.asc file, where xxxx is the serial number of your LISST instrument. This file can be found on your ship disk.
Set X=1 if your data were obtained with a LISST-100X, all other values for LISST-100 format
getscat.m calls the routine tt2mat.m in order to read and convert your raw data to corrected scattering data. Once that is done, you can proceed with calling invert.p, which is a MATLAB function that returns the uncalibrated volume distribution and the midpoint of the size bins in µm. The general syntax is as follows:
cscat is the fully corrected scattering data in n x 32 format, obtained using getscat.m
1 for type A (5-500 µm)
2 for type B (1.25-250 µm size range)
3 for type C (2.5-500 µm size range)
4 for FLOC (7.5-1500 µm size range)
ST = 1 if the data are to be inverted in LISST-ST format (8 size bins)
ST = 0 if the data are to be inverted in LISST-100/LISST-100X format (32 size bins)
RANDOM = 1 if matrices based on scattering from randomly shaped particles are to be used for inversion. NOTE: Only type B and C instruments are supported for RANDOM = 1.
RANDOM = 0 if matrices based on scattering from spherical particles are to be used for inversion.
SHARPEN = 1 causes the routine to check if the size distribution is narrow and, if so, increases the number of inversions. Use this setting if you expect a narrow size distribution (e.g. if you are analyzing narrow-size standard particles).
GREEN = 1 if inversion is for a green laser unit (only for type B instruments as of September 2010)
WAITBARSHOW = 1 if user wants a waitbar to show during processing in order to keep track of progress.
vd – volume distribution (NOT CALIBRATED WITH VCC)
dias – the midpoint of the size bins for the 8 / 32 size classes for the appropriate instrument, inversion type and laser color
In order to convert the volume distribution into calibrated units, you must divide vd by the Volume Conversion Constant (VCC), which can be found as the 4th number in the InstrumentData.txt file that was provided with your instrument (on your ship disk). This is done using the script vdcorr.m
The syntax for converting vd to calibrated vd is:
vd = vdcorr(vd,VCC,flref,lref);
vd is the vd from invert.p
VCC is the Volume Conversion Constant for your instrument
flref is the factory laser reference value for you instrument (element 36 in your factory_zsc file).
lref is the laser reference value during measurement. For LISST-100X, LISST-STX and LISST-Deep it is element 36 in the .DAT file. If you have only one measurement in your .DAT file you can specify this as data(36), otherwise specify it as data(:,36).
The power of this function is that it processes all data at once, producing the volume distribution for the entire data file. Once done, you can proceed with plotting your results.
The weakness of this function is that the user does not see intermediate results. The invert.p processing script assumes that the data and background files are perfect. Often this is the case. However, scientists are always well advised to look at their data in detail. For this, several functions are provided that do partial processing. These are described next.
The most basic function is tt2mat.m. It reads a binary file (background file or data file, *.dat* extension). It is not suitable for ASCII files. To use:
data = tt2mat(‘datafilename’,40);
However, in case you are loading a LISST-100X binary data file one further step is necessary:
data(:,1:32) = data(:,1:32)/10;
This is done in order to compensate for the raw data being saved at high gain, or having been multiplied by 10 prior to storing on the data logger (cf. introduction).
zsc = tt2mat(‘background_file_name’,40);
(Again, if you are loading a LISST-100X binary background file you must compensate for the multiplication by a factor of 10:
zsc(1:32) = zsc(1:32)/10;)
The resulting variables now have a dimension of (row,40) where row is the number of records in the binary file. To compute a mean, as you would want to do with the zsc from a binary file, type
zsc = mean(zsc);
When reading an ASCII file (can be .asc or .log extension), use the following command:
data = load(‘datafilename’);
zsc = load(‘background_file_name’);
Specify the full file name, including the extension. Note that now, one need not specify the number of variables per record, 40. The ASCII files are typically created by processing the binary files in the LISST-SOP Windows software and DO NOT need to be divided by 10, as the Windows software already performs this operation.
It is of course possible to load a binary data file and an ASCII background file (or vice versa) and use them together. In this case, for a LISST-100X:
data = tt2mat(‘datafilename.dat’,40);
data(:,1:32) = data(:,1:32)/10;
zsc = load(‘background_file_name.asc’);
These functions permit you to look at the details of your data. For example, you may look at the time series of the background file (only if this is a binary file; the .asc file already is averaged before writing to computer memory). Viewing can be done by typing;
The first of these would display a time series of all 32 ring detectors, the latter will show all the records, as backgrounds across the 32 rings. In the time series, spikes in rings may be revealed, suggesting bubbles or contamination during collecting a background file. In the second display, the background pattern and its variability during the data acquisition will be revealed. A background file should not show high variability (i.e. less than a few counts). Variability in backgrounds comes from contamination. For example, you may plot the standard deviation of the background light on each of the 32 rings with a simple command:
As noted, if the background file is of the .asc or .log type, it is only a 40-variable file. The variables are the averages of multiple scans during data acquisition. This command will not work with zsc from an ASCII file.
You may use the same commands to view a time series of your in-situ data file. Generally, we advise that you first look at the laser transmission and laser reference, i.e. variables 33 and 36, as follows:
You can plot these variables simultaneously with any others, e.g.:
will show the laser transmitted and reference power time series. Matlab follows a color scheme when multiple variables are plotted. The color scheme follows the order: blue, green, red, cyan…
To compute and plot the optical transmission for your data series,
r = zsc(33)/zsc(36);
Use this form after zsc is loaded as an ASCII file, or after taking the mean of the variable zsc if loading the background binary file. Now, the transmission is (cf. Eq. 1):
tau = data(:,33)./r./data(:,36);
Plotting the time series of your transmission record is simple:
or, to put symbols at the data points (see Matlab guide for allowed symbols):
To compute the net scattering by de-attenuating the measurements and subtracting the background (cf. Eq. 4):
[row,col] = size(data);
for i = 1:row
scat(i,:) = data(i,1:32)/tau(i) – zsc(1:32)*data(i,36)/zsc(36);
The getscat.m function performs all these steps automatically. To use, input the data file name, the background (zscat) file name and the ringarea file name. The latter is a file provided with each instrument called RingArea_XXXX.asc where the X’s are replaced with your instrument’s serial number. A multiplication with this array of 32 numbers replaces the scat variable with a corrected scat variable, which is called cscat (cf. Eq. 5). To use:
where X=1 implies data in LISST-100X format, necessitating division by 10. If X is 0, no division is performed.
The final step is to invert the corrected scattering (cf. Eq. 6):
For a standard LISST-100X type C instrument, inverting using the matrices based on scattering from spherical particles, in LISST-100 data format and displaying a waitbar while processing:
Finally, perform the calibration.
vd = vdcorr(vd,VCC,flref,lref);
VC = sum(vd’);
Now the full volume distribution time series and concentration time series VC have been computed and if the correct VCC value has been used the output is volume distribution and concentration in µl/l.
In order to compute values such as mean and median particle sizes from the volume distribution, the MATLAB script compute_mean.m can be used. The call is:
diameters = compute_mean(vd, type, transmission)
vd is the VCC corrected volume distribution
type is LISST instrument type (use help compute_mean for details) and
transmission is the optical transmission (tau) from getscat.m.
The latter is an optional input. The structure of the output matrix (diameters) is described in the help notes to compute_mean.m
The averaging can be done at any step. For example, one may average the raw data before computing a transmission time series, or do the averaging after computing the corrected net scattering cscat. Alternately, averaging can be done after inversion. We leave it to the user to explore the statistical consequences of the choice they make.
Often, one simply wants to remove problem data from analysis. This may occur because the instrument is out of water and apparent optical transmission is zero (due to misalignment when instrument is out of water). This causes scat to be infinite and so on. To remove a data point from consideration, simply type:
data(bad_data_point_index,:) = ;
If a number of points appear problematic, you can get their indices using the ginput command in Matlab. Plot the optical transmission (or whatever), then type ginput. A cross-hair will appear when you take the cursor to the plot. Click at all the bad points. Their indices will appear in the first column of the 2 column variable ans. For example:
The indices are the integer values of the variable ans in the first column. You can read these in and convert to integers which are required for index values:
bad_data_point_index = fix(ans(:,1)); and then
data(bad_data_point_index,:) = ;
This is often useful in particularly sensitive environments. For example, if you are doing a profile in the water column and suspect that the instrument is going out of alignment due to pressure (we admit, sometimes this may happen!), then find a local minimum again using the ginput command, or by plotting some other variable. Example methods may be: find the point where t is highest. This can be done by:
[a,b] = max(tau);
zsc = data(b,:);
One may then save this estimate of the background and process the data as already described. To save, type
save zsc.asc zsc –ascii
Another example is encountered in looking at deep ocean profiles. Here, instead of the maximum transmission, one may find a local point that is suitable as a background. For example, plot the ratio data(:,33)/data(:,36) against depth [data(:,37)]. This is something like optical transmission. The data may involve profiles across a hydrothermal plume. Find a point just above the plume and call it the local background. You may use the ginput command to identify this local background. Again, you may save this as a background, or use it in Matlab in the detailed procedure described in II.2.
At high sediment concentrations the beam might be attenuated so much that no light is being transmitted. For this case the optical path reduction modules have been developed. They are essentially glass plugs that drop into the optical path, thereby reducing the total path length in water. Typically the path length is reduced by 50% (to 2.5 cm) or by 80% (to 1 cm). If it is decided to use a PRM, two things should be borne in mind:
- It is necessary to obtain a zscat with the PRM in place and use this zscat when processing the data obtained with that particular PRM.
- An additional post-processing step is required in order to get correct volume concentration and beam attenuation values, explained below:
Beam attenuation is computed as (-1/r)*ln(t), where r is the path length in m. It can be seen from this equation that for a given transmission value, beam attenuation is inversely related to path length in a linear manner. If beam attenuation was computed for a 5 cm path while an 80% PRM (1 cm effective path length) was in fact installed, the actual beam attenuation would be five times higher than the one computed using the 5 cm path length.
In a similar manner the volume concentration on the 32 rings must be adjusted. The volume conversion constant is determined in the factory, using the full 5 cm path length. If a 50% PRM has been used the LISST will underestimate the volume concentration by 50%, if an 80% PRM has been used the under estimation will be 80%. To adjust, perform the following step immediately after the inversion (see Performing the inversion in section II above):
Read the Vcc value from the InstrumentData.txt file and type
Vcc = Vcc * (1-(%PRM/100));
vd = vd./Vcc;
VC = sum(vd’);
where %PRM is the reduction in path length (in percent) by the PRM.
Now the full volume distribution time series and concentration time series VC have been computed and the Vcc value has been properly adjusted to the PRM so that the output is volume distribution and concentration in µl/l.
Ever hear an old guy in an audience say, with great authority: “I don’t believe it!”. The final test of your results is reasonableness. This test uses all the knowledge and wisdom of your past experience to evaluate the current result. If it does not make sense, there is probably a good reason.
Here are some criteria that you can use to evaluate if ‘it’ does make sense:
- Is the net scattering smooth? The final net scattering variable cscat should be smooth. The nature of light scattering by particles is such that sudden spikes in an individual ring can not arise. For example, a cscat curve for the 123rd scan shows a spike at ring 23. This is not possible. No particle size produces a spike at a single ring. What is the likely cause? Probably the background data is not good, or the estimate of optical transmission is not good. Sometimes, a minor adjustment of the background or the transmission can remove such a spike. At other times, you can use the criterion of a smooth cscat and sophisticated mathematical routines to find a best fit estimate of transmission. We have successfully used a minimization technique that seeks the smoothest cscat by varying transmission. Such routines are available in Matlab. The function fmins is one such function.
- A persistent high value at the inner rings spells trouble. In situations where large particles are not always present, a persistent high value at the inner rings can not exist in cscat. Again, this would point to a bad estimate of the background zscat.
We look forward to user feedback to expand this list.
Matlab is a powerful environment for data processing, and particularly so for those difficult situations which arise either due to instrument malfunction, or due to extreme conditions of low or high concentrations. If the data just doesn’t make sense, contact us and we’ll look at it.
Yogi Agrawal & Ole Mikkelsen
Update May 15, 2008: Confusing use of the words transmission and attenuation in the introduction was clarified.
Update September 22, 2010: Rewritten to be used with invert.p instead of nlia.p
Update March 4, 2011: Added section and reference to compute_mean.m
Update October 4, 2011: Minor re-wording
Update May 4, 2012: Added vdcorr.m correction
Update July 11, 2012: Added minor extra descriptions.
Update Jan 4, 2015: Fix broken links
Questions to this application note? Email us!