NOTE: These instructions apply to the LISST-Portable, but not to the newer LISST-Portable|XR.I. INTRODUCTION
II. SOFTWARE FUNCTIONS
II.1 Using the invert.p processing file
II.2 Using Matlab functions for detailed processing
Loading your files
Viewing the raw data files
Viewing partial results with one command
Performing the inversion
Averages
III Discarding problem data
IV. THE REASONABLENESS TEST
V. TO CONCLUDE
I. INTRODUCTION
The data formats of the LISST-StreamSide are different from those of the LISST-100 and LISST-100X. Each time a new experiment is started on the LISST-StreamSide, 4 files are created: Ldddhhmm.ASC, Ldddhhmm.DAT, Zdddhhmm.DAT and Ldddhhmm.LOG. The .ASC file contains fully processed data in a comma-separated ASCII format, all the user has to do is to offload the .ASC file and it can be viewed immediately in a spreadsheet, e.g. EXCEL. The .LOG file contains the settings of the instrument during sampling. The .DAT file with an L prefix contains the raw scattering data as well as the zscat data to be used with each individual scattering record and has 80 variables in each data record. The .DAT file with a Z prefix only contains the zscat raw scattering data and has 40 variables in each data record. This note is only concerned with the two .DAT files and how to use them and process them.
In the .DAT file with an L prefix, the first 32 variables are the outputs of the 32 ring detectors, each of which measures scattering of laser light from particles and water into a specific narrow sub-range of angles. Variables 33 to 40 are, respectively, laser power transmitted through water (variable 33), battery voltage (variable 34), empty variable (variable 35), laser reference power (variable 36), external water tank level in % (variable 37), year (variable 38), and 2 variables which contain time information. These are variable 39: day number*100+hr, and variable 40: minutes*100+seconds.
Variables 41-72 contains the raw zscat data for each of the measurements, that is, each and every scattering measurement has a zscat measurement associated with it. However, depending on the setting of the LISST-StreamSide, the zscat data in variables 41-72 can be the same for a number of different samples, as it is possible to program the LISST-StreamSide to collect zscat data at a rate different from that of the particle scattering data collection. Variables 73 to 80 are, respectively, zscat laser power transmitted through water (variable 73), battery voltage (variable 74), empty variable (variable 75), zscat laser reference power (variable 76), empty variable (variable 77), zscat year (variable 78), and 2 variables which contain zscat time information. These are variable 79: day number*100+hr, and variable 80: minutes*100+seconds.
The .DAT file with a Z prefix is identical to variables 41-80 in the .DAT file with an L prefix. This .DAT file thus only contains the zscat data, stored in variables 1-40.
This write-up explains how to process the data from the rings. [A detailed step by step guide is offered in section II]. To process this data, two pieces of information are required: a measure of the background light scattering from optical surfaces, and the field data from turbid water. The background (also called zscat) is measured automatically by the LISST-StreamSide. The frequency with which the background is collected is programmed by the user. Background measurements MUST BE obtained on a regular basis when the LISST-StreamSide is measuring in the field.
The field data is the summation of the background and contributions from particles. The background as well as the scattering from particles are both attenuated in water. Consequently, the data must be de-attenuated and for this purpose, the transmission measurement is included in the data stream. Transmission t is computed from the ratio of laser power transmitted through water in turbid conditions to its value in clean conditions – this is the ratio of variable 33 measured in situ and its value in clean water. Since laser output can drift over time, a correction for drift is applied using the laser reference sensor. If r is the ratio of laser transmitted power to laser reference measurement in clean water, then it follows that despite any drift in laser output, the clean water transmitted power would be the product of laser reference and r, i.e. r *element(36). It then follows that transmission for the ith data record in-situ is
t = data(i,33)/[r data(i,36)] (1)
One of the most useful steps in processing data is to view the time-series of t through the experiment. This can reveal periods of sediment events, bio-fouling, and instrument health (e.g. t >1 is not possible, so this must mean something like a bad previous background data file, or extreme water clarity, or instrument fault).
Once the transmission time series is computed, the net particle scattering is computed. The net scattering is computed by de-attenuating and subtracting the background from the total (remember that the background is in variables 41-80):
scattering(i, 1:32) = data(i,1:32)/ t(i) – data(i,41:72) (2)
A final step is necessary to correct for drift in the laser reference sensor itself over time:
scattering(i,1:32) = scattering(i,1:32)* data(i,76)/data(i,36) (3)
Next, one applies the ring area correction file. For the LISST-StreamSide the ring area file is always called RA_C.ASC. This file can be loaded as
dcal = load(‘RA_C.ASC’);
and the corrected scattering (corrected for variations in areas of detector rings from ideal) is done by:
c_scat(i,:)= dcal.* scattering(i,:); (4)
Once the net scattering is computed, as in Eq.(3), the next step is to invert the data to generate size distribution, which we also call volume distribution since it is actually the volume concentration of particles in the different size classes. This step calls the proprietary routine nlia.dll or nlia.p and is computed as follows:
volume_dist(i,1:32) = invert(c_scat(i,:),instrument_type,ST,RANDOM,SHARPEN,GREEN,WAITBARSHOW)/Vcc (5)
From the volume distribution for the ith size class above, the total concentration is computed by simple summation. The variable Vcc in Eq.(5) is the Volume Conversion Constant. This is a calibration constant provided with the instrument and can be found in the .LOG file of the LISST-StreamSide.
Note that Sequoia leaves it to the user to convert volume concentration to mass. This is because the LISST-StreamSide instrument does not measure mass density (for this purpose, the LISST-STX may be used). Mass density can vary from 1.01 for extremely light biological type particles, to 2.65 for sand and clay type materials. The user should consult published scientific literature for the appropriate density applicable to your case.
II. SOFTWARE FUNCTIONS
Sequoia provides MATLAB scripts that can be used to view the zscat data from the LISST-StreamSide, as well as process the raw scattering data into size distribution if the user wish to do so. Below, these functionalities are described in more detail.
II.1 Using the invert.p processing file
You will need the following files from Sequoia: getscat_lp_lss.m, tt2mat.m, invert.p, vdcorr.m, and compute_mean.m, which you can download as a ZIP archive. Place them all in your MATLAB working folder.
The first step is to convert the binary .DAT file with scattering data into corrected scattering data (cscat). This is accomplished using getscat.m:
[scat,tau,zsc,data,cscat,r] = getscat_lp_lss(‘datafile’,readfile,instrument_type);
‘datafile’ is the path and file name for the binary .DAT file offloaded from your LISST-StreamSide or LISST-Portable.
readfile has a value of 0 or 1. Set it to 1 to indicate that you are going to read a file.
instrument_type is the instrument type. Set it to 2 for a type B instrument (1.25-250µm) or 3 for a type C instrument (2.5-500µm). All LISST-StreamSide’s are type C instruments.
getscat_lp_lss.m calls the routine tt2mat.m in order to read and convert your raw data to corrected scattering data. Once that is done, you can proceed with calling invert.p, which is a MATLAB function that returns the uncalibrated volume distribution and the midpoint of the size bins in µm. The general syntax is as follows:
[vd dias]=invert(cscat,instrument_type,ST,RANDOM,SHARPEN,GREEN,WAITBARSHOW);
where
cscat is the fully corrected scattering data in n x 32 format, obtained using getscat.m
instrument_type is
2 for type B (1.25-250 µm size range)
3 for type C (2.5-500 µm size range)
ST = 1 if the data are to be inverted in LISST-ST format (8 size bins)
ST = 0 if the data are to be inverted in LISST-StreamSide/LISST-Portable format (32 size bins)
RANDOM = 1 if matrices based on scattering from randomly shaped particles are to be used for inversion.
RANDOM = 0 if matrices based on scattering from spherical particles are to be used for inversion.
SHARPEN = 1 causes the routine to check if the size distribution is narrow and, if so, increases the number of inversions. Use this setting if you expect a narrow size distribution (e.g. if you are analyzing narrow-size standard particles).
GREEN should always be set to 0 for LISST-Portable and LISST-StreamSide instruments. It is for use with special versions of the LISST-100X .
WAITBARSHOW = 1 if user wants a waitbar to show during processing in order to keep track of progress.
Outputs are:
vd – volume distribution (NOT CALIBRATED WITH VCC)
dias – the midpoint of the size bins for the 32 size classes for the appropriate inversion type.
In order to convert the volume distribution into calibrated units, you must divide vd by the Volume Conversion Constant (VCC), which can be found in the .LOG file of your LISST-StreamSide and LISST-Portable. You also need the factory laser reference value, which can also be found in the .LOG file.
The syntax for converting vd to calibrated vd is:
vd = vdcorr(vd,VCC,flref,lref);
where
vd is the vd from invert.p
VCC is the Volume Conversion Constant for your instrument
flref is the factory laser reference value for you instrument.
lref is the laser reference value during measurement. For LISST-Portable and LISST-StreamSide it is element 36 in the .DAT file.
The power of these functions are that they process all data at once, producing the volume distribution for the entire data file. Once done, you can proceed with plotting your results.
The weakness of this function is that the user does not see intermediate results. The invert.p processing script assumes that the data and background files are perfect. Often this is the case. However, scientists are always well advised to look at their data in detail. For this, several functions are provided that do partial processing. These are described next.
II.2 Using Matlab functions for detailed processing
Loading your files:
The most basic function is tt2mat.m. It reads a binary file (background file or data file, *.dat* extension). It is not suitable for ASCII files. To use:
data = tt2mat(‘datafilename’,80);
or, in the case of zscat file from the LISST-StreamSide:
zsc = tt2mat(‘zscatfilename’,40);
Viewing the raw data files:
These functions permit you to look at the details of your data. For example, you may look at the time series of the background file. Viewing can be done by typing;
plot(zsc(:,1:32))
or
plot(zsc(:,1:32)’)
The first of these would display a time series of all 32 ring detectors, the latter will show all the records, as backgrounds across the 32 rings. In the time series, spikes in rings may be revealed, suggesting bubbles or contamination during collecting a background file. In the second display, the background pattern and its variability during the data acquisition will be revealed. A background file should not show high variability (i.e. less than a few counts). Variability in backgrounds comes from contamination. For example, you may plot the standard deviation of the background light on each of the 32 rings with a simple command:
plot(std(zsc(:,1:32)))
You may use the same commands to view a time series of your in-situ data file. Generally, we advise that you first look at the laser transmission and laser reference, i.e. variables 33 and 36, as follows:
plot(data(:,[ 33 36]))
This will show the laser transmitted and reference power time series. Matlab follows a color scheme when multiple variables are plotted. The color scheme follows the order: blue, green, red, cyan…
To compute and plot the optical transmission for your data series,
r = zsc(:,33)./zsc(:,36);
Now, the transmission is (cf. Eq. 1):
tau = data(:,33)./r./data(:,36);
Plotting the time series of your transmission record is simple:
plot(tau)
or, to put symbols at the data points (see Matlab guide for allowed symbols):
plot(tau,’.’)
Viewing partial results with one command:
The getscat_lp_lss.m function performs all these steps automatically. To use:
[scat,tau,zsc,data,cscat,r] = getscat_lp_lss(‘datafile’,readfile,instrument_type);
Performing the inversion:
The next step is to invert the corrected scattering, cscat.
[vd dias]=invert(cscat,instrument_type,ST,RANDOM,SHARPEN,GREEN,WAITBARSHOW);
For a standard LISST-StreamSide instrument, inverting using the matrices based on scattering from randomly shaped particles:
[vd dias]=invert(cscat,3,0,1,0,0,0);
Finally, perform the calibration.
vd = vdcorr(vd,VCC,flref,lref);
C = sum(vd’);
Now the full volume distribution time series and concentration time series C have been computed and if the correct VCC value has been used the output is volume distribution and concentration in ml/l.
Averages:
The averaging can be done at any step. For example, one may average the raw data before computing a transmission time series, or do the averaging after computing the corrected net scattering cscat. Alternately, averaging can be done after inversion. We leave it to the user to explore the statistical consequences of the choice they make.
III Discarding problem data
Often, one simply wants to remove problem data from analysis. This may occur because the instrument is out of water and apparent optical transmission is zero (due to misalignment when instrument is out of water). This causes scat to be infinite and so on. To remove a data point from consideration, simply type:
cscat(bad_data_point_index,:) = [];
If a number of points appear problematic, you can get their indices using the ginput command in Matlab. Plot the optical transmission (or whatever), then type ginput. A cross-hair will appear when you take the cursor to the plot. Click at all the bad points. Their indices will appear in the first column of the 2 column variable ans. For example:
ginput
ans =
20.7373 894.3860
26.5438 889.8246
37.0507 911.5789
40.6452 880.3509
60.5530 868.7719
The indices are the integer values of the variable ans in the first column. You can read these in and convert to integers which are required for index values:
bad_data_point_index = fix(ans(:,1)); and then
cscat(bad_data_point_index,:) = [];
IV. THE REASONABLENESS TEST
Ever hear an old guy in an audience say, with great authority: “I don’t believe it!”. The final test of your results is reasonableness. This test uses all the knowledge and wisdom of your past experience to evaluate the current result. If it does not make sense, there is probably a good reason.
Here are some criteria that you can use to evaluate if ‘it’ does make sense:
- Is the net scattering smooth? The final net scattering variable cscat should be smooth. The nature of light scattering by particles is such that sudden spikes in an individual ring can not arise. For example, a cscat curve for the 123rd scan shows a spike at ring 23. This is not possible. No particle size produces a spike at a single ring. What is the likely cause? Probably the background data is not good, or the estimate of optical transmission is not good. Sometimes, a minor adjustment of the background or the transmission can remove such a spike. At other times, you can use the criterion of a smooth cscat and sophisticated mathematical routines to find a best fit estimate of transmission.
- A persistent high value at the inner rings spells trouble. In situations where large particles are not always present, a persistent high value at the inner rings can not exist in cscat. Again, this would point to a bad estimate of the background zscat.
We look forward to user feedback to expand this list.
V. TO CONCLUDE
Matlab is a powerful environment for data processing, and particularly so for those difficult situations which arise either due to instrument malfunction, or due to extreme conditions of low or high concentrations. If the data just doesn’t make sense, contact us and we’ll look at it.
Yogi Agrawal & Ole Mikkelsen
October 31, 2011: First version for LISST-Portable and LISST-StreamSide data format
January 4, 2015: Fix broken links, minor formatting |