*** Q. Do data gaps mean that we have some information but try to apply it
sensibly with regards to the area we don't have information or is it referring
solely to the fact that we just don't have specifics for each individual and
don't have a magic crystal ball with the future in it?
A. I put a slide show in the Module 10 Closure file, which you can access. (After
the show you'll have to find your way back here, the trick to link them eludes
me.) The first 10 slides explain this better by use of technical terms I've
tried to avoid. Show from here. I've
tried to just skim the surface of sampling and data analysis. When I teach this
in the classroom I have colleague, Ph.D. chemist-type, who gives that class.
It can get very involved. Take-home message is that you need to prepare and
plan for the sampling. Planning includes a careful review of the regulations,
coordination with the testing labs, and coordination with the agencies who are
responsible for accepting your final report.
*** Q. There were lots of words and term in the paper by Ames and Gold that
I have never heard of. But one question is torturing me the most: Who is Rachel
Carson?
A. Rachel Carlson wrote a book published in 1962, Silent Spring. The book was
a benchmark in public's attitude toward environmental pollution, especially
pesticides, and most especially DDT. Here is a great website http://onlineethics.org/index.html
and an article about Carson. http://onlineethics.org/moral/carson/main.html
The Ames and Gold article was poking fun at her apparent main thesis, that chemical
are bad. Her book would have lacked its appeal to the mass media if it would
have spoke to the technical issues very well, besides many were not defined
well in 1962. (That web site might provide a great final exam question.)
***Q. Please clarify the response you were looking for in your question about
the "two commonly" used extrapolations to predict human health effects
based on animal testing.
A. See discussion in 10B. The answer was 1.) The extrapolation from high doses
to low doses in animals, and two the extrapolation from animal testing to humans.
This latter is not an "extrapolation" in the mathematical or scientific
sense.
*** Q. On the exposures page of submodule 10A, non detected chemicals were discussed.
When referring to the contaminated site that many of the samples do not report
a particular chemical, does this mean the value was zero or that the test for
a certain chemical might not have been done?
A. The lab reports should never show zero. They show "ND" for non-detect
or other codes. They mean the sample was presented to the lab. Some samples
misfire for one reason or the other and these are also coded. Generally the
risk assessor gets all the lab data, and then tries to arrange it so those questions
do not come up. That can be a big job on a complicated site. Be nice to your
chemist.
***Q. Also, if you plug zeros into ttest, the UCL will drop. Does this drop
mean a decrease or be eliminated to zero?
A. Plugging a whole bunch of the same number into a ttest will increase the
degrees of freedom and, if the number is close to the average, bring the UCL
in. Most environmental data has lots of zeros, and the important limits are
not that far from zero. So, how you handle the non-detects can skew you data
quite a bit. See next.
*** Q. What would be a abnormal distribution---something other than a bellcurve???
A..The most common in the log-normal. Remember the normal tapers at both ends,
or has a more or less even distribution each side of the mean. Environmental
data has a minimum of zero and often is skewed to the low end. This is not a
normal distribution, but the log of it is. See, page
three of module 12E The first two figures, although they are talking about
risk, are similar to what environmental data often look like. You can clearly
see how the arithmetic data is skewed, but the log of it is normal. You should
not use the log directly in the t-test, but there are very similar tests that
you can use, to the same purpose, namely estimating the confidence that two
means are the same.
Another common "abnormal" data set is bi-model. That is, there are
two peaks.
*** Q. What is sensitivity analysis?
A. When you have a model, or just an equation, some of the factors (parameters,
or unknowns) affect the answer more than others. Sometimes this is obvious,
such as the price of apples = 0.01X + 100Y^3, the price is much more sensitive
to changes in Y than to changes in X. (provided both X and Y are greater than
1.0). In other cases it is not obvious, so you play with the model, varying
one parameter at a time to determine if your result changes much. You may find
that the parameter you were worried about does not make much difference, so
a guess is good enough. Or the model might be very sensitive and you need to
be very careful what you use. Of course some parameters affect others, so that
is one reason a mathematical or statistician is a recommended part of the risk
assessment team.
Q. I have one more question regarding the module 10 and the statistics refresher. How do you know what sample size would be appropriate for an assessment?
A. Ideally you "design" a study and consider the "power" of the sampling. I did not go into power, but if you are trying to distinguish if two population are "different," by which we usually mean different means, the greater the difference between their means and the smaller their standard deviations, the less samples we will need to determine if they are different. Based on our preliminary notions about the populations we can estimate how many samples are needed.
For real environmental sampling, we are hampered because the suspect contamination is likely just a portion of our study area. If it is obvious, a stain, it is easy to delimit "samples of the stain." If it is not obvious, there may be dozens of "areas" that are contaminated and many that are not. An old drill pad or gravel contractors equipment yard might have many stains related to specific spills. What then? From the regulator's viewpoint, if you try to use an average, it may average as below the action levels, but many spots within might be above the action levels, if you sampled that spot. Hot spots are typically the danger to on-site people and cleanup crews.
The EPA has several guidance documents about sampling, using grids and random numbers. You really have to use these, if you do not have any known hot spots. If you have known hot spots, you want to sample these separately and distinguish these from the "rest of the site." Then use the grid for the rest of the site. You can get such from RAGS and there are many others. How many samples per hot spot? You want to be sure to get several at depth. Surface, two feet and four feet are common. I would want at least one sample per 1000 CY of estimated contamination, but no less than 5 from each hot spot.
The usual scenario is that the regulators just want enough samples to prove there is a problem. Then it is up to the PRP to define the problem. Usually the PRP will present a sampling plan, which the regulators then accept or modify. Here the regulators just want to be sure the PRP samples each hot spot, then the PRP will take lots of samples to find the limits of the hot spot, otherwise the regulators will have them clean the whole thing.
For risk assessment for populations distant from the site, on the other hand,
it is usually the average contamination at the site that determines the risk,
not the hot spots. For these the grids work fine.
Q. For the weight-of evidence classification implemented by the EPA and the
IARC, who decides what chemical belongs to what class? Also who decides what
the criteria are?
A. "Experts." For EPA the process goes something like this: In-house
experts decide that there is some need to classify a chemical or change a classification.
They get some budget together and contract with a beltway bandit-type company
to study the issue and draft a report or summary, based on the primary literature.
This report is then reviewed by a committee composed of experts usually hired
individually for this task from industry and academia. These experts write a
recommendation to the Administrator of the EPA (who I doubt sees it, actually
the same EPA people who started do read it.) EPA has a Scientific Review Board
also composed of outside experts that then reviews the recommendation. Since
these classifications are not regulations per se, I'm not sure what the public
comment is. Although I do not have personal knowledge of the people involved,
I would bet the selection of "experts" tends to be skewed towards
people who give the EPA the answers they want. There are many experts out there,
but most are busy people and not available to do this sort of consulting.
Q. Three methods were discussed for handling the parameter uncertainties. Using
the EPA defaults were seen as being the cheapest. Is this one used the more
than the second method? Or is there some sort of combination of the two? Is
the third one so expensive that it is often not used?
A. Pay now or pay later. If using the default assumptions results in an answer
you like, you are home free. If they result in an expensive cleanup, you want
to check them closer, if you can.
Q. Acrolor 1260 is considered 60% chlorine by weight, what about 1016? Does
that have the same naming convention?
A. Aroclor 1016 is an exception to the naming rule. It is a random mixture of
1 through 6 chlorines per molecule and is about 41% chlorine by weight.
Q. What is BTEX?
A. Benzene, Toulene, Ethyl benzene, and Xylene. These have 6,7 and 8 carbons
and are light, volatile aromatics. They are all considered toxic.
Q. If the hydrocarbon groupings are useful for regulatory purposes, but are
not very valuble scientifically, than why are they used? I hope its not just
because its cheaper.
A. One man's trash is another man's treasure.
The toxicity needs to be evaluated for each chemical, which is impractical for
hydrocarbons, because most are mixtures that contain hundreds of chemicals.
The same is true, but to a lesser extent for fate and transport. So, it is practical
to "bundle" certain groups of chemicals together, "gasoline range
organics" GRO for example. And make some sort of rational decision about
it. For example, (I'm making these number up) "if the GRO in the soil is
less than 100 mg/kg, it is unlikely to be a significant risk to wells more than
500 feet away." If you wanted to be scientific, you would need to analyze
that GRO for its constituent chemicals, probably about 30, and do a risk analysis
for each. But why bother, unless you are working for a wealthy client or her
lawyer who want to prove that it is harmful, or wants to confuse a jury.
Q. Your professional opinion: As you have probably noticed, I have a problem
with computer models. Yet, I have done some biological and chemical work and
have used statistical analysis methods and I am able to accept what I get from
statistics, with some caution. Do you think that there is a large difference
between computer models and statistical analysis?
OK, see articles above regarding sensitivity and error terminology.
Q. Under "Special Chemical", was there any reason why in particular
you chose the chemicals you did to discuss? I began thinking about that, and
there are many others that you could have mentioned or discussed
so I guess
I am asking what was the madness behind it? Was it just because we encounter
these chemicals more often? I suppose these are the killers, so to speak.
A. "...and all the children are above average." We are all special
and so are our chemicals. These particular chemicals are mentioned because they
are very common chemicals in hazardous waste sites (but not fresh spills) and
their risks are computed using a slightly different algorithm then most chemicals.
Does statistics fully consider the uncertainty parameters? Or is it a mere
jugglery of numeric data.
I think if it fully depends on the data collected then our results fully depend
on how the data is collected (I mean without any errors).
I suspect any small error in data collection may lead to big mistakes in statistical
drawings. Thus statistics should also consider that.
OK, see articles above regarding sensitivity and error terminology.