The plot I have attached shows gamma induced events from three sources of radiation (Co-60, Cs-137, and Co-57).
The distributions of the data contain a lot of noise (on the left side of the plot), because of the trigger level on the oscilloscope. I kept the trigger level at 10mv (any higher and I would lose valuable data) to get a good distribution of points. You'll notice that with the Co-60, everything up until about 0.04 V is noise (and there is a lot of it) and the gamma events are located on the interval between ~0.45 and 0.07 V.
For the other two sources, the idea is the same, it is just more difficult to distinguish the noise from gamma events, because the the signal produced by the gammas (multiplied by the SiPM) is much closer to the noise for Co-57 and Cs-137 than it is for Co-60. To get an idea about how much the original energy is amplified by the photo-multiplier, we can look at the Cs-137. If you look up the gamma energy associated with a radiating Cs-137 nucleus, you'll find it to be ~660 keV. And the peak voltage found in the experiment (see green Gaussian) is ~0.03 V.
The phenomenon shows itself even more so with the Co-57, as the distribution just appears to be one smooth exponential falloff with an increase in pulse height. But, if you look closely, you'll see a second peak just to the right of the noise peak channel (red Gaussian) that represents the gamma induced events. I am convinced of this, because if I try to run the experiment with no source or a very weak one (Am-241 (~60 keV) for example), I am left with a very noisy distribution, which is to be expected. Basically, I don't see the second peak where the Co-57 has another peak.
When I first took a look at this plot from some of Georgia de Nolfo's work I was confused in thinking that the plot was just of the peak heights from one event (one csv). But upon further inspection I learned that it is a plot of a certain number of events (5-,000-10,000) and the associated peaks calculated from the PSD algorithm we are familiar with. With a better understanding of the plot, I could recreate the experiment and use the algorithm I wrote to get the peaks.
It is important to understand the gamma induced event and why we get a pulse. When a gamma ray collides with an electron inside our p-terphenyl, inelastic scattering called Compton scattering occurs. The scattered photon from this reaction will have an energy that is some fraction of the incident energy while the remaining kinetic energy is transferred to the electron. The Compton edge is the maximum possible energy deposited in the detector (electron) occurring at 180 degrees. The range of energies deposited can be determined by the scattering angle of the resulting photon. Looking at the plot below for a 500 keV photon, we can see that the maximum energy deposited in the detector is ~320 keV.
Once the secondary photon is scattered with some fraction of the incident energy it is detected by the SiPM array. This is where the photoelectric effect takes hold. Because the energy of the gamma ray is less than the initial energy, the effect shifts from Compton to photoelectric inside the photo-multiplier. An electron is ejected from the material and multiplied inside the SiPM (keep in mind that this is a diagram of a PMT which works a little differently than the SiPM):
Other than doing the FPGA stuff with Joe, and working with the hardware in the lab, I have been tasked with comparing both the peak height and pulse shape from two competing algorithms (version 1 and version 2). Since Georgia originally sent the python code to us last summer, the code has made significant advancements in precision. An engineer named Jeff Dumonthier (who wrote the first version as well) updated it and took me through the main differences:
- The time of flight is the difference in the beginning of the pulse in the double scatter experiment I described in the other blog. The time of flight for these events is on the order of ~1ns. So precise calculation is needed to determine the exact instant of the CFD time-- the constant fraction discriminator time. Remember that the CFD was calculated as the value that is 25% of the peak voltage. If we can accurately find the time value that the pulse is at 25% of its peak value for both detectors, we can do a simple subtraction and see how long it took a gamma ray/neutron to cause an event in both. The way the DRS is set up, it only saves an event if both detectors are triggered. So the XML file we receive has 1024 time/voltage pairs (as opposed to the 502 pairs from the LeCroy scope) for detector 1, followed directly by the same number of pairs for detector 2. It is important to note that this 25% value is arbitrarily selected and needs (probably) to be optimized to produce the best version of our ToF measurement. The following plot shows the time of flight for a set of 1,000 events. You'll notice two distributions on the plot, the small one corresponding to neutron induced events, and the other to gammas. In this plot the subtraction is reversed between detector 1 and 2, so we can conclude that the neutrons have a longer ToF than gammas (as expected because of their greater weight and lower velocity) and gamma events are more probable than neutrons over the data collection interval.
- Because time is so important now, everything in the new algorithm is calculated through the time vector. So, instead of just calculating which bin (0 to 502) the pulse height and pulse CFD are at, the new code finds both the closest time bin (real value gotten from the DRS) and the actual time value of both values. Realizing that this is sort of confusing I will describe more here:
- For one event in one detector, we have 1024 time/voltage pairs. With the LeCroy we had 502 time/voltage pairs. We simply marched up the leading edge of the pulse finding the index of the time/voltage value closest to 25% of the peak. We then chose a precise number of bins coming after that index to calculate both the long and short interval. The algorithm has been updated to instead of simply finding the closest value to the CFD time, it finds the exact time of 25% of the peak by fitting a line to the leading edge. Because part of the leading edge of each pulse is 'very linear' we can take some upper and lower range of the edge (say 15% to 50% of the peak) and save all the time/voltage pairs into a matrix. From there we calculate a slope and intercept. Knowing the exact 25% voltage of the peak we can subsequently calculate the exact 25% CFD time threshold. Then we can get ToF!
Using the 'binning method of inexactness' I was getting a ToF distribution that had a similar shape to the one as above, but the measurements were scattered over a greater range of ToF channels. If you look at the largest number of counts for the ToF above you'll see ~170 events at the peak. With the bins the largest number of counts in the peak bin was ~35 events (withe the same 1,000 event set).
- The overall goal of comparing the old and new versions (and not just going with the later version) is to find a 'lightweight' enough algorithm that produces precise enough results and can be implemented on an FPGA, aka written in such a way it could be written in HDL (or synthesized by HLS for us). The way things are calculated for each algorithm is as follows:
- Version 1 Agorithm:
- Pre-Filter:
- Subtracts out first time element from time vector to normalize/zero-out set
- Inverts voltage polarity (LeCroy gives negative vals, DRS gived positive vals)
- Pass 1:
- Baseline - finds average of constant number of voltage points preceding the leading the edge of the pulse and uses that as a zero such that postY=MAV-baseline.
- Performs smoothing function (moving average filter) and finds 4 greatest values of post-smoothing array. Finds average of those four to find pulse height/index.
- Pass 2:
- Finds CFD as closest time/voltage value to fraction of the peak.
- Performs integration over two regions a fixed distance -- number of bins -- from the CFD index.
- Divides the Long/Short interval to give Pulse Shape.
- Version 2 Algorithm:
- Pre-Process:
- Subtracts out first time element from time vector to normalize/zero-out set
- Base and Scale
- Baseline - uses all voltage points before a certain time value and computes average.
- Subtracts baseline from all voltage values (no smoothing first)
- Peak
- Fits a parabola to a fixed number of points in a range of points near the max of the pulse. Takes max of parabola to find height.
- Leading Edge CFD
- Finds CFD time/voltage by fitting a line to range of voltage values along leading edge
- PSD
- Takes time/voltage pair closest to CFD time threshold and performs integration over two regions of a fixed time.
- After running through the comparisons of each algorithm for the baseline voltage, peak, cfd time, and psd calculation I found that Version 1 does a good job of finding the baseline and calculating the peak using smoothing but does not do a very good job of finding the CFD time and finding the ToF. So I adjusted the CFD calculation to manually fit a line between two points, because something like polyfit in python would be too cumbersome to replicate in the FPGA. Here are the comparisons between Version 1 (Mike) and Version 2 (Jeff) for finding the peak and CFD time:
A typical baseline voltage is -.0022 V. So the difference being ~0.0001 is not large enough to say the binning baseline calculation is useless. Version 1 matches Version 2 pretty closely.
A typical peak voltage is 0.03-0.06 V. So the difference being ~0.0001-0.0002 is not large enough to say the method of smoothing is useless. Version 1 matches Version 2 pretty closely.
Because in both algorithms the CFD voltage is calculated as a certain percentage of the peak, the difference in CFD voltage will be the same as it was for the difference in peak voltage (just a scalar modification).
A typical CFD value is 5.5e-07 so the difference for CFD time for Version 1 and 2 being 2 orders of magnitude smaller shows: The method of finding a point directly above and below the smoothed curve CFD voltage and fitting a line between those points (to find the CFD time) does as good a job as an un-smoothed range of points 15-60% of the peak with a line fitting those points to find the CFD time. Also now Version 1 has an easier transition to C or HDL.