Analyzing simulation output¶
By default, the remage output files are already reshaped in a hit-like structure and contains a time-coincidence-map for easy event-building. In a lot of cases, this is sufficient for analysis. In some cases, further post-processing with reboost is required.
Event building with the time-coincidence-map¶
Todo
this needs considerable rework
This tutorial is based on the output file produced in the Basic Tutorial. Let’s start by reading in the TCM in memory as an Awkward array:
>>> from lgdo import lh5
>>>
>>> tcm = lh5.read_as("/tcm", "output.lh5", library="ak")
let’s inspect the data with Awkward:
>>> ak.with_field(tcm, ak.local_index(tcm, axis=0), "event").show()
[{row_in_table: [0], table_key: [2], event: 0},
{row_in_table: [0], table_key: [3], event: 1},
...,
{row_in_table: [1, 1], table_key: [3, 2], event: 6},
{row_in_table: [0, 2], table_key: [1, 3], event: 7},
...,
{row_in_table: [1108, 6521], table_key: [2, 3], event: 6954}]
The instructions to build events are: the event 0 has one hit in the detector with UID 2, which can be found at row 0 of the corresponding output table. Event 1 is characterized by the hit stored in the first row of output table of detector UID 3. The event 6 is a two-detector event: one hit from UID 3 at output table row 1, one hit from UID 2 also at table row 1.
If the output tables in the remage output are keyed by UID, loading them is
straightforward. If tables are labeled after the detector name, one can still
resort to the soft links stored below /stp/__by_uid__, which are always keyed
by UID.
The following function can be used to exploit the TCM to load hit data from disk from all tables and organize it by event:
import re
import awkward as ak
from lgdo import lh5
def read_hits_as_events(
tcm: ak.Array,
field: str,
filename: str,
):
# flatten the TCM (this is needed later to read from disk)
# but first save the information needed to unflatten at the end
num_tcm = ak.num(tcm.row_in_table)
flat_tcm = ak.Array({k: ak.flatten(tcm[k]) for k in tcm.fields})
# initialise the output event array
flat_evt = None
# loop over list of tables. use the softlinks which are guaranteed to
# contain the detector UID (used in the TCM)
for table in lh5.ls(filename, "/stp/__by_uid__/"):
# extract the uid from the table name
uid = int(re.search(r"(?<=stp/__by_uid__/det)\d+", table).group())
# ask the TCM which rows of this table we have to read from disk
# remonder: row corresponds to a hit
rows_in_table = flat_tcm.row_in_table[flat_tcm.table_key == uid].to_numpy()
# read the data!
# NOTE: lh5.read() is smart enough to detect contiguous ranges. If not,
# it will read min(idx):max(idx) and then slice
# https://legend-lh5io.readthedocs.io/en/latest/api/lh5.io.html#lh5.io.core.read
data = lh5.read(f"{table}/{field}", filename, idx=rows_in_table).view_as("ak")
# concatenate data from this table to the event structure
flat_evt = ak.concatenate((flat_evt, data)) if flat_evt is not None else data
# group the data by events
return ak.unflatten(flat_evt, num_tcm)
For example:
# read the TCM from disk as an Awkward array
tcm = lh5.read_as("tcm", "output.lh5", library="ak")
# organize some interesting hit data fields into a event-oriented structure
evt = ak.Array(
{k: read_hits_as_events(tcm, k, "output.lh5") for k in ("edep", "t0", "xloc")}
)
# compute and add some event observables
evt["hpge_multiplicity"] = ak.num(evt.edep, axis=1)
evt["hpge_energy_sum"] = ak.sum(ak.sum(evt.edep, axis=-1), axis=-1)
Let’s inspect the evt array at event 0 and 6:
>>> evt[0].show()
{edep: [[0.0699, 28.1, 147, 164, 176]],
t0: [0.133],
xloc: [[0.0424, 0.0424, 0.0425, 0.0424, 0.0424]],
hpge_multiplicity: 1,
hpge_energy_sum: 516}
>>> evt[6].show()
{edep: [[0.0065, 30.2, 162, 146, 124, 166, 1.65, 0.0922], [0.0887, ...]],
t0: [0.119, 0.505],
xloc: [[0.0599, 0.0599, 0.0599, 0.0599, 0.06, 0.06, 0.054, 0.054], [...]],
hpge_multiplicity: 2,
hpge_energy_sum: 682}
We find the event structure expected from the TCM.
Post-processing with reboost¶
reboost is the general post-processing and analysis toolkit for remage output. remage internally uses reboost for the output reshaping, but does not apply any further user-defined post-processing.
reboost supports various processors for output tables from remage: HPGe pulse-shape emulation or heuristics, optical map application for scintillators, … It can be used as a command-line tool, or by writing custom code.
Important
When using reboost with a config file, the TCM will not be available after post-processing. This is currently a known limitation; in the future we will recommend moving away from config files toward a python-script based workflow.