A plotting file defines a set of options that are used for analysis and representation purposes, particularly to determine how datasets should be represented in plots and how should they be grouped together according to various criteria. The plotting files should be considered part of the implementation of the dataset, and should be read by various tools that want to sensibly represent the data.
Given a dataset labeled <DATASET>
, a file found in the commondata
folder (nnpdfcpp/data/commondata
) that matches the regular expression
PLOTTING_<DATASET>\.ya?ml
(that is, the string "PLOTTING_"
followed by the name of the dataset, and ending in .yaml
or .yml
) is to be considering a plotting file for that dataset.
For example, given the dataset “HERA1CCEP”, the corresponding plotting file name is:
PLOTTING_HERA1CCEP.yaml
Additionally, the configuration is loaded from a per-process-type file called:
PLOTTINTYPE_<type>.yaml
See Kinamatic labels below for a list of defined types. When a key is present both in the dataset-specific and the process type level file, the dataset-specific one always takes precedence.
The plot file specifies the variable as a function of which the data is to be plotted (in the x axis) as well as the variables as a function of which the data will be split in different lines in the same figure or in different figures. The possible variables (‘kinematic labels’) are described below.
The format also allows to control several plotting properties, such that whether to use log scale, or the axes labels.
A key called dataset_label
can be used to specify a nice plotting and display label for each dataset. LaTeX math is allowed between dollar signs.
The default kinematic variables are inferred from the process type declared in the commondata files (more specifically from a substring). Currently they are:
'DIS': ('$x$', '$Q^2 (GeV^2)$', '$y$'),
'DYP': ('$y$', '$M^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
'EWJ_JPT': ('$p_T (GeV)$', '$M^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
'EWJ_JRAP': ('$\\eta/y$', '$M^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
'EWJ_MLL': ('$M_{ll} (GeV)$', '$M_{ll}^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
'EWJ_PT': ('$p_T (GeV)$', '$M^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
'EWJ_PTRAP': ('$\\eta/y$', '$p_T^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
'EWJ_RAP': ('$\\eta/y$', '$M^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
'EWK_MLL': ('$M_{ll} (GeV)$', '$M_{ll}^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
'EWK_PT': ('$p_T$ (GeV)', '$M^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
'EWK_PTRAP': ('$\\eta/y$', '$p_T^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
'EWK_RAP': ('$\\eta/y$', '$M^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
'HIG_RAP': ('$y$', '$M_H^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
'HQP_MQQ': ('$M^{QQ} (GeV)$', '$\\mu^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
'HQP_PTQ': ('$p_T^Q (GeV)$', '$\\mu^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
'HQP_PTQQ': ('$p_T^{QQ} (GeV)$', '$\\mu^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
'HQP_YQ': ('$y^Q$', '$\\mu^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
'HQP_YQQ': ('$y^{QQ} (GeV)$', '$\\mu^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
'INC': ('$0$', '$\\mu^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
'JET': ('$\\eta$', '$p_T^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
'PHT': ('$\\eta_\\gamma$', '$E_{T,\\gamma}^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
'SIA': ('$z$', '$Q^2 (GeV^2)$', '$y$')
This mapping is declared as CommonData.kinLabel_latex
in the C++ code (and accessible as validphys.plotoptions.core.kinlabels_latex
in the Python code).
The three kinematic variables are referred to as ‘k1’, ‘k2’ and ‘k3’ in the plot files. For example, for DIS processes, ‘k1’ refers to ‘x’, ‘k2’ to ‘Q’, and ‘k3 to ’y’.
These kinematic values can be overridden by some transformation of them. For that purpose, it is possible to define a kinematics_override key. The value must be a class defined in:
validphys2/src/validphys/plotoptions/kintransforms.py
The class must have a __call__
method that takes three parameters: (k1, k2 k3)
as defined in the dataset implementation, and return three new values (k1', k2', k3')
which are the “transformed” kinematical variables, which will be used for plotting purposes every time the kinematic variables k1, k2 and k3 are referred to. Additionally, the class must implement a new_labels
method, that takes the old labels and returns the new ones, and an xq2map
function that takes the kinematic variables and returns a tuple of (x, Q²) with some approximate values. An example of such transform is:
class dis_sqrt_scale:
def __call__(self, k1, k2, k3):
ecm = sqrt(k2/(k1*k3))
return k1, sqrt(k2), ceil(ecm)
def new_labels(self, *old_labels):
return ('$x$', '$Q$ (GeV)', r'$\sqrt{s} (GeV)$')
def xq2map(self, k1, k2, k3, **extra_labels):
return k1, k2*k2
Additional labels can be specified by declaring an extra_labels key in the plotting file, and specifying for each new label a value for each point in the dataset.
For example:
extra_labels:
idat2bin: [0, 0, 0, 0, 0, 0, 0, 0, 100, 100, 100, 100, 100, 200, 200, 200, 300, 300, 300, 400, 400, 400, 500, 500, 600, 600, 700, 700, 800, 800, 900, 1000, 1000, 1100]
defines one label where the values for each of the datapoints are given in the list. Note that the name of the extra_label (in this case idat2bin
is completely arbitrary, and will be used for plotting purposes (LaTeX math syntax is allowed as well). However adding labels manually for each point can be tedious. This should only be reserved for information that cannot be recovered from the kinematics as defined in the CommonData file. Instead, new labels can be generated programmatically: Every function defined in:
validphys2/src/validphys/plotoptions/labelers.py
is a valid label. These functions take as keyword arguments the (possibly transformed) kinematical variables, as well as any extra label declared in the plotting file. For example, one might declare:
def high_xq(k1, k2, k3, **kwargs):
return k1 > 1e-2 and k2 > 1000
Note that it is convenient to always declare the ‘**kwargs’ parameter so that the code doesn’t crash when the function is called with extra arguments. Similarly to the kinematics transforms, it is possible to decorate them with a @label
describing a nicer latex label than the function name. For example:
@label(r"$I(x>10^{-2})\times I(Q > 1000 GeV)$")
def high_xq(k1, k2, k3, **kwargs):
return (k1 > 1e-2) & (k2 > 1000)
The variable as function of which the data is plotted, is simply declared as
x: <label>
For example:
x: k1
If a line_by
key is specified, variables with different values for each of the labels listed, will be represented as different lines. For example,
line_by:
- k2
for DIS would mean that the data in the same Q bin is plotted in the same line.
Similarly, it is possible to define a figure_by
key: Points with different values for the listed keys will be split across separated figures. For example:
figure_by:
- idat2bin
- high_xq
By default the y axis represents the central value and error. However it is possible to define a results_transform in the plotting file:
result_transform: qbinexp
The value must be a function declared in
validphys2/src/validphys/plotoptions/results_transform.py
taking the error, the central value as well as all the labels, and returning a new error and central value. For example:
def qbinexp(cv, error, **labels):
q = labels['k2']
qbin = bins(q)
return 10**qbin*cv, 10**qbin*error
Several plotting options can be specified. These include
When the results are to be plotted as a ratio, it may be convenient to alter the configuration of the plots, for example by changing the line_by
labels into figure_by
(because otherwise the points would overlap), or by changing the scale from log to linear. To do so, we specify the options we want to override in a normalize
key. Everything defined inside will take precedence when we produce a ratio plot and will be ignored for absolute value plots. For example:
x: k1
x_label: '$\left\|\eta/y\right|$'
y_label: '$d\sigma/dy$ (fb)'
line_by:
- Boson
normalize:
figure_by:
- Boson
extra_labels:
Boson: ["$W^+$","$W^+$","$W^+$","$W^+$","$W^+$","$W^+$","$W^+$","$W^+$","$W^+$","$W^+$","$W^+$","$W^-$","$W^-$","$W^-$","$W^-$","$W^-$","$W^-$","$W^-$","$W^-$","$W^-$","$W^-$","$W^-$","$Z$","$Z$","$Z$","$Z$","$Z$","$Z$","$Z$","$Z$"]
Here, we would split the data by different figure files for each unique value of the key Boson
(which is defined explicitly as an extra_label
), but only one plots with the three bosons split across different lines will be produced in absolute value plots.
Plotting files are also used to define metada related to the various datasets. These keys include:
experiment
(string): The experiment which produced the experimental data.process_description
(string): A description of the physical process associated to the dataset. This would typically be defined in the PLOTTINTYPE
files.data_reference
(string): a LaTex key corresponding to the reference of the experimental paper.theory_reference
(string): a LaTeX key corresponding to the codes used to compute the theory predictions.A complete example (all keys are optional) looks like this:
dataset_label: "Some hypothetical dataset"
experiment: ATLAS
x: k3
x_scale: log
kinematics_override: dummy_transform #defined in transforms.py
line_by:
- k2
figure_by:
- idat2bin #defined below
- high_xq #defined in labelers.py
normalize: # Change the scale for ratio plots
x_scale: linear
extra_labels:
idat2bin: [0, 0, 0, 0, 0, 0, 0, 0, 100, 100, 100, 100, 100, 200, 200, 200, 300, 300, 300, 400, 400, 400, 500, 500, 600, 600, 700, 700, 800, 800, 900, 1000, 1000, 1100]