978 lines
60 KiB
Markdown
978 lines
60 KiB
Markdown
---
|
|
author: David Leppla-Weber
|
|
title: Search for excited quark states decaying to qW/qZ
|
|
lang: en-GB
|
|
header-includes: |
|
|
\usepackage[onehalfspacing]{setspace}
|
|
\usepackage{siunitx}
|
|
\usepackage{tikz-feynman}
|
|
\usepackage{csquotes}
|
|
\pagenumbering{gobble}
|
|
\setlength{\parskip}{0.5em}
|
|
\bibliographystyle{lucas_unsrt}
|
|
abstract: |
|
|
A search for an excited quark state, called q\*, is presented using data recorded by CMS during the years 2016, 2017
|
|
and 2018. By analysing its decay channels to qW and qZ, a minimum mass of 6.1 TeV resp. 5.5 TeV is established. This
|
|
limit is about 1 TeV higher than the limits found by a previous research of data collected by CMS in 2016
|
|
[@PREV_RESEARCH], excluding the q\* particle up to a mass of 5.0 TeV resp. 4.7 TeV. Also a comparison of the new
|
|
DeepAK8 [@DEEP_BOOSTED] and the older N-subjettiness [@TAU21] tagger is conducted, showing that the newer DeepAK8
|
|
tagger is currently approximately at the same level as the N-subjettiness tagger, but has the potential to further
|
|
improve in performance.
|
|
|
|
|
|
```{=tex}
|
|
\end{abstract}
|
|
\begin{abstract}
|
|
Abstract 2.
|
|
```
|
|
|
|
documentclass: article
|
|
geometry:
|
|
- top=2.5cm
|
|
- left=2.5cm
|
|
- right=2.5cm
|
|
- bottom=2cm
|
|
papersize: a4
|
|
mainfont: Times New Roman
|
|
fontsize: 12pt
|
|
toc: true
|
|
nocite: |
|
|
@*
|
|
---
|
|
\newpage
|
|
\pagenumbering{arabic}
|
|
|
|
# Introduction
|
|
|
|
The Standard Model is a very successful theory in describing most of the effects on a particle level. But it still has a
|
|
lot of shortcomings that show that it isn't yet a full "theory of everything". To solve these shortcomings, lots of
|
|
theories beyond the standard model exist that try to explain some of them.
|
|
|
|
One category of such theories is based on a composite quark model. Quarks are currently considered elementary particles
|
|
by the Standard Model. The composite quark models on the other hand predict that quarks consist of particles unknown
|
|
to us so far or can bind to other particles using unknown forces. This could explain the symmetries between particles
|
|
and reduce the number of constants needed to explain the properties of the known particles. One common prediction of
|
|
those theories are excited quark states. Those are quark states of higher energy that can decay to an unexcited quark
|
|
under the emission of a boson. This thesis will search for their decay to a quark and a W/Z boson. The W/Z boson then
|
|
decays in the hadronic channel, to two more quarks. The endstate of this decay has only quarks, making Quantum
|
|
Chromodynamics effects the main background.
|
|
|
|
In a previous research [@PREV_RESEARCH], a lower limit for the mass of an excited quark has already been set using data
|
|
from the 2016 run of the Large Hadron Collider with an integrated luminosity of $\SI{35.92}{\per\femto\barn}$. Since
|
|
then, a lot more data has been collected, totalling to $\SI{137.19}{\per\femto\barn}$ of data usable for research. This
|
|
thesis uses this new data as well as a new technique to identify decays of highly boosted particles based on a deep
|
|
neural network. By using more data and new tagging techniques, it aims to either confirm the existence of the q\*
|
|
particle or improve the previously set lower limit of 5 TeV respectively 4.7 TeV for the decay to qW respectively qZ on
|
|
its mass to even higher values. It will also directly compare the performance of this new tagging technique to an older
|
|
tagger based on jet substructure studies used in the previous research.
|
|
|
|
In chapter 2, a theoretical background will be presented explaining in short the Standard Model, its shortcomings and
|
|
the theory of excited quarks. Then, in chapter 3, the Large Hadron Collider and the Compact Muon Solenoid, the detector
|
|
that collected the data for this analysis, will be described. After that, in chapters 4-7, the main analysis part
|
|
follows, describing how the data was used to extract limits on the mass of the excited quark particle. At the very end,
|
|
in chapter 8, the results are presented and compared to previous research.
|
|
|
|
\newpage
|
|
|
|
# Theoretical motivation
|
|
|
|
This chapter presents a short summary of the theoretical background relevant to this thesis. It first gives an
|
|
introduction to the standard model itself and some of the issues it raises. It then goes on to explain the background
|
|
processes of quantum chromodynamics and the theory of q*, which will be the main topic of this thesis.
|
|
|
|
## Standard model {#sec:sm}
|
|
|
|
The Standard Model of physics proved to be very successful in describing three of the four fundamental interactions
|
|
currently known: the electromagnetic, weak and strong interaction. The fourth, gravity, could not yet be successfully
|
|
included in this theory.
|
|
|
|
The Standard Model divides all particles into spin-$\frac{n}{2}$ fermions and spin-n bosons, where n could be any
|
|
integer but so far is only known to be one for fermions and either one (gauge bosons) or zero (scalar bosons) for
|
|
bosons. Fermions are further classified into quarks and leptons.
|
|
Quarks and leptons can also be categorized into three generations, each of which contains two particles, also called
|
|
flavours. For leptons, the three generations each consist of a lepton and its corresponding neutrino, namely first the
|
|
electron, then the muon and third, the tau. The three quark generations consist of first, the up and down, second, the
|
|
charm and strange, and third, the top and bottom quark. So overall, their exists a total of six quark and six lepton
|
|
flavours. A full list of particles known to the standard model can be found in [@fig:sm]. Furthermore, all fermions have
|
|
an associated anti particle with reversed charge.
|
|
|
|
The matter around us, is built from so called hadrons, that are bound states of quarks, for example protons and
|
|
neutrons. Long lived hadrons consist of up and down quarks, as the heavier ones over time decay to those.
|
|
|
|
{width=50% #fig:sm}
|
|
|
|
The gauge bosons, namely the photon, $W^\pm$ bosons, $Z^0$ boson, and gluon, are mediators of the different
|
|
forces of the standard model.
|
|
|
|
The photon is responsible for the electromagnetic force and therefore interacts with all
|
|
electrically charged particles. It itself carries no electromagnetic charge and has no mass. Possible interactions are
|
|
either scattering or absorption. Photons of different energies can also be described as electromagnetic waves of
|
|
different wavelengths.
|
|
|
|
The $W^\pm$ and $Z^0$ bosons mediate the weak force. All quarks and leptons carry a flavour, which is a conserved value.
|
|
Only the weak interaction breaks this conservation, a quark or lepton can therefore, by interacting with a $W^\pm$
|
|
boson, change its flavour. The probabilities of this happening are determined by the Cabibbo-Kobayashi-Maskawa matrix:
|
|
|
|
\begin{equation}
|
|
V_{CKM} =
|
|
\begin{pmatrix}
|
|
|V_{ud}| & |V_{us}| & |V_{ub}| \\
|
|
|V_{cd}| & |V_{cs}| & |V_{cb}| \\
|
|
|V_{td}| & |V_{ts}| & |V_{tb}|
|
|
\end{pmatrix}
|
|
=
|
|
\begin{pmatrix}
|
|
0.974 & 0.225 & 0.004 \\
|
|
0.224 & 0.974 & 0.042 \\
|
|
0.008 & 0.041 & 0.999
|
|
\end{pmatrix}
|
|
\end{equation}
|
|
The probability of a quark changing its flavour from $i$ to $j$ is given by the square of the absolute value of the
|
|
matrix element $V_{ij}$. It is easy to see, that the change of flavour in the same generation is way more likely than
|
|
any other flavour change.
|
|
|
|
Due to their high masses of 80.39 GeV resp. 91.19 GeV, the $W^\pm$ and $Z^0$ bosons themselves decay very quickly.
|
|
Either in the leptonic or hadronic decay channel. In the leptonic channel, the $W^\pm$ decays to a lepton and the
|
|
corresponding anti-lepton neutrino, in the hadronic channel it decays to a quark and an anti-quark of a different
|
|
flavour. Due to the $Z^0$ boson having no charge, it always decays to a fermion and its anti-particle, in the leptonic
|
|
channel this might be for example a electron - positron pair, in the hadronic channel an up and anti-up quark pair. This
|
|
thesis examines the hadronic decay channel, where both vector bosons essentially decay to to quarks.
|
|
|
|
The quantum chromodynamics (QCD) describes the strong interaction of particles. It applies to all
|
|
particles carrying colour (e.g. quarks). The force is mediated by gluons. These bosons carry colour as well,
|
|
although they don't carry only one colour but rather a combination of a colour and an anticolour, and can therefore
|
|
interact with themselves and exist in eight different variants. As a result of this, processes, where a gluon decays
|
|
into two gluons are possible. Furthermore the strength of the strong force, binding to colour carrying particles,
|
|
increases with their distance making it at a certain point more energetically efficient to form a new quark - antiquark
|
|
pair than separating the two particles even further. This effect is known as colour confinement. Due to this effect,
|
|
colour carrying particles can't be observed directly, but rather form so called jets that cause hadronic showers in the
|
|
detector. Those jets are cone like structures made of hadrons and other particles. The effect is called Hadronisation.
|
|
|
|
### Shortcomings of the Standard Model
|
|
|
|
While being very successful in describing the effects observed in particle colliders or the particles reaching earth
|
|
from cosmological sources, the Standard Model still has several shortcomings.
|
|
|
|
- **Gravity**: as already noted, the standard model doesn't include gravity as a force.
|
|
- **Dark Matter**: observations of the rotational velocity of galaxies can't be explained by the known matter. Dark
|
|
matter currently is our best theory to explain those.
|
|
- **Matter-antimatter asymmetry**: The amount of matter vastly outweights the amount of antimatter in the observable
|
|
universe. This can't be explained by the standard model, which predicts a similar amount of matter and antimatter.
|
|
- **Symmetries between particles**: Why do exactly three generations of fermions exist? Why is the charge of a quark
|
|
exactly one third of the charge of a lepton? How are the masses of the particles related? Those and more questions
|
|
cannot be answered by the standard model.
|
|
- **Hierarchy problem**: The weak force is approximately $10^{24}$ times stronger than gravity and so far, there's no
|
|
satisfactory explanation as to why that is.
|
|
|
|
## Excited quark states {#sec:qs}
|
|
|
|
One category of theories that try to explain the symmetries between particles of the standard model are the composite
|
|
quark models. Those state, that quarks consist of some particles unknown to us so far. This could explain the symmetries
|
|
between the different fermions. A common prediction of those models are excited quark states (q\*, q\*\*, q\*\*\*...).
|
|
Similar to atoms, that can be excited by the absorption of a photon and can then decay again under emission of a photon
|
|
with an energy corresponding to the excited state, those excited quark states could decay under the emission of any
|
|
boson. Quarks are smaller than $10^{-18}$ m. This corresponds to an energy scale of approximately 1 TeV. Therefore the
|
|
excited quark states are expected to be in that region. That will cause the emitted boson to be highly boosted.
|
|
|
|
\begin{figure}
|
|
\centering
|
|
\feynmandiagram [large, horizontal=qs to v] {
|
|
a -- qs -- b,
|
|
qs -- [fermion, edge label=\(q*\)] v,
|
|
q1 [particle=\(q\)] -- v -- w [particle=\(W\)],
|
|
q2 [particle=\(q\)] -- w -- q3 [particle=\(q\)],
|
|
};
|
|
\caption{Feynman diagram showing a possible decay of a q* particle to a W boson and a quark with the W boson also
|
|
decaying to two quarks.} \label{fig:qsfeynman}
|
|
\end{figure}
|
|
This thesis will search data collected by the CMS in the years 2016, 2017 and 2018 for the single excited quark state
|
|
q\* which can decay to a quark and any boson. An example of a q\* decaying to a quark and a W boson can be seen in
|
|
[@fig:qsfeynman]. As explained in [@sec:sm], the vector boson can then decay either in the hadronic or leptonic decay
|
|
channel. This research investigates only the hadronic channel with two quarks in the endstate. Because the boson is
|
|
highly boosted, those will be very close together and therefore appear to the detector as only one jet. This means that
|
|
the decay of a q\* particle will have two jets in the endstate (assuming the W/Z boson decays to two quarks) and will
|
|
therefore be hard to distinguish from the QCD background described in [@sec:qcdbg].
|
|
|
|
The choice of only examining the decay of the q\* particle to the vector bosons is motivated by the branching ratios
|
|
calculated for the decay [@QSTAR_THEORY]:
|
|
|
|
|
|
: Branching ratios of the decaying q\* particle.
|
|
|
|
| decay mode | br. ratio [%] | decay mode | br. ratio [%] |
|
|
|---------------------------|---------------|---------------------------|---------------|
|
|
| $U^* \rightarrow ug$ | 83.4 | $D^* \rightarrow dg$ | 83.4 |
|
|
| $U^* \rightarrow dW$ | 10.9 | $D^* \rightarrow uW$ | 10.9 |
|
|
| $U^* \rightarrow u\gamma$ | 2.2 | $D^* \rightarrow d\gamma$ | 0.5 |
|
|
| $U^* \rightarrow uZ$ | 3.5 | $D^* \rightarrow dZ$ | 5.1 |
|
|
|
|
The decay to the vector bosons have the second highest branching ratio. The decay to a gluon and a quark is the dominant
|
|
decay, but virtually impossible to distinguish from the QCD background described in the next section. This makes the
|
|
decay to the vector bosons the obvious choice.
|
|
|
|
To reconstruct the mass of the q\* particle from an event successfully recognized to be the decay of such a particle,
|
|
the dijet invariant mass has to be calculated. This can be achieved by adding their four momenta, vectors consisting of
|
|
the energy and momentum of a particle, together. From the four momentum it's easy to derive the mass by solving
|
|
$E=\sqrt{p^2 + m^2}$ for m.
|
|
|
|
This theory has already been investigated in [@PREV_RESEARCH] analysing data recorded by CMS in 2016, excluding the q\*
|
|
particle up to a mass of 5 TeV resp. 4.7 TeV for the decay to qW resp. qZ analysing the hadronic decay of the vector
|
|
boson. This thesis aims to either exclude the particle to higher masses or find a resonance showing its existence using
|
|
the higher center of mass energy of the LHC as well as more data that is available now.
|
|
|
|
### Quantum Chromodynamic background {#sec:qcdbg}
|
|
|
|
In this thesis, a decay with two jets in the endstate will be analysed. Therefore it will be hard to distinguish the
|
|
signal processes from QCD effects. Those can also produce two jets in the endstate, as can be seen in [@fig:qcdfeynman].
|
|
They are also happening very often in a proton proton collision, as it is happening in the Large Hadron Collider. This
|
|
is caused by the structure of the proton. It not only consists of three quarks, called valence quarks, but also of a lot
|
|
of quark-antiquark pairs connected by gluons, called the sea quarks, that exist due to the self interaction of the
|
|
gluons binding the three valence quarks. Therefore the QCD multijet backgroubd is the dominant background of the signal
|
|
described in [@sec:qs].
|
|
|
|
\begin{figure}
|
|
\centering
|
|
\feynmandiagram [horizontal=v1 to v2] {
|
|
q1 [particle=\(q\)] -- [fermion] v1 -- [gluon] g1 [particle=\(g\)],
|
|
v1 -- [gluon] v2,
|
|
q2 [particle=\(q\)] -- [fermion] v2 -- [gluon] g2 [particle=\(g\)],
|
|
};
|
|
\feynmandiagram [horizontal=v1 to v2] {
|
|
g1 [particle=\(g\)] -- [gluon] v1 -- [gluon] g2 [particle=\(g\)],
|
|
v1 -- [gluon] v2,
|
|
g3 [particle=\(g\)] -- [gluon] v2 -- [gluon] g4 [particle=\(g\)],
|
|
};
|
|
\caption{Two examples of QCD processes resulting in two jets.} \label{fig:qcdfeynman}
|
|
\end{figure}
|
|
|
|
\newpage
|
|
|
|
# Experimental Setup
|
|
|
|
Following on, the experimental setup used to gather the data analysed in this thesis will be described.
|
|
|
|
## Large Hadron Collider
|
|
|
|
The Large Hadron Collider is the world's largest and most powerful particle accelerator [@website]. It has a perimeter
|
|
of 27 km and can accelerate two beams of protons to an energy of 6.5 TeV resulting in a collision with a centre of mass
|
|
energy of 13 TeV. It is home to several experiments, the biggest of those are ATLAS and the Compact Muon Solenoid (CMS).
|
|
Both are general-purpose detectors to investigate the particles that form during particle collisions.
|
|
|
|
Particle colliders are characterized by their luminosity L. It is a quantity to be able to calculate the number of
|
|
events per second generated in a collision by $N_{event} = L\sigma_{event}$ with $\sigma_{event}$ being the cross
|
|
section of the event. The luminosity of the LHC for a Gaussian beam distribution can be described as follows:
|
|
|
|
\begin{equation}
|
|
L = \frac{N_b^2 n_b f_{rev} \gamma_r}{4 \pi \epsilon_n \beta^*}F
|
|
\end{equation}
|
|
Where $N_b$ is the number of particles per bunch, $n_b$ the number of bunches per beam, $f_{rev}$ the revolution
|
|
frequency, $\gamma_r$ the relativistic gamma factor, $\epsilon_n$ the normalised transverse beam emittance, $\beta^*$
|
|
the beta function at the collision point and F the geometric luminosity reduction factor due to the crossing angle at
|
|
the interaction point:
|
|
\begin{equation}
|
|
F = \left(1+\left( \frac{\theta_c\sigma_z}{2\sigma^*}\right)^2\right)^{-1/2}
|
|
\end{equation}
|
|
At the maximum luminosity of $10^{34}\si{\per\square\centi\metre\per\s}$, $N_b = 1.15 \cdot 10^{11}$, $n_b = 2808$,
|
|
$f_{rev} = \SI{11.2}{\kilo\Hz}$, $\beta^* = \SI{0.55}{\m}$, $\epsilon_n = \SI{3.75}{\micro\m}$ and $F = 0.85$.
|
|
|
|
To quantify the amount of data collected by one of the experiments at LHC, the integrated luminosity is introduced as
|
|
$L_{int} = \int L dt$.
|
|
|
|
|
|
## Compact Muon Solenoid
|
|
|
|
The data used in this thesis was recorded by the Compact Muon Solenoid (CMS). It is one of the four main experiments at
|
|
the Large Hadron Collider. It can detect all elementary particles of the standard model except neutrinos. For that, it
|
|
has an onion like setup, as can be seen in [@fig:cms_setup]. The particles produced in a collision first go through a
|
|
tracking system. They then pass an electromegnetic as well as a hadronic calorimeter. This part is surrounded by a
|
|
superconducting solenoid that generates a magenetic field of 3.8 T. Outside of the solenoid are big muon chambers. In
|
|
2016 the CMS captured data of an integrated luminosity of $\SI{37.80}{\per\femto\barn}$. In 2017 it collected
|
|
$\SI{44.98}{\per\femto\barn}$ and in 2018 $\SI{63.67}{\per\femto\barn}$. Of that data, in 2016
|
|
$\SI{35.92}{\per\femto\barn}$, in 2017 $\SI{41.53}{\per\femto\barn}$ and in 2018 $\SI{59.74}{\per\femto\barn}$ were
|
|
usable for research. Therefore the combined integrated luminosity of data usable for research is
|
|
$\SI{137.19}{\per\femto\barn}$.
|
|
|
|
![
|
|
The setup of the Compact Muon Solenoid showing its onion like structure, the different detector parts and where
|
|
different particles are detected [@CMS_PLOT].
|
|
](./figures/cms_setup.png){#fig:cms_setup}
|
|
|
|
|
|
### Coordinate conventions
|
|
|
|
Per convention, the z axis points along the beam axis in the direction of the magnetic fields of the solenoid, the y
|
|
axis upwards and the x axis horizontal towards the LHC centre. The azimuthal angle $\phi$, which describes the angle in
|
|
the x - y plane, the polar angle $\theta$, which describes the angle in the y - z plane and the pseudorapidity $\eta$,
|
|
which is defined as $\eta = -ln\left(tan\frac{\theta}{2}\right)$ are also introduced. The coordinates are visualised in
|
|
[@fig:cmscoords]. Furthermore, to describe a particle's momentum, often the transverse momentum, $p_t$ is used. It is
|
|
the component of the momentum transversal to the beam axis. Before the collision, the transverse momentum obviously has
|
|
to be zero, therefore, due to conservation of energy, the sum of all transverse momenta after the collision has to be
|
|
zero, too. If this is not the case for the detected events, it implies particles that weren't detected such as
|
|
neutrinos.
|
|
|
|
{#fig:cmscoords width=60%}
|
|
|
|
### The tracking system
|
|
|
|
The tracking system is built of two parts, closest to the collision is a pixel detector and around that silicon strip
|
|
sensors. They are used to reconstruct the tracks of charged particles, measuring their charge sign, direction and
|
|
momentum. They are as close to the collision as possible to be able to identify secondary vertices.
|
|
|
|
### The electromagnetic calorimeter
|
|
|
|
The electromagnetic calorimeter measures the energy of photons and electrons. It is made of tungstate crystal and
|
|
photodetectors. When passed by particles, the crystal produces light in proportion to the particle's energy. This light
|
|
is measured by the photodetectors that convert this scintillation light to an electrical signal. To measure a particles
|
|
energy, it has to leave its whole energy in the ECAL, which is true for photons and electrons, but not for other
|
|
particles such as hadrons and muons. Those have are of higher energy and therefore only leave some energy in the ECAL
|
|
but are not stopped by it.
|
|
|
|
### The hadronic calorimeter
|
|
|
|
The hadronic calorimeter (HCAL) is used to detect high energy hadronic particles. It surrounds the ECAL and is made of
|
|
alternating layers of active and absorber material. While the absorber material with its high density causes the hadrons
|
|
to shower, the active material then detects those showers and measures their energy, similar to how the ECAL works.
|
|
|
|
### The solenoid
|
|
|
|
The solenoid, giving the detector its name, is one of the most important features. It creates a magnetic field of 3.8 T
|
|
and therefore makes it possible to measure momentum of charged particles by bending their tracks.
|
|
|
|
### The muon system
|
|
|
|
Outside of the solenoid there is only the muon system. It consists of three types of gas detectors, the drift tubes,
|
|
cathode strip chambers and resistive plate chambers. It covers a total of $0 < |\eta| < 2.4$. The muons are the only
|
|
detected particles, that can pass all the other systems without a significant energy loss.
|
|
|
|
### The Trigger system
|
|
|
|
The CMS features a two level trigger system. It is necessary because the detector is unable to process all the events
|
|
due to limited bandwidth. The Level 1 trigger reduces the event rate from 40 MHz to 100 kHz, the software based High
|
|
Level trigger is then able to further reduce the rate to 1 kHz. The Level 1 trigger uses the data from the
|
|
electromagnetic and hadronic calorimeters as well as the muon chambers to decide whether to keep an event. The High
|
|
Level trigger uses a streamlined version of the CMS offline reconstruction software for its decision making.
|
|
|
|
### The Particle Flow algorithm
|
|
|
|
The particle flow algorithm is used to identify and reconstruct all the particles arising from the proton - proton
|
|
collision by using all the information available from the different sub-detectors of the CMS. It does so by
|
|
extrapolating the tracks through the different calorimeters and associating clusters they cross with them. The set of
|
|
the track and its clusters is then no more used for the detection of other particles. This is first done for muons and
|
|
then for charged hadrons, so a muon can't give rise to a wrongly identified charged hadron. Due to Bremsstrahlung photon
|
|
emission, electrons are harder to reconstruct. For them a specific track reconstruction algorithm is used.
|
|
After identifying charged hadrons, muons and electrons, all remaining clusters within the HCAL correspond to neutral
|
|
hadrons and within ECAL to photons. If the list of particles and their corresponding deposits is established, it can be
|
|
used to determine the particles four momenta. From that, the missing transverse energy can be calculated and tau
|
|
particles can be reconstructed by their decay products.
|
|
|
|
## Jet clustering
|
|
|
|
Because of the hadronisation it is not possible to uniquely identify the originating particle of a jet. Nonetheless,
|
|
several algorithms exist to help with this problem. The algorithm used in this thesis is the anti-$k_t$ clustering
|
|
algorithm. It arises from a generalization of several other clustering algorithms, namely the $k_t$, Cambridge/Aachen
|
|
and SISCone clustering algorithms.
|
|
|
|
The anti-$k_t$ clustering algorithm associates hard particles with their soft particles surrounding them within a radius
|
|
$R = \sqrt{\eta^2 - \phi^2}$ in the $\eta$ - $\phi$ plane forming cone like jets. If two jets overlap, the jets shape is
|
|
changed according to its hardness in regards to the transverse momentum. A softer particles jet will change its shape
|
|
more than a harder particles. A visual comparison of four different clustering algorithms can be seen in
|
|
[@fig:antiktcomparison]. For this analysis, a radius of 0.8 is used.
|
|
|
|
Furthermore, to approximate the mass of a heavy particle that caused a jet, the softdropmass can be used. It is
|
|
calculated by removing wide angle soft particles from the jet to counter the effects of contamination from initial state
|
|
radiation, underlying event and multiple hadron scattering. It therefore is more accurate in determining the mass of a
|
|
particle causing a jet than taking the mass of all constituent particles of the jet combined.
|
|
|
|
![
|
|
Comparison of the $k_t$, Cambridge/Aachen, SISCone and anti-$k_t$ algorithms clustering a sample parton-level event
|
|
with many random soft "ghosts". Taken from [@ANTIKT]
|
|
](./figures/antikt-comparision.png){#fig:antiktcomparison}
|
|
|
|
[@Fig:antiktcomparison] shows, that the jets reconstructed using the anti-$k_t$ algorithm have the clearest cone like
|
|
shape and is therefore chosen for this thesis.
|
|
|
|
\newpage
|
|
|
|
# Method of analysis {#sec:moa}
|
|
|
|
This section gives an overview over how the data gathered by the LHC and CMS is going to be analysed to be able to
|
|
either exclude the q\* particle to even higher masses than already done or maybe confirm its existence.
|
|
|
|
As described in [@sec:qs], the decay of the q\* particle to a quark and a vector boson with the vector boson then
|
|
decaying hadronically will be investigated. This is the second most probable decay of the q\* particle and easier to
|
|
analyse than the dominant decay to a quark and a gluon. Therefore it is a good choice for this research.
|
|
The decay q\* $\rightarrow$ qV + q $\rightarrow q\bar{q}$ + q results in two jets, because the decay products of the
|
|
heavy vector boson are highly boosted, causing them to be very close together and therefore be reconstructed as one jet.
|
|
The dijet invariant mass of those two jets, which is identical to the mass of the q\* particle, is reconstructed. The
|
|
only background considered is the QCD background described in [@sec:qcdbg]. A selection using different kinematic
|
|
variables as well as a tagger to identify jets from the decay of a vector boson is introduced to reduce the background
|
|
and increase the sensitivity for the signal. After that, it will be looked for a peak in the dijet invariant mass
|
|
distribution at the resonance mass of the q\* particle.
|
|
|
|
The data studied were collected by the CMS experiment in the years 2016, 2017 and 2018. They are analysed with the
|
|
Particle Flow algorithm to reconstruct jets and all the other particles forming during the collision. The jets are then
|
|
clustered using the anti-$k_t$ algorithm with the distance parameter R being 0.8.
|
|
|
|
The analysis will be conducted with two different sets of data. First, only the data collected by CMS in 2016 will be
|
|
used to compare the results to the previous analysis [@PREV_RESEARCH]. Then the combined data from 2016, 2017 and 2018
|
|
will be used to improve the previously set limits for the mass of the q\* particle. Also, two different V-tagging
|
|
mechanisms will be used to compare their performance. One based on the N-subjettiness variable used in the previous
|
|
research [@PREV_RESEARCH], the other being a novel approach using a deep neural network, that will be explained in the
|
|
following.
|
|
|
|
## Signal and Background modelling
|
|
|
|
To make sure the setup is working as intended, at first simulated samples of background and signal are used. In those
|
|
Monte Carlo simulations, the different particle interactions that take place in a proton - proton
|
|
collision are simulated using the probabilities provided by the Standard Model by calculating the cross sections of the
|
|
different feynman diagrams. Later on, also detector effects (like its limited resolution) are
|
|
applied to make sure, they look like real data coming from the CMS detector. The q\* signal samples are simulated by the
|
|
probabilities given by the q\* theory [@QSTAR_THEORY] and assuming a cross section of $\SI{1}{\per\pico\barn}$. The
|
|
simulation was done using MadGraph. Because of the expected high mass, the signal width will be dominated by the
|
|
resolution of the detector, not by the natural resonance width.
|
|
|
|
The dijet invariant mass distribution of the QCD background is expected to smoothly fall with higher masses.
|
|
It is therefore fitted using the following smooth falling function with three parameters p0, p1, p2:
|
|
\begin{equation}
|
|
\frac{dN}{dm_{jj}} = \frac{p_0 \cdot ( 1 - m_{jj} / \sqrt{s} )^{p_2}}{ (m_{jj} / \sqrt{s})^{p_1}}
|
|
\end{equation}
|
|
Whereas $m_{jj}$ is the invariant mass of the dijet and $p_0$ is a normalisation parameter. It is the same function as
|
|
used in the previous research studying 2016 data only.
|
|
|
|
The signal is fitted using a double sided crystal ball function. It has six parameters:
|
|
|
|
- mean: the functions mean, in this case the resonance mass
|
|
- sigma: the functions width, in this case the resolution of the detector
|
|
- n1, n2, alpha1, alpha2: parameters influencing the shape of the left and right tail
|
|
|
|
A gaussian and a poisson function have also been studied but found to be not able to reproduce the signal shape as they
|
|
couldn't model the tails on both sides of the peak.
|
|
|
|
An example of a fit of these functions to a toy dataset with gaussian errors can be seen in [@fig:cb_fit]. In this
|
|
figure, a binning of 200 GeV is used. For the actual analysis a 1 GeV binning will be used. It can be seen that the fit
|
|
works very well and therefore confirms the functions chosen to model signal and background. This is supported by a
|
|
$\chi^2 /$ ndof of 0.5 and a found mean for the signal at 2999 $\pm$ 23 $\si{\giga\eV}$ which is extremely close to the
|
|
expected 3000 GeV mean. Those numbers clearly show that the method in use is able to successfully describe the data.
|
|
|
|
{#fig:cb_fit}
|
|
|
|
\newpage
|
|
|
|
# Preselection and data quality
|
|
|
|
To reduce the background and increase the signal sensitivity, a selection of events by different variables is
|
|
introduced. It is divided into two stages. The first one (the preselection) adds some general physics motivated
|
|
selection using kinematic variables and is also used to make sure a good trigger efficiency is achieved. In the second
|
|
part, different taggers will be used as a discriminator between QCD background and signal events. After the
|
|
preselection, it is made sure, that the simulated samples represent the real data well by comparing the data with the
|
|
simulation in the signal as well as a sideband region, where no signal events are expected.
|
|
|
|
## Preselection
|
|
|
|
First, all events are cleaned of jets with a $p_t < \SI{200}{\giga\eV}$ and a pseudorapidity $|\eta| > 2.4$. This is to
|
|
discard soft background and to make sure the particles are in the barrel region of the detector for an optimal track
|
|
reconstruction. Furthermore, all events with one of the two highest $p_t$ jets having an angular separation smaller
|
|
than 0.8 from any electron or muon are discarded to allow future use of the results in studies of the semi or
|
|
all-leptonic decay channels.
|
|
|
|
From a decaying q\* particle, we expect two jets in the endstate. The dijet invariant mass of those two jets will be
|
|
used to reconstruct the mass of the q\* particle. Therefore a cut is added to have at least 2 jets.
|
|
More jets are also possible, for example caused by gluon radiation of a quark causing another jet. If this is the case,
|
|
the two jets with the highest $p_t$ are used for the reconstruction of the q\* mass.
|
|
The distributions of the number of jets before and after the selection can be seen in [@fig:njets].
|
|
|
|
\begin{figure}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/2016/v1_Cleaner_N_jets_stack.eps}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/2016/v1_Njet_N_jets_stack.eps}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/combined/v1_Cleaner_N_jets_stack.eps}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/combined/v1_Njet_N_jets_stack.eps}
|
|
\end{minipage}
|
|
\caption{Number of jet distribution showing the cut at number of jets $\ge$ 2. Left: distribution before the cut. Right:
|
|
distribution after the cut. 1st row: data from 2016. 2nd row: combined data from 2016, 2017 and 2018. The signal curves
|
|
are amplified by a factor of 10,000, to be visible.}
|
|
\label{fig:njets}
|
|
\end{figure}
|
|
|
|
The next selection is done using $\Delta\eta = |\eta_1 - \eta_2|$, with $\eta_1$ and $\eta_2$ being the $\eta$ of the
|
|
first two jets in regards to their transverse momentum. The q\* particle is expected to be very heavy in regards to the
|
|
center of mass energy of the collision and will therefore be almost stationary. Its decay products should therefore be
|
|
close to back to back, which means the $\Delta\eta$ distribution is expected to peak at 0. At the same time, particles
|
|
originating from QCD effects are expected to have a higher $\Delta\eta$ as they mainly form from less heavy resonances.
|
|
To maintain comparability, the same selection as in previous research of $\Delta\eta \le 1.3$ is used. A comparison of
|
|
the $\Delta\eta$ distribution before and after the selection can be seen in [@fig:deta].
|
|
|
|
\begin{figure}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/2016/v1_Njet_deta_stack.eps}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/2016/v1_Eta_deta_stack.eps}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/combined/v1_Njet_deta_stack.eps}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/combined/v1_Eta_deta_stack.eps}
|
|
\end{minipage}
|
|
\caption{$\Delta\eta$ distribution showing the cut at $\Delta\eta \le 1.3$. Left: distribution before the cut. Right:
|
|
distribution after the cut. 1st row: data from 2016. 2nd row: combined data from 2016, 2017 and 2018. The signal curves
|
|
are amplified by a factor of 10,000, to be visible.}
|
|
\label{fig:deta}
|
|
\end{figure}
|
|
|
|
The last selection in the preselection is on the dijet invariant mass: $m_{jj} \ge \SI{1050}{\giga\eV}$. It is important
|
|
for a high trigger efficiency and can be seen in [@fig:invmass]. Also, it has a huge impact on the background because it
|
|
usually consists of way lighter particles. The q\* on the other hand is expected to have a very high invariant mass of
|
|
more than 1 TeV. The $m_{jj}$ distribution should be a smoothly falling function for the QCD background and peak at the
|
|
simulated resonance mass for the signal events.
|
|
|
|
\begin{figure}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/2016/v1_Eta_invMass_stack.eps}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/2016/v1_invmass_invMass_stack.eps}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/combined/v1_Eta_invMass_stack.eps}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/combined/v1_invmass_invMass_stack.eps}
|
|
\end{minipage}
|
|
\caption{Invariant mass distribution showing the cut at $m_{jj} \ge \SI{1050}{\giga\eV}$. It shows the expected smooth
|
|
falling functions of the background whereas the signal peaks at the simulated resonance mass.
|
|
Left: distribution before the
|
|
cut. Right: distribution after the cut. 1st row: data from 2016. 2nd row: combined data from 2016, 2017 and 2018.}
|
|
\label{fig:invmass}
|
|
\end{figure}
|
|
|
|
After the preselection, the signal efficiency for q* decaying to qW of 2016 ranges from 48 % for 1.6 TeV to 49 % for 7
|
|
TeV. Decaying to qZ, the efficiencies are between 45 % (1.6 TeV) and 50 % (7 TeV). The amount of background after the
|
|
preselection is reduced to 5 % of the original events. For the combined data of the three years those values look
|
|
similar. Decaying to qW signal efficiencies between 49 % (1.6 TeV) and 56 % (7 TeV) are reached, wheres the efficiencies
|
|
when decaying to qZ are in the range of 46 % (1.6 TeV) to 50 % (7 TeV). Here, the background could be reduced to 8 % of
|
|
the original events. So while keeping around 50 % of the signal, the background was already reduced to less than a
|
|
tenth. Still, as can be seen in [@fig:njets] to [@fig:invmass], the amount of signal is very low.
|
|
|
|
## Data - Monte Carlo Comparison
|
|
|
|
To ensure high data quality, the simulated QCD background sample is now being compared to the actual data of the
|
|
corresponding year collected by the CMS detector. This is done for the year 2016 and for the combined data of years
|
|
2016, 2017 and 2018. The distributions are rescaled so the integral over the invariant mass distribution of data and
|
|
simulation are the same. In [@fig:data-mc], the three distributions of the variables that were used for the preselection
|
|
can be seen for year 2016 and the combined data of years 2016 to 2018.
|
|
For analysing the real data from the CMS, jet energy corrections have to be applied. Those are to calibrate the ECAL and
|
|
HCAL parts of the CMS, so the energy of the detected particles can be measured correctly. The corrections used were
|
|
published by the CMS group. [source needed, but not sure where to find it]
|
|
|
|
\begin{figure}
|
|
\begin{minipage}{0.33\textwidth}
|
|
\includegraphics{./figures/2016/DATA/v1_invmass_N_jets.eps}
|
|
\end{minipage}
|
|
\begin{minipage}{0.33\textwidth}
|
|
\includegraphics{./figures/2016/DATA/v1_invmass_deta.eps}
|
|
\end{minipage}
|
|
\begin{minipage}{0.33\textwidth}
|
|
\includegraphics{./figures/2016/DATA/v1_invmass_invMass.eps}
|
|
\end{minipage}
|
|
\begin{minipage}{0.33\textwidth}
|
|
\includegraphics{./figures/combined/DATA/v1_invmass_N_jets.eps}
|
|
\end{minipage}
|
|
\begin{minipage}{0.33\textwidth}
|
|
\includegraphics{./figures/combined/DATA/v1_invmass_deta.eps}
|
|
\end{minipage}
|
|
\begin{minipage}{0.33\textwidth}
|
|
\includegraphics{./figures/combined/DATA/v1_invmass_invMass.eps}
|
|
\end{minipage}
|
|
\caption{Comparision of data with the Monte Carlo simulation.
|
|
1st row: data from 2016.
|
|
2nd row: combined data from 2016, 2017 and 2018.}
|
|
\label{fig:data-mc}
|
|
\end{figure}
|
|
|
|
The shape of the real data matches the simulation well. The $\Delta\eta$ distributions shows some offset between data
|
|
and simulation.
|
|
|
|
### Sideband
|
|
|
|
The sideband is introduced to make sure no bias in the data and Monte Carlo simulation is introduced. It is a region in
|
|
which no signal event is expected. Again, data and the Monte Carlo simulation are compared. For this analysis, the
|
|
region where the softdropmass of both of the two jets with the highest transverse momentum ($p_t$) is more than 105 GeV
|
|
was chosen. 105 GeV is well above the mass of 91 GeV of the Z boson, the heavier vector boson. Therefore it is very
|
|
unlikely that a particle heavier than t
|
|
In [@fig:sideband], the comparison of data with simulation in the sideband region can be seen for the softdropmass
|
|
distribution as well as the dijet invariant mass distribution. As in [fig:data-mc], the histograms are rescaled, so that
|
|
the dijet invariant mass distributions of data and simulation have the same integral.
|
|
It can be seen, that in the sideband region data and simulation match very well.
|
|
|
|
\begin{figure}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/2016/sideband/v1_SDM_SoftDropMass_1.eps}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/2016/sideband/v1_SDM_invMass.eps}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/combined/sideband/v1_SDM_SoftDropMass_1.eps}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/combined/sideband/v1_SDM_invMass.eps}
|
|
\end{minipage}
|
|
\caption{Comparison of data with the Monte Carlo simulation in the sideband region. 1st row: data from 2016. 2nd row:
|
|
combined data from 2016, 2017 and 2018.}
|
|
\label{fig:sideband}
|
|
\end{figure}
|
|
|
|
\newpage
|
|
|
|
# Jet substructure selection
|
|
|
|
So far it was made sure, that the actual data and the simulation are in good agreement after the preselection and no
|
|
unwanted side effects are introduced in the data by the used cuts. Now another selection has to be introduced, to
|
|
further reduce the background to be able to extract the hypothetical signal events from the actual data.
|
|
|
|
This is done by distinguishing between QCD and signal events using a tagger to identify jets coming
|
|
from a vector boson. Two different taggers will be used to later compare their performance. The decay analysed includes
|
|
either a W or Z boson, which are, compared to the particles in QCD effects, very heavy. This can be used by adding a cut
|
|
on the softdropmass of a jet. The softdropmass of at least one of the two leading jets is expected to be within
|
|
$\SI{35}{\giga\eV}$ and $\SI{105}{\giga\eV}$. This cut already provides a good separation of QCD and signal events, on
|
|
which the two taggers presented next can build.
|
|
|
|
Both taggers provide a discriminator value to choose whether an event originates in the decay of a vector boson or from
|
|
QCD effects. This value will be optimized afterwards to make sure the maximum efficiency possible is achieved.
|
|
|
|
## N-Subjettiness
|
|
|
|
The N-subjettiness [@TAU21] $\tau_N$ is a jet shape parameter designed to identify boosted hadronically-decaying
|
|
objects. When a vector boson decays hadronically, it produces two quarks each causing a jet. But because of the high
|
|
mass of the vector bosons, the particles are highly boosted and appear, after applying a clustering algorithm, as just
|
|
one. This algorithm now tries to figure out, whether one jet might consist of two subjets by using the kinematics and
|
|
positions of the constituent particles of this jet.
|
|
The N-subjettiness is defined as
|
|
|
|
\begin{equation} \tau_N = \frac{1}{d_0} \sum_k p_{T,k} \cdot \text{min}\{ \Delta R_{1,k}, \Delta R_{2,k}, …, \Delta
|
|
R_{N,k} \} \end{equation}
|
|
|
|
with k going over the constituent particles in a given jet, $p_{T,k}$ being their transverse momenta and $\Delta R_{J,k}
|
|
= \sqrt{(\Delta\eta)^2 + (\Delta\phi)^2}$ being the distance of a candidate subjet J and a constituent particle k in the
|
|
$\eta$ - $\phi$ plane. It quantifies to what degree a jet can be regarded as a jet composed of $N$ subjets.
|
|
Experiments showed, that rather than using $\tau_N$ directly, the ratio $\tau_{21} = \tau_2/\tau_1$ is a better
|
|
discriminator between QCD events and events originating from the decay of a boosted vector boson.
|
|
|
|
The lower the $\tau_{21}$ is, the more likely a jet is caused by the decay of a vector boson. Therefore a selection will
|
|
be introduced, so that $\tau_{21}$ of one candidate jet is smaller then some value that will be determined by an
|
|
optimization process described in the next chapter. As candidate jet the one of the two highest $p_t$ jets passing the
|
|
softdropmass window is used. If both of them pass, the one with higher $p_t$ is chosen.
|
|
|
|
## DeepAK8
|
|
|
|
The DeepAK8 tagger [@DEEP_BOOSTED] uses a deep neural network (DNN) to identify decays originating in a vector boson. It
|
|
claims to reduce the background rate by up to a factor of ~10 with the same signal efficiency compared to
|
|
non-machine-learning approaches like the N-Subjettiness method. This is supported by [@fig:ak8_eff], showing a
|
|
comparision of background and signal efficiency of the DeepAK8 tagger, with, between others, the $\tau_{21}$ tagger also
|
|
used in this analysis.
|
|
|
|
![Comparison of tagger efficiencies, showing, between others, the DeepAK8 and $\tau_{21}$ tagger used in this analysis.
|
|
Taken from [@DEEP_BOOSTED]](./figures/deep_ak8.pdf){#fig:ak8_eff width=80%}
|
|
|
|
The DNN has two input lists for each jet. The first is a list of up to 100 constituent particles of the jet, sorted by
|
|
decreasing $p_t$. A total of 42 properties of the particles such es $p_t$, energy deposit, charge and the
|
|
angular momentum between the particle and the jet or subjet axes are included. The second input list is a list of up to
|
|
seven secondary vertices, each with 15 features, such as the kinematics, displacement and quality criteria.
|
|
To process those inputs, a customised DNN architecture has been developed. It consists of two convolutional neural
|
|
networks that each process one of the input lists. The outputs of the two CNNs are then combined and processed by a
|
|
fully-connected network to identify the jet. The network was trained with a sample of 40 million jets, another 10
|
|
million jets were used for development and validation.
|
|
|
|
In this thesis, the mass decorrelated version of the DeepAK8 tagger is used. It adds an additional mass predictor layer,
|
|
that is trained to quantify how strongly the output of the non-decorrelated tagger is correlated to the mass of a
|
|
particle. Its output is fed back to the network as a penalty so it avoids using features of the particles correlated to
|
|
their mass. The result is a largely mass decorrelated tagger of heavy resonances.
|
|
As the mass variable is already in use for the softdropmass selection, this version of the tagger is to be preferred.
|
|
|
|
The higher the discriminator value of the deep boosted tagger, the more likely is the jet to be caused by decay of a
|
|
vector boson. Therefore, using the same way to choose a candidate jet as for the N-subjettiness tagger, a selection is
|
|
applied so that this candidate jet has a WvsQCD/ZvsQCD value greater than some value determined by the optimization
|
|
presented next.
|
|
|
|
## Optimization {#sec:opt}
|
|
|
|
To figure out the best value to cut on the discriminators introduced by the two taggers, a value to quantify how good a
|
|
cut is has to be introduced. For that, the significance calculated by $\frac{S}{\sqrt{B}}$ will be used. S stands for
|
|
the amount of signal events and B for the amount of background events in a given interval. This value assumes a gaussian
|
|
error on the background so it will be calculated for the 2 TeV masspoint where enough background events exist to justify
|
|
this assumption. It follows from the central limit theorem that states, that for identical distributed random variables,
|
|
their sum converges to a gaussian distribution. The significance therefore represents how good the signal can be
|
|
distinguished from the background in units of the standard deviation of the background. As interval, a 10 % margin
|
|
around the resonance nominal mass is chosen. The significance is then calculated for different selections on the
|
|
discriminant of the two taggers and then plotted in dependence on the minimum resp. maximum allowed value of the
|
|
discriminant to pass the selection for the deep boosted resp. the N-subjettiness tagger.
|
|
|
|
The optimization process is done using only the data from year 2018, assuming the taggers have similar performances on
|
|
the data of the different years.
|
|
|
|
\begin{figure}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/sig-db.pdf}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/sig-tau.pdf}
|
|
\end{minipage}
|
|
\caption{Significance plots for the deep boosted (left) and N-subjettiness (right) tagger at the 2 TeV masspoint.}
|
|
\label{fig:sig}
|
|
\end{figure}
|
|
|
|
As a result, the $\tau_{21}$ cut is placed at $\le 0.35$, confirming the value previous research chose and the deep
|
|
boosted cut is placed at $\ge 0.95$. For the deep boosted tagger, 0.97 would give a slightly higher significance but as
|
|
it is very close to the edge where the significance drops very low and the higher the cut the less background will be
|
|
left to calculate the cross section limits, especially at higher resonance masses, the slightly less strict cut is
|
|
chosen.
|
|
The significance for the $\tau_{21}$ cut is 14, and for the deep boosted tagger 26.
|
|
|
|
For both taggers also a low purity category is introduced for high TeV regions. Using the cuts optimized for 2 TeV,
|
|
there are very few background events left for higher resonance masses, but to reliably calculate cross section limits,
|
|
those are needed. As low purity category for the N-subjettiness tagger, a cut at $0.35 < \tau_{21} < 0.75$ is used. For
|
|
the deep boosted tagger the opposite cut from the high purity category is used: $VvsQCD < 0.95$.
|
|
|
|
# Signal extraction {#sec:extr}
|
|
|
|
After the optimization, now the optimal selection for the N-subjettiness as well as the deep boosted tagger is found and
|
|
applied to the simulated samples as well as the data collected by the CMS. The fit described in [@sec:moa] is performed
|
|
for all masspoints of the decay to qW and qZ and for both datasets used, the one from 2016 und the combined one of 2016,
|
|
2017 and 2018.
|
|
|
|
To extract the signal from the background, its cross section limit is calculated using a frequentist asymptotic limit
|
|
calculator. It performs a shape analysis of the dijet invariant mass spectrum to determine an expected and an observed
|
|
limit. If there's no resonance of the q\* particle in the data, the observed limit should lie within the $2\sigma$
|
|
environment of the expected limit. After that, the crossing of the theory line, representing the cross section limits
|
|
expected, if the q\* particle would exist, and the observed data is calculated, to have a limit of mass up to which the
|
|
existence of the q\* particle can be excluded. To find the uncertainty of this result, the crossing of the theory line
|
|
plus, respectively minus, its uncertainty with the observed limit is also calculated.
|
|
|
|
## Uncertainties
|
|
|
|
For calculating the cross section of the signal, four sources of uncertainties are considered.
|
|
|
|
First, the uncertainty of the Jet Energy Corrections. When measuring a particle's energy with the ECAL or HCAL part of
|
|
the CMS, the electronic signals send by the photodetectors in the calorimeters have to be converted to actual energy
|
|
values. Therefore an error in this calibration causes the energy measured to be shifted to higher or lower values
|
|
causing also the position of the signal peak in the $m_{jj}$ distribution to vary. The uncertainty is approximated to be
|
|
2 %.
|
|
|
|
Second, the tagger is not perfect and therefore some events, that don't originate from a V boson are wrongly chosen and
|
|
on the other hand sometimes events that do originate from one are not. It influences the events chose for analysis and
|
|
is therefore also considered as an uncertainty, which is approximated to be 6 %.
|
|
|
|
Third, the uncertainty of the parameters of the background fit is also considered, as it might change the background
|
|
shape a little and therefore influence how many signal and background events are reconstructed from the data.
|
|
|
|
Fourth, the uncertainty on the Luminosity of the LHC of 2.5 % is also taken into account for the final results.
|
|
|
|
# Results
|
|
|
|
This chapter will start by presenting the results for the data of year 2016 using both taggers and comparing it to the
|
|
previous research [@PREV_RESEARCH]. It will then go on showing the results for the combined dataset, again using both
|
|
taggers comparing their performances.
|
|
|
|
## 2016
|
|
|
|
Using the data collected by the CMS experiment on 2016, the cross section limits seen in [@fig:res2016] were obtained.
|
|
|
|
As described in [@sec:extr], the calculated cross section limits are used to then calculate a mass limit, meaning the
|
|
lowest possible mass of the q\* particle, by finding the crossing of the theory line with the observed cross section
|
|
limit. In [@fig:res2016] it can be seen, that the observed limit in the region where theory and observed limit cross is
|
|
very high compared to when using the N-subjettiness tagger. Therefore the two lines cross earlier, which results in
|
|
lower exclusion limits on the mass of the q\* particle causing the deep boosted tagger to perform worse than the
|
|
N-subjettiness tagger in regards of establishing those limits as can be seen in [@tbl:res2016]. The table also shows the
|
|
upper and lower limits on the mass found by calculating the crossing of the theory plus resp. minus its uncertainty. Due
|
|
to the theory and the observed limits line being very flat in the high TeV region, even a small uncertainty of the
|
|
theory can cause a high difference of the mass limit.
|
|
|
|
|
|
: Mass limits found using the data collected in 2016 {#tbl:res2016}
|
|
|
|
| Decay | Tagger | Limit [TeV] | Upper Limit [TeV] | Lower Limit [TeV] |
|
|
|-------|--------------|-------------|-------------------|-------------------|
|
|
| qW | $\tau_{21}$ | 5.39 | 6.01 | 4.99 |
|
|
| qW | deep boosted | 4.96 | 5.19 | 4.84 |
|
|
| qZ | $\tau_{21}$ | 4.86 | 4.96 | 4.70 |
|
|
| qZ | deep boosted | 4.49 | 4.61 | 4.40 |
|
|
|
|
|
|
|
|
\begin{figure}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/results/brazilianFlag_QtoqW_2016tau_13TeV.pdf}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/results/brazilianFlag_QtoqW_2016db_13TeV.pdf}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/results/brazilianFlag_QtoqZ_2016tau_13TeV.pdf}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/results/brazilianFlag_QtoqZ_2016db_13TeV.pdf}
|
|
\end{minipage}
|
|
\caption{Results of the cross section limits for 2016 using the $\tau_{21}$ tagger (left) and the deep boosted tagger
|
|
(right).}
|
|
\label{fig:res2016}
|
|
\end{figure}
|
|
|
|
### Previous research
|
|
|
|
The limit established by using the N-subjettiness tagger on the 2016 data is already slightly higher than the one from
|
|
previous research, which was found to be 5 TeV for the decay to qW and 4.7 TeV for the decay to qZ. This is mainly due
|
|
to the fact, that in our data, the observed limit at the intersection point happens to be in the lower region of the
|
|
expected limit interval and therefore causing a very late crossing with the theory line when using the N-subjettiness
|
|
tagger (as can be seen in [@fig:res2016]). This could be caused by small differences of the setup used or slightly
|
|
differently processed data. Comparing the expected limits, there is a difference between 3 % and 30 %, between the
|
|
values calculated by this thesis compared to the previous research. It is not, however, that one of the two results was
|
|
constantly lower or higher but rather fluctuating. Therefore it can be said, that the results are in good agreement. The
|
|
cross section limits of the previous research can be seen in [@fig:prev].
|
|
|
|
\begin{figure}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/results/prev_qW.png}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/results/prev_qZ.png}
|
|
\end{minipage}
|
|
\caption{Previous results of the cross section limits for q\* decaying to qW (left) and q\* decaying to qZ (right).
|
|
Taken from \cite{PREV_RESEARCH}.}
|
|
\label{fig:prev}
|
|
\end{figure}
|
|
|
|
## Combined dataset
|
|
|
|
Using the combined data, the cross section limits seen in [@fig:resCombined] were obtained. The cross section limits
|
|
are, compared to only using the 2016 dataset, almost cut in half. This shows the big improvement achieved by using more
|
|
than three times the amount of data.
|
|
|
|
The results for the mass limits of the combined years are as follows:
|
|
|
|
|
|
: Mass limits found using the data collected in 2016 - 2018
|
|
|
|
| Decay | Tagger | Limit [TeV] | Upper Limit [TeV] | Lower Limit [TeV] |
|
|
|-------|--------------|-------------|-------------------|-------------------|
|
|
| qW | $\tau_{21}$ | 6.00 | 6.26 | 5.74 |
|
|
| qW | deep boosted | 6.11 | 6.31 | 5.39 |
|
|
| qZ | $\tau_{21}$ | 5.49 | 5.76 | 5.29 |
|
|
| qZ | deep boosted | 4.92 | 5.02 | 4.80 |
|
|
|
|
|
|
The combination of the three years not just improved the cross section limits, but also the limit for the mass of the
|
|
q\* particle. The final result is 1 TeV higher for the decay to qW and almost 0.8 TeV higher for the decay to qZ than
|
|
what was concluded by the previous research [@PREV_RESEARCH].
|
|
|
|
\begin{figure}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/results/brazilianFlag_QtoqW_Combinedtau_13TeV.pdf}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/results/brazilianFlag_QtoqW_Combineddb_13TeV.pdf}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/results/brazilianFlag_QtoqZ_Combinedtau_13TeV.pdf}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/results/brazilianFlag_QtoqZ_Combineddb_13TeV.pdf}
|
|
\end{minipage}
|
|
\caption{Results of the cross section limits for the three combined years using the $\tau_{21}$ tagger (left) and the
|
|
deep boosted tagger (right).}
|
|
\label{fig:resCombined}
|
|
\end{figure}
|
|
|
|
|
|
## Comparison of taggers
|
|
|
|
The previously shown results already show, that the deep boosted tagger was not able to significantly improve the
|
|
results compared to the N-subjettiness tagger.
|
|
For further comparison, in [@fig:limit_comp] the expected limits of the different taggers for the q\* $\rightarrow$ qW
|
|
and the q\* $\rightarrow$ qZ decay are shown. It can be seen, that the deep boosted is at best as good as the
|
|
N-subjettiness tagger. This was not the expected result, as the deep neural network was already found to provide a
|
|
higher significance in the optimisation done in [@sec:opt]. The higher significance should also result in lower cross
|
|
section limits. Apparently, doing the optimization only on data of the year 2018, was not the best choice. To make sure,
|
|
there is no mistake in the setup, also the expected cross section limits using only the high purity category of the two
|
|
taggers with 2018 data are compared in [@fig:comp_2018]. There, the cross section limits calculated using the deep
|
|
boosted tagger are a bit lower than with the N-subjettiness tagger, showing, that the method used for optimisation was
|
|
working but should have been applied to the combined dataset.
|
|
|
|
Recently, some issues with the training of the deep boosted tagger used in this analysis were also found, which might
|
|
explain, why it didn't perform much better in general.
|
|
|
|
|
|
\begin{figure}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/limit_comp_w.pdf}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/limit_comp_z.pdf}
|
|
\end{minipage}
|
|
\caption{Comparison of expected limits of the different taggers using different datasets. Left: decay to qW. Right:
|
|
decay to qZ}
|
|
\label{fig:limit_comp}
|
|
\end{figure}
|
|
|
|
|
|
{#fig:comp_2018 width=60%}
|
|
|
|
\clearpage
|
|
\newpage
|
|
|
|
# Summary
|
|
|
|
In this thesis, a limit on the mass of the q\* particle has been successfully established. By combining the data from
|
|
the years 2016, 2017 and 2018, collected by the CMS experiment, the previously set limit could be significantly
|
|
improved.
|
|
|
|
For the data analysis, the following selection was applied:
|
|
|
|
- #jets >= 2
|
|
- $\Delta\eta < 1.4$
|
|
- $m_{jj} >= \SI{1050}{\giga\eV}$
|
|
- $\SI{35}{\giga\eV} < m_{SDM} < \SI{105}{\giga\eV}$
|
|
|
|
For the deep boosted tagger, a high purity category of $VvsQCD > 0.95$ and a low purity category of $VvsQCD <= 0.95$ was
|
|
used. For the N-subjettiness tagger the high purity category was $\tau_{21} < 0.35$ and the low purity category $0.35 <
|
|
\tau_{21} < 0.75$. These values were found by optimizing for the highest possible significance of the signal.
|
|
|
|
After the selection, the cross section limits were extracted from the data and new exclusion limits for the mass of the
|
|
q\* particles set. These are 6.1 TeV by analyzing the decay to qW, respectively 5.5 TeV for the decay to qZ. Those
|
|
limits are about 1 TeV higher than the ones found in previous research, that found them to be 5 TeV resp. 4.7 TeV.
|
|
|
|
Two different taggers were used to compare the result. The newer deep boosted tagger was found to not improve the result
|
|
over the older N-subjettiness tagger. This was rather unexpected but might be caused by some training issues, that were
|
|
identified lately.
|
|
|
|
This research can also be used to test other theories of the q\* particle that predict its existence at lower masses,
|
|
than the one used, by overlaying the different theory curves in the plots shown in [@fig:res2016] and
|
|
[@fig:resCombined].
|
|
|
|
The optimization process used to find the optimal values for the discriminant provided by the taggers, was found to not
|
|
be optimal. It was only done using 2018 data, with which the deep boosted tagger showed a higher significance than the
|
|
N-subjettiness tagger. Apparently, the assumption, that the same optimization would apply to the data of the other years
|
|
as well, did not hold. Using the combined dataset, the deep boosted tagger showed no better cross section limits than
|
|
the N-subjettiness tagger, which are directly related to the significance used for the optimization. Therefore, with a
|
|
better optimization and the fixed training issues of the deep boosted tagger, it is very likely, that the result
|
|
presented could be further improved.
|
|
|
|
\newpage
|
|
|
|
\nocite{*}
|
|
|