1121 lines
68 KiB
Markdown
1121 lines
68 KiB
Markdown
---
|
|
author: David Leppla-Weber
|
|
#title: Search for excited quark states decaying to qW/qZ
|
|
lang: en-GB
|
|
header-includes: |
|
|
\usepackage[onehalfspacing]{setspace}
|
|
\usepackage{siunitx}
|
|
\usepackage{tikz-feynman}
|
|
\usepackage{csquotes}
|
|
\usepackage{abstract}
|
|
\pagenumbering{gobble}
|
|
\setlength{\parskip}{0.4em}
|
|
\bibliographystyle{lucas_unsrt}
|
|
documentclass: article
|
|
geometry:
|
|
- top=2.5cm
|
|
- left=2.5cm
|
|
- right=2.5cm
|
|
- bottom=2cm
|
|
papersize: a4
|
|
mainfont: Times New Roman
|
|
fontsize: 12pt
|
|
toc: false
|
|
nocite: |
|
|
@*
|
|
figPrefix: "Fig."
|
|
tblPrefix: "Table"
|
|
secPrefix: "Sec."
|
|
eqnPrefix: "Eq."
|
|
---
|
|
\begin{titlepage}
|
|
\begin{center}
|
|
\vspace*{1cm}
|
|
\Huge
|
|
\rule{\textwidth}{0.1cm}
|
|
\textbf{Search for excited quark states decaying to qW/qZ with the CMS experiment}
|
|
\rule{\textwidth}{0.1cm}
|
|
|
|
\vspace{2.5cm}
|
|
\Large
|
|
von\\
|
|
\LARGE
|
|
David Leppla-Weber\\
|
|
\Large
|
|
\vspace{0.5cm}
|
|
Geboren am\\
|
|
18.09.1996
|
|
|
|
\vfill
|
|
|
|
Bachelorarbeit im Studiengang Physik\\
|
|
Universität Hamburg\\
|
|
November 2019
|
|
|
|
\end{center}
|
|
\end{titlepage}
|
|
|
|
\newpage
|
|
\mbox{}
|
|
\vfill
|
|
|
|
1. Gutachter: Dr. Andreas Hinzmann
|
|
2. Gutachter: Jun.-Prof. Dr. Gregor Kasieczka
|
|
|
|
\newpage
|
|
|
|
\begin{abstract}
|
|
A search for an excited quark state, called q*, is presented using data of proton-proton collisions at the LHC recorded
|
|
by the CMS experiment during the years 2016, 2017 and 2018 with a centre-of-mass energy of $\sqrt{s} =
|
|
\SI{13}{\tera\eV}$ and an integrated luminosity of $\SI{137.19}{\per\femto\barn}$. Its decay channels to q
|
|
+ W and q + Z with the vector bosons further decaying hadronically to $q + q\bar{q}'$ resp. $q + q\bar{q}$, resulting in
|
|
two jets in the final state, are analysed. The dijet invariant mass spectrum of those two jets is then used to look
|
|
for a resonance and to reconstruct the q* mass. To identify jets originating from the decay of a vector boson, a
|
|
V-tagger is needed. For that, the new DeepAK8 tagger, based on a neural network, is compared to the older N-subjettiness
|
|
tagger. In the result, no significant deviation from the Standard Model can be observed, therefore the q* is excluded up
|
|
to a mass of 6.1\ TeV (qW) resp. 5.5\ TeV (qZ) with a confidence level of 95 \%. This limit is about 1\ TeV higher than
|
|
the limits found by a previous research of data with an integrated luminosity of $\SI{35.92}{\per\femto\barn}$ collected
|
|
by the CMS experiment in 2016, excluding the q* particle up to a mass of 5.0\ TeV resp. 4.7\ TeV. The DeepAK8 tagger is
|
|
found to currently be at the same level as the N-subjettiness tagger, giving a $\SI{0.1}{\tera\eV}$ better result for
|
|
the decay to qW but a by $\SI{0.5}{\tera\eV}$ worse one for the decay to qZ. By optimising the neural network's training
|
|
for the datasets of 2016, 2017 and 2018, the sensitivity can likely be improved.
|
|
|
|
\end{abstract}
|
|
\newpage
|
|
\renewcommand{\abstractname}{Zusammenfassung}
|
|
\begin{abstract}
|
|
|
|
In dieser Arbeit wird eine Suche nach angeregten Quarkzuständen, genannt q*, durchgeführt. Dafür werden Daten von
|
|
Proton-Proton Kollisionen am LHC mit einer integrierten Luminosität von $\SI{137.19}{\per\femto\barn}$ analysiert,
|
|
welche über die Jahre 2016, 2017 und 2018 bei einer Schwerpunktsenergie von $\sqrt{s} = \SI{13}{\tera\eV}$ vom CMS
|
|
Experiment aufgenommen wurden. Es wird der Zerfall des q* Teilchens zu q + W und q + Z untersucht, bei anschließendem
|
|
hadronischen Zerfall des Vektorbosons zu $q\bar{q}'$ bzw. $q\bar{q}$. Der gesamte Zerfall resultiert damit in zwei Jets,
|
|
mithilfe deren invariantem Massenspektrum die q* Masse rekonstruiert und nach einer Resonanz gesucht wird. Zur
|
|
Identifizerung von Jets, welche durch den Zerfall eines Vektorbosons entstanden sind, wird ein V-Tagger benötigt.
|
|
Hierfür wird der neue DeepAK8 Tagger, welcher auf einem neuronalen Netzwerk basiert, mit dem älteren N-Subjettiness
|
|
Tagger verglichen. Im Ergebnis kann keine signifikante Abweichung vom Standardmodell beobachtet werden. Das q* Teilchen
|
|
wird mit einem Konfidenzniveau von 95 \% bis zu einer Masse von 6.1\ TeV (qW) bzw. 5.5\ TeV (qZ) ausgeschlossen. Das Limit
|
|
liegt etwa 1\ TeV höher, als das anhand des $\SI{35.92}{\per\femto\barn}$ großen Datensatzes von 2016 gefundene von 5.0
|
|
TeV bzw. 4.7\ TeV. Beim Zerfall zu qW erzielt der DeepAK8 Tagger ein um $\SI{0.1}{\tera\eV}$ besseres Ergebnis, als der
|
|
N-Subjettiness Tagger, beim Zerfall zu qZ jedoch ein um $\SI{0.5}{\tera\eV}$ schlechteres. Durch Verbesserung des
|
|
Trainings des neuronalen Netzwerkes für die drei Datensätze von 2016, 2017 und 2018, gibt es aber noch Potential die
|
|
Sensitivität zu verbessern.
|
|
|
|
\end{abstract}
|
|
|
|
\newpage
|
|
\setcounter{tocdepth}{3}
|
|
\tableofcontents
|
|
|
|
\newpage
|
|
\pagenumbering{arabic}
|
|
|
|
# Introduction
|
|
|
|
The Standard Model is a very successful theory in describing most of the interactions happening between particles.
|
|
Still, it has a lot of limitations, that show that it isn't yet a full "theory of everything". To solve these
|
|
shortcomings, lots of theories beyond the standard model exist that try to expand the Standard Model in different ways
|
|
to solve these issues.
|
|
|
|
One category of such theories is based on a composite quark model. Quarks are currently considered elementary particles
|
|
by the Standard Model. The composite quark models on the other hand predict that quarks consist of particles unknown
|
|
to us so far or can bind to other particles using unknown forces. This could explain the symmetries between particles
|
|
and reduce the number of constants needed to explain the properties of the known particles. One common prediction of
|
|
those theories are excited quark states. Those are quark states of higher energy that can decay to an unexcited quark
|
|
under the emission of a boson. This thesis will look for their decay to a vector boson that then further decays
|
|
hadronically. The final state of this decay consists only of quarks forming two jets, making Quantum Chromodynamics the
|
|
main background.
|
|
|
|
In a previous research [@PREV_RESEARCH], an exclusion limit for the mass of an excited quark has already been set using
|
|
data from the 2016 run of the Large Hadron Collider with an integrated luminosity of $\SI{35.92}{\per\femto\barn}$.
|
|
Since then, a lot more data has been collected by the CMS experiment, totalling to $\SI{137.19}{\per\femto\barn}$ of
|
|
data usable for research. This thesis takes advantage of this larger dataset as well as a new technique to identify
|
|
decays of highly boosted particles based on a deep neural network. By using more data and new tagging techniques, it
|
|
aims to either confirm the existence of the q\* particle or improve the previously set lower limit of 5 TeV respectively
|
|
4.7 TeV for the decay to qW respectively qZ on its mass to even higher values. It will also directly compare the
|
|
performance of this new tagging technique to an older tagger based on jet substructure studies used in the previous
|
|
research.
|
|
|
|
In chapter 2, a theoretical background will be presented briefly explaining the Standard Model, its shortcomings and the
|
|
theory of excited quarks. Then, in chapter 3, the Large Hadron Collider and the Compact Muon Solenoid, the detector that
|
|
collected the data for this analysis, will be described. After that, in chapters 4-7, the main analysis part follows,
|
|
describing how the data were used to extract limits on the mass of the excited quark particle. At the very end, in
|
|
chapter 8, the results are presented and compared to previous research.
|
|
|
|
\newpage
|
|
|
|
# Theoretical motivation
|
|
|
|
This chapter presents a short summary of the theoretical background relevant to this thesis. It first gives an
|
|
introduction to the standard model itself and some of the issues it raises. It then goes on to explain the background
|
|
processes of quantum chromodynamics and the theory of q*, which are the relevant phenomena for the search described in
|
|
this thesis.
|
|
|
|
## Standard Model {#sec:sm}
|
|
|
|
The Standard Model of physics proved to be very successful in describing three of the four fundamental interactions
|
|
currently known: the electromagnetic, weak and strong interaction. The fourth, gravity, could not yet be successfully
|
|
included in this theory.
|
|
|
|
The Standard Model divides all particles into spin-$\frac{n}{2}$ fermions and spin-n bosons, where n could be any
|
|
integer but so far is only known to be one for fermions and either one (gauge bosons) or zero (scalar bosons) for
|
|
bosons. Fermions are further classified into quarks and leptons.
|
|
Quarks and leptons can also be categorized into three generations, each of which contains two particles, also called
|
|
flavours. For leptons, the three generations each consist of a charged lepton and its corresponding neutrino, namely the
|
|
electron, the muon and the tau. The three quark generations consist of the up and down, the charm and strange, and the
|
|
top and bottom quark. A full list of particles of the standard model can be found in [@fig:sm]. Furthermore, all
|
|
fermions have an associated anti particle with reversed charge. Bound states of multiple quarks also exist and are
|
|
called hadrons.
|
|
|
|
![
|
|
Elementary particles of the Standard Model and their mass charge and spin [@SM].
|
|
](./figures/sm_wikipedia.pdf){width=50% #fig:sm}
|
|
|
|
The gauge bosons, namely the photon, $W^\pm$ bosons, $Z^0$ boson, and gluon, are mediators of the different
|
|
forces of the standard model.
|
|
|
|
The photon is responsible for the electromagnetic force and therefore interacts with all
|
|
electrically charged particles. It itself carries no electromagnetic charge and has no mass. Possible interactions are
|
|
either scattering or absorption. Photons of different energies can also be described as electromagnetic waves of
|
|
different wavelengths.
|
|
|
|
The $W^\pm$ and $Z^0$ bosons mediate the weak force. All quarks and leptons carry a flavour, which is a conserved value
|
|
in all interactions but the weak one. There, a quark or lepton can, by interacting with a $W^\pm$ boson, change its
|
|
flavour. The probabilities of this happening are determined by the Cabibbo-Kobayashi-Maskawa matrix:
|
|
|
|
\begin{equation}
|
|
V_{CKM} =
|
|
\begin{pmatrix}
|
|
|V_{ud}| & |V_{us}| & |V_{ub}| \\
|
|
|V_{cd}| & |V_{cs}| & |V_{cb}| \\
|
|
|V_{td}| & |V_{ts}| & |V_{tb}|
|
|
\end{pmatrix}
|
|
=
|
|
\begin{pmatrix}
|
|
0.974 & 0.225 & 0.004 \\
|
|
0.224 & 0.974 & 0.042 \\
|
|
0.008 & 0.041 & 0.999
|
|
\end{pmatrix}
|
|
\end{equation}
|
|
|
|
The probability of a quark changing its flavour from $i$ to $j$ is given by the square of the absolute value of the
|
|
matrix element $V_{ij}$. It is easy to see, that the change of flavour in the same generation is way more likely than
|
|
any other flavour change.
|
|
|
|
Due to their high masses of 80.39 GeV resp. 91.19 GeV, the $W^\pm$ and $Z^0$ bosons themselves decay very quickly.
|
|
Either in the leptonic or hadronic decay channel. In the leptonic channel, the $W^\pm$ decays to a lepton and the
|
|
corresponding anti-lepton neutrino, in the hadronic channel it decays to a quark and an anti-quark of a different
|
|
flavour. Due to the $Z^0$ boson having no charge, it always decays to a fermion and its anti-particle, in the leptonic
|
|
channel this might be for example an electron - positron pair, in the hadronic channel an up and anti-up quark pair.
|
|
This thesis examines the hadronic decay channel, where both vector bosons decay to two quarks.
|
|
|
|
The quantum chromodynamics (QCD) describes the strong interaction of particles. It applies to all
|
|
particles carrying colour (e.g. quarks). The force is mediated by gluons. These bosons carry colour as well,
|
|
although they don't carry only one colour but rather a combination of a colour and an anticolour, and can therefore
|
|
interact with themselves and exist in eight different variants. As a result of this, processes where a gluon decays into
|
|
two gluons are possible. Furthermore the strength of the strong force, binding colour carrying particles, increases with
|
|
their distance making it at a certain point more energetically efficient to form a new quark - antiquark pair than
|
|
separating the two particles even further. This effect is known as colour confinement. Due to this effect, colour
|
|
carrying particles can't be observed directly, but rather form so called jets that cause hadronic showers in the
|
|
detector. Those jets are cone like structures made of hadrons and other particles. The effect is called hadronisation
|
|
[@HADRONIZATION].
|
|
|
|
### Shortcomings of the Standard Model
|
|
|
|
While being very successful in describing the effects observed in particle colliders or the particles reaching earth
|
|
from cosmological sources, the Standard Model still has several shortcomings.
|
|
|
|
- **Gravity**: as already noted, the standard model doesn't include gravity as a force.
|
|
- **Dark Matter**: observations of the rotational velocity of galaxies can't be explained by the known matter. Dark
|
|
matter currently the most popular theory to explain those.
|
|
- **Matter-antimatter asymmetry**: The amount of matter vastly outweights the amount of antimatter in the observable
|
|
universe. This can't be explained by the standard model, which predicts a similar amount of matter and antimatter.
|
|
- **Symmetries between particles**: Why do exactly three generations of fermions exist? Why is the charge of a quark
|
|
exactly one third of the charge of a lepton? How are the masses of the particles related? Those and more questions
|
|
cannot be answered by the standard model.
|
|
- **Hierarchy problem**: The weak force is approximately $10^{24}$ times stronger than gravity and so far, there's no
|
|
satisfactory explanation as to why that is.
|
|
|
|
## Excited quark states {#sec:qs}
|
|
|
|
One category of theories that try to explain the symmetries between particles of the standard model are the composite
|
|
quark models. Those state, that quarks consist of some particles unknown so far. This could explain the symmetries
|
|
between the different fermions. A common prediction of those models are excited quark states (q\*, q\*\*, q\*\*\*...)
|
|
[@QSTAR_THEORY]. Similar to atoms, that can be excited by the absorption of a photon and can then decay again under
|
|
emission of a photon with an energy corresponding to the excited state, those excited quark states could decay under the
|
|
emission of any boson. Quarks are measured to be smaller than $10^{-18}$ m. This corresponds to an energy scale of
|
|
approximately 1 TeV. Therefore the excited quark states are expected to be in that energy region. That will cause the
|
|
emitted boson to be highly boosted.
|
|
|
|
\begin{figure}
|
|
\centering
|
|
\feynmandiagram [large, horizontal=qs to v] {
|
|
a -- qs -- b,
|
|
qs -- [fermion, edge label=\(q*\)] v,
|
|
q1 [particle=\(q\)] -- v -- w [particle=\(W\)],
|
|
q2 [particle=\(q\)] -- w -- q3 [particle=\(q\)],
|
|
};
|
|
\caption{Feynman diagram showing the decay of a q* particle to a W boson and a quark with the W boson decaying
|
|
hadronically.} \label{fig:qsfeynman}
|
|
\end{figure}
|
|
|
|
This thesis will search data collected by the CMS in the years 2016, 2017 and 2018 for the decay of a single excited
|
|
quark state q\* to a quark and a vector boson . An example of a q\* decaying to a quark and a W boson can be seen in
|
|
[@fig:qsfeynman]. As explained in [@sec:sm], the vector boson can then decay either in the hadronic or leptonic decay
|
|
channel. This research investigates only the hadronic channel with two quarks in the final state. Because the boson is
|
|
highly boosted, those will be very close together and therefore appear to the detector as only one jet. This means that
|
|
the investigated decay of a q\* particle will have two jets in the final state and will therefore be hard to distinguish
|
|
from the QCD background described in [@sec:qcdbg].
|
|
|
|
The choice of only examining the decay of the q\* particle to the vector bosons is motivated by the branching ratios
|
|
calculated for the decay [@QSTAR_THEORY]:
|
|
|
|
|
|
: Branching ratios of the decaying q\* particle.
|
|
|
|
| decay mode | br. ratio [%] | decay mode | br. ratio [%] |
|
|
|---------------------------|---------------|---------------------------|---------------|
|
|
| $U^* \rightarrow ug$ | 83.4 | $D^* \rightarrow dg$ | 83.4 |
|
|
| $U^* \rightarrow dW$ | 10.9 | $D^* \rightarrow uW$ | 10.9 |
|
|
| $U^* \rightarrow u\gamma$ | 2.2 | $D^* \rightarrow d\gamma$ | 0.5 |
|
|
| $U^* \rightarrow uZ$ | 3.5 | $D^* \rightarrow dZ$ | 5.1 |
|
|
|
|
The decay to the vector bosons have the second highest branching ratio. The decay to a gluon and a quark is the dominant
|
|
decay, but virtually impossible to distinguish from the QCD background described in the next section. This makes the
|
|
decay to the vector bosons the most promising choice.
|
|
|
|
To reconstruct the mass of the q\* particle from an event successfully recognized to be the decay of such a particle,
|
|
the dijet invariant mass has to be calculated. This can be achieved by adding the four momenta of the two jets in the
|
|
final state, vectors consisting of the energy and momentum of a particle, together. From the four momentum it's easy to
|
|
derive the mass by solving $E=\sqrt{p^2
|
|
+ m^2}$ for m.
|
|
|
|
A search for the excited quark predicted by this theory has already been investigated in [@PREV_RESEARCH] analysing data
|
|
with an integrated luminosity of $\SI{35.92}{\per\femto\barn}$ recorded by the CMS experiment in 2016, excluding the q\*
|
|
particle up to a mass of 5 TeV resp. 4.7 TeV for the decay to qW resp. qZ analysing the hadronic decay of the vector
|
|
boson. This thesis aims to either exclude the particle to higher masses or find a resonance showing its existence using
|
|
more data that is available now.
|
|
|
|
### Quantum Chromodynamic background {#sec:qcdbg}
|
|
|
|
In this thesis, a decay with two jets in the final state will be analysed. Therefore it will be hard to distinguish the
|
|
signal processes from QCD effects. Those can also produce two jets in the final state, as can be seen in
|
|
[@fig:qcdfeynman]. They are also happening very often in a proton proton collision, as it is happening in the Large
|
|
Hadron Collider. This is caused by the structure of the proton. It not only consists of three quarks, called valence
|
|
quarks, but also of a lot of quark-antiquark pairs connected by gluons, called the sea quarks, that exist due to the
|
|
self interaction of the gluons binding the three valence quarks. Therefore the QCD multijet backgroubd is the dominant
|
|
background of the signal described in [@sec:qs].
|
|
|
|
\begin{figure}
|
|
\centering
|
|
\feynmandiagram [horizontal=v1 to v2] {
|
|
q1 [particle=\(q\)] -- [fermion] v1 -- [gluon] g1 [particle=\(g\)],
|
|
v1 -- [gluon] v2,
|
|
q2 [particle=\(q\)] -- [fermion] v2 -- [gluon] g2 [particle=\(g\)],
|
|
};
|
|
\feynmandiagram [horizontal=v1 to v2] {
|
|
g1 [particle=\(g\)] -- [gluon] v1 -- [gluon] g2 [particle=\(g\)],
|
|
v1 -- [gluon] v2,
|
|
g3 [particle=\(g\)] -- [gluon] v2 -- [gluon] g4 [particle=\(g\)],
|
|
};
|
|
\caption{Two examples of QCD processes resulting in two jets.} \label{fig:qcdfeynman}
|
|
\end{figure}
|
|
|
|
\newpage
|
|
|
|
# Experimental Setup
|
|
|
|
Following on, the experimental setup used to gather the data analysed in this thesis will be described.
|
|
|
|
## Large Hadron Collider
|
|
|
|
The Large Hadron Collider [@LHC_MACHINE] is the world's largest and most powerful particle accelerator. It has a
|
|
circumference of 27 km and can accelerate two beams of protons to an energy of 6.5 TeV resulting in a collision with a
|
|
centre of mass energy of 13 TeV. It is home to several experiments, between others the Compact Muon Solenoid (CMS),
|
|
which is the one used for the search presented in this thesis. It is a general-purpose detector to investigate the
|
|
particles that form during particle collisions. The LHC may also be used for colliding ions but this ability is to no
|
|
interest for this research.
|
|
|
|
Because of the collision of two beams with particles of the same charge, it is not possible to use the same magnetic
|
|
field for both beams. Therefore opposite magnetic-dipole fields exist in both rings to be able to accelerate the beams
|
|
in opposite directions.
|
|
|
|
Particle colliders are characterized by their luminosity L. It is a quantity to be able to calculate the number of
|
|
events per second generated in a collision by $\dot{N}_{event} = L\sigma_{event}$ with $\sigma_{event}$ being the cross
|
|
section of the event. The LHC aims for a peak luminosity of $10^{34}\si{\per\square\centi\metre\per\s}$. This is
|
|
achieved by colliding two bunches of protons every $\SI{25}{ns}$. Each proton beam thereby consists of 2'808 bunches.
|
|
Furthermore, the integrated Luminosity, defined as $\int Ldt$, can be used to describe the amount of data collected over
|
|
a specific time interval.
|
|
|
|
## Compact Muon Solenoid
|
|
|
|
The data used in this thesis was recorded by the Compact Muon Solenoid (CMS) [@CMS_REPORT]. It is one of the four main
|
|
experiments at the Large Hadron Collider. It can detect all elementary particles of the standard model except neutrinos.
|
|
For that, it has an onion like setup, as can be seen in [@fig:cms_setup]. The particles produced in a collision first go
|
|
through a tracking system. They then pass an electromagnetic as well as a hadronic calorimeter. This part is surrounded
|
|
by a superconducting solenoid that generates a magenetic field of 3.8 T. Outside of the solenoid are big muon chambers.
|
|
In 2016 the CMS captured data of an integrated luminosity of $\SI{37.80}{\per\femto\barn}$. In 2017 it collected
|
|
$\SI{44.98}{\per\femto\barn}$ and in 2018 $\SI{63.67}{\per\femto\barn}$ [@CMS_LUMI]. The amount of data usable for
|
|
research is $\SI{35.92}{\per\femto\barn}$, $\SI{41.53}{\per\femto\barn}$ and $\SI{59.74}{\per\femto\barn}$ for the years
|
|
2016, 2017 and 2018, totalling to $\SI{137.19}{\per\femto\barn}$ of data.
|
|
|
|
![
|
|
The setup of the Compact Muon Solenoid showing its onion like structure, the different detector parts and where
|
|
different particles are detected [@CMS_PLOT].
|
|
](./figures/cms_setup.png){#fig:cms_setup}
|
|
|
|
|
|
### Coordinate conventions
|
|
|
|
Per convention, the z axis points along the beam axis in the direction of the magnetic fields of the solenoid, the y
|
|
axis upwards and the x axis horizontal towards the LHC centre. The azimuthal angle $\phi$, which describes the angle in
|
|
the x - y plane, the polar angle $\theta$, which describes the angle in the y - z plane and the pseudorapidity $\eta$,
|
|
which is defined as $\eta = -ln\left(tan\frac{\theta}{2}\right)$ are also introduced. The coordinates are visualised in
|
|
[@fig:cmscoords]. Furthermore, to describe a particle's momentum, often the transverse momentum, $p_t$ is used. It is
|
|
the component of the momentum transversal to the beam axis. Before the collision, the transverse momentum is zero,
|
|
therefore, due to conservation of energy, the sum of all transverse momenta after the collision has to be zero, too. If
|
|
this is not the case for the detected events, it implies particles that weren't detected such as neutrinos.
|
|
|
|
![Coordinate conventions of the CMS illustrating the use of $\eta$ and
|
|
$\phi$. The Z axis is in beam direction [@COORD_PLOT].
|
|
](./figures/cms_coordinates.png){#fig:cmscoords width=60%}
|
|
|
|
### The tracking system
|
|
|
|
The tracking system is built of two parts, closest to the collision is a pixel detector and around that silicon strip
|
|
sensors. They are used to measure their charge sign, direction and momentum to be later able to reconstruct the tracks
|
|
of charged particles. They are as close to the collision as possible to be able to identify secondary vertices.
|
|
|
|
### The electromagnetic calorimeter
|
|
|
|
The electromagnetic calorimeter measures the energy of photons and electrons. It is made of tungstate crystal and
|
|
photodetectors. When passed by particles, the crystal produces scintillation light in proportion to the particle's
|
|
energy. This light is measured by the photodetectors that convert it to an electrical signal. To measure a particles
|
|
energy, it has to leave its whole energy in the ECAL, which is true for photons and electrons, but not for other
|
|
particles such as hadrons and muons. Those interact with matter differently and therefore only leave some energy in the
|
|
ECAL but are not stopped by it.
|
|
|
|
### The hadronic calorimeter
|
|
|
|
The hadronic calorimeter (HCAL) is used to detect high energy hadronic particles. It surrounds the ECAL and is made of
|
|
alternating layers of active and absorber material. While the absorber material with its high density causes the hadrons
|
|
to shower, the active material then detects those showers and measures their energy, similar to how the ECAL works.
|
|
|
|
### The solenoid
|
|
|
|
The solenoid, giving the detector its name, is one of the most important features. It creates a magnetic field of 3.8 T
|
|
and therefore makes it possible to measure momentum of charged particles by bending their tracks.
|
|
|
|
### The muon system
|
|
|
|
Outside of the solenoid, but still in its return yoke, there is only the muon system. It consists of three types of gas
|
|
detectors, the drift tubes, cathode strip chambers and resistive plate chambers. It covers a total of $0 < |\eta| <
|
|
2.4$. The muons are the only detected particles, that can pass all the other systems without a significant energy loss.
|
|
|
|
### The Trigger system
|
|
|
|
The CMS features a two level trigger system. It is necessary because the detector is unable to process all the events
|
|
due to limited bandwidth. The Level 1 trigger reduces the event rate from 40 MHz to 100 kHz, the software based High
|
|
Level trigger is then able to further reduce the rate to 1 kHz. The Level 1 trigger uses the data from the
|
|
electromagnetic and hadronic calorimeters as well as the muon chambers to decide whether to keep an event. The High
|
|
Level trigger uses a streamlined version of the CMS offline reconstruction software for its decision making.
|
|
|
|
### The Particle Flow algorithm
|
|
|
|
The particle flow algorithm [@PARTICLE_FLOW] is used to identify and reconstruct all the particles arising from the
|
|
proton - proton collision by using all the information available from the different sub-detectors. It does so by
|
|
extrapolating the tracks through the different calorimeters and associating clusters they cross with them. The set of
|
|
clusters already associated to a track is then no more used for the reconstruction of other particles. This is first
|
|
done for muons and then for charged hadrons, so a muon can't give rise to a wrongly identified charged hadron. Due to
|
|
Bremsstrahlung photon emission, electrons are harder to reconstruct. For them a specific track reconstruction algorithm
|
|
is used [@ERECO]. After identifying charged hadrons, muons and electrons, all remaining clusters within the HCAL
|
|
correspond to neutral hadrons and within ECAL to photons. When the list of particles and their corresponding deposits is
|
|
established, it can be used to determine the particles four momenta. From that, the missing transverse energy can be
|
|
calculated and tau particles can be reconstructed by their decay products.
|
|
|
|
## Jet clustering
|
|
|
|
Because of the hadronisation it is not possible to uniquely identify the originating particle of a jet. Nonetheless,
|
|
several algorithms exist to help with this problem. The algorithm used in this thesis is the anti-$k_t$ [@ANTIKT]
|
|
clustering algorithm. It arises from a generalization of several other clustering algorithms, namely the $k_t$,
|
|
Cambridge/Aachen and SISCone clustering algorithms.
|
|
|
|
The anti-$k_t$ clustering algorithm associates high $p_t$ particles with the lower $p_t$ particles surrounding them
|
|
within a radius R in the $\eta$ - $\phi$ plane forming cone like jets. If two jets overlap, the jets shape is changed
|
|
according to its hardness in regards to the transverse momentum. A softer particles jet will change its shape more than
|
|
a harder particles. A visual comparison of four different clustering algorithms can be seen in [@fig:antiktcomparison].
|
|
It shows, that the jets reconstructed using the anti-$k_t$ algorithm have the clearest cone like shape and is therefore
|
|
chosen for this thesis. For this analysis, a radius of 0.8 is used.
|
|
|
|
![
|
|
Comparison of the $k_t$, Cambridge/Aachen, SISCone and anti-$k_t$ algorithms clustering a sample parton-level event
|
|
with many random soft "ghosts" [@ANTIKT].
|
|
](./figures/antikt-comparision.png){#fig:antiktcomparison}
|
|
|
|
Furthermore, to approximate the mass of a heavy particle that caused a jet, the soft-drop mass [@SDM] can be used. In
|
|
its calculation, to reduce the contamination from initial state radiation, underlying event and multiple hadron
|
|
scattering, wide angle soft particles are removed from the jet. It therefore is more accurate in determining the mass of
|
|
a particle causing a jet than taking the mass of all constituent particles of the jet combined.
|
|
|
|
\newpage
|
|
|
|
# Method of analysis {#sec:moa}
|
|
|
|
This section gives an overview over how the data collected by CMS is going to be analysed to be able to either exclude
|
|
the q\* particle to even higher masses than already done or confirm its existence.
|
|
|
|
As described in [@sec:qs], the decay of the q\* particle to a quark and a vector boson with the vector boson then
|
|
decaying hadronically will be investigated. This is the second most probable decay of the q\* particle and easier to
|
|
analyse than the dominant decay to a quark and a gluon. Therefore it is a good choice for this research.
|
|
It results in two jets, because the decay products of the heavy vector boson are highly boosted, causing them to be very
|
|
close together and therefore be reconstructed as one jet. The dijet invariant mass of the two jets in the final state is
|
|
then used to reconstruct the mass of the q\* particle. The only background considered is the QCD multijet background
|
|
described in [@sec:qcdbg]. A selection using different kinematic variables as well as a tagger to identify jets from the
|
|
decay of a vector boson is introduced to reduce the background and increase the sensitivity for the signal. After that,
|
|
it will be looked for a peak in the dijet invariant mass distribution at the resonance mass of the q\* particle.
|
|
|
|
The data studied were collected by the CMS experiment in the years 2016, 2017 and 2018. They are analysed with the
|
|
Particle Flow algorithm to reconstruct jets and all the other particles forming during the collision. The jets are then
|
|
clustered using the anti-$k_t$ algorithm with the distance parameter R being 0.8.
|
|
|
|
The analysis will be conducted in two steps. First, only the data collected by the CMS experiment in 2016 with an
|
|
integrated luminosity of $\SI{35.92}{\per\femto\barn}$ will be used to compare the results to the previous analysis
|
|
[@PREV_RESEARCH]. Then the combined data from 2016, 2017 and 2018 with an integrated luminosity of
|
|
$\SI{137.19}{\per\femto\barn}$ will be used to improve the previously set limits for the mass of the q\* particle. Also,
|
|
two different V-tagging methods will be used to compare their performance. One based on the N-subjettiness variable used
|
|
in the previous research [@PREV_RESEARCH], the other being a novel approach using a deep neural network, that will be
|
|
explained in the following.
|
|
|
|
## Signal and Background modelling
|
|
|
|
Before looking at the data collected by the CMS experiment, Monte Carlo simulations [@MONTECARLO] of background and
|
|
signal are used to understand how the data is expected to look like. To replicate the QCD background processes, the
|
|
different particle interactions that take place in a proton - proton collision are simulated using the probabilities
|
|
provided by the Standard Model by calculating the cross sections of the different Feynman diagrams. This was done using
|
|
MadGraph [@MADGRAPH] and Pythia 8 [@PYTHIA8]. Later on, also detector effects (like its limited resolution) are applied
|
|
to make sure, they look like real data coming from the CMS detector.
|
|
|
|
The q\* signal samples are simulated by the probabilities given by the q\* theory [@QSTAR_THEORY] and assuming a cross
|
|
section of $\SI{1}{\pico\barn}$. The simulation was done using MadGraph [@MADGRAPH] for eleven masspoints between 1.6
|
|
TeV and 7 TeV. Because of the expected high mass, the signal width will be dominated by the resolution of the detector,
|
|
not by the natural resonance width.
|
|
|
|
The dijet invariant mass distribution of the QCD background is expected to smoothly fall with higher masses.
|
|
It is therefore fitted using the following smooth falling function with three parameters p0, p1, p2:
|
|
\begin{equation}
|
|
\frac{dN}{dm_{jj}} = \frac{p_0 \cdot ( 1 - m_{jj} / \sqrt{s} )^{p_2}}{ (m_{jj} / \sqrt{s})^{p_1}}
|
|
\end{equation}
|
|
Whereas $m_{jj}$ is the invariant mass of the dijet and $p_0$ is a normalisation parameter. It is the same function as
|
|
used in the previous research studying 2016 data only but was also found to reliably reproduce the background shape of
|
|
the other years.
|
|
|
|
The signal is fitted using a double sided crystal ball function. It has six parameters:
|
|
|
|
- mean: the functions mean, in this case the resonance mass
|
|
- sigma: the functions width, in this case the resolution of the detector due to the very small resonance width expected
|
|
- n1, n2, alpha1, alpha2: parameters influencing the shape of the left and right tail
|
|
|
|
A gaussian and a poisson function have also been studied but found to be not able to reproduce the signal shape as they
|
|
couldn't model the tails on both sides of the peak.
|
|
|
|
A linear combination of the signal and background function is then fitted to a toy dataset with gaussian errors obtained
|
|
by adding simulated background and signal. The resulting coefficients of said combination then show the expected signal
|
|
rate for the simulated signal cross section of $\SI{1}{\pico\barn}$. An example of such a fit can be seen in
|
|
[@fig:cb_fit]. In this figure, a binning of 200 GeV is used for presentational purposes. The analysis itself is
|
|
conducted using a 1 GeV binning. It can be seen that the fit works very well and therefore confirms the functions chosen
|
|
to model signal and background. This is supported by a $\chi^2 /$ ndof of 0.5 and a found mean for the signal at 2999
|
|
$\pm$ 23 $\si{\giga\eV}$ which is in very good agreement with the expected 3000 GeV mean. Those numbers clearly show
|
|
that the method in use is able to successfully describe the simulated toy data.
|
|
|
|
{#fig:cb_fit}
|
|
|
|
\newpage
|
|
|
|
# Preselection and data quality
|
|
|
|
To reduce the background and increase the signal sensitivity, a selection of events that satisfy certain requirements is
|
|
introduced. This is done taking into account different variables. The selection is divided into two stages. The first
|
|
one (the preselection) introduces some general physics motivated selections using kinematic variables and is also used
|
|
to ensure a high trigger efficiency. In the second part, the discriminants introduced by different taggers will be used
|
|
to identify jets originating from the decay of a vector boson. After the preselection, it is made sure, that the
|
|
simulated samples represent the real data well by comparing the data with the simulation in the signal as well as a
|
|
sideband region, where no signal events are expected.
|
|
|
|
## Preselection
|
|
|
|
First, all events are cleaned of jets with a $p_t < \SI{200}{\giga\eV}$ and a pseudorapidity $|\eta| > 2.4$. This is to
|
|
discard soft background and to make sure the particles are in the barrel region of the detector for an optimal track
|
|
reconstruction. Furthermore, all events with one of the two highest $p_t$ jets having an angular separation smaller
|
|
than 0.8 from any electron or muon are discarded to allow future use of the data in studies investigating the leptonic
|
|
decay channel of the vector boson.
|
|
|
|
From a decaying q\* particle, two jets are expected in the final state. The dijet invariant mass of those two jets will
|
|
be used to reconstruct the mass of the q\* particle. Therefore a cut is added to have at least 2 jets, accounting for
|
|
the possibility of more jets, for example caused by gluon radiation of a quark or other QCD effects. If this is the
|
|
case, the two jets with the highest $p_t$ are used for the reconstruction of the q\* mass.
|
|
The distributions of the number of jets before and after the selection can be seen in [@fig:njets]. The light blue
|
|
filled histogram shows the QCD background, the green and red line show the expected signal for a decay of the q\*
|
|
particle to qW with a mass of 2 TeV (green) and 5 TeV (red). By comparing the left to the right distributions, it is
|
|
clear that the requirement of at least 2 jets reduces the background significantly while keeping mostly all signal
|
|
events.
|
|
|
|
\begin{figure}
|
|
\begin{minipage}{\textwidth}
|
|
\centering\textbf{Comparison for 2016}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/2016/v1_Cleaner_N_jets_stack.eps}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/2016/v1_Njet_N_jets_stack.eps}
|
|
\end{minipage}
|
|
\begin{minipage}{\textwidth}
|
|
\vspace{0.1cm}
|
|
\centering\textbf{Comparison for the combined dataset}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/combined/v1_Cleaner_N_jets_stack.eps}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/combined/v1_Njet_N_jets_stack.eps}
|
|
\end{minipage}
|
|
\caption{Comparison of the number of jet distribution before and after the cut at number of jets $\ge$ 2. \newline
|
|
Left: distribution before the cut. Right: distribution after the cut. \newline
|
|
The signal curves are amplified by a factor of 10'000 to be visible.}
|
|
\label{fig:njets}
|
|
\end{figure}
|
|
|
|
The next selection is done using $\Delta\eta = |\eta_1 - \eta_2|$, with $\eta_1$ and $\eta_2$ being the $\eta$ of the
|
|
two jets with the highest transverse momentum. The q\* particle is expected to be very heavy in regards to the center of
|
|
mass energy of the collision and will therefore be almost stationary. Its decay products should therefore be close to
|
|
back to back, which means the $\Delta\eta$ distribution is expected to peak at zero. At the same time, particles
|
|
originating from QCD effects are expected to have a higher $\Delta\eta$. To maintain comparability, the same selection
|
|
as in previous research of $\Delta\eta \le 1.3$ is used. In the top two distributions of [@fig:deta], this cut is marked
|
|
by a vertical black line. The difference in the $m_{jj}$ distribution shows the strong reduction of the background by
|
|
this cut.
|
|
|
|
\begin{figure}
|
|
\begin{minipage}{\textwidth}
|
|
\centering\textbf{$\Delta\eta$ cut with signal amplified by 10'000}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/2016/v1_Njet_deta_stack.eps}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/combined/v1_Njet_deta_stack.eps}
|
|
\end{minipage}
|
|
\begin{minipage}{\textwidth}
|
|
\vspace{0.1cm}
|
|
\centering\textbf{$m_{jj}$ distribution before the cut}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/2016/v1_Njet_invMass_stack.eps}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/combined/v1_Njet_invMass_stack.eps}
|
|
\end{minipage}
|
|
\begin{minipage}{\textwidth}
|
|
\vspace{0.1cm}
|
|
\centering\textbf{$m_{jj}$ distribution after the cut}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/2016/v1_Eta_invMass_stack.eps}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/combined/v1_Eta_invMass_stack.eps}
|
|
\end{minipage}
|
|
\caption{Demonstration of the effect of the $\Delta\eta$ cut at $\Delta\eta \le 1.3$ on the $m_{jj}$ distribution.
|
|
\newline
|
|
Left: Partial dataset of $\SI{35.92}{\per\femto\barn}$ Right: Full dataset of $\SI{137.19}{\per\femto\barn}$.
|
|
}
|
|
\label{fig:deta}
|
|
\end{figure}
|
|
|
|
The last selection in the preselection is on the dijet invariant mass: $m_{jj} \ge \SI{1050}{\giga\eV}$. It is important
|
|
for a trigger efficiency higher than 99 % with a soft-drop mass cut of $m_{SDM} > \SI{65}{\giga\eV}$ applied to the jet
|
|
with the highest transverse momentum. A comparison of the $m_{jj}$ distribution before and after the selection can be
|
|
seen in [@fig:invmass]. Also, it has a huge impact on the background because it usually consists of lighter particles.
|
|
The q\* on the other hand is expected to have a very high invariant mass of more than 1 TeV. The $m_{jj}$ distribution
|
|
should be a smoothly falling function for the QCD background and peak at the simulated resonance mass for the signal
|
|
events.
|
|
|
|
\begin{figure}
|
|
\begin{minipage}{\textwidth}
|
|
\centering\textbf{Comparison for 2016}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/2016/v1_Eta_invMass_stack.eps}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/2016/v1_invmass_invMass_stack.eps}
|
|
\end{minipage}
|
|
\begin{minipage}{\textwidth}
|
|
\centering\textbf{Comparison for the combined dataset}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/combined/v1_Eta_invMass_stack.eps}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/combined/v1_invmass_invMass_stack.eps}
|
|
\end{minipage}
|
|
\caption{Comparison of the invariant mass distribution before and after the cut at $m_{jj} \ge \SI{1050}{\giga\eV}$. It
|
|
shows the expected smooth falling functions of the background whereas the signal peaks at the simulated resonance mass.
|
|
\newline
|
|
Left: distribution before the cut. Right: distribution after the cut.}
|
|
\label{fig:invmass}
|
|
\end{figure}
|
|
|
|
After the preselection, the signal efficiency for q* decaying to qW of 2016 ranges from 48 % for 1.6 TeV to 49 % for 7
|
|
TeV. Decaying to qZ, the efficiencies are between 45 % (1.6 TeV) and 50 % (7 TeV). The amount of background after the
|
|
preselection is reduced to 5 % of the original events. For the combined data of the three years those values look
|
|
similar. Decaying to qW signal efficiencies between 49 % (1.6 TeV) and 56 % (7 TeV) are reached, whereas the
|
|
efficiencies when decaying to qZ are in the range of 46 % (1.6 TeV) to 50 % (7 TeV). Here, the background could be
|
|
reduced to 8 % of the original events. So while keeping around 50 % of the signal, the background was already reduced to
|
|
less than a tenth.
|
|
|
|
## Data - Monte Carlo Comparison
|
|
|
|
To ensure that the simulation reproduces the data well, the simulated QCD background sample is now being compared to the
|
|
data of the corresponding year collected by the CMS detector. This is done for the partial dataset of year 2016 and for
|
|
the full dataset separately. In [@fig:data-mc], this comparison can be seen for the distributions of the variables used
|
|
during the preselection.
|
|
To compensate for the simulation overpredicting the scale of the QCD background, histograms are rescaled, so that the
|
|
dijet invariant mass distributions of data and simulation have the same integral.
|
|
The invariant mass distribution of the data of 2016 falls slightly faster than the simulated one, apart from that, the
|
|
distributions are in very good agreement.
|
|
|
|
For analysing the data from the CMS experiment, jet energy corrections have to be applied. Those are to calibrate the
|
|
ECAL and HCAL parts of the CMS, so the energy of the detected particles can be measured correctly. The corrections used
|
|
were recommended by the CMS group for internal use [@JEC].
|
|
|
|
\begin{figure}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\centering\textbf{2016}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\centering\textbf{Combined}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/2016/DATA/v1_invmass_N_jets.eps}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/combined/DATA/v1_invmass_N_jets.eps}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/2016/DATA/v1_invmass_deta.eps}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/combined/DATA/v1_invmass_deta.eps}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/2016/DATA/v1_invmass_invMass.eps}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/combined/DATA/v1_invmass_invMass.eps}
|
|
\end{minipage}
|
|
\caption{Comparison of data with the Monte Carlo simulation.}
|
|
\label{fig:data-mc}
|
|
\end{figure}
|
|
|
|
|
|
### Sideband region
|
|
|
|
The sideband region is introduced to make sure no bias in the data and Monte Carlo simulation is introduced and also to
|
|
verify the agreement of data and simulation. It is a region in which no signal event is expected. Again, data and the
|
|
Monte Carlo simulation are compared. For this analysis, the region where the soft-drop mass of both of the two jets with
|
|
the highest transverse momentum is more than 105 GeV is chosen. 105 GeV is well above the mass of 91 GeV of the Z boson,
|
|
the heavier vector boson. Therefore it is very unlikely, that an event with a particle heavier than that originates from
|
|
the decay of a vector boson. In [@fig:sideband], the comparison of data with simulation in the sideband region can be
|
|
seen for the soft-drop mass distribution as well as the dijet invariant mass distribution. It can be seen, that in the
|
|
sideband region data and simulation match very well.
|
|
|
|
\begin{figure}
|
|
\begin{minipage}{\textwidth}
|
|
\centering\textbf{2016}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/2016/sideband/v1_SDM_SoftDropMass_1.eps}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/2016/sideband/v1_SDM_invMass.eps}
|
|
\end{minipage}
|
|
\begin{minipage}{\textwidth}
|
|
\centering\textbf{Combined dataset}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/combined/sideband/v1_SDM_SoftDropMass_1.eps}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/combined/sideband/v1_SDM_invMass.eps}
|
|
\end{minipage}
|
|
\caption{Comparison of data with the Monte Carlo simulation in the sideband region.}
|
|
\label{fig:sideband}
|
|
\end{figure}
|
|
|
|
\newpage
|
|
|
|
# Jet substructure selection
|
|
|
|
So far it was made sure, that the data collected by the CMS and the simulation are in good agreement after the
|
|
preselection and no unwanted side effects are introduced in the data by the used cuts. Now another selection has to be
|
|
introduced, to further reduce the background to be able to look for the hypothetical signal events in the data.
|
|
|
|
This is done by distinguishing between QCD and signal events using a tagger to identify jets coming
|
|
from a vector boson. Two different taggers will be used to later compare their performance. The decay analysed includes
|
|
either a W or Z boson, which are, compared to the particles in QCD effects, very heavy. This can be used by adding a
|
|
selection using the soft-drop mass of a jet. The soft-drop mass of at least one of the two leading jets is expected to
|
|
be within $\SI{35}{\giga\eV}$ and $\SI{105}{\giga\eV}$. This cut already provides a good separation of QCD and signal
|
|
events, on which the two taggers presented next can build.
|
|
|
|
Both taggers provide a discriminant to choose whether an event can be classified as the decay of a vector boson or
|
|
originates from QCD effects. This value will be optimized afterwards to make sure the maximum signal significance
|
|
possible is achieved.
|
|
|
|
## N-Subjettiness
|
|
|
|
The N-subjettiness [@TAU21] $\tau_N$ is a jet shape parameter designed to identify boosted hadronically-decaying
|
|
objects. When a vector boson decays hadronically, it produces two quarks each causing a jet. But in the case of the
|
|
decay of a q\* particle, the vector boson is highly boosted and so are its decay products. They therefore appear, after
|
|
applying a clustering algorithm, as just one jet. This algorithm now tries to figure out, whether one jet might consist
|
|
of two subjets by using the kinematics and positions of the constituent particles of this jet.
|
|
The N-subjettiness is defined as
|
|
|
|
\begin{equation} \tau_N = \frac{1}{d_0} \sum_k p_{T,k} \cdot \text{min}\{ \Delta R_{1,k}, \Delta R_{2,k}, …, \Delta
|
|
R_{N,k} \} \end{equation}
|
|
|
|
with k going over the constituent particles in a given jet, $p_{T,k}$ being their transverse momenta and $\Delta R_{J,k}
|
|
= \sqrt{(\Delta\eta)^2 + (\Delta\phi)^2}$ being the distance of a candidate subjet J and a constituent particle k in the
|
|
$\eta$ - $\phi$ plane. It quantifies to what degree a jet can be regarded as a jet composed of $N$ subjets. In the
|
|
hadronic decay of a highly boosted vector boson, two subjets are expected. Therefore it seems that $\tau_2$ would be a
|
|
good choice for a discriminant. However, experiments showed, that rather than using $\tau_2$ directly, the ratio
|
|
$\tau_{21} = \tau_2/\tau_1$ is a better discriminant between QCD effects and events originating from the decay of a
|
|
boosted vector boson.
|
|
|
|
The lower the $\tau_{21}$ is, the more likely a jet is caused by the decay of a vector boson. Therefore a selection will
|
|
be introduced, so that $\tau_{21}$ of one candidate jet is smaller then some value that will be determined by the
|
|
optimisation process described in the next chapter. As candidate jet the one of the two highest $p_t$ jets passing the
|
|
soft-drop mass window is used. If both of them pass, the one with higher $p_t$ is chosen.
|
|
|
|
## DeepAK8
|
|
|
|
The DeepAK8 tagger [@DEEP_BOOSTED] uses a deep neural network (DNN) to identify decays originating in a vector boson. It
|
|
reduces the background rate by up to a factor of ~10 with the same signal efficiency compared to non-machine-learning
|
|
approaches like the N-Subjettiness method. This is shown by [@fig:ak8_eff], showing a comparison of background and
|
|
signal efficiency of the DeepAK8 tagger with, between others, the $\tau_{20}$ tagger that is also used in this analysis.
|
|
|
|
![Comparison of tagger efficiencies, showing, between others, the DeepAK8 and $\tau_{21}$ tagger used in this analysis
|
|
[@DEEP_BOOSTED].](./figures/deep_ak8.pdf){#fig:ak8_eff width=60%}
|
|
|
|
The DNN has two input lists for each jet. The first is a list of up to 100 constituent particles of the jet, sorted by
|
|
decreasing $p_t$. A total of 42 properties of the particles such es $p_t$, energy deposit, charge and the
|
|
angular momentum between the particle and the jet or subjet axes are included. The second input list is a list of up to
|
|
seven secondary vertices, each with 15 features, such as the kinematics, displacement and quality criteria.
|
|
To process those inputs, a customised DNN architecture has been developed. It consists of two convolutional neural
|
|
networks (CNN) that each process one of the input lists. The outputs of the two CNNs are then combined and processed by
|
|
a fully-connected network to identify the jet. The network was trained with a sample of 40 million jets, another 10
|
|
million jets were used for development and validation.
|
|
|
|
In this thesis, the mass decorrelated version of the DeepAK8 tagger, called DeepAK8-MD but further referred to as only
|
|
DeepAK8, is used. It adds an additional mass predictor layer, that is trained to quantify how strongly the output of the
|
|
non-decorrelated tagger is correlated to the mass of a particle. Its output is fed back to the network as a penalty so
|
|
it avoids using features of the particles correlated to their mass. The result is a largely mass decorrelated tagger of
|
|
heavy resonances, that doesn't introduce a bias in the jet mass shape. As can be seen in [@fig:ak8_eff], it performs not
|
|
as good as the non-mass-decorrelated version, but still better than the other taggers it was compared to.
|
|
|
|
The higher the discriminant value, called WvsQCD resp. ZvsQCD (further referred to as only VvsQCD), of the DeepAK8
|
|
tagger, the more likely is the jet to be caused by the decay of a vector boson. Therefore, using the same way to choose
|
|
a candidate jet as for the N-subjettiness tagger, a selection is applied so that this candidate jet has a VvsQCD value
|
|
greater than some value determined by the optimisation presented next.
|
|
|
|
## Optimisation {#sec:opt}
|
|
|
|
To figure out the best value to cut on the discriminants introduced by the two taggers, a value to quantify how good a
|
|
cut is has to be introduced. For that, the significance calculated by $\frac{S}{\sqrt{B}}$ will be used. S stands for
|
|
the amount of signal events and B for the amount of background events in a given interval. This value assumes a gaussian
|
|
error on the background so it will be calculated for the 2 TeV masspoint of the decay to qW where enough background
|
|
events exist to justify this assumption, which follows from the central limit theorem [@CLT] that states, that for
|
|
identical distributed random variables, their sum converges to a gaussian distribution. The significance represents how
|
|
good the signal can be distinguished from the background in units of the standard deviation of the background. As
|
|
interval, a 10 % margin around the resonance nominal mass is chosen. The significance is then calculated for different
|
|
selections on the discriminant of the two taggers and then plotted in dependence on the minimum resp. maximum allowed
|
|
value of the discriminant to pass the selection for the DeepAK8 resp. the N-subjettiness tagger.
|
|
|
|
The optimisation process is done using only the data from year 2018, assuming the taggers have similar performances on
|
|
the data of the different years.
|
|
|
|
\begin{figure}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/sig-db.pdf}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/sig-tau.pdf}
|
|
\end{minipage}
|
|
\caption{Significance plots for the DeepAK8 (left) and N-subjettiness (right) tagger at the 2 TeV masspoint.}
|
|
\label{fig:sig}
|
|
\end{figure}
|
|
|
|
As a result, the $\tau_{21}$ cut is placed at $\le 0.35$, confirming the value previous research chose and the deep
|
|
boosted cut is placed at $\ge 0.95$. For the DeepAK8 tagger, 0.97 would give a slightly higher significance but as it is
|
|
very close to the edge where the significance drops very low and the higher the cut the less background will be left to
|
|
calculate the cross section limits, especially at higher resonance masses, the slightly less strict cut is chosen.
|
|
|
|
For both taggers also a low purity category is introduced. Using the cuts optimized for 2 TeV, there are very few
|
|
background events left for higher resonance masses, but to reliably calculate cross section limits, those are needed.
|
|
Therefore in the final cross section calculation, the two categories are combined to have a high signal sensitivity for
|
|
all masspoints between 1.6 TeV and 7 TeV that were simulated. As low purity category for the N-subjettiness tagger, a
|
|
cut at $0.35 < \tau_{21} < 0.75$ is used. For the DeepAK8 tagger the opposite cut from the high purity category is used:
|
|
$VvsQCD < 0.95$.
|
|
|
|
\newpage
|
|
|
|
# Signal extraction {#sec:extr}
|
|
|
|
After the optimisation, now the optimal selection for the N-subjettiness as well as the DeepAK8 tagger is found and
|
|
applied to the simulated samples as well as the data collected by the CMS experiment. The fit described in [@sec:moa] is
|
|
performed for all masspoints of the decay to qW and qZ and for the partial dataset of $\SI{35.92}{\per\femto\barn}$ as
|
|
well as the complete dataset of $\SI{137.19}{\per\femto\barn}$ separately.
|
|
|
|
To test for the presence of a resonance in the data, the cross section limits of the signal event are calculated using a
|
|
frequentist asymptotic limit criterion described in [@ASYMPTOTIC_LIMIT]. Using the parameters and signal rate obtained
|
|
by the method described in [@sec:moa] as well as a shape analysis of the data recorded by the CMS experiment, it
|
|
determines an expected and an observed cross section limit by doing a signal + background versus background-only
|
|
hypothesis test. It also calculates upper and lower limits of the expected cross section corresponding to a confidence
|
|
level of 95 %.
|
|
|
|
In the absence of the q\* particle in the data, the observed limits lie within the $2\sigma$ environment, meaning a 95 %
|
|
confidence level, of the expected limit. This observed limit is plotted together with a theory line, representing the
|
|
cross section limits expected, if the q\* predicted by [@QSTAR_THEORY] would exist. Since no significant deviation from
|
|
the Standard Model is found while looking for the resonance, the crossing of the theory line with the observed limit is
|
|
calculated, to have a limit of mass up to which the existence of the q\* particle can be excluded. To find the
|
|
uncertainty of this result, the crossing of the theory line plus, respectively minus, its uncertainty with the observed
|
|
limit is also calculated.
|
|
|
|
## Systematic Uncertainties
|
|
|
|
The variables used in this analysis are affected by systematic uncertainties.
|
|
For calculating the cross section of the signal, four sources of such uncertainties are considered.
|
|
|
|
First, the uncertainty of the Jet Energy Corrections. When measuring a particle's energy with the ECAL or HCAL part of
|
|
the CMS, the electronic signals send by the photodetectors in the calorimeters have to be converted to actual energy
|
|
values. Therefore an error in this calibration causes the energy measured to be shifted to higher or lower values
|
|
causing also the position of the signal peak in the $m_{jj}$ distribution to vary. The uncertainty is approximated to be
|
|
2 %.
|
|
|
|
Second, the tagger does not work perfectly and therefore some events, that don't originate from a V boson are wrongly
|
|
chosen and on the other hand sometimes events that do originate from one are not. It influences the events chose for
|
|
analysis and is therefore also considered as an uncertainty, which is approximated to be 6 %.
|
|
|
|
Third, the uncertainty of the parameters of the background fit is also considered, as it might change the background
|
|
shape a little and therefore influence how many signal and background events are reconstructed from the data.
|
|
|
|
Fourth, the uncertainty on the luminosity influences the normalization of the processes. Its value is 2.5 % [@LUMI_UNC].
|
|
\newpage
|
|
|
|
# Results
|
|
|
|
This chapter will start by presenting the results for the partial dataset of year 2016 with an integrated luminosity of
|
|
$\SI{35.92}{\per\femto\barn}$ using both taggers and comparing it to the previous research [@PREV_RESEARCH]. It will
|
|
then go on showing the results for the combined dataset with an integrated luminosity of $\SI{137.19}{\per\femto\barn}$,
|
|
again using both taggers comparing their performances.
|
|
|
|
## Partial dataset
|
|
|
|
Using the $\SI{35.92}{\per\femto\barn}$ of data collected by the CMS experiment during 2016, the cross section limits
|
|
seen in [@fig:res2016] were obtained.
|
|
|
|
As described in [@sec:extr], the calculated cross section limits are used to then calculate a mass limit, meaning the
|
|
lowest possible mass of the q\* particle, by finding the crossing of the theory line with the observed cross section
|
|
limit. In [@fig:res2016dw,@fig:res2016dz] it can be seen, that the observed limit using the DeepAK8 tagger in the region
|
|
where theory and observed limit cross is very high compared to when using the N-subjettiness tagger. Therefore the two
|
|
lines cross at lower resonance masses, which results in lower exclusion limits on the mass of the q\* particle causing
|
|
the DeepAK8 tagger to perform worse than the N-subjettiness tagger in regards of establishing those limits as can be
|
|
seen in [@tbl:res2016]. The table also shows the upper and lower limits on the mass found by calculating the crossing of
|
|
the theory plus resp. minus its uncertainty. Due to the theory and the observed limits line being slowly falling in the
|
|
high TeV region, even a small uncertainty of the theory can cause a high difference of the mass limit.
|
|
|
|
|
|
: Mass limits found using the partial dataset of $\SI{35.92}{\per\femto\barn}$ {#tbl:res2016}
|
|
|
|
| Decay | Tagger | Limit [TeV] | Upper Limit [TeV] | Lower Limit [TeV] |
|
|
|-------|-------------|-------------|-------------------|-------------------|
|
|
| qW | $\tau_{21}$ | 5.39 | 6.01 | 4.99 |
|
|
| qW | DeepAK8 | 4.96 | 5.19 | 4.84 |
|
|
| qZ | $\tau_{21}$ | 4.86 | 4.96 | 4.70 |
|
|
| qZ | DeepAK8 | 4.62 | 4.71 | 4.49 |
|
|
|
|
|
|
\begin{figure}%
|
|
\centering
|
|
\subfloat[Decay to qW, using N-subjettiness tagger]{%
|
|
\label{fig:res2016tw}%
|
|
\includegraphics[width=0.5\textwidth]{./figures/results/brazilianFlag_QtoqW_2016tau_13TeV.pdf}}
|
|
\subfloat[Decay to qW, using DeepAK8 tagger]{%
|
|
\includegraphics[width=0.5\textwidth]{./figures/results/brazilianFlag_QtoqW_2016db_13TeV.pdf}%
|
|
\label{fig:res2016dw}}\\
|
|
\subfloat[Decay to qZ, using N-subjettiness tagger]{%
|
|
\includegraphics[width=0.5\textwidth]{./figures/results/brazilianFlag_QtoqZ_2016tau_13TeV.pdf}%
|
|
\label{fig:res2016tz}}%
|
|
\subfloat[Decay to qZ, using DeepAK8 tagger]{%
|
|
\includegraphics[width=0.5\textwidth]{./figures/results/brazilianFlag_QtoqZ_2016db_13TeV.pdf}%
|
|
\label{fig:res2016dz}}%
|
|
\caption{Results of the cross section limits for the partial dataset of 2016 using the $\tau_{21}$ tagger and the deep
|
|
boosted tagger.}
|
|
\label{fig:res2016}
|
|
\end{figure}
|
|
|
|
### Comparison with existing results
|
|
|
|
The result will now be compared to an existing result using the same dataset. This research, however, uses a newer
|
|
detector calibration as well as an improved reconstruction so slight variations in the results are to be expected.
|
|
|
|
The limit established by using the N-subjettiness tagger with the partial dataset is $\SI{0.39}{\tera\eV}$ (decay to qW)
|
|
resp. $\SI{0.16}{\tera\eV}$ (decay to qZ) higher than the one from previous research, which was found to be 5 TeV for
|
|
the decay to qW and 4.7 TeV for the decay to qZ. This is mainly due to the fact, that in our data, the observed limit at
|
|
the intersection point happens to be in the lower region of the expected limit interval and therefore causing a very
|
|
late crossing with the theory line when using the N-subjettiness tagger (as can be seen in [@fig:res2016]). Comparing
|
|
the expected limits, there is a difference between 2 % and 30 %, between the values calculated by this thesis compared
|
|
to the previous research. It is not, however, that one of the two results was constantly lower or higher but rather
|
|
fluctuating. As already noted, a slight variations in the results was expected, therefore it can be said, that the
|
|
results are in good agreement. The cross section limits of the previous research can be seen in [@fig:prev].
|
|
|
|
\begin{figure}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/results/prev_qW.png}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/results/prev_qZ.png}
|
|
\end{minipage}
|
|
\caption{Previous results of the cross section limits for q\* decaying to qW (left) and q\* decaying to qZ (right)
|
|
\cite{PREV_RESEARCH}.}
|
|
\label{fig:prev}
|
|
\end{figure}
|
|
|
|
## Combined dataset
|
|
|
|
Using the full available dataset of $\SI{137.19}{\per\femto\barn}$, the cross section limits seen in [@fig:resCombined]
|
|
were obtained. The cross section limits are, compared to only using the 2016 dataset, reduced to about 50 %. This shows
|
|
the big improvement achieved by using more than three times the amount of data.
|
|
|
|
The results for the mass limits of the combined years are presented in the following table.
|
|
|
|
|
|
: Mass limits found using $\SI{137.19}{\per\femto\barn}$ of data {#tbl:resCombined}
|
|
|
|
| Decay | Tagger | Limit [TeV] | Upper Limit [TeV] | Lower Limit [TeV] |
|
|
|-------|-------------|-------------|-------------------|-------------------|
|
|
| qW | $\tau_{21}$ | 6.00 | 6.26 | 5.74 |
|
|
| qW | DeepAK8 | 6.11 | 6.31 | 5.39 |
|
|
| qZ | $\tau_{21}$ | 5.49 | 5.76 | 5.29 |
|
|
| qZ | DeepAK8 | 4.95 | 5.13 | 4.85 |
|
|
|
|
|
|
The combination of the three years not just improved the cross section limits, but also the limit for the mass of the
|
|
q\* particle. The final result is 1 TeV higher for the decay to qW and almost 0.8 TeV higher for the decay to qZ than
|
|
what was concluded by the previous research [@PREV_RESEARCH].
|
|
|
|
|
|
\begin{figure}%
|
|
\centering
|
|
\subfloat[Decay to qW, using N-subjettiness tagger]{%
|
|
\label{fig:resCombinedtw}%
|
|
\includegraphics[width=0.5\textwidth]{./figures/results/brazilianFlag_QtoqW_Combinedtau_13TeV.pdf}}
|
|
\subfloat[Decay to qW, using DeepAK8 tagger]{%
|
|
\includegraphics[width=0.5\textwidth]{./figures/results/brazilianFlag_QtoqW_Combineddb_13TeV.pdf}%
|
|
\label{fig:resCombineddw}}\\
|
|
\subfloat[Decay to qZ, using N-subjettiness tagger]{%
|
|
\includegraphics[width=0.5\textwidth]{./figures/results/brazilianFlag_QtoqZ_Combinedtau_13TeV.pdf}%
|
|
\label{fig:resCombinedtz}}%
|
|
\subfloat[Decay to qZ, using DeepAK8 tagger]{%
|
|
\includegraphics[width=0.5\textwidth]{./figures/results/brazilianFlag_QtoqZ_Combineddb_13TeV.pdf}%
|
|
\label{fig:resCombineddz}}%
|
|
\caption{Results of the cross section limits for the combined dataset using the $\tau_{21}$ tagger and the DeepAK8
|
|
tagger.}
|
|
\label{fig:resCombined}
|
|
\end{figure}
|
|
|
|
|
|
## Comparison of taggers
|
|
|
|
The results presented in [@tbl:res2016, @tbl:resCombined] show, that the DeepAK8 tagger was not able to
|
|
significantly improve the results compared to the N-subjettiness tagger.
|
|
For further comparison, in [@fig:limit_comp] the expected limits of the different taggers for the q\* $\rightarrow$ qW
|
|
and the q\* $\rightarrow$ qZ decay are shown. It can be seen, that the DeepAK8 is at best as good as the
|
|
N-subjettiness tagger. This was not the expected result, as the deep neural network was already found to provide a
|
|
higher significance in the optimisation done in [@sec:opt]. The higher significance should also result in lower cross
|
|
section limits. To make sure, there is no mistake in the setup, also the expected cross section limits using only the
|
|
high purity category of the two taggers with 2018 data are compared in [@fig:comp_2018]. There, the cross section limits
|
|
calculated using the DeepAK8 tagger are a bit lower than with the N-subjettiness tagger, showing, that the method
|
|
used for optimisation is working but the assumption of it also applying to the combined dataset did not hold.
|
|
|
|
This can be explained by some training issues identified lately.
|
|
The training of the DeepAK8 tagger was done for the data of year 2016. It therefore performs differently for the data of
|
|
the other years. This caused the DeepAK8 tagger to perform significantly worse than it could have for several reasons.
|
|
First, the optimisation done for the data of year 2018 could therefore not be applied to the other datasets. Second,
|
|
even for the data of 2016, a newer version of the background simulation was used, that, in combination with the samples
|
|
used for the signal, turned out to be the worst case scenario for the used training.
|
|
Recently, the training was improved to better perform across all datasets, but those changes could not be incorporated
|
|
into this thesis due to it not being possible to do this in a reasonable timeframe.
|
|
\newpage
|
|
|
|
|
|
|
|
\begin{figure}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/limit_comp_w.pdf}
|
|
\end{minipage}
|
|
\begin{minipage}{0.5\textwidth}
|
|
\includegraphics{./figures/limit_comp_z.pdf}
|
|
\end{minipage}
|
|
\caption{Comparison of expected limits of the different taggers using different datasets. Left: decay to qW. Right:
|
|
decay to qZ}
|
|
\label{fig:limit_comp}
|
|
\end{figure}
|
|
|
|
|
|
{#fig:comp_2018 width=55%}
|
|
|
|
\clearpage
|
|
\newpage
|
|
|
|
# Summary
|
|
|
|
In this thesis, a search for the q\* particle decaying to q + W and q + Z was presented. Data of proton - proton
|
|
collisions at the LHC of an integrated luminosity of $\SI{137.19}{\per\femto\barn}$ collected by the CMS experiment at a
|
|
centre-of-mass energy of $\sqrt{s} = \SI{13}{\tera\eV}$ has been searched. Also a partial dataset of
|
|
$\SI{35.92}{\per\femto\barn}$ was analysed, to be able to compare the results to previous research. Monte Carlo
|
|
simulations were used to model the QCD multijet background and signal shapes.
|
|
|
|
A selection was introduced to reduce background events and enhance signal sensitivity. This selection required at least
|
|
two jets, a $\Delta\eta \ge 1.3$ between the two highest $p_t$ jets, an invariant mass of the two highest $p_t$ jets
|
|
greater than $\SI{1050}{\giga\eV}$ and a soft-drop mass of at least one jet between $\SI{35}{\giga\eV}$ and
|
|
$\SI{105}{\giga\eV}$.
|
|
|
|
Two taggers, the DeepAK8 and the N-subjettiness tagger, have been used to identify jets originating from the decay of a
|
|
vector boson. For both of them, two categories were introduced. A high purity category, aiming for maximal signal
|
|
sensitivity in the low TeV region of the invariant mass spectrum and a low purity category, aiming for better statistics
|
|
in the high TeV region. For the DeepAK8 tagger, a high purity category of $VvsQCD > 0.95$ and a low purity category of
|
|
$VvsQCD \le 0.95$ was used. For the N-subjettiness tagger the high purity category was $\tau_{21} < 0.35$ and the low
|
|
purity category $0.35 < \tau_{21} < 0.75$. These values were obtained by optimising for the highest possible
|
|
significance of the signal.
|
|
|
|
A combined fit to the dijet invariant mass distribution of background plus signal has been used to determine their shape
|
|
parameters and the expected signal rate. With those results, the cross section limits were extracted from the data.
|
|
Because no significant deviation from the Standard Model was observed, new exclusion limits for the mass of the q\*
|
|
particle were set. These are 6.1 TeV by analyzing the decay to qW, respectively 5.5 TeV for the decay to qZ. Those
|
|
limits are about 1 TeV higher than the ones found in previous research, that found them to be 5 TeV resp. 4.7 TeV.
|
|
|
|
The performance of the two taggers used have been compared and found to produce similar results. This was unexpected, as
|
|
the DeepAK8 tagger was supposed to perform better than the N-subjettiness tagger. The result analysing the decay to qW
|
|
was $\SI{0.1}{\tera\eV}$ better using the DeepAK8 than with the N-subjettiness tagger, but analysing the decay to qZ it
|
|
was $\SI{0.5}{\tera\eV}$ worse. The performance of the DeepAK8 tagger is likely to significantly improve with an updated
|
|
training that was not yet available in the framework used by this thesis.
|
|
|
|
|
|
\newpage
|
|
|
|
\nocite{*}
|
|
|