bsc-thesis/thesis.tex

% Options for packages loaded elsewhere
\PassOptionsToPackage{unicode}{hyperref}
\PassOptionsToPackage{hyphens}{url}
%
\documentclass[
  12pt,
  british,
  a4paper,
]{article}
\usepackage{lmodern}
\usepackage{amssymb,amsmath}
\usepackage{ifxetex,ifluatex}
\ifnum 0\ifxetex 1\fi\ifluatex 1\fi=0 % if pdftex
  \usepackage[T1]{fontenc}
  \usepackage[utf8]{inputenc}
  \usepackage{textcomp} % provide euro and other symbols
\else % if luatex or xetex
  \usepackage{unicode-math}
  \defaultfontfeatures{Scale=MatchLowercase}
  \defaultfontfeatures[\rmfamily]{Ligatures=TeX,Scale=1}
  \setmainfont[]{Times New Roman}
\fi
% Use upquote if available, for straight quotes in verbatim environments
\IfFileExists{upquote.sty}{\usepackage{upquote}}{}
\IfFileExists{microtype.sty}{% use microtype if available
  \usepackage[]{microtype}
  \UseMicrotypeSet[protrusion]{basicmath} % disable protrusion for tt fonts
}{}
\makeatletter
\@ifundefined{KOMAClassName}{% if non-KOMA class
  \IfFileExists{parskip.sty}{%
    \usepackage{parskip}
  }{% else
    \setlength{\parindent}{0pt}
    \setlength{\parskip}{6pt plus 2pt minus 1pt}}
}{% if KOMA class
  \KOMAoptions{parskip=half}}
\makeatother
\usepackage{xcolor}
\IfFileExists{xurl.sty}{\usepackage{xurl}}{} % add URL line breaks if available
\IfFileExists{bookmark.sty}{\usepackage{bookmark}}{\usepackage{hyperref}}
\hypersetup{
  pdfauthor={David Leppla-Weber},
  pdflang={en-GB},
  hidelinks,
  pdfcreator={LaTeX via pandoc}}
\urlstyle{same} % disable monospaced font for URLs
\usepackage[top=2.5cm,left=2.5cm,right=2.5cm,bottom=2cm]{geometry}
\usepackage{longtable,booktabs}
% Correct order of tables after \paragraph or \subparagraph
\usepackage{etoolbox}
\makeatletter
\patchcmd\longtable{\par}{\if@noskipsec\mbox{}\fi\par}{}{}
\makeatother
% Allow footnotes in longtable head/foot
\IfFileExists{footnotehyper.sty}{\usepackage{footnotehyper}}{\usepackage{footnote}}
\makesavenoteenv{longtable}
\usepackage{graphicx,grffile}
\makeatletter
\def\maxwidth{\ifdim\Gin@nat@width>\linewidth\linewidth\else\Gin@nat@width\fi}
\def\maxheight{\ifdim\Gin@nat@height>\textheight\textheight\else\Gin@nat@height\fi}
\makeatother
% Scale images if necessary, so that they will not overflow the page
% margins by default, and it is still possible to overwrite the defaults
% using explicit options in \includegraphics[width, height, ...]{}
\setkeys{Gin}{width=\maxwidth,height=\maxheight,keepaspectratio}
% Set default figure placement to htbp
\makeatletter
\def\fps@figure{htbp}
\makeatother
\setlength{\emergencystretch}{3em} % prevent overfull lines
\providecommand{\tightlist}{%
  \setlength{\itemsep}{0pt}\setlength{\parskip}{0pt}}
\setcounter{secnumdepth}{5}
\usepackage[onehalfspacing]{setspace}
\usepackage{siunitx}
\usepackage{tikz-feynman}
\usepackage{csquotes}
\usepackage{abstract}
\pagenumbering{gobble}
\setlength{\parskip}{0.4em}
\bibliographystyle{lucas_unsrt}
\makeatletter
\@ifpackageloaded{subfig}{}{\usepackage{subfig}}
\@ifpackageloaded{caption}{}{\usepackage{caption}}
\captionsetup[subfloat]{margin=0.5em}
\AtBeginDocument{%
\renewcommand*\figurename{Figure}
\renewcommand*\tablename{Table}
}
\AtBeginDocument{%
\renewcommand*\listfigurename{List of Figures}
\renewcommand*\listtablename{List of Tables}
}
\@ifpackageloaded{float}{}{\usepackage{float}}
\floatstyle{ruled}
\@ifundefined{c@chapter}{\newfloat{codelisting}{h}{lop}}{\newfloat{codelisting}{h}{lop}[chapter]}
\floatname{codelisting}{Listing}
\newcommand*\listoflistings{\listof{codelisting}{List of Listings}}
\makeatother
\ifxetex
  % Load polyglossia as late as possible: uses bidi with RTL langages (e.g. Hebrew, Arabic)
  \usepackage{polyglossia}
  \setmainlanguage[variant=british]{english}
\else
  \usepackage[shorthands=off,main=british]{babel}
\fi
\usepackage[]{biblatex}
\addbibresource{bibliography.bib}

\author{David Leppla-Weber}
\date{}

\begin{document}

\begin{titlepage}
\begin{center}
       \vspace*{1cm}
       \Huge
       \rule{\textwidth}{0.1cm}
       \textbf{Search for excited quark states decaying to qW/qZ with the CMS experiment}
       \rule{\textwidth}{0.1cm}

       \vspace{2.5cm}
       \Large
       von\\
       \LARGE
       David Leppla-Weber\\
       \Large
       \vspace{0.5cm}
       Geboren am\\
       18.09.1996

       \vfill

       Bachelorarbeit im Studiengang Physik\\
       Universität Hamburg\\
       November 2019

   \end{center}
\end{titlepage}

\newpage
\mbox{}
\vfill

\begin{enumerate}
\def\labelenumi{\arabic{enumi}.}
\tightlist
\item
  Gutachter: Dr.~Andreas Hinzmann
\item
  Gutachter: Jun.-Prof.~Dr.~Gregor Kasieczka
\end{enumerate}

\newpage

\begin{abstract}
A search for an excited quark state, called q*, is presented using data of proton-proton collisions at the LHC recorded
by the CMS experiment during the years 2016, 2017 and 2018 with a centre-of-mass energy of $\sqrt{s} =
\SI{13}{\tera\eV}$ and an integrated luminosity of $\SI{137.19}{\per\femto\barn}$. Its decay channels to q
+ W and q + Z with the vector bosons further decaying hadronically to $q + q\bar{q}'$ resp. $q + q\bar{q}$, resulting in
  two jets in the final state, are analysed. The dijet invariant mass spectrum of those two jets is then used to look
for a resonance and to reconstruct the q* mass. To identify jets originating from the decay of a vector boson, a
V-tagger is needed. For that, the new DeepAK8 tagger, based on a neural network, is compared to the older N-subjettiness
tagger. In the result, no significant deviation from the Standard Model can be observed, therefore the q* is excluded up
to a mass of 6.1\ TeV (qW) resp. 5.5\ TeV (qZ) with a confidence level of 95 \%. This limit is about 1\ TeV higher than
the limits found by a previous research of data with an integrated luminosity of $\SI{35.92}{\per\femto\barn}$ collected
by the CMS experiment in 2016, excluding the q* particle up to a mass of 5.0\ TeV resp. 4.7\ TeV. The DeepAK8 tagger is
found to currently be at the same level as the N-subjettiness tagger, giving a $\SI{0.1}{\tera\eV}$ better result for
the decay to qW but a by $\SI{0.5}{\tera\eV}$ worse one for the decay to qZ. By optimising the neural network's training
for the datasets of 2016, 2017 and 2018, the sensitivity can likely be improved.

\end{abstract}
\newpage
\renewcommand{\abstractname}{Zusammenfassung}
\begin{abstract}

In dieser Arbeit wird eine Suche nach angeregten Quarkzuständen, genannt q*, durchgeführt. Dafür werden Daten von
Proton-Proton Kollisionen am LHC mit einer integrierten Luminosität von $\SI{137.19}{\per\femto\barn}$ analysiert,
welche über die Jahre 2016, 2017 und 2018 bei einer Schwerpunktsenergie von $\sqrt{s} = \SI{13}{\tera\eV}$ vom CMS
Experiment aufgenommen wurden. Es wird der Zerfall des q* Teilchens zu q + W und q + Z untersucht, bei anschließendem
hadronischen Zerfall des Vektorbosons zu $q\bar{q}'$ bzw. $q\bar{q}$. Der gesamte Zerfall resultiert damit in zwei Jets,
mithilfe deren invariantem Massenspektrum die q* Masse rekonstruiert und nach einer Resonanz gesucht wird. Zur
Identifizerung von Jets, welche durch den Zerfall eines Vektorbosons entstanden sind, wird ein V-Tagger benötigt.
Hierfür wird der neue DeepAK8 Tagger, welcher auf einem neuronalen Netzwerk basiert, mit dem älteren N-Subjettiness
Tagger verglichen. Im Ergebnis kann keine signifikante Abweichung vom Standardmodell beobachtet werden. Das q* Teilchen
wird mit einem Konfidenzniveau von 95 \% bis zu einer Masse von 6.1\ TeV (qW) bzw. 5.5\ TeV (qZ) ausgeschlossen. Das Limit
liegt etwa 1\ TeV höher, als das anhand des $\SI{35.92}{\per\femto\barn}$ großen Datensatzes von 2016 gefundene von 5.0
TeV bzw. 4.7\ TeV. Beim Zerfall zu qW erzielt der DeepAK8 Tagger ein um $\SI{0.1}{\tera\eV}$ besseres Ergebnis, als der
N-Subjettiness Tagger, beim Zerfall zu qZ jedoch ein um $\SI{0.5}{\tera\eV}$ schlechteres. Durch Verbesserung des
Trainings des neuronalen Netzwerkes für die drei Datensätze von 2016, 2017 und 2018, gibt es aber noch Potential die
Sensitivität zu verbessern.

\end{abstract}

\newpage
\setcounter{tocdepth}{3}
\tableofcontents

\newpage
\pagenumbering{arabic}

\hypertarget{introduction}{%
\section{Introduction}\label{introduction}}

The Standard Model is a very successful theory in describing most of the
interactions happening between particles. Still, it has a lot of
limitations, that show that it isn't yet a full \enquote{theory of
everything}. To solve these shortcomings, lots of theories beyond the
standard model exist that try to expand the Standard Model in different
ways to solve these issues.

One category of such theories is based on a composite quark model.
Quarks are currently considered elementary particles by the Standard
Model. The composite quark models on the other hand predict that quarks
consist of particles unknown to us so far or can bind to other particles
using unknown forces. This could explain the symmetries between
particles and reduce the number of constants needed to explain the
properties of the known particles. One common prediction of those
theories are excited quark states. Those are quark states of higher
energy that can decay to an unexcited quark under the emission of a
boson. This thesis will look for their decay to a vector boson that then
further decays hadronically. The final state of this decay consists only
of quarks forming two jets, making Quantum Chromodynamics the main
background.

In a previous research \autocite{PREV_RESEARCH}, an exclusion limit for
the mass of an excited quark has already been set using data from the
2016 run of the Large Hadron Collider with an integrated luminosity of
\(\SI{35.92}{\per\femto\barn}\). Since then, a lot more data has been
collected by the CMS experiment, totalling to
\(\SI{137.19}{\per\femto\barn}\) of data usable for research. This
thesis takes advantage of this larger dataset as well as a new technique
to identify decays of highly boosted particles based on a deep neural
network. By using more data and new tagging techniques, it aims to
either confirm the existence of the q* particle or improve the
previously set lower limit of 5 TeV respectively 4.7 TeV for the decay
to qW respectively qZ on its mass to even higher values. It will also
directly compare the performance of this new tagging technique to an
older tagger based on jet substructure studies used in the previous
research.

In chapter 2, a theoretical background will be presented briefly
explaining the Standard Model, its shortcomings and the theory of
excited quarks. Then, in chapter 3, the Large Hadron Collider and the
Compact Muon Solenoid, the detector that collected the data for this
analysis, will be described. After that, in chapters 4-7, the main
analysis part follows, describing how the data were used to extract
limits on the mass of the excited quark particle. At the very end, in
chapter 8, the results are presented and compared to previous research.

\newpage

\hypertarget{theoretical-motivation}{%
\section{Theoretical motivation}\label{theoretical-motivation}}

This chapter presents a short summary of the theoretical background
relevant to this thesis. It first gives an introduction to the standard
model itself and some of the issues it raises. It then goes on to
explain the background processes of quantum chromodynamics and the
theory of q*, which are the relevant phenomena for the search described
in this thesis.

\hypertarget{sec:sm}{%
\subsection{Standard Model}\label{sec:sm}}

The Standard Model of physics proved to be very successful in describing
three of the four fundamental interactions currently known: the
electromagnetic, weak and strong interaction. The fourth, gravity, could
not yet be successfully included in this theory.

The Standard Model divides all particles into spin-\(\frac{n}{2}\)
fermions and spin-n bosons, where n could be any integer but so far is
only known to be one for fermions and either one (gauge bosons) or zero
(scalar bosons) for bosons. Fermions are further classified into quarks
and leptons. Quarks and leptons can also be categorized into three
generations, each of which contains two particles, also called flavours.
For leptons, the three generations each consist of a charged lepton and
its corresponding neutrino, namely the electron, the muon and the tau.
The three quark generations consist of the up and down, the charm and
strange, and the top and bottom quark. A full list of particles of the
standard model can be found in Fig.~\ref{fig:sm}. Furthermore, all
fermions have an associated anti particle with reversed charge. Bound
states of multiple quarks also exist and are called hadrons.

\begin{figure}
\hypertarget{fig:sm}{%
\centering
\includegraphics[width=0.5\textwidth,height=\textheight]{./figures/sm_wikipedia.pdf}
\caption{Elementary particles of the Standard Model and their mass
charge and spin \autocite{SM}.}\label{fig:sm}
}
\end{figure}

The gauge bosons, namely the photon, \(W^\pm\) bosons, \(Z^0\) boson,
and gluon, are mediators of the different forces of the standard model.

The photon is responsible for the electromagnetic force and therefore
interacts with all electrically charged particles. It itself carries no
electromagnetic charge and has no mass. Possible interactions are either
scattering or absorption. Photons of different energies can also be
described as electromagnetic waves of different wavelengths.

The \(W^\pm\) and \(Z^0\) bosons mediate the weak force. All quarks and
leptons carry a flavour, which is a conserved value in all interactions
but the weak one. There, a quark or lepton can, by interacting with a
\(W^\pm\) boson, change its flavour. The probabilities of this happening
are determined by the Cabibbo-Kobayashi-Maskawa matrix:

\begin{equation}
  V_{CKM} =
    \begin{pmatrix}
      |V_{ud}| & |V_{us}| & |V_{ub}| \\
      |V_{cd}| & |V_{cs}| & |V_{cb}| \\
      |V_{td}| & |V_{ts}| & |V_{tb}|
    \end{pmatrix}
  =
    \begin{pmatrix}
      0.974 & 0.225 & 0.004 \\
      0.224 & 0.974 & 0.042 \\
      0.008 & 0.041 & 0.999
    \end{pmatrix}
\end{equation}

The probability of a quark changing its flavour from \(i\) to \(j\) is
given by the square of the absolute value of the matrix element
\(V_{ij}\). It is easy to see, that the change of flavour in the same
generation is way more likely than any other flavour change.

Due to their high masses of 80.39 GeV resp. 91.19 GeV, the \(W^\pm\) and
\(Z^0\) bosons themselves decay very quickly. Either in the leptonic or
hadronic decay channel. In the leptonic channel, the \(W^\pm\) decays to
a lepton and the corresponding anti-lepton neutrino, in the hadronic
channel it decays to a quark and an anti-quark of a different flavour.
Due to the \(Z^0\) boson having no charge, it always decays to a fermion
and its anti-particle, in the leptonic channel this might be for example
an electron - positron pair, in the hadronic channel an up and anti-up
quark pair. This thesis examines the hadronic decay channel, where both
vector bosons decay to two quarks.

The quantum chromodynamics (QCD) describes the strong interaction of
particles. It applies to all particles carrying colour (e.g.~quarks).
The force is mediated by gluons. These bosons carry colour as well,
although they don't carry only one colour but rather a combination of a
colour and an anticolour, and can therefore interact with themselves and
exist in eight different variants. As a result of this, processes where
a gluon decays into two gluons are possible. Furthermore the strength of
the strong force, binding colour carrying particles, increases with
their distance making it at a certain point more energetically efficient
to form a new quark - antiquark pair than separating the two particles
even further. This effect is known as colour confinement. Due to this
effect, colour carrying particles can't be observed directly, but rather
form so called jets that cause hadronic showers in the detector. Those
jets are cone like structures made of hadrons and other particles. The
effect is called hadronisation \autocite{HADRONIZATION}.

\hypertarget{shortcomings-of-the-standard-model}{%
\subsubsection{Shortcomings of the Standard
Model}\label{shortcomings-of-the-standard-model}}

While being very successful in describing the effects observed in
particle colliders or the particles reaching earth from cosmological
sources, the Standard Model still has several shortcomings.

\begin{itemize}
\tightlist
\item
  \textbf{Gravity}: as already noted, the standard model doesn't include
  gravity as a force.
\item
  \textbf{Dark Matter}: observations of the rotational velocity of
  galaxies can't be explained by the known matter. Dark matter currently
  the most popular theory to explain those.
\item
  \textbf{Matter-antimatter asymmetry}: The amount of matter vastly
  outweights the amount of antimatter in the observable universe. This
  can't be explained by the standard model, which predicts a similar
  amount of matter and antimatter.
\item
  \textbf{Symmetries between particles}: Why do exactly three
  generations of fermions exist? Why is the charge of a quark exactly
  one third of the charge of a lepton? How are the masses of the
  particles related? Those and more questions cannot be answered by the
  standard model.
\item
  \textbf{Hierarchy problem}: The weak force is approximately
  \(10^{24}\) times stronger than gravity and so far, there's no
  satisfactory explanation as to why that is.
\end{itemize}

\hypertarget{sec:qs}{%
\subsection{Excited quark states}\label{sec:qs}}

One category of theories that try to explain the symmetries between
particles of the standard model are the composite quark models. Those
state, that quarks consist of some particles unknown so far. This could
explain the symmetries between the different fermions. A common
prediction of those models are excited quark states (q*, q**,
q***\ldots) \autocite{QSTAR_THEORY}. Similar to atoms, that can be
excited by the absorption of a photon and can then decay again under
emission of a photon with an energy corresponding to the excited state,
those excited quark states could decay under the emission of any boson.
Quarks are measured to be smaller than \(10^{-18}\) m. This corresponds
to an energy scale of approximately 1 TeV. Therefore the excited quark
states are expected to be in that energy region. That will cause the
emitted boson to be highly boosted.

\begin{figure}
\centering
\feynmandiagram [large, horizontal=qs to v] {
  a -- qs -- b,
  qs -- [fermion, edge label=\(q*\)] v,
  q1 [particle=\(q\)] -- v -- w [particle=\(W\)],
  q2 [particle=\(q\)] -- w -- q3 [particle=\(q\)],
};
\caption{Feynman diagram showing the decay of a q* particle to a W boson and a quark with the W boson decaying
hadronically.} \label{fig:qsfeynman}
\end{figure}

This thesis will search data collected by the CMS in the years 2016,
2017 and 2018 for the decay of a single excited quark state q* to a
quark and a vector boson . An example of a q* decaying to a quark and a
W boson can be seen in Fig.~\ref{fig:qsfeynman}. As explained in
Sec.~\ref{sec:sm}, the vector boson can then decay either in the
hadronic or leptonic decay channel. This research investigates only the
hadronic channel with two quarks in the final state. Because the boson
is highly boosted, those will be very close together and therefore
appear to the detector as only one jet. This means that the investigated
decay of a q* particle will have two jets in the final state and will
therefore be hard to distinguish from the QCD background described in
Sec.~\ref{sec:qcdbg}.

The choice of only examining the decay of the q* particle to the vector
bosons is motivated by the branching ratios calculated for the decay
\autocite{QSTAR_THEORY}:

\begin{longtable}[]{@{}llll@{}}
\caption{Branching ratios of the decaying q* particle.}\tabularnewline
\toprule
decay mode & br. ratio {[}\%{]} & decay mode & br. ratio
{[}\%{]}\tabularnewline
\midrule
\endfirsthead
\toprule
decay mode & br. ratio {[}\%{]} & decay mode & br. ratio
{[}\%{]}\tabularnewline
\midrule
\endhead
\(U^* \rightarrow ug\) & 83.4 & \(D^* \rightarrow dg\) &
83.4\tabularnewline
\(U^* \rightarrow dW\) & 10.9 & \(D^* \rightarrow uW\) &
10.9\tabularnewline
\(U^* \rightarrow u\gamma\) & 2.2 & \(D^* \rightarrow d\gamma\) &
0.5\tabularnewline
\(U^* \rightarrow uZ\) & 3.5 & \(D^* \rightarrow dZ\) &
5.1\tabularnewline
\bottomrule
\end{longtable}

The decay to the vector bosons have the second highest branching ratio.
The decay to a gluon and a quark is the dominant decay, but virtually
impossible to distinguish from the QCD background described in the next
section. This makes the decay to the vector bosons the most promising
choice.

To reconstruct the mass of the q* particle from an event successfully
recognized to be the decay of such a particle, the dijet invariant mass
has to be calculated. This can be achieved by adding the four momenta of
the two jets in the final state, vectors consisting of the energy and
momentum of a particle, together. From the four momentum it's easy to
derive the mass by solving \(E=\sqrt{p^2 + m^2}\) for m.

A search for the excited quark predicted by this theory has already been
investigated in \autocite{PREV_RESEARCH} analysing data with an
integrated luminosity of \(\SI{35.92}{\per\femto\barn}\) recorded by the
CMS experiment in 2016, excluding the q* particle up to a mass of 5 TeV
resp. 4.7 TeV for the decay to qW resp. qZ analysing the hadronic decay
of the vector boson. This thesis aims to either exclude the particle to
higher masses or find a resonance showing its existence using more data
that is available now.

\hypertarget{sec:qcdbg}{%
\subsubsection{Quantum Chromodynamic background}\label{sec:qcdbg}}

In this thesis, a decay with two jets in the final state will be
analysed. Therefore it will be hard to distinguish the signal processes
from QCD effects. Those can also produce two jets in the final state, as
can be seen in Fig.~\ref{fig:qcdfeynman}. They are also happening very
often in a proton proton collision, as it is happening in the Large
Hadron Collider. This is caused by the structure of the proton. It not
only consists of three quarks, called valence quarks, but also of a lot
of quark-antiquark pairs connected by gluons, called the sea quarks,
that exist due to the self interaction of the gluons binding the three
valence quarks. Therefore the QCD multijet backgroubd is the dominant
background of the signal described in Sec.~\ref{sec:qs}.

\begin{figure}
\centering
\feynmandiagram [horizontal=v1 to v2] {
    q1 [particle=\(q\)] -- [fermion] v1 -- [gluon] g1 [particle=\(g\)],
    v1 -- [gluon] v2,
    q2 [particle=\(q\)] -- [fermion] v2 -- [gluon] g2 [particle=\(g\)],
};
\feynmandiagram [horizontal=v1 to v2] {
    g1 [particle=\(g\)] -- [gluon] v1 -- [gluon] g2 [particle=\(g\)],
    v1 -- [gluon] v2,
    g3 [particle=\(g\)] -- [gluon] v2 -- [gluon] g4 [particle=\(g\)],
};
\caption{Two examples of QCD processes resulting in two jets.} \label{fig:qcdfeynman}
\end{figure}

\newpage

\hypertarget{experimental-setup}{%
\section{Experimental Setup}\label{experimental-setup}}

Following on, the experimental setup used to gather the data analysed in
this thesis will be described.

\hypertarget{large-hadron-collider}{%
\subsection{Large Hadron Collider}\label{large-hadron-collider}}

The Large Hadron Collider \autocite{LHC_MACHINE} is the world's largest
and most powerful particle accelerator. It has a circumference of 27 km
and can accelerate two beams of protons to an energy of 6.5 TeV
resulting in a collision with a centre of mass energy of 13 TeV. It is
home to several experiments, between others the Compact Muon Solenoid
(CMS), which is the one used for the search presented in this thesis. It
is a general-purpose detector to investigate the particles that form
during particle collisions. The LHC may also be used for colliding ions
but this ability is to no interest for this research.

Because of the collision of two beams with particles of the same charge,
it is not possible to use the same magnetic field for both beams.
Therefore opposite magnetic-dipole fields exist in both rings to be able
to accelerate the beams in opposite directions.

Particle colliders are characterized by their luminosity L. It is a
quantity to be able to calculate the number of events per second
generated in a collision by \(\dot{N}_{event} = L\sigma_{event}\) with
\(\sigma_{event}\) being the cross section of the event. The LHC aims
for a peak luminosity of \(10^{34}\si{\per\square\centi\metre\per\s}\).
This is achieved by colliding two bunches of protons every
\(\SI{25}{ns}\). Each proton beam thereby consists of 2'808 bunches.
Furthermore, the integrated Luminosity, defined as \(\int Ldt\), can be
used to describe the amount of data collected over a specific time
interval.

\hypertarget{compact-muon-solenoid}{%
\subsection{Compact Muon Solenoid}\label{compact-muon-solenoid}}

The data used in this thesis was recorded by the Compact Muon Solenoid
(CMS) \autocite{CMS_REPORT}. It is one of the four main experiments at
the Large Hadron Collider. It can detect all elementary particles of the
standard model except neutrinos. For that, it has an onion like setup,
as can be seen in Fig.~\ref{fig:cms_setup}. The particles produced in a
collision first go through a tracking system. They then pass an
electromagnetic as well as a hadronic calorimeter. This part is
surrounded by a superconducting solenoid that generates a magenetic
field of 3.8 T. Outside of the solenoid are big muon chambers. In 2016
the CMS captured data of an integrated luminosity of
\(\SI{37.80}{\per\femto\barn}\). In 2017 it collected
\(\SI{44.98}{\per\femto\barn}\) and in 2018
\(\SI{63.67}{\per\femto\barn}\) \autocite{CMS_LUMI}. The amount of data
usable for research is \(\SI{35.92}{\per\femto\barn}\),
\(\SI{41.53}{\per\femto\barn}\) and \(\SI{59.74}{\per\femto\barn}\) for
the years 2016, 2017 and 2018, totalling to
\(\SI{137.19}{\per\femto\barn}\) of data.

\begin{figure}
\hypertarget{fig:cms_setup}{%
\centering
\includegraphics{./figures/cms_setup.png}
\caption{The setup of the Compact Muon Solenoid showing its onion like
structure, the different detector parts and where different particles
are detected \autocite{CMS_PLOT}.}\label{fig:cms_setup}
}
\end{figure}

\hypertarget{coordinate-conventions}{%
\subsubsection{Coordinate conventions}\label{coordinate-conventions}}

Per convention, the z axis points along the beam axis in the direction
of the magnetic fields of the solenoid, the y axis upwards and the x
axis horizontal towards the LHC centre. The azimuthal angle \(\phi\),
which describes the angle in the x - y plane, the polar angle
\(\theta\), which describes the angle in the y - z plane and the
pseudorapidity \(\eta\), which is defined as
\(\eta = -ln\left(tan\frac{\theta}{2}\right)\) are also introduced. The
coordinates are visualised in Fig.~\ref{fig:cmscoords}. Furthermore, to
describe a particle's momentum, often the transverse momentum, \(p_t\)
is used. It is the component of the momentum transversal to the beam
axis. Before the collision, the transverse momentum is zero, therefore,
due to conservation of energy, the sum of all transverse momenta after
the collision has to be zero, too. If this is not the case for the
detected events, it implies particles that weren't detected such as
neutrinos.

\begin{figure}
\hypertarget{fig:cmscoords}{%
\centering
\includegraphics[width=0.6\textwidth,height=\textheight]{./figures/cms_coordinates.png}
\caption{Coordinate conventions of the CMS illustrating the use of
\(\eta\) and \(\phi\). The Z axis is in beam direction
\autocite{COORD_PLOT}.}\label{fig:cmscoords}
}
\end{figure}

\hypertarget{the-tracking-system}{%
\subsubsection{The tracking system}\label{the-tracking-system}}

The tracking system is built of two parts, closest to the collision is a
pixel detector and around that silicon strip sensors. They are used to
measure their charge sign, direction and momentum to be later able to
reconstruct the tracks of charged particles. They are as close to the
collision as possible to be able to identify secondary vertices.

\hypertarget{the-electromagnetic-calorimeter}{%
\subsubsection{The electromagnetic
calorimeter}\label{the-electromagnetic-calorimeter}}

The electromagnetic calorimeter measures the energy of photons and
electrons. It is made of tungstate crystal and photodetectors. When
passed by particles, the crystal produces scintillation light in
proportion to the particle's energy. This light is measured by the
photodetectors that convert it to an electrical signal. To measure a
particles energy, it has to leave its whole energy in the ECAL, which is
true for photons and electrons, but not for other particles such as
hadrons and muons. Those interact with matter differently and therefore
only leave some energy in the ECAL but are not stopped by it.

\hypertarget{the-hadronic-calorimeter}{%
\subsubsection{The hadronic
calorimeter}\label{the-hadronic-calorimeter}}

The hadronic calorimeter (HCAL) is used to detect high energy hadronic
particles. It surrounds the ECAL and is made of alternating layers of
active and absorber material. While the absorber material with its high
density causes the hadrons to shower, the active material then detects
those showers and measures their energy, similar to how the ECAL works.

\hypertarget{the-solenoid}{%
\subsubsection{The solenoid}\label{the-solenoid}}

The solenoid, giving the detector its name, is one of the most important
features. It creates a magnetic field of 3.8 T and therefore makes it
possible to measure momentum of charged particles by bending their
tracks.

\hypertarget{the-muon-system}{%
\subsubsection{The muon system}\label{the-muon-system}}

Outside of the solenoid, but still in its return yoke, there is only the
muon system. It consists of three types of gas detectors, the drift
tubes, cathode strip chambers and resistive plate chambers. It covers a
total of \(0 < |\eta| < 2.4\). The muons are the only detected
particles, that can pass all the other systems without a significant
energy loss.

\hypertarget{the-trigger-system}{%
\subsubsection{The Trigger system}\label{the-trigger-system}}

The CMS features a two level trigger system. It is necessary because the
detector is unable to process all the events due to limited bandwidth.
The Level 1 trigger reduces the event rate from 40 MHz to 100 kHz, the
software based High Level trigger is then able to further reduce the
rate to 1 kHz. The Level 1 trigger uses the data from the
electromagnetic and hadronic calorimeters as well as the muon chambers
to decide whether to keep an event. The High Level trigger uses a
streamlined version of the CMS offline reconstruction software for its
decision making.

\hypertarget{the-particle-flow-algorithm}{%
\subsubsection{The Particle Flow
algorithm}\label{the-particle-flow-algorithm}}

The particle flow algorithm \autocite{PARTICLE_FLOW} is used to identify
and reconstruct all the particles arising from the proton - proton
collision by using all the information available from the different
sub-detectors. It does so by extrapolating the tracks through the
different calorimeters and associating clusters they cross with them.
The set of clusters already associated to a track is then no more used
for the reconstruction of other particles. This is first done for muons
and then for charged hadrons, so a muon can't give rise to a wrongly
identified charged hadron. Due to Bremsstrahlung photon emission,
electrons are harder to reconstruct. For them a specific track
reconstruction algorithm is used \autocite{ERECO}. After identifying
charged hadrons, muons and electrons, all remaining clusters within the
HCAL correspond to neutral hadrons and within ECAL to photons. When the
list of particles and their corresponding deposits is established, it
can be used to determine the particles four momenta. From that, the
missing transverse energy can be calculated and tau particles can be
reconstructed by their decay products.

\hypertarget{jet-clustering}{%
\subsection{Jet clustering}\label{jet-clustering}}

Because of the hadronisation it is not possible to uniquely identify the
originating particle of a jet. Nonetheless, several algorithms exist to
help with this problem. The algorithm used in this thesis is the
anti-\(k_t\) \autocite{ANTIKT} clustering algorithm. It arises from a
generalization of several other clustering algorithms, namely the
\(k_t\), Cambridge/Aachen and SISCone clustering algorithms.

The anti-\(k_t\) clustering algorithm associates high \(p_t\) particles
with the lower \(p_t\) particles surrounding them within a radius R in
the \(\eta\) - \(\phi\) plane forming cone like jets. If two jets
overlap, the jets shape is changed according to its hardness in regards
to the transverse momentum. A softer particles jet will change its shape
more than a harder particles. A visual comparison of four different
clustering algorithms can be seen in Fig.~\ref{fig:antiktcomparison}. It
shows, that the jets reconstructed using the anti-\(k_t\) algorithm have
the clearest cone like shape and is therefore chosen for this thesis.
For this analysis, a radius of 0.8 is used.

\begin{figure}
\hypertarget{fig:antiktcomparison}{%
\centering
\includegraphics{./figures/antikt-comparision.png}
\caption{Comparison of the \(k_t\), Cambridge/Aachen, SISCone and
anti-\(k_t\) algorithms clustering a sample parton-level event with many
random soft \enquote{ghosts}
\autocite{ANTIKT}.}\label{fig:antiktcomparison}
}
\end{figure}

Furthermore, to approximate the mass of a heavy particle that caused a
jet, the soft-drop mass \autocite{SDM} can be used. In its calculation,
to reduce the contamination from initial state radiation, underlying
event and multiple hadron scattering, wide angle soft particles are
removed from the jet. It therefore is more accurate in determining the
mass of a particle causing a jet than taking the mass of all constituent
particles of the jet combined.

\newpage

\hypertarget{sec:moa}{%
\section{Method of analysis}\label{sec:moa}}

This section gives an overview over how the data collected by CMS is
going to be analysed to be able to either exclude the q* particle to
even higher masses than already done or confirm its existence.

As described in Sec.~\ref{sec:qs}, the decay of the q* particle to a
quark and a vector boson with the vector boson then decaying
hadronically will be investigated. This is the second most probable
decay of the q* particle and easier to analyse than the dominant decay
to a quark and a gluon. Therefore it is a good choice for this research.
It results in two jets, because the decay products of the heavy vector
boson are highly boosted, causing them to be very close together and
therefore be reconstructed as one jet. The dijet invariant mass of the
two jets in the final state is then used to reconstruct the mass of the
q* particle. The only background considered is the QCD multijet
background described in Sec.~\ref{sec:qcdbg}. A selection using
different kinematic variables as well as a tagger to identify jets from
the decay of a vector boson is introduced to reduce the background and
increase the sensitivity for the signal. After that, it will be looked
for a peak in the dijet invariant mass distribution at the resonance
mass of the q* particle.

The data studied were collected by the CMS experiment in the years 2016,
2017 and 2018. They are analysed with the Particle Flow algorithm to
reconstruct jets and all the other particles forming during the
collision. The jets are then clustered using the anti-\(k_t\) algorithm
with the distance parameter R being 0.8.

The analysis will be conducted in two steps. First, only the data
collected by the CMS experiment in 2016 with an integrated luminosity of
\(\SI{35.92}{\per\femto\barn}\) will be used to compare the results to
the previous analysis \autocite{PREV_RESEARCH}. Then the combined data
from 2016, 2017 and 2018 with an integrated luminosity of
\(\SI{137.19}{\per\femto\barn}\) will be used to improve the previously
set limits for the mass of the q* particle. Also, two different
V-tagging methods will be used to compare their performance. One based
on the N-subjettiness variable used in the previous research
\autocite{PREV_RESEARCH}, the other being a novel approach using a deep
neural network, that will be explained in the following.

\hypertarget{signal-and-background-modelling}{%
\subsection{Signal and Background
modelling}\label{signal-and-background-modelling}}

Before looking at the data collected by the CMS experiment, Monte Carlo
simulations \autocite{MONTECARLO} of background and signal are used to
understand how the data is expected to look like. To replicate the QCD
background processes, the different particle interactions that take
place in a proton - proton collision are simulated using the
probabilities provided by the Standard Model by calculating the cross
sections of the different Feynman diagrams. This was done using MadGraph
\autocite{MADGRAPH} and Pythia 8 \autocite{PYTHIA8}. Later on, also
detector effects (like its limited resolution) are applied to make sure,
they look like real data coming from the CMS detector.

The q* signal samples are simulated by the probabilities given by the q*
theory \autocite{QSTAR_THEORY} and assuming a cross section of
\(\SI{1}{\pico\barn}\). The simulation was done using MadGraph
\autocite{MADGRAPH} for eleven masspoints between 1.6 TeV and 7 TeV.
Because of the expected high mass, the signal width will be dominated by
the resolution of the detector, not by the natural resonance width.

The dijet invariant mass distribution of the QCD background is expected
to smoothly fall with higher masses. It is therefore fitted using the
following smooth falling function with three parameters p0, p1, p2:
\begin{equation}
\frac{dN}{dm_{jj}} = \frac{p_0 \cdot ( 1 - m_{jj} / \sqrt{s} )^{p_2}}{ (m_{jj} / \sqrt{s})^{p_1}}
\end{equation} Whereas \(m_{jj}\) is the invariant mass of the dijet and
\(p_0\) is a normalisation parameter. It is the same function as used in
the previous research studying 2016 data only but was also found to
reliably reproduce the background shape of the other years.

The signal is fitted using a double sided crystal ball function. It has
six parameters:

\begin{itemize}
\tightlist
\item
  mean: the functions mean, in this case the resonance mass
\item
  sigma: the functions width, in this case the resolution of the
  detector due to the very small resonance width expected
\item
  n1, n2, alpha1, alpha2: parameters influencing the shape of the left
  and right tail
\end{itemize}

A gaussian and a poisson function have also been studied but found to be
not able to reproduce the signal shape as they couldn't model the tails
on both sides of the peak.

A linear combination of the signal and background function is then
fitted to a toy dataset with gaussian errors obtained by adding
simulated background and signal. The resulting coefficients of said
combination then show the expected signal rate for the simulated signal
cross section of \(\SI{1}{\pico\barn}\). An example of such a fit can be
seen in Fig.~\ref{fig:cb_fit}. In this figure, a binning of 200 GeV is
used for presentational purposes. The analysis itself is conducted using
a 1 GeV binning. It can be seen that the fit works very well and
therefore confirms the functions chosen to model signal and background.
This is supported by a \(\chi^2 /\) ndof of 0.5 and a found mean for the
signal at 2999 \(\pm\) 23 \(\si{\giga\eV}\) which is in very good
agreement with the expected 3000 GeV mean. Those numbers clearly show
that the method in use is able to successfully describe the simulated
toy data.

\begin{figure}
\hypertarget{fig:cb_fit}{%
\centering
\includegraphics{./figures/cb_fit.pdf}
\caption{Combined fit of signal and background on a toy dataset with
gaussian errors and a simulated resonance mass of 3
TeV.}\label{fig:cb_fit}
}
\end{figure}

\newpage

\hypertarget{preselection-and-data-quality}{%
\section{Preselection and data
quality}\label{preselection-and-data-quality}}

To reduce the background and increase the signal sensitivity, a
selection of events that satisfy certain requirements is introduced.
This is done taking into account different variables. The selection is
divided into two stages. The first one (the preselection) introduces
some general physics motivated selections using kinematic variables and
is also used to ensure a high trigger efficiency. In the second part,
the discriminants introduced by different taggers will be used to
identify jets originating from the decay of a vector boson. After the
preselection, it is made sure, that the simulated samples represent the
real data well by comparing the data with the simulation in the signal
as well as a sideband region, where no signal events are expected.

\hypertarget{preselection}{%
\subsection{Preselection}\label{preselection}}

First, all events are cleaned of jets with a
\(p_t < \SI{200}{\giga\eV}\) and a pseudorapidity \(|\eta| > 2.4\). This
is to discard soft background and to make sure the particles are in the
barrel region of the detector for an optimal track reconstruction.
Furthermore, all events with one of the two highest \(p_t\) jets having
an angular separation smaller than 0.8 from any electron or muon are
discarded to allow future use of the data in studies investigating the
leptonic decay channel of the vector boson.

From a decaying q* particle, two jets are expected in the final state.
The dijet invariant mass of those two jets will be used to reconstruct
the mass of the q* particle. Therefore a cut is added to have at least 2
jets, accounting for the possibility of more jets, for example caused by
gluon radiation of a quark or other QCD effects. If this is the case,
the two jets with the highest \(p_t\) are used for the reconstruction of
the q* mass. The distributions of the number of jets before and after
the selection can be seen in Fig.~\ref{fig:njets}. The light blue filled
histogram shows the QCD background, the green and red line show the
expected signal for a decay of the q* particle to qW with a mass of 2
TeV (green) and 5 TeV (red). By comparing the left to the right
distributions, it is clear that the requirement of at least 2 jets
reduces the background significantly while keeping mostly all signal
events.

\begin{figure}
\begin{minipage}{\textwidth}
\centering\textbf{Comparison for 2016}
\end{minipage}
\begin{minipage}{0.5\textwidth}
\includegraphics{./figures/2016/v1_Cleaner_N_jets_stack.eps}
\end{minipage}
\begin{minipage}{0.5\textwidth}
\includegraphics{./figures/2016/v1_Njet_N_jets_stack.eps}
\end{minipage}
\begin{minipage}{\textwidth}
\vspace{0.1cm}
\centering\textbf{Comparison for the combined dataset}
\end{minipage}
\begin{minipage}{0.5\textwidth}
\includegraphics{./figures/combined/v1_Cleaner_N_jets_stack.eps}
\end{minipage}
\begin{minipage}{0.5\textwidth}
\includegraphics{./figures/combined/v1_Njet_N_jets_stack.eps}
\end{minipage}
\caption{Comparison of the number of jet distribution before and after the cut at number of jets $\ge$ 2. \newline
Left: distribution before the cut. Right: distribution after the cut. \newline
The signal curves are amplified by a factor of 10'000 to be visible.}
\label{fig:njets}
\end{figure}

The next selection is done using \(\Delta\eta = |\eta_1 - \eta_2|\),
with \(\eta_1\) and \(\eta_2\) being the \(\eta\) of the two jets with
the highest transverse momentum. The q* particle is expected to be very
heavy in regards to the center of mass energy of the collision and will
therefore be almost stationary. Its decay products should therefore be
close to back to back, which means the \(\Delta\eta\) distribution is
expected to peak at zero. At the same time, particles originating from
QCD effects are expected to have a higher \(\Delta\eta\). To maintain
comparability, the same selection as in previous research of
\(\Delta\eta \le 1.3\) is used. In the top two distributions of
Fig.~\ref{fig:deta}, this cut is marked by a vertical black line. The
difference in the \(m_{jj}\) distribution shows the strong reduction of
the background by this cut.

\begin{figure}
\begin{minipage}{\textwidth}
\centering\textbf{$\Delta\eta$ cut with signal amplified by 10'000}
\end{minipage}
\begin{minipage}{0.5\textwidth}
\includegraphics{./figures/2016/v1_Njet_deta_stack.eps}
\end{minipage}
\begin{minipage}{0.5\textwidth}
\includegraphics{./figures/combined/v1_Njet_deta_stack.eps}
\end{minipage}
\begin{minipage}{\textwidth}
\vspace{0.1cm}
\centering\textbf{$m_{jj}$ distribution before the cut}
\end{minipage}
\begin{minipage}{0.5\textwidth}
\includegraphics{./figures/2016/v1_Njet_invMass_stack.eps}
\end{minipage}
\begin{minipage}{0.5\textwidth}
\includegraphics{./figures/combined/v1_Njet_invMass_stack.eps}
\end{minipage}
\begin{minipage}{\textwidth}
\vspace{0.1cm}
\centering\textbf{$m_{jj}$ distribution after the cut}
\end{minipage}
\begin{minipage}{0.5\textwidth}
\includegraphics{./figures/2016/v1_Eta_invMass_stack.eps}
\end{minipage}
\begin{minipage}{0.5\textwidth}
\includegraphics{./figures/combined/v1_Eta_invMass_stack.eps}
\end{minipage}
\caption{Demonstration of the effect of the $\Delta\eta$ cut at $\Delta\eta \le 1.3$ on the $m_{jj}$ distribution.
\newline
Left: Partial dataset of $\SI{35.92}{\per\femto\barn}$ Right: Full dataset of $\SI{137.19}{\per\femto\barn}$.
}
\label{fig:deta}
\end{figure}

The last selection in the preselection is on the dijet invariant mass:
\(m_{jj} \ge \SI{1050}{\giga\eV}\). It is important for a trigger
efficiency higher than 99 \% with a soft-drop mass cut of
\(m_{SDM} > \SI{65}{\giga\eV}\) applied to the jet with the highest
transverse momentum. A comparison of the \(m_{jj}\) distribution before
and after the selection can be seen in Fig.~\ref{fig:invmass}. Also, it
has a huge impact on the background because it usually consists of
lighter particles. The q* on the other hand is expected to have a very
high invariant mass of more than 1 TeV. The \(m_{jj}\) distribution
should be a smoothly falling function for the QCD background and peak at
the simulated resonance mass for the signal events.

\begin{figure}
\begin{minipage}{\textwidth}
\centering\textbf{Comparison for 2016}
\end{minipage}
\begin{minipage}{0.5\textwidth}
\includegraphics{./figures/2016/v1_Eta_invMass_stack.eps}
\end{minipage}
\begin{minipage}{0.5\textwidth}
\includegraphics{./figures/2016/v1_invmass_invMass_stack.eps}
\end{minipage}
\begin{minipage}{\textwidth}
\centering\textbf{Comparison for the combined dataset}
\end{minipage}
\begin{minipage}{0.5\textwidth}
\includegraphics{./figures/combined/v1_Eta_invMass_stack.eps}
\end{minipage}
\begin{minipage}{0.5\textwidth}
\includegraphics{./figures/combined/v1_invmass_invMass_stack.eps}
\end{minipage}
\caption{Comparison of the invariant mass distribution before and after the cut at $m_{jj} \ge \SI{1050}{\giga\eV}$. It
shows the expected smooth falling functions of the background whereas the signal peaks at the simulated resonance mass.
\newline
Left: distribution before the cut. Right: distribution after the cut.}
\label{fig:invmass}
\end{figure}

After the preselection, the signal efficiency for q* decaying to qW of
2016 ranges from 48 \% for 1.6 TeV to 49 \% for 7 TeV. Decaying to qZ,
the efficiencies are between 45 \% (1.6 TeV) and 50 \% (7 TeV). The
amount of background after the preselection is reduced to 5 \% of the
original events. For the combined data of the three years those values
look similar. Decaying to qW signal efficiencies between 49 \% (1.6 TeV)
and 56 \% (7 TeV) are reached, whereas the efficiencies when decaying to
qZ are in the range of 46 \% (1.6 TeV) to 50 \% (7 TeV). Here, the
background could be reduced to 8 \% of the original events. So while
keeping around 50 \% of the signal, the background was already reduced
to less than a tenth.

\hypertarget{data---monte-carlo-comparison}{%
\subsection{Data - Monte Carlo
Comparison}\label{data---monte-carlo-comparison}}

To ensure that the simulation reproduces the data well, the simulated
QCD background sample is now being compared to the data of the
corresponding year collected by the CMS detector. This is done for the
partial dataset of year 2016 and for the full dataset separately. In
Fig.~\ref{fig:data-mc}, this comparison can be seen for the
distributions of the variables used during the preselection. To
compensate for the simulation overpredicting the scale of the QCD
background, histograms are rescaled, so that the dijet invariant mass
distributions of data and simulation have the same integral. The
invariant mass distribution of the data of 2016 falls slightly faster
than the simulated one, apart from that, the distributions are in very
good agreement.

For analysing the data from the CMS experiment, jet energy corrections
have to be applied. Those are to calibrate the ECAL and HCAL parts of
the CMS, so the energy of the detected particles can be measured
correctly. The corrections used were recommended by the CMS group for
internal use \autocite{JEC}.

\begin{figure}
\begin{minipage}{0.5\textwidth}
\centering\textbf{2016}
\end{minipage}
\begin{minipage}{0.5\textwidth}
\centering\textbf{Combined}
\end{minipage}
\begin{minipage}{0.5\textwidth}
\includegraphics{./figures/2016/DATA/v1_invmass_N_jets.eps}
\end{minipage}
\begin{minipage}{0.5\textwidth}
\includegraphics{./figures/combined/DATA/v1_invmass_N_jets.eps}
\end{minipage}
\begin{minipage}{0.5\textwidth}
\includegraphics{./figures/2016/DATA/v1_invmass_deta.eps}
\end{minipage}
\begin{minipage}{0.5\textwidth}
\includegraphics{./figures/combined/DATA/v1_invmass_deta.eps}
\end{minipage}
\begin{minipage}{0.5\textwidth}
\includegraphics{./figures/2016/DATA/v1_invmass_invMass.eps}
\end{minipage}
\begin{minipage}{0.5\textwidth}
\includegraphics{./figures/combined/DATA/v1_invmass_invMass.eps}
\end{minipage}
\caption{Comparison of data with the Monte Carlo simulation.}
\label{fig:data-mc}
\end{figure}

\hypertarget{sideband-region}{%
\subsubsection{Sideband region}\label{sideband-region}}

The sideband region is introduced to make sure no bias in the data and
Monte Carlo simulation is introduced and also to verify the agreement of
data and simulation. It is a region in which no signal event is
expected. Again, data and the Monte Carlo simulation are compared. For
this analysis, the region where the soft-drop mass of both of the two
jets with the highest transverse momentum is more than 105 GeV is
chosen. 105 GeV is well above the mass of 91 GeV of the Z boson, the
heavier vector boson. Therefore it is very unlikely, that an event with
a particle heavier than that originates from the decay of a vector
boson. In Fig.~\ref{fig:sideband}, the comparison of data with
simulation in the sideband region can be seen for the soft-drop mass
distribution as well as the dijet invariant mass distribution. It can be
seen, that in the sideband region data and simulation match very well.

\begin{figure}
\begin{minipage}{\textwidth}
\centering\textbf{2016}
\end{minipage}
\begin{minipage}{0.5\textwidth}
\includegraphics{./figures/2016/sideband/v1_SDM_SoftDropMass_1.eps}
\end{minipage}
\begin{minipage}{0.5\textwidth}
\includegraphics{./figures/2016/sideband/v1_SDM_invMass.eps}
\end{minipage}
\begin{minipage}{\textwidth}
\centering\textbf{Combined dataset}
\end{minipage}
\begin{minipage}{0.5\textwidth}
\includegraphics{./figures/combined/sideband/v1_SDM_SoftDropMass_1.eps}
\end{minipage}
\begin{minipage}{0.5\textwidth}
\includegraphics{./figures/combined/sideband/v1_SDM_invMass.eps}
\end{minipage}
\caption{Comparison of data with the Monte Carlo simulation in the sideband region.}
\label{fig:sideband}
\end{figure}

\newpage

\hypertarget{jet-substructure-selection}{%
\section{Jet substructure selection}\label{jet-substructure-selection}}

So far it was made sure, that the data collected by the CMS and the
simulation are in good agreement after the preselection and no unwanted
side effects are introduced in the data by the used cuts. Now another
selection has to be introduced, to further reduce the background to be
able to look for the hypothetical signal events in the data.

This is done by distinguishing between QCD and signal events using a
tagger to identify jets coming from a vector boson. Two different
taggers will be used to later compare their performance. The decay
analysed includes either a W or Z boson, which are, compared to the
particles in QCD effects, very heavy. This can be used by adding a
selection using the soft-drop mass of a jet. The soft-drop mass of at
least one of the two leading jets is expected to be within
\(\SI{35}{\giga\eV}\) and \(\SI{105}{\giga\eV}\). This cut already
provides a good separation of QCD and signal events, on which the two
taggers presented next can build.

Both taggers provide a discriminant to choose whether an event can be
classified as the decay of a vector boson or originates from QCD
effects. This value will be optimized afterwards to make sure the
maximum signal significance possible is achieved.

\hypertarget{n-subjettiness}{%
\subsection{N-Subjettiness}\label{n-subjettiness}}

The N-subjettiness \autocite{TAU21} \(\tau_N\) is a jet shape parameter
designed to identify boosted hadronically-decaying objects. When a
vector boson decays hadronically, it produces two quarks each causing a
jet. But in the case of the decay of a q* particle, the vector boson is
highly boosted and so are its decay products. They therefore appear,
after applying a clustering algorithm, as just one jet. This algorithm
now tries to figure out, whether one jet might consist of two subjets by
using the kinematics and positions of the constituent particles of this
jet. The N-subjettiness is defined as

\begin{equation} \tau_N = \frac{1}{d_0} \sum_k p_{T,k} \cdot \text{min}\{ \Delta R_{1,k}, \Delta R_{2,k}, …, \Delta
R_{N,k} \} \end{equation}

with k going over the constituent particles in a given jet, \(p_{T,k}\)
being their transverse momenta and
\(\Delta R_{J,k} = \sqrt{(\Delta\eta)^2 + (\Delta\phi)^2}\) being the
distance of a candidate subjet J and a constituent particle k in the
\(\eta\) - \(\phi\) plane. It quantifies to what degree a jet can be
regarded as a jet composed of \(N\) subjets. In the hadronic decay of a
highly boosted vector boson, two subjets are expected. Therefore it
seems that \(\tau_2\) would be a good choice for a discriminant.
However, experiments showed, that rather than using \(\tau_2\) directly,
the ratio \(\tau_{21} = \tau_2/\tau_1\) is a better discriminant between
QCD effects and events originating from the decay of a boosted vector
boson.

The lower the \(\tau_{21}\) is, the more likely a jet is caused by the
decay of a vector boson. Therefore a selection will be introduced, so
that \(\tau_{21}\) of one candidate jet is smaller then some value that
will be determined by the optimisation process described in the next
chapter. As candidate jet the one of the two highest \(p_t\) jets
passing the soft-drop mass window is used. If both of them pass, the one
with higher \(p_t\) is chosen.

\hypertarget{deepak8}{%
\subsection{DeepAK8}\label{deepak8}}

The DeepAK8 tagger \autocite{DEEP_BOOSTED} uses a deep neural network
(DNN) to identify decays originating in a vector boson. It reduces the
background rate by up to a factor of \textasciitilde10 with the same
signal efficiency compared to non-machine-learning approaches like the
N-Subjettiness method. This is shown by Fig.~\ref{fig:ak8_eff}, showing
a comparison of background and signal efficiency of the DeepAK8 tagger
with, between others, the \(\tau_{20}\) tagger that is also used in this
analysis.

\begin{figure}
\hypertarget{fig:ak8_eff}{%
\centering
\includegraphics[width=0.6\textwidth,height=\textheight]{./figures/deep_ak8.pdf}
\caption{Comparison of tagger efficiencies, showing, between others, the
DeepAK8 and \(\tau_{21}\) tagger used in this analysis
\autocite{DEEP_BOOSTED}.}\label{fig:ak8_eff}
}
\end{figure}

The DNN has two input lists for each jet. The first is a list of up to
100 constituent particles of the jet, sorted by decreasing \(p_t\). A
total of 42 properties of the particles such es \(p_t\), energy deposit,
charge and the angular momentum between the particle and the jet or
subjet axes are included. The second input list is a list of up to seven
secondary vertices, each with 15 features, such as the kinematics,
displacement and quality criteria. To process those inputs, a customised
DNN architecture has been developed. It consists of two convolutional
neural networks (CNN) that each process one of the input lists. The
outputs of the two CNNs are then combined and processed by a
fully-connected network to identify the jet. The network was trained
with a sample of 40 million jets, another 10 million jets were used for
development and validation.

In this thesis, the mass decorrelated version of the DeepAK8 tagger,
called DeepAK8-MD but further referred to as only DeepAK8, is used. It
adds an additional mass predictor layer, that is trained to quantify how
strongly the output of the non-decorrelated tagger is correlated to the
mass of a particle. Its output is fed back to the network as a penalty
so it avoids using features of the particles correlated to their mass.
The result is a largely mass decorrelated tagger of heavy resonances,
that doesn't introduce a bias in the jet mass shape. As can be seen in
Fig.~\ref{fig:ak8_eff}, it performs not as good as the
non-mass-decorrelated version, but still better than the other taggers
it was compared to.

The higher the discriminant value, called WvsQCD resp. ZvsQCD (further
referred to as only VvsQCD), of the DeepAK8 tagger, the more likely is
the jet to be caused by the decay of a vector boson. Therefore, using
the same way to choose a candidate jet as for the N-subjettiness tagger,
a selection is applied so that this candidate jet has a VvsQCD value
greater than some value determined by the optimisation presented next.

\hypertarget{sec:opt}{%
\subsection{Optimisation}\label{sec:opt}}

To figure out the best value to cut on the discriminants introduced by
the two taggers, a value to quantify how good a cut is has to be
introduced. For that, the significance calculated by
\(\frac{S}{\sqrt{B}}\) will be used. S stands for the amount of signal
events and B for the amount of background events in a given interval.
This value assumes a gaussian error on the background so it will be
calculated for the 2 TeV masspoint of the decay to qW where enough
background events exist to justify this assumption, which follows from
the central limit theorem \autocite{CLT} that states, that for identical
distributed random variables, their sum converges to a gaussian
distribution. The significance represents how good the signal can be
distinguished from the background in units of the standard deviation of
the background. As interval, a 10 \% margin around the resonance nominal
mass is chosen. The significance is then calculated for different
selections on the discriminant of the two taggers and then plotted in
dependence on the minimum resp. maximum allowed value of the
discriminant to pass the selection for the DeepAK8 resp. the
N-subjettiness tagger.

The optimisation process is done using only the data from year 2018,
assuming the taggers have similar performances on the data of the
different years.

\begin{figure}
  \begin{minipage}{0.5\textwidth}
    \includegraphics{./figures/sig-db.pdf}
  \end{minipage}
  \begin{minipage}{0.5\textwidth}
    \includegraphics{./figures/sig-tau.pdf}
  \end{minipage}
\caption{Significance plots for the DeepAK8 (left) and N-subjettiness (right) tagger at the 2 TeV masspoint.}
\label{fig:sig}
\end{figure}

As a result, the \(\tau_{21}\) cut is placed at \(\le 0.35\), confirming
the value previous research chose and the deep boosted cut is placed at
\(\ge 0.95\). For the DeepAK8 tagger, 0.97 would give a slightly higher
significance but as it is very close to the edge where the significance
drops very low and the higher the cut the less background will be left
to calculate the cross section limits, especially at higher resonance
masses, the slightly less strict cut is chosen.

For both taggers also a low purity category is introduced. Using the
cuts optimized for 2 TeV, there are very few background events left for
higher resonance masses, but to reliably calculate cross section limits,
those are needed. Therefore in the final cross section calculation, the
two categories are combined to have a high signal sensitivity for all
masspoints between 1.6 TeV and 7 TeV that were simulated. As low purity
category for the N-subjettiness tagger, a cut at
\(0.35 < \tau_{21} < 0.75\) is used. For the DeepAK8 tagger the opposite
cut from the high purity category is used: \(VvsQCD < 0.95\).

\newpage

\hypertarget{sec:extr}{%
\section{Signal extraction}\label{sec:extr}}

After the optimisation, now the optimal selection for the N-subjettiness
as well as the DeepAK8 tagger is found and applied to the simulated
samples as well as the data collected by the CMS experiment. The fit
described in Sec.~\ref{sec:moa} is performed for all masspoints of the
decay to qW and qZ and for the partial dataset of
\(\SI{35.92}{\per\femto\barn}\) as well as the complete dataset of
\(\SI{137.19}{\per\femto\barn}\) separately.

To test for the presence of a resonance in the data, the cross section
limits of the signal event are calculated using a frequentist asymptotic
limit criterion described in \autocite{ASYMPTOTIC_LIMIT}. Using the
parameters and signal rate obtained by the method described in
Sec.~\ref{sec:moa} as well as a shape analysis of the data recorded by
the CMS experiment, it determines an expected and an observed cross
section limit by doing a signal + background versus background-only
hypothesis test. It also calculates upper and lower limits of the
expected cross section corresponding to a confidence level of 95 \%.

In the absence of the q* particle in the data, the observed limits lie
within the \(2\sigma\) environment, meaning a 95 \% confidence level, of
the expected limit. This observed limit is plotted together with a
theory line, representing the cross section limits expected, if the q*
predicted by \autocite{QSTAR_THEORY} would exist. Since no significant
deviation from the Standard Model is found while looking for the
resonance, the crossing of the theory line with the observed limit is
calculated, to have a limit of mass up to which the existence of the q*
particle can be excluded. To find the uncertainty of this result, the
crossing of the theory line plus, respectively minus, its uncertainty
with the observed limit is also calculated.

\hypertarget{systematic-uncertainties}{%
\subsection{Systematic Uncertainties}\label{systematic-uncertainties}}

The variables used in this analysis are affected by systematic
uncertainties. For calculating the cross section of the signal, four
sources of such uncertainties are considered.

First, the uncertainty of the Jet Energy Corrections. When measuring a
particle's energy with the ECAL or HCAL part of the CMS, the electronic
signals send by the photodetectors in the calorimeters have to be
converted to actual energy values. Therefore an error in this
calibration causes the energy measured to be shifted to higher or lower
values causing also the position of the signal peak in the \(m_{jj}\)
distribution to vary. The uncertainty is approximated to be 2 \%.

Second, the tagger does not work perfectly and therefore some events,
that don't originate from a V boson are wrongly chosen and on the other
hand sometimes events that do originate from one are not. It influences
the events chose for analysis and is therefore also considered as an
uncertainty, which is approximated to be 6 \%.

Third, the uncertainty of the parameters of the background fit is also
considered, as it might change the background shape a little and
therefore influence how many signal and background events are
reconstructed from the data.

Fourth, the uncertainty on the luminosity influences the normalization
of the processes. Its value is 2.5 \% \autocite{LUMI_UNC}. \newpage

\hypertarget{results}{%
\section{Results}\label{results}}

This chapter will start by presenting the results for the partial
dataset of year 2016 with an integrated luminosity of
\(\SI{35.92}{\per\femto\barn}\) using both taggers and comparing it to
the previous research \autocite{PREV_RESEARCH}. It will then go on
showing the results for the combined dataset with an integrated
luminosity of \(\SI{137.19}{\per\femto\barn}\), again using both taggers
comparing their performances.

\hypertarget{partial-dataset}{%
\subsection{Partial dataset}\label{partial-dataset}}

Using the \(\SI{35.92}{\per\femto\barn}\) of data collected by the CMS
experiment during 2016, the cross section limits seen in
Fig.~\ref{fig:res2016} were obtained.

As described in Sec.~\ref{sec:extr}, the calculated cross section limits
are used to then calculate a mass limit, meaning the lowest possible
mass of the q* particle, by finding the crossing of the theory line with
the observed cross section limit. In Fig.~\ref{fig:res2016dw} it can be
seen, that the observed limit using the DeepAK8 tagger in the region
where theory and observed limit cross is very high compared to when
using the N-subjettiness tagger. Therefore the two lines cross at lower
resonance masses, which results in lower exclusion limits on the mass of
the q* particle causing the DeepAK8 tagger to perform worse than the
N-subjettiness tagger in regards of establishing those limits as can be
seen in Table~\ref{tbl:res2016}. The table also shows the upper and
lower limits on the mass found by calculating the crossing of the theory
plus resp. minus its uncertainty. Due to the theory and the observed
limits line being slowly falling in the high TeV region, even a small
uncertainty of the theory can cause a high difference of the mass limit.

\hypertarget{tbl:res2016}{}
\begin{longtable}[]{@{}lllll@{}}
\caption{\label{tbl:res2016}Mass limits found using the partial dataset
of \(\SI{35.92}{\per\femto\barn}\)}\tabularnewline
\toprule
Decay & Tagger & Limit {[}TeV{]} & Upper Limit {[}TeV{]} & Lower Limit
{[}TeV{]}\tabularnewline
\midrule
\endfirsthead
\toprule
Decay & Tagger & Limit {[}TeV{]} & Upper Limit {[}TeV{]} & Lower Limit
{[}TeV{]}\tabularnewline
\midrule
\endhead
qW & \(\tau_{21}\) & 5.39 & 6.01 & 4.99\tabularnewline
qW & DeepAK8 & 4.96 & 5.19 & 4.84\tabularnewline
qZ & \(\tau_{21}\) & 4.86 & 4.96 & 4.70\tabularnewline
qZ & DeepAK8 & 4.62 & 4.71 & 4.49\tabularnewline
\bottomrule
\end{longtable}

\begin{figure}%
  \centering
  \subfloat[Decay to qW, using N-subjettiness tagger]{%
  \label{fig:res2016tw}%
  \includegraphics[width=0.5\textwidth]{./figures/results/brazilianFlag_QtoqW_2016tau_13TeV.pdf}}
  \subfloat[Decay to qW, using DeepAK8 tagger]{%
  \includegraphics[width=0.5\textwidth]{./figures/results/brazilianFlag_QtoqW_2016db_13TeV.pdf}%
  \label{fig:res2016dw}}\\
  \subfloat[Decay to qZ, using N-subjettiness tagger]{%
  \includegraphics[width=0.5\textwidth]{./figures/results/brazilianFlag_QtoqZ_2016tau_13TeV.pdf}%
  \label{fig:res2016tz}}%
  \subfloat[Decay to qZ, using DeepAK8 tagger]{%
  \includegraphics[width=0.5\textwidth]{./figures/results/brazilianFlag_QtoqZ_2016db_13TeV.pdf}%
  \label{fig:res2016dz}}%
\caption{Results of the cross section limits for the partial dataset of 2016 using the $\tau_{21}$ tagger and the deep
boosted tagger.}
\label{fig:res2016}
\end{figure}

\hypertarget{comparison-with-existing-results}{%
\subsubsection{Comparison with existing
results}\label{comparison-with-existing-results}}

The result will now be compared to an existing result using the same
dataset. This research, however, uses a newer detector calibration as
well as an improved reconstruction so slight variations in the results
are to be expected.

The limit established by using the N-subjettiness tagger with the
partial dataset is \(\SI{0.39}{\tera\eV}\) (decay to qW) resp.
\(\SI{0.16}{\tera\eV}\) (decay to qZ) higher than the one from previous
research, which was found to be 5 TeV for the decay to qW and 4.7 TeV
for the decay to qZ. This is mainly due to the fact, that in our data,
the observed limit at the intersection point happens to be in the lower
region of the expected limit interval and therefore causing a very late
crossing with the theory line when using the N-subjettiness tagger (as
can be seen in Fig.~\ref{fig:res2016}). Comparing the expected limits,
there is a difference between 2 \% and 30 \%, between the values
calculated by this thesis compared to the previous research. It is not,
however, that one of the two results was constantly lower or higher but
rather fluctuating. As already noted, a slight variations in the results
was expected, therefore it can be said, that the results are in good
agreement. The cross section limits of the previous research can be seen
in Fig.~\ref{fig:prev}.

\begin{figure}
\begin{minipage}{0.5\textwidth}
\includegraphics{./figures/results/prev_qW.png}
\end{minipage}
\begin{minipage}{0.5\textwidth}
\includegraphics{./figures/results/prev_qZ.png}
\end{minipage}
\caption{Previous results of the cross section limits for q\* decaying to qW (left) and q\* decaying to qZ (right)
\cite{PREV_RESEARCH}.}
\label{fig:prev}
\end{figure}

\hypertarget{combined-dataset}{%
\subsection{Combined dataset}\label{combined-dataset}}

Using the full available dataset of \(\SI{137.19}{\per\femto\barn}\),
the cross section limits seen in Fig.~\ref{fig:resCombined} were
obtained. The cross section limits are, compared to only using the 2016
dataset, reduced to about 50 \%. This shows the big improvement achieved
by using more than three times the amount of data.

The results for the mass limits of the combined years are presented in
the following table.

\hypertarget{tbl:resCombined}{}
\begin{longtable}[]{@{}lllll@{}}
\caption{\label{tbl:resCombined}Mass limits found using
\(\SI{137.19}{\per\femto\barn}\) of data}\tabularnewline
\toprule
Decay & Tagger & Limit {[}TeV{]} & Upper Limit {[}TeV{]} & Lower Limit
{[}TeV{]}\tabularnewline
\midrule
\endfirsthead
\toprule
Decay & Tagger & Limit {[}TeV{]} & Upper Limit {[}TeV{]} & Lower Limit
{[}TeV{]}\tabularnewline
\midrule
\endhead
qW & \(\tau_{21}\) & 6.00 & 6.26 & 5.74\tabularnewline
qW & DeepAK8 & 6.11 & 6.31 & 5.39\tabularnewline
qZ & \(\tau_{21}\) & 5.49 & 5.76 & 5.29\tabularnewline
qZ & DeepAK8 & 4.95 & 5.13 & 4.85\tabularnewline
\bottomrule
\end{longtable}

The combination of the three years not just improved the cross section
limits, but also the limit for the mass of the q* particle. The final
result is 1 TeV higher for the decay to qW and almost 0.8 TeV higher for
the decay to qZ than what was concluded by the previous research
\autocite{PREV_RESEARCH}.

\begin{figure}%
  \centering
  \subfloat[Decay to qW, using N-subjettiness tagger]{%
  \label{fig:resCombinedtw}%
  \includegraphics[width=0.5\textwidth]{./figures/results/brazilianFlag_QtoqW_Combinedtau_13TeV.pdf}}
  \subfloat[Decay to qW, using DeepAK8 tagger]{%
  \includegraphics[width=0.5\textwidth]{./figures/results/brazilianFlag_QtoqW_Combineddb_13TeV.pdf}%
  \label{fig:resCombineddw}}\\
  \subfloat[Decay to qZ, using N-subjettiness tagger]{%
  \includegraphics[width=0.5\textwidth]{./figures/results/brazilianFlag_QtoqZ_Combinedtau_13TeV.pdf}%
  \label{fig:resCombinedtz}}%
  \subfloat[Decay to qZ, using DeepAK8 tagger]{%
  \includegraphics[width=0.5\textwidth]{./figures/results/brazilianFlag_QtoqZ_Combineddb_13TeV.pdf}%
  \label{fig:resCombineddz}}%
\caption{Results of the cross section limits for the combined dataset using the $\tau_{21}$ tagger and the DeepAK8
tagger.}
\label{fig:resCombined}
\end{figure}

\hypertarget{comparison-of-taggers}{%
\subsection{Comparison of taggers}\label{comparison-of-taggers}}

The results presented in Table~\ref{tbl:res2016} show, that the DeepAK8
tagger was not able to significantly improve the results compared to the
N-subjettiness tagger. For further comparison, in
Fig.~\ref{fig:limit_comp} the expected limits of the different taggers
for the q* \(\rightarrow\) qW and the q* \(\rightarrow\) qZ decay are
shown. It can be seen, that the DeepAK8 is at best as good as the
N-subjettiness tagger. This was not the expected result, as the deep
neural network was already found to provide a higher significance in the
optimisation done in Sec.~\ref{sec:opt}. The higher significance should
also result in lower cross section limits. To make sure, there is no
mistake in the setup, also the expected cross section limits using only
the high purity category of the two taggers with 2018 data are compared
in Fig.~\ref{fig:comp_2018}. There, the cross section limits calculated
using the DeepAK8 tagger are a bit lower than with the N-subjettiness
tagger, showing, that the method used for optimisation is working but
the assumption of it also applying to the combined dataset did not hold.

This can be explained by some training issues identified lately. The
training of the DeepAK8 tagger was done for the data of year 2016. It
therefore performs differently for the data of the other years. This
caused the DeepAK8 tagger to perform significantly worse than it could
have for several reasons. First, the optimisation done for the data of
year 2018 could therefore not be applied to the other datasets. Second,
even for the data of 2016, a newer version of the background simulation
was used, that, in combination with the samples used for the signal,
turned out to be the worst case scenario for the used training.
Recently, the training was improved to better perform across all
datasets, but those changes could not be incorporated into this thesis
due to it not being possible to do this in a reasonable timeframe.
\newpage

\begin{figure}
\begin{minipage}{0.5\textwidth}
\includegraphics{./figures/limit_comp_w.pdf}
\end{minipage}
\begin{minipage}{0.5\textwidth}
\includegraphics{./figures/limit_comp_z.pdf}
\end{minipage}
\caption{Comparison of expected limits of the different taggers using different datasets. Left: decay to qW. Right:
decay to qZ}
\label{fig:limit_comp}
\end{figure}

\begin{figure}
\hypertarget{fig:comp_2018}{%
\centering
\includegraphics[width=0.55\textwidth,height=\textheight]{./figures/limit_comp_2018.pdf}
\caption{Comparison of DeepAK8 and N-subjettiness tagger in the high
purity category using the data from year 2018.}\label{fig:comp_2018}
}
\end{figure}

\clearpage
\newpage

\hypertarget{summary}{%
\section{Summary}\label{summary}}

In this thesis, a search for the q* particle decaying to q + W and q + Z
was presented. Data of proton - proton collisions at the LHC of an
integrated luminosity of \(\SI{137.19}{\per\femto\barn}\) collected by
the CMS experiment at a centre-of-mass energy of
\(\sqrt{s} = \SI{13}{\tera\eV}\) has been searched. Also a partial
dataset of \(\SI{35.92}{\per\femto\barn}\) was analysed, to be able to
compare the results to previous research. Monte Carlo simulations were
used to model the QCD multijet background and signal shapes.

A selection was introduced to reduce background events and enhance
signal sensitivity. This selection required at least two jets, a
\(\Delta\eta \ge 1.3\) between the two highest \(p_t\) jets, an
invariant mass of the two highest \(p_t\) jets greater than
\(\SI{1050}{\giga\eV}\) and a soft-drop mass of at least one jet between
\(\SI{35}{\giga\eV}\) and \(\SI{105}{\giga\eV}\).

Two taggers, the DeepAK8 and the N-subjettiness tagger, have been used
to identify jets originating from the decay of a vector boson. For both
of them, two categories were introduced. A high purity category, aiming
for maximal signal sensitivity in the low TeV region of the invariant
mass spectrum and a low purity category, aiming for better statistics in
the high TeV region. For the DeepAK8 tagger, a high purity category of
\(VvsQCD > 0.95\) and a low purity category of \(VvsQCD \le 0.95\) was
used. For the N-subjettiness tagger the high purity category was
\(\tau_{21} < 0.35\) and the low purity category
\(0.35 < \tau_{21} < 0.75\). These values were obtained by optimising
for the highest possible significance of the signal.

A combined fit to the dijet invariant mass distribution of background
plus signal has been used to determine their shape parameters and the
expected signal rate. With those results, the cross section limits were
extracted from the data. Because no significant deviation from the
Standard Model was observed, new exclusion limits for the mass of the q*
particle were set. These are 6.1 TeV by analyzing the decay to qW,
respectively 5.5 TeV for the decay to qZ. Those limits are about 1 TeV
higher than the ones found in previous research, that found them to be 5
TeV resp. 4.7 TeV.

The performance of the two taggers used have been compared and found to
produce similar results. This was unexpected, as the DeepAK8 tagger was
supposed to perform better than the N-subjettiness tagger. The result
analysing the decay to qW was \(\SI{0.1}{\tera\eV}\) better using the
DeepAK8 than with the N-subjettiness tagger, but analysing the decay to
qZ it was \(\SI{0.5}{\tera\eV}\) worse. The performance of the DeepAK8
tagger is likely to significantly improve with an updated training that
was not yet available in the framework used by this thesis.

\newpage

\nocite{*}

\printbibliography

\newpage
\appendix

\hypertarget{expected-and-observed-cross-section-limits}{%
\section{Expected and observed cross section
limits}\label{expected-and-observed-cross-section-limits}}

\begin{longtable}[]{@{}lllll@{}}
\caption{Cross Section limits using 2016 data and the N-subjettiness
tagger for the decay to qW}\tabularnewline
\toprule
Mass {[}TeV{]} & Exp. limit {[}pb{]} & Upper limit {[}pb{]} & Lower
limit {[}pb{]} & Obs. limit {[}pb{]}\tabularnewline
\midrule
\endfirsthead
\toprule
Mass {[}TeV{]} & Exp. limit {[}pb{]} & Upper limit {[}pb{]} & Lower
limit {[}pb{]} & Obs. limit {[}pb{]}\tabularnewline
\midrule
\endhead
1.6 & 0.10 & 0.15 & 0.074 & 0.082\tabularnewline
1.8 & 0.077 & 0.11 & 0.054 & 0.041\tabularnewline
2.0 & 0.054 & 0.076 & 0.039 & 0.040\tabularnewline
2.5 & 0.024 & 0.034 & 0.017 & 0.041\tabularnewline
3.0 & 0.013 & 0.018 & 0.009 & 0.021\tabularnewline
3.5 & 0.0070 & 0.0099 & 0.005 & 0.004\tabularnewline
4.0 & 0.0042 & 0.0060 & 0.003 & 0.0017\tabularnewline
4.0 & 0.0042 & 0.0060 & 0.003 & 0.0017\tabularnewline
4.5 & 0.0035 & 0.0048 & 0.0027 & 0.0025\tabularnewline
5.0 & 0.0027 & 0.0036 & 0.0021 & 0.0024\tabularnewline
6.0 & 0.0010 & 0.0016 & 0.00068 & 0.00062\tabularnewline
7.0 & 0.00063 & 0.0010 & 0.00039 & 0.00086\tabularnewline
\bottomrule
\end{longtable}

\begin{longtable}[]{@{}lllll@{}}
\caption{Cross Section limits using 2016 data and the deep boosted
tagger for the decay to qW}\tabularnewline
\toprule
Mass {[}TeV{]} & Exp. limit {[}pb{]} & Upper limit {[}pb{]} & Lower
limit {[}pb{]} & Obs. limit {[}pb{]}\tabularnewline
\midrule
\endfirsthead
\toprule
Mass {[}TeV{]} & Exp. limit {[}pb{]} & Upper limit {[}pb{]} & Lower
limit {[}pb{]} & Obs. limit {[}pb{]}\tabularnewline
\midrule
\endhead
1.6 & 0.18 & 0.25 & 0.13 & 0.38\tabularnewline
1.8 & 0.11 & 0.16 & 0.078 & 0.12\tabularnewline
2.0 & 0.082 & 0.12 & 0.058 & 0.095\tabularnewline
2.5 & 0.033 & 0.047 & 0.024 & 0.037\tabularnewline
3.0 & 0.016 & 0.023 & 0.012 & 0.011\tabularnewline
3.5 & 0.0084 & 0.012 & 0.0059 & 0.0068\tabularnewline
4.0 & 0.0046 & 0.0067 & 0.0032 & 0.0034\tabularnewline
4.5 & 0.0028 & 0.0041 & 0.0019 & 0.0037\tabularnewline
5.0 & 0.0018 & 0.0027 & 0.0012 & 0.0040\tabularnewline
6.0 & 0.0011 & 0.0017 & 0.00071 & 0.0016\tabularnewline
7.0 & 0.00065 & 0.0011 & 0.00041 & 0.0011\tabularnewline
\bottomrule
\end{longtable}

\begin{longtable}[]{@{}lllll@{}}
\caption{Cross Section limits using 2016 data and the N-subjettiness
tagger for the decay to qZ}\tabularnewline
\toprule
Mass {[}TeV{]} & Exp. limit {[}pb{]} & Upper limit {[}pb{]} & Lower
limit {[}pb{]} & Obs. limit {[}pb{]}\tabularnewline
\midrule
\endfirsthead
\toprule
Mass {[}TeV{]} & Exp. limit {[}pb{]} & Upper limit {[}pb{]} & Lower
limit {[}pb{]} & Obs. limit {[}pb{]}\tabularnewline
\midrule
\endhead
1.6 & 0.087 & 0.12 & 0.062 & 0.07\tabularnewline
1.8 & 0.067 & 0.095 & 0.048 & 0.034\tabularnewline
2.0 & 0.047 & 0.066 & 0.034 & 0.033\tabularnewline
2.5 & 0.019 & 0.026 & 0.013 & 0.032\tabularnewline
3.0 & 0.010 & 0.015 & 0.0074 & 0.018\tabularnewline
3.5 & 0.0060 & 0.0084 & 0.0043 & 0.0035\tabularnewline
4.0 & 0.0035 & 0.0050 & 0.0025 & 0.0014\tabularnewline
4.5 & 0.0023 & 0.0034 & 0.0016 & 0.0018\tabularnewline
5.0 & 0.0016 & 0.0023 & 0.0011 & 0.0019\tabularnewline
6.0 & 0.00082 & 0.0013 & 0.00054 & 0.00049\tabularnewline
7.0 & 0.00050 & 0.00083 & 0.00031 & 0.00066\tabularnewline
\bottomrule
\end{longtable}

\begin{longtable}[]{@{}lllll@{}}
\caption{Cross Section limits using 2016 data and deep boosted tagger
for the decay to qZ}\tabularnewline
\toprule
Mass {[}TeV{]} & Exp. limit {[}pb{]} & Upper limit {[}pb{]} & Lower
limit {[}pb{]} & Obs. limit {[}pb{]}\tabularnewline
\midrule
\endfirsthead
\toprule
Mass {[}TeV{]} & Exp. limit {[}pb{]} & Upper limit {[}pb{]} & Lower
limit {[}pb{]} & Obs. limit {[}pb{]}\tabularnewline
\midrule
\endhead
1.6 & 0.15 & 0.22 & 0.11 & 0.33\tabularnewline
1.8 & 0.10 & 0.14 & 0.072 & 0.085\tabularnewline
2.0 & 0.077 & 0.11 & 0.056 & 0.064\tabularnewline
2.5 & 0.027 & 0.038 & 0.019 & 0.041\tabularnewline
3.0 & 0.015 & 0.021 & 0.010 & 0.0087\tabularnewline
3.5 & 0.0084 & 0.012 & 0.006 & 0.0066\tabularnewline
4.0 & 0.0049 & 0.0071 & 0.0035 & 0.0045\tabularnewline
4.5 & 0.0032 & 0.0046 & 0.0022 & 0.0026\tabularnewline
5.0 & 0.0022 & 0.0033 & 0.0015 & 0.0041\tabularnewline
6.0 & 0.0012 & 0.0019 & 0.00081 & 0.0018\tabularnewline
7.0 & 0.00057 & 0.00092 & 0.00037 & 0.00093\tabularnewline
\bottomrule
\end{longtable}

\begin{longtable}[]{@{}lllll@{}}
\caption{Cross Section limits using the combined data and the
N-subjettiness tagger for the decay to qW}\tabularnewline
\toprule
Mass {[}TeV{]} & Exp. limit {[}pb{]} & Upper limit {[}pb{]} & Lower
limit {[}pb{]} & Obs. limit {[}pb{]}\tabularnewline
\midrule
\endfirsthead
\toprule
Mass {[}TeV{]} & Exp. limit {[}pb{]} & Upper limit {[}pb{]} & Lower
limit {[}pb{]} & Obs. limit {[}pb{]}\tabularnewline
\midrule
\endhead
1.6 & 0.057 & 0.08 & 0.041 & 0.034\tabularnewline
1.8 & 0.040 & 0.056 & 0.028 & 0.043\tabularnewline
2.0 & 0.028 & 0.040 & 0.020 & 0.048\tabularnewline
2.5 & 0.013 & 0.018 & 0.0091 & 0.015\tabularnewline
3.0 & 0.0066 & 0.0092 & 0.0047 & 0.012\tabularnewline
3.5 & 0.0038 & 0.0053 & 0.0027 & 0.0047\tabularnewline
4.0 & 0.0022 & 0.0031 & 0.0016 & 0.0011\tabularnewline
4.5 & 0.0013 & 0.0019 & 0.00094 & 0.00068\tabularnewline
5.0 & 0.00084 & 0.0012 & 0.00060 & 0.00059\tabularnewline
6.0 & 0.00044 & 0.00066 & 0.00030 & 0.00041\tabularnewline
7.0 & 0.00022 & 0.00036 & 0.00014 & 0.00043\tabularnewline
\bottomrule
\end{longtable}

\begin{longtable}[]{@{}lllll@{}}
\caption{Cross Section limits using the combined data and the deep
boosted tagger for the decay to qW}\tabularnewline
\toprule
Mass {[}TeV{]} & Exp. limit {[}pb{]} & Upper limit {[}pb{]} & Lower
limit {[}pb{]} & Obs. limit {[}pb{]}\tabularnewline
\midrule
\endfirsthead
\toprule
Mass {[}TeV{]} & Exp. limit {[}pb{]} & Upper limit {[}pb{]} & Lower
limit {[}pb{]} & Obs. limit {[}pb{]}\tabularnewline
\midrule
\endhead
1.6 & 0.067 & 0.095 & 0.047 & 0.12\tabularnewline
1.8 & 0.043 & 0.061 & 0.030 & 0.054\tabularnewline
2.0 & 0.033 & 0.047 & 0.024 & 0.047\tabularnewline
2.5 & 0.013 & 0.019 & 0.0095 & 0.011\tabularnewline
3.0 & 0.0065 & 0.0092 & 0.0046 & 0.0050\tabularnewline
3.5 & 0.0034 & 0.0048 & 0.0024 & 0.0041\tabularnewline
4.0 & 0.0018 & 0.0026 & 0.0013 & 0.0013\tabularnewline
4.5 & 0.0011 & 0.0016 & 0.00074 & 0.0012\tabularnewline
5.0 & 0.00068 & 0.0010 & 0.00046 & 0.0015\tabularnewline
6.0 & 0.00038 & 0.00060 & 0.00024 & 0.00034\tabularnewline
7.0 & 0.00021 & 0.00035 & 0.00013 & 0.00046\tabularnewline
\bottomrule
\end{longtable}

\begin{longtable}[]{@{}lllll@{}}
\caption{Cross Section limits using the combined data and the
N-subjettiness tagger for the decay to qZ}\tabularnewline
\toprule
Mass {[}TeV{]} & Exp. limit {[}pb{]} & Upper limit {[}pb{]} & Lower
limit {[}pb{]} & Obs. limit {[}pb{]}\tabularnewline
\midrule
\endfirsthead
\toprule
Mass {[}TeV{]} & Exp. limit {[}pb{]} & Upper limit {[}pb{]} & Lower
limit {[}pb{]} & Obs. limit {[}pb{]}\tabularnewline
\midrule
\endhead
1.6 & 0.051 & 0.072 & 0.037 & 0.030\tabularnewline
1.8 & 0.035 & 0.050 & 0.026 & 0.036\tabularnewline
2.0 & 0.025 & 0.035 & 0.018 & 0.042\tabularnewline
2.5 & 0.011 & 0.015 & 0.0076 & 0.012\tabularnewline
3.0 & 0.0058 & 0.0081 & 0.0041 & 0.011\tabularnewline
3.5 & 0.0033 & 0.0046 & 0.0023 & 0.0042\tabularnewline
4.0 & 0.0019 & 0.0027 & 0.0014 & 0.00097\tabularnewline
4.5 & 0.0012 & 0.0017 & 0.00084 & 0.00059\tabularnewline
5.0 & 0.00077 & 0.0011 & 0.00054 & 0.00051\tabularnewline
6.0 & 0.00039 & 0.00057 & 0.00026 & 0.00036\tabularnewline
7.0 & 0.00019 & 0.00031 & 0.00013 & 0.00036\tabularnewline
\bottomrule
\end{longtable}

\begin{longtable}[]{@{}lllll@{}}
\caption{Cross Section limits using the combined data and deep boosted
tagger for the decay to qZ}\tabularnewline
\toprule
Mass {[}TeV{]} & Exp. limit {[}pb{]} & Upper limit {[}pb{]} & Lower
limit {[}pb{]} & Obs. limit {[}pb{]}\tabularnewline
\midrule
\endfirsthead
\toprule
Mass {[}TeV{]} & Exp. limit {[}pb{]} & Upper limit {[}pb{]} & Lower
limit {[}pb{]} & Obs. limit {[}pb{]}\tabularnewline
\midrule
\endhead
1.6 & 0.067 & 0.095 & 0.047 & 0.095\tabularnewline
1.8 & 0.044 & 0.063 & 0.032 & 0.048\tabularnewline
2.0 & 0.032 & 0.045 & 0.023 & 0.045\tabularnewline
2.5 & 0.012 & 0.017 & 0.0088 & 0.013\tabularnewline
3.0 & 0.0064 & 0.009 & 0.0046 & 0.0032\tabularnewline
3.5 & 0.0036 & 0.0051 & 0.0026 & 0.0039\tabularnewline
4.0 & 0.0021 & 0.0029 & 0.0015 & 0.0027\tabularnewline
4.5 & 0.0013 & 0.0018 & 0.00088 & 0.00094\tabularnewline
5.0 & 0.00083 & 0.0012 & 0.00057 & 0.00150\tabularnewline
6.0 & 0.00046 & 0.00072 & 0.00031 & 0.00043\tabularnewline
7.0 & 0.00023 & 0.00037 & 0.00015 & 0.00049\tabularnewline
\bottomrule
\end{longtable}

\newpage
\section*{Erklärung}

Hiermit bestätige ich, dass die vorliegende Bachelorarbeit von mir
selbstständig verfasst wurde und ich keine anderen als die angegebenen
Hilfsmittel - insbesondere keine im Quellenverzeichnis nicht benannten
Internet-Quellen - benutzt habe. Die Arbeit wurde bisher weder gesamt
noch in Teilen einer anderen Prüfungsbehörde vorgelegt. Die eingereichte
schriftliche Fassung entspricht der auf dem elekronischen
Speichermedium. Ich bin damit einverstanden, dass die Bachelorarbeit
veröffentlicht wird.

\vspace{5cm}

\parbox{5cm}{\hrule
\strut \footnotesize Ort, Datum} \hspace{1cm}\parbox{5cm}{\hrule
\strut \footnotesize David Leppla-Weber}

\end{document}