This article is available for purchase or by subscription. See below.
Abstract
|
Most data in genome-wide phylogenetic analysis (phylogenomics) is essentially
multidimensional, posing a major challenge to human comprehension and
computational analysis. Also, we cannot directly apply statistical learning models in
data science to a set of phylogenetic trees since the space of phylogenetic trees is not
Euclidean. In fact, the space of phylogenetic trees is a tropical Grassmannian in
terms of the max-plus algebra. Therefore, to classify multilocus data sets for
phylogenetic analysis, we propose tropical
support vector machines (SVMs).
Linear SVMs are supervised learning models that can be formulated in terms
of quadratic optimization problems and that classify using hyperplanes in
a high-dimensional Euclidean space. Here we study hard margin tropical
SVMs introduced by Gärtner and Jaggi and define soft margin tropical
SVMs in the setting of tropical geometry. Then we show that both hard
margin tropical SVMs and soft margin tropical SVMs can be formulated
as linear optimization problems. For hard margin tropical SVMs, we show
necessary and sufficient conditions on the feasibility of the linear optimization
problem and if there exists a feasible solution then we show an explicit formula
for the optimal value of the feasible linear optimization problem. For soft
margin tropical SVMs, we show necessary conditions of the feasibility of the
linear optimization problem. Computational experiments show that our
methods work well with data sets generated under the multispecies coalescent
model.
|
PDF Access Denied
We have not been able to recognize your IP address
18.226.87.233
as that of a subscriber to this journal.
Online access to the content of recent issues is by
subscription, or purchase of single articles.
Please contact your institution's librarian suggesting a subscription, for example by using our
journal-recommendation form.
Or, visit our
subscription page
for instructions on purchasing a subscription.
You may also contact us at
contact@msp.org
or by using our
contact form.
Or, you may purchase this single article for
USD 40.00:
Keywords
linear programming, phylogenetic tree, supervised learning,
non-Euclidean data, tropical geometry
|
Mathematical Subject Classification
Primary: 14T90, 90C24, 92B10
|
Milestones
Received: 26 September 2022
Revised: 27 February 2023
Accepted: 5 March 2023
Published: 16 May 2024
|
© 2023 MSP (Mathematical Sciences
Publishers). |
|