Seminars
 
Intelligent Arxiv: Sort daily papers by learning users topic preferences
CSIC
 
Location
Sala Alberto Lobo (ICE building, UAB Campus)
Date
21/02/2020 - 00/00/0000
 
We present and discuss some novel applications of the Linear Discriminant Analysis (LDA) technique of Machine Learning (ML). First in the field of New Physics (NP) searches at the LHC, where we are currently applying this unsupervised ML technique to find NP as emerging topics. Motivated by this powerful tool  we pursued the goal of sorting daily Arxiv papers in given field(s) according to individual user preference.

We model a scientific paper to be built as a combination of different scientific knowledge from diverse topics into a new problem. We apply then the (unsupervised) Machine Learning technique LDA to construct and extract topics from the corpus of papers. We obtain the topic weights of the available and new papers in the Arxiv,  and determine each user preference in topics according to each user preference in papers.

This allows us to determine the personal preference on new papers according to their topics weight distribution. We have created the web interface IArxiv.org where users can read personally-sorted daily Arxiv releases (and more) while the algorithm learns his/her preferences. Yielding therefore a more accurate sorting every day. Current IArxiv.org version runs on categories astro-ph, gr-qc, hep-ph and hep-th.
 
Attached Documents
Generalitat de CatalunyaUniversitat de BarcelonaUniversitat Autònoma de BarcelonaUniversitat Politècnica de CatalunyaConsejo Superior de Investigaciones CientíficasCentres de Recerca de Catalunya