BEGIN:VCALENDAR
VERSION:2.0
PRODID:researchseminars.org
CALSCALE:GREGORIAN
X-WR-CALNAME:researchseminars.org
BEGIN:VEVENT
SUMMARY:Matthew Lee (University of Bristol)
DTSTART:20201015T130000Z
DTEND:20201015T140000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/1
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/1/">EpiViz: an implementation of Circos plots for epidemiolog
 ists</a>\nby Matthew Lee (University of Bristol) as part of (ED-3S) Essex 
 Data Science Seminar Series\n\n\nAbstract\nEpidemiology studies predominan
 tly focus on single exposure and single outcome associations. However\, bi
 ological pathways involve numerous processes and identifying meaningful in
 termediate associations that can be taken forward for further analysis is 
 complex. This is particularly the case for studies involving metabolomics 
 data\, as effects rarely occur in isolation. Gaining global overview of hu
 ndreds of exposure/outcome associations may therefore aid downstream analy
 ses. Visual inspection is one of the main modes of understanding global ex
 posure/outcome associations. EpiViz is a wrapper that makes producing Cric
 os plots simple and efficient for those new to programming and data visual
 isation.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/1/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Godwin Osuntoki (University of Essex)
DTSTART:20201022T130000Z
DTEND:20201022T140000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/2
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/2/">Bayesian Analysis of chromosomal interactions in Hi-C dat
 a using the hidden Markov random field model</a>\nby Godwin Osuntoki (Univ
 ersity of Essex) as part of (ED-3S) Essex Data Science Seminar Series\n\n\
 nAbstract\nThere are different biological methods that have been developed
  over the years for analysis of the 3D structure of the DNA. Few computati
 onal and statistical methods have\, however\, been developed to analysis d
 ata generated using the Hi-C method. We follow statistical methodology to 
 explore the Hi-C data. The Hi-C data is well suited to be analyzed using a
  finite mixture model. The Potts model\, a hidden Markov random field mode
 l\, was employed to analyze the hidden (latent) components. The hidden com
 ponents through the Potts model can be categorized into k components (k = 
 2\,3…\,K). Using the Metropolis-within-Gibbs approach to analyze the dat
 a\, the proposed method was able to detect interactions (short and long ra
 nge) and loops. A large part of the significant interactions that we detec
 t are found within Topological Associated Domains\, which is one of the 3D
  structures known to occur in Hi-C data.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/2/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Nosheen Faiz (University of Essex)
DTSTART:20201105T140000Z
DTEND:20201105T150000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/4
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/4/">Assessing how feature selection and hyper-parameters infl
 uence optimal trees ensemble and random projection</a>\nby Nosheen Faiz (U
 niversity of Essex) as part of (ED-3S) Essex Data Science Seminar Series\n
 \n\nAbstract\nOur work investigates the effect of feature selection on thr
 ee methods: Random Forest (Breiman 2001)\, Optimal Trees Ensemble (Khan et
  al 2016) and Random Projection (Canning and Samworth 2017) in high dimens
 ional settings. To this end\, LASSO has been considered for selecting the 
 most important features based on training data for dimension reduction. Ad
 ditionally\, the influence of various hyper-parameters regulating the thre
 e methods has also been assessed. Analysis on several benchmark datasets i
 s given to illustrate the phenomena. The results reveal that feature selec
 tion improves the predictive performance of the Random Forest and Random P
 rojection methods in addition to reducing the computational burden. The pe
 rformance of Optimal Trees Ensemble is less influenced by feature selectio
 n.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/4/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Peng Liu (University of Essex)
DTSTART:20201112T140000Z
DTEND:20201112T150000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/5
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/5/">Ordering and Inequalities for Mixtures on Risk Aggregatio
 n</a>\nby Peng Liu (University of Essex) as part of (ED-3S) Essex Data Sci
 ence Seminar Series\n\n\nAbstract\nAggregation sets\, which represent mode
 l uncertainty due to unknown dependence\, are an important object in the s
 tudy of robust risk aggregation. In this talk\, we investigate ordering re
 lations between two aggregation sets for which the sets of marginals are r
 elated by two simple operations: distribution mixtures and quantile mixtur
 es. Intuitively\, these operations ``homogenize"   marginal distributions 
 by making them similar. As a general conclusion from our results\, more ``
 homogeneous" marginals lead to a larger aggregation set\, and thus more se
 vere model uncertainty\, although the situation for quantile mixtures is m
 uch more complicated than   that for distribution mixtures. \nWe proceed t
 o study inequalities on the worst-case values of risk measures in risk agg
 regation\, which represent conservative calculation of regulatory capital.
   Among other results\, we obtain an order relation on VaR under quantile 
 mixture for marginal distributions with monotone densities. Numerical resu
 lts are presented to visualize the theoretical results and further inspire
  some conjectures.\nFinally\, we discuss the connection of our results to 
 joint mixability and to merging p-values in multiple hypothesis testing.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/5/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Tolulope Fadina (University of Essex)
DTSTART:20210225T140000Z
DTEND:20210225T150000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/6
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/6/">Symmetric measures of variability induced by risk measure
 s</a>\nby Tolulope Fadina (University of Essex) as part of (ED-3S) Essex D
 ata Science Seminar Series\n\n\nAbstract\nGeneral measures of variability 
 induced by risk measures are investigated for their potential applications
  to risk management. We emphasize on the three classes of variability meas
 ures generated by the Value-at-Risk\, Expected Shortfall\, and the Expecti
 les. Their properties are explored\, and we obtain a characterization resu
 lt on general model spaces. Convergence properties and asymptotic normalit
 y of the empirical variability measures estimators are established. An app
 lication of the variability measures to financial data is also investigate
 d.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/6/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Ioana Olan (University of Cambridge)
DTSTART:20201126T140000Z
DTEND:20201126T150000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/7
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/7/">Detecting the hierarchical structure of the cell nucleus<
 /a>\nby Ioana Olan (University of Cambridge) as part of (ED-3S) Essex Data
  Science Seminar Series\n\n\nAbstract\nChromatin consists of DNA wrapped a
 round histones and forms complex three-dimensional structures within the c
 ell nucleus with various degrees of compaction. Genes have been shown to b
 e repressed by their proximity to the nuclear periphery or activated by be
 ing in contact with special regulatory regions called enhancers. Thus the 
 relative positioning of genes and their interactions with other regions ar
 e very important in determining whether they are expressed or not. Interac
 tions between pairs of genomic regions have been studied using assays such
  as Hi-C\, which generate large matrices estimating interaction frequencie
 s. We use such interaction estimates as weights in a network whose nodes a
 re equally sized genomic regions and perform nested community detection in
  order to resolve the relative positioning of genomic regions of interest 
 and model the interior of the cell nucleus. Our biological model is cellul
 ar senescence\, a phenotype associated with dramatic changes in its chroma
 tin interactions network relative to normal cells. Senescence corresponds 
 to permanent cell cycle arrest and has been shown to act as a protective b
 arrier against tumourigenesis.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/7/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Josh Bull (University of Oxford)
DTSTART:20201203T140000Z
DTEND:20201203T150000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/8
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/8/">Can maths tell us how to win at Fantasy Football?</a>\nby
  Josh Bull (University of Oxford) as part of (ED-3S) Essex Data Science Se
 minar Series\n\n\nAbstract\nFantasy Football is an online game played by m
 illions of people every year\, in which players attempt to predict the out
 come of football matches over the course of a season. To the surprise of e
 veryone (including myself)\, I was lucky enough to be crowned the winner o
 f the 2019-20 Fantasy Premier League\, one of the largest competitions in 
 the UK. As a researcher in Mathematical Oncology at the University of Oxfo
 rd\, people have asked me whether I used maths to win – while I followed
  some strategies at the time\, I didn’t have any proof that they were in
  some sense mathematically optimal. However\, mathematical modelling is a 
 tool which is capable of exploring exactly these kinds of questions: how c
 an we identify the best strategies to tackle complex problems? What types 
 of data are important to consider\, and how should we use them to inform o
 ur decisions? In this talk\, I’ll analyse how different quantitative app
 roaches can be used to tackle key questions in Fantasy Football\, and iden
 tify the strengths and weaknesses of these frameworks. Finally\, I’ll ad
 dress the question: Can maths tell us how to win at Fantasy Football?\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/8/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Osama Mahmoud (University of Essex)
DTSTART:20210211T140000Z
DTEND:20210211T150000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/9
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/9/">Slope-Hunter: A robust method for index-event bias correc
 tion in genome-wide association studies of conditional analyses</a>\nby Os
 ama Mahmoud (University of Essex) as part of (ED-3S) Essex Data Science Se
 minar Series\n\n\nAbstract\nBackground: Studying genetic associations with
  prognosis (e.g. survival\, subsequent events) is problematic due to selec
 tion bias - also termed index event bias or collider bias - whereby select
 ion on disease status can induce associations between causes of incidence 
 with prognosis. A current method for adjusting genetic associations for th
 is bias assumes there is no genetic correlation between incidence and prog
 nosis\, which may not be a plausible assumption.\n\nMethods: We propose an
  alternative\, the ‘Slope-Hunter’ approach\, which is unbiased even wh
 en there is genetic correlation between incidence and prognosis. Our appro
 ach has two stages. First\, we use cluster-based techniques to identify: v
 ariants affecting neither incidence nor prognosis (these should not suffer
  bias and only a random sub-sample of them are retained in the analysis)\;
  variants affecting prognosis only (excluded from the analysis). Second\, 
 we fit a cluster-based model to identify the class of variants only affect
 ing incidence\, and use this class to estimate the adjustment factor. {\\c
 olor{blue} The underlying assumption of our approach is that variants affe
 cting only incidence explain more variation in incidence than any group of
  variants with unique effects\, e.g. via same exposure\, on both incidence
  and prognosis}.\n\nResults: Simulation studies showed that {\\color{blue}
  our approach eliminates the bias and outperforms alternatives in the pres
 ence of genetic correlation\, and performs as well as alternatives under n
 o genetic correlation when its assumption is satisfied. We applied the ‘
 Slope-Hunter’ method to a study of fasting blood insulin levels (FI) con
 ditional on body mass index (BMI)\, estimated the index event bias\, and a
 djusted conditional associations of the lead variants with FI. Our estimat
 es suggested that there were common causes of BMI and FI of concordant dir
 ections of effect\, that are in-line with previously observed association 
 between obesity and insulin resistance.}\n\nConclusions: Our approach is u
 nbiased even in the presence of genetic correlation between incidence and 
 progression when the underlying assumptions hold. Bias-adjusting methods s
 hould be used to carry out causal analyses when conditioning on incidence.
 \n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/9/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Yanchun Bao (University of Essex)
DTSTART:20201217T140000Z
DTEND:20201217T150000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/10
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/10/">Estimating mode effects from a sequential mixed-modes ex
 periment</a>\nby Yanchun Bao (University of Essex) as part of (ED-3S) Esse
 x Data Science Seminar Series\n\n\nAbstract\nThe large-scale household pan
 el study Understanding Society (The U.K. Household Longitudinal Study UKHL
 S) has\, until recently\, used interviewers to administer its questionnair
 es\, but is now in the process of allowing individuals to participate usin
 g the web. Survey data are known to be affected by survey mode so a sequen
 tial mode-effects experiment was carried out on to evaluate the impact of 
 this change on the panel. In this talk we present a novel estimator and an
 alysis strategy to quantify the impact of mode across a wide range of vari
 ables\, with large mode effects on the covariance of a pair of variables u
 sed to indicate an increased risk that statistical analyses involving this
  pair will be affected.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/10/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Rafal Kulakowski (University of Essex)
DTSTART:20210204T140000Z
DTEND:20210204T150000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/11
DESCRIPTION:by Rafal Kulakowski (University of Essex) as part of (ED-3S) E
 ssex Data Science Seminar Series\n\nAbstract: TBA\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/11/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Yassir Rabhi (University of Essex)
DTSTART:20201210T140000Z
DTEND:20201210T150000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/12
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/12/">Copulas and measures of dependence under length-biased s
 ampling and informative censoring</a>\nby Yassir Rabhi (University of Esse
 x) as part of (ED-3S) Essex Data Science Seminar Series\n\n\nAbstract\nLen
 gth-biased data are often encountered in cross-sectional surveys and preva
 lent-cohort studies on disease durations. Under length-biased sampling sub
 jects with longer disease durations have greater chance to be observed. As
  a result\, covariate values linked to the longer survivors are favoured b
 y the sampling mechanism. When the sampled durations are also subject to r
 ight censoring\, the censoring is informative. Modelling dependence struct
 ure without adjusting for these issues leads to biased results. In this ta
 lk\, I will present a study on copulas for modelling dependence when the c
 ollected data are length-biased and account for both informative censoring
  and covariate bias. I will address the nonparametric estimation of the bi
 variate distribution\, copula function and its density\, and Kendall and S
 pearman measures for right-censored length-biased data. The proposed estim
 ator of the bivariate CDF is a Hadamard-differentiable functional of two M
 LEs\, Kaplan-Meier and empirical CDF\, and inherits their efficiencies. Ba
 sed on this estimator\, we devise estimators for copula function and a loc
 al-polynomial estimator for copula density that accounts for boundary bias
 . In addition\, I will introduce estimators for Kendall and Spearman measu
 res. The weak convergence of the estimators will also be discussed. The pr
 oposed method is then applied to analyse a set of right-censored length-bi
 ased data on survival with dementia\, collected as part of a nationwide st
 udy in Canada.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/12/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Carolin Strobl (Universität Zürich)
DTSTART:20201119T140000Z
DTEND:20201119T150000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/13
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/13/">A Statistician’s Botanical Garden - The Ideas behind T
 rees\, Model-Based Trees and Random Forests</a>\nby Carolin Strobl (Univer
 sität Zürich) as part of (ED-3S) Essex Data Science Seminar Series\n\n\n
 Abstract\nClassification and regression trees\, model-based trees and rand
 om forests are powerful statistical methods from the field of machine lear
 ning. They have been shown to achieve a high prediction accuracy\, especia
 lly in big data applications with many predictor variables and complex ass
 ociation patterns (such as nonlinear and higher-order interaction effects)
 . While individual trees are easy to interpret\, random forests are "black
  box" prediction methods. They do\, however\, provide variable importance 
 measures\, that are being used to judge the relevance of the individual pr
 edictor variables. The aim of this presentation is to introduce the ration
 ale behind trees\, model-based trees and random forests\, to illustrate th
 eir potential for high-dimensional data exploration\, e.g.\, in psychologi
 cal research\, but also to point out limitations and potential pitfalls in
  their practical application.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/13/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Shenggang Hu (University of Essex)
DTSTART:20221013T130000Z
DTEND:20221013T140000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/14
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/14/">Statistical disaggregation - a Monte Carlo approach for 
 imputation under constraints</a>\nby Shenggang Hu (University of Essex) as
  part of (ED-3S) Essex Data Science Seminar Series\n\nLecture held in NTC.
 1.04.\n\nAbstract\nStatistical disaggregation has become more and more imp
 ortant for smart energy systems. A typical example of such disaggregation 
 problems is to learn energy consumption for a higher resolution level (dat
 a recorded at higher frequency) based on data at a lower resolution (data 
 recorded at lower frequency). Constrained models are often used in such pr
 oblems and they are often very useful compared to their unconstrained coun
 terparts in terms of reducing uncertainty and leading to an improvement of
  the overall performance. However\, these constrained models usually are n
 ot expressible as ordinary distributions due to their intractable density 
 functions which makes it hard to conduct further analysis. This paper intr
 oduces a novel constrained Monte Carlo sampling algorithm based on Langevi
 n diffusions and rejection sampling to solve the problem of sampling from 
 constrained models. This new method is then applied to a statistical disag
 gregation problem for an electricity consumption dataset.  Our approach pr
 ovides excellent accuracy of data imputation\, based on our simulation stu
 dies and data analysis. The new method is also justified theoretically.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/14/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Prof Christian Martin Hennig (University of Bologna\, UCL)
DTSTART:20221103T140000Z
DTEND:20221103T150000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/15
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/15/">Advances in using cluster analysis for species delimitat
 ion</a>\nby Prof Christian Martin Hennig (University of Bologna\, UCL) as 
 part of (ED-3S) Essex Data Science Seminar Series\n\nLecture held in STEM 
 3.1.\n\nAbstract\nBiological species are often delimited based on genetic 
 multilocus data using methods for inferring phylogenetic trees or model- o
 r distance-based cluster analysis. A major problem here is that genetic di
 ssimilarity does not only arise from separated species\, but also if subpo
 pulations of a species live in geographically distant areas without geneti
 c exchange. In any case\, be it using partitioning cluster analysis or hie
 rarchical trees\, it is a hard problem to decide the number of species\, a
 nd whether groups that are candidates for being species actually belong to
 gether. I will discuss some the use of some new approaches for clustering 
 and estimating the number of clusters for this problem\, focusing particul
 arly on testing whether observed genetic heterogeneity within a species ca
 ndidate group can be explained be geographical distance rather than consis
 ting of separate species. This requires hypothesis testing in a distance-d
 istance regression model. I will also discuss the integration of such a te
 sting routine in a fully automated method for species delimitation.\n\nRef
 erence\n\nHausdorf\, B\, Hennig\, C. Species delimitation and geography. M
 ol Ecol Resour. 2020\; 20: 950– 960. https://doi.org/10.1111/1755-0998.1
 3184\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/15/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Johan van der Molen (University of Cambridge)
DTSTART:20221124T140000Z
DTEND:20221124T150000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/16
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/16/">Dirichlet process mixture inconsistency for the number o
 f components: how worried should we be in practice?</a>\nby Dr Johan van d
 er Molen (University of Cambridge) as part of (ED-3S) Essex Data Science S
 eminar Series\n\nLecture held in STEM 3.1.\n\nAbstract\nBayesian nonparame
 tric mixture models are widely used for model-based clustering due to thei
 r flexibility and  conceptual simplicity\, as well as the availability of 
 efficient sampling methods for performing inference. However\, recent work
  has established that such models have undesirable asymptotic properties r
 egarding the estimation of the number of clusters. For instance\, Dirichle
 t Process Mixtures (DPMs) have been shown to be inconsistent for the numbe
 r of clusters\, and overestimation of the number of clusters has been obse
 rved in practice for finite samples. Finite mixtures with a prior on the n
 umber of components - also known as Mixtures of Finite Mixtures (MFMs) - h
 ave been suggested as an asymptotically consistent alternative\, but the e
 ffects of model misspecification can still result in asymptomatic inconsis
 tency and poor estimation of the number of clusters in practice. \n\nHere 
 we specifically focus on estimation of the number of clusters in Bayesian 
 nonparametric mixtures in practice\, including the impact of Markov chain 
 Monte Carlo (MCMC) post-processing algorithms for summarisation and identi
 fication of a final representative summary clustering. We consider practic
 al scenarios of low to moderate dimension\, through both simulation studie
 s and applications to real biomolecular data. In the situations we conside
 r\, we confirm that even when the parametric form of the mixture component
  distributions is correctly specified\, DPMs lead to mild overestimation o
 f the number of clusters for finite samples. However\, we also demonstrate
  that this can be corrected by common summarisation methods\, suggesting t
 hat applications of DPMs in practice may be more robust than the theory mi
 ght suggest. We show that\, for both DPMs and MFMs\, mixture component den
 sity misspecification typically leads to more dramatic overestimation\, wi
 th DPMs providing slightly worse estimates than MFMs\, but with the common
  pattern of “true” clusters in the data being split into smaller subcl
 usters due to additional mixture components being required to flexibly cap
 ture features of the data inadequately described by the misspecified model
 s. We consider implications for high-dimensional data analysis\, in which 
 simplifying assumptions that are commonly made in practice for computation
 al tractability (e.g. assuming a diagonal covariance matrix for Gaussian m
 ixture components) are also expected to result in model misspecification. 
 As part of our work\, we compare popular MCMC post-processing algorithms f
 or identifying a final summary clustering\, and show that although some of
  them have a positive impact on results\, others can introduce severe over
 estimation of the number of clusters\, even when the underlying posterior 
 distribution from which samples are being drawn is centred on the true num
 ber of clusters. This is joint work with Yannis Chaumeny\, Paul Kirk\, Ant
 hony Davidson.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/16/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Alexei Vernitski (University of Essex)
DTSTART:20221027T130000Z
DTEND:20221027T140000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/17
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/17/">Using machine learning to solve mathematical problems an
 d to search for examples and counterexamples in pure maths research</a>\nb
 y Dr Alexei Vernitski (University of Essex) as part of (ED-3S) Essex Data 
 Science Seminar Series\n\nLecture held in STEM 3.1.\n\nAbstract\nOur recen
 t research can be generally described as applying state-of-the-art technol
 ogies of machine learning to suitable mathematical problems. We use both r
 einforcement learning and supervised learning (underpinned by deep learnin
 g). As to mathematical problems we consider\, they include learning to unt
 angle a braid (this problem is not unlike the problem of solving the Rubik
  cube)\, learning to find the parity of a permutation (as compared to the 
 classical problem of deep learning of learning to find the parity bit of a
  binary array)\, comparing mathematical mistakes made by artificial intell
 igence with those made by human mathematicians\, etc.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/17/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Qiuyi Hong (University of Essex)
DTSTART:20221117T140000Z
DTEND:20221117T150000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/18
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/18/">A Bilevel Game-TheoreDc Decision-Making Framework for St
 rategic Retailers in Both Local and Wholesale Electricity Markets</a>\nby 
 Qiuyi Hong (University of Essex) as part of (ED-3S) Essex Data Science Sem
 inar Series\n\nLecture held in STEM 3.1.\n\nAbstract\nIn this talk we prop
 ose a bilevel game-theoretic model for multiple strategic retailers partic
 ipating in both wholesale and local electricity markets while considering 
 customers’ switching behaviours. At the upper level\, each retailer maxi
 mizes its own profit by making optimal offering decisions in the retail ma
 rket and bidding decisions in the day-ahead wholesale (DAW) and local powe
 r exchange (LPE) markets. The interaction among multiple strategic retaile
 rs is formulated using the Bertrand competition model. For the lower level
 \, there are three optimisation problems. First\, the customers’ welfare
  maximisation problem with their switching behaviors is formulated to capt
 ure the demand responses from customers. Second\, a market-clearing proble
 m is formulated for the independent system operator (ISO) in the DAW marke
 t. Third\, a novel LPE market is developed for retailers to facilitate the
 ir power balancing. In addition\, the bilevel multi-leader multi-follower 
 Stackelberg game forms an equilibrium problem with equilibrium constraints
  (EPEC) problem\, which is solved by the diagonalization algorithm. Numeri
 cal results demonstrate the feasibility and effectiveness of the EPEC mode
 l and the importance of modeling customers’ switching behaviors. We corr
 oborate that incentivising customers’ switching behaviors and increasing
  the number of retailers facilitates retail competition\, which results in
  reducing strategic retailers’ retail prices and profits. Moreover\, the
  relationship between customers’ switching behaviors and welfare is refl
 ected by a balance between the electricity purchasing cost (i.e.\, electri
 city price) and the electricity consumption level.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/18/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Mateo Salles (University of Essex)
DTSTART:20230209T140000Z
DTEND:20230209T150000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/19
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/19/">Supervised Learning for Untangling Braids</a>\nby Mateo 
 Salles (University of Essex) as part of (ED-3S) Essex Data Science Seminar
  Series\n\nLecture held in STEM 3.1.\n\nAbstract\nUntangling a braid is a 
 typical multi-step process\, and reinforcement learning can be used to tra
 in an agent to untangle braids. Here we present another approach. Starting
  from the untangled braid\, we produce a dataset of braids using breadth-f
 irst search and then apply behavioral cloning to train an agent on the out
 put of this search. As a result\, the (inverses of) steps predicted by the
  agent turn out to be an unexpectedly good method of untangling braids\, i
 ncluding those braids which did not feature in the dataset.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/19/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr. Peng Liu (University of Kent)
DTSTART:20230504T130000Z
DTEND:20230504T140000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/20
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/20/">Optimal Smooth Approximation for Quantile Matrix Factori
 sation</a>\nby Dr. Peng Liu (University of Kent) as part of (ED-3S) Essex 
 Data Science Seminar Series\n\nLecture held in STEM 3.1.\n\nAbstract\nMatr
 ix Factorisation (MF) is essential to many estimation tasks. Most existing
  matrix factorisation methods focus on least squares matrix factorisation 
 (LSMF)\, which aims to minimise a smooth L2 loss between observations and 
 their dependent matrix measurement variables. In reality\, however\, L1 lo
 ss and check loss are widely used in regression to deal with outliers or o
 bservations contaminated by skewed or heavy-tailed noise. Although under c
 ertain conditions\, linear convergence to the global optimality can be est
 ablished for matrix factorisation under the L2 loss\, there is a lack of p
 rovably efficient algorithms for solving matrix factorisation under non-sm
 ooth losses. In this paper\, we investigate Quantile Matrix Factorization 
 (QMF)\, the counterpart of Quantile Regression in matrix estimation\, that
  adopts a tunable check loss and introduces robustness to matrix estimatio
 n for skewed and heavy tailed observations\, which are prevalent in realit
 y. To deal with the non-smooth loss\, we propose Nesterov smoothed QMF (Ns
 QMF)\, extending Nesterov’s optimal smooth approximation technique to th
 e matrix factorisation setting. We then present an alternating minimizatio
 n algorithm to solve the smooth NsQMF efficiently. We mathematically prove
  that solving the smoothed NsQMF is equivalent to solving the original non
 -smooth QMF problem and that our proposed algorithm achieves linear conver
 gence to the global optimality of QMF. Numerical evaluations verify our th
 eoretical findings and demonstrate that NsQMF significantly outperforms th
 e commonly used LSMF and prior approximate smoothing heuristics for QMF un
 der various noise distributions.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/20/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr. Xiaochuan Yang (University of Brunel)
DTSTART:20230525T130000Z
DTEND:20230525T140000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/21
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/21/">Some recent progress in random geometric graphs: beyond 
 the standard regimes</a>\nby Dr. Xiaochuan Yang (University of Brunel) as 
 part of (ED-3S) Essex Data Science Seminar Series\n\nLecture held in STEM 
 3.1.\n\nAbstract\nI will survey some recent joint works with Mathew Penros
 e (Bath)  on the cluster structure of random geometric graphs in a regime 
 that is less discussed in the literature.  The statistics of interest incl
 ude the number of k-components\, the number of components\, the number of 
 vertices in the giant component\, and the connectivity threshold. We show 
 LLN and normal/Poisson approximation by Stein's method.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/21/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr. Yufei Zhang (London School of Economics & Political Science)
DTSTART:20230511T130000Z
DTEND:20230511T140000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/22
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/22/">Exploration-exploitation trade-off for continuous-time r
 einforcement learning</a>\nby Dr. Yufei Zhang (London School of Economics 
 & Political Science) as part of (ED-3S) Essex Data Science Seminar Series\
 n\nLecture held in STEM 3.1.\n\nAbstract\nRecently\, reinforcement learnin
 g (RL) has attracted substantial research interests. Much of the attention
  and success\, however\, has been for the discrete-time setting. Continuou
 s-time RL\, despite its natural analytical connection to stochastic contro
 ls\, has been largely unexplored and with limited progress. In particular\
 , characterising sample efficiency for continuous-time RL algorithms remai
 ns a challenging and open problem.\n\nIn this talk\, we develop a framewor
 k to analyse model-based reinforcement learning in the episodic setting. W
 e then apply it to optimise exploration-exploitation trade-off for linear-
 convex RL problems\, and report sublinear (or even logarithmic) regret bou
 nds for a class of learning algorithms inspired by filtering theory. The a
 pproach is probabilistic\, involving analysing learning efficiency using c
 oncentration inequalities for correlated continuous-time observations\, an
 d applying stochastic control theory to quantify the performance gap betwe
 en applying greedy policies derived from estimated and true models.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/22/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Prof. Chenggui Yuan (Swansea University)
DTSTART:20230601T130000Z
DTEND:20230601T140000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/24
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/24/">Numerical solutions of SDEs with irregular coefficients<
 /a>\nby Prof. Chenggui Yuan (Swansea University) as part of (ED-3S) Essex 
 Data Science Seminar Series\n\nLecture held in STEM 3.1.\n\nAbstract\nStoc
 hastic differential equations (SDEs) with irregular coefficients have been
  widely studied. In this talk\, I will discuss the strong convergence and 
  the weak convergence of SDEs with  irregular coefficients. The convergenc
 e rate will be investigated under different irregular conditions on coeffi
 cients.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/24/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr. Robert Gaunt (The University of Manchester)
DTSTART:20230615T130000Z
DTEND:20230615T140000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/25
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/25/">Normal approximation for the posterior in exponential fa
 milies</a>\nby Dr. Robert Gaunt (The University of Manchester) as part of 
 (ED-3S) Essex Data Science Seminar Series\n\nLecture held in STEM 3.1.\n\n
 Abstract\nIn this talk I'll introduce quantitative Bernstein-von Mises typ
 e bounds on the normal approximation of the posterior distribution in expo
 nential family models when centering either around the posterior mode or a
 round the maximum likelihood estimator. Our bounds\, obtained through a ve
 rsion of Stein’s method\, are non-asymptotic\, and data dependent\; they
  are of the correct order both in the total variation and Wasserstein dist
 ances\, as well as for approximations for expectations of smooth functions
  of the posterior. All our results are valid for univariate and multivaria
 te posteriors alike\, and do not require a conjugate prior setting. We ill
 ustrate our findings on a variety of exponential family distributions\, in
 cluding Poisson\, multinomial and normal distribution with unknown mean an
 d variance. The resulting bounds have an explicit dependence on the prior 
 distribution and on sufficient statistics of the data from the sample\, an
 d thus provide insight into how these factors may affect the quality of th
 e normal approximation.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/25/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr. Arthur Maheo (Amazon)
DTSTART:20230622T130000Z
DTEND:20230622T140000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/26
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/26/">Benders decomposition for public transportation</a>\nby 
 Dr. Arthur Maheo (Amazon) as part of (ED-3S) Essex Data Science Seminar Se
 ries\n\nLecture held in STEM 3.1.\n\nAbstract\nCanberra (Australia) wants 
 to design a transportation network combining high-frequency buses with on-
 demand taxis. The resulting hub-and-shuttle network design problem is a la
 rge\, difficult mixed-integer program. We identified how to decompose the 
 problem – design first\, route second – and used a modern Benders deco
 mposition on the resulting formulation.\nThis new approach is orders of ma
 gnitude faster\, allowing us to solve full instances where a standard appr
 oach can only do small ones.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/26/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Prof Boris Mirkin (National Research University Higher School of E
 conomics)
DTSTART:20231006T120000Z
DTEND:20231006T130000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/27
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/27/">Anomalous clustering at various data formats</a>\nby Pro
 f Boris Mirkin (National Research University Higher School of Economics) a
 s part of (ED-3S) Essex Data Science Seminar Series\n\nLecture held in 1N1
 .4.1.\n\nAbstract\nAnomalous clustering is a method for extracting cluster
 s one-by-one. It is an extension of the Principal Component Analysis meth
 od to zero-one matrix factorization settings. After a brief overview of va
 rious versions of the method\, including its  extensions to similarity da
 ta\, spatial data\, and fuzzy clustering\, I am going to concentrate on a
  most recent development\, a triple-stage application of the approach to t
 he analysis of spatial-temporal patterns in a coastal oceanic phenomenon o
 f upwelling (see Nascimento et al. 2023).\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/27/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Jin Zhu (LSE)
DTSTART:20231019T130000Z
DTEND:20231019T140000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/28
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/28/">A Tuning-Free Algorithm for Sparsity-Constraint Optimiza
 tion</a>\nby Dr Jin Zhu (LSE) as part of (ED-3S) Essex Data Science Semina
 r Series\n\nLecture held in STEM 3.1.\n\nAbstract\nSparsity-constraint opt
 imization has wide applicability in signal processing\, statistics\, and m
 achine learning. Existing fast algorithms must burdensomely tune parameter
 s\, such as the step size or the implementation of precise stop criteria\,
  which may be challenging to determine in practice. To address this issue\
 , we develop an algorithm named sparsity-constraint optimization via splic
 ing iteration (SCOPE) to optimize nonlinear differential objective functio
 ns with strong convexity and smoothness in low dimensional subspaces. Algo
 rithmically\, the SCOPE algorithm converges effectively without tuning par
 ameters. Theoretically\, SCOPE has a linear convergence rate and converges
  to a solution that recovers the true support set when it correctly specif
 ies the sparsity. We also develop parallel theoretical results without res
 tricted-isometry-property-type conditions. We apply SCOPE’s versatility 
 and power to solve sparse quadratic optimization\, learn sparse classifier
 s\, and recover sparse Markov networks for binary variables. The numerical
  results on these specific tasks reveal that SCOPE perfectly identifies th
 e true support set with a 10–1000 speedup over the standard exact solver
 \, confirming SCOPE’s algorithmic and theoretical merits. Our open-sourc
 e Python package scope based on C++ implementation is publicly available o
 n GitHub\, reaching a ten-fold speedup on the competing convex relaxation 
 methods implemented by the cvxpy library.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/28/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Shenggang Hu (University of Warwick)
DTSTART:20231026T130000Z
DTEND:20231026T140000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/29
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/29/">Differential Privacy of Bayesian Posterior under Contami
 nation</a>\nby Dr Shenggang Hu (University of Warwick) as part of (ED-3S) 
 Essex Data Science Seminar Series\n\nLecture held in STEM 3.1.\n\nAbstract
 \nIn recent years\, differential privacy has been adopted by tech-companie
 s and governmental agencies as the standard for measuring privacy in algor
 ithms. We study the level of differential privacy in Bayesian posterior sa
 mpling setups. As opposed to the common privatization approach of injectin
 g Laplace/Gaussian noise into the output\, Huber's contamination model is 
 considered\, where we replace at random the data points with samples from 
 a heavy-tailed distribution. The derived bound for the differential privac
 y level in our approach matches the existing literature while lifting the 
 restriction on bounded observation space. We further consider the effect o
 f sample size on privacy level and conclude that asymptotically the contam
 ination approach is fully private at no cost of information loss.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/29/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Prof Wolfgang Hardle (Humboldt-Universität zu Berlin\, Germany)
DTSTART:20240118T140000Z
DTEND:20240118T150000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/30
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/30/">Data Science in a Math-Less Digital Society</a>\nby Prof
  Wolfgang Hardle (Humboldt-Universität zu Berlin\, Germany) as part of (E
 D-3S) Essex Data Science Seminar Series\n\nLecture held in STEM 3.1.\n\nAb
 stract\nIn an increasingly digital and data-driven world\, the importance 
 of data science cannot be overstated.  Data science\, by itself\, carries 
 a "push to analyse“  button though\, that lets the analyst forget about 
 the „math behind the machine learning tools“\n\nWe cover a few example
 s\, where data science needs math in order to be understood and applied.\n
 \nBy the end of this talk\, attendees will gain a fresh perspective on dat
 a science's role in a math-less digital society. They will leave with prac
 tical insights\, tools\, and strategies to leverage data effectively\, fos
 tering a culture of data-driven decision-making that transcends mathematic
 al barriers.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/30/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Dimitra Kosta (University of Edinburgh)
DTSTART:20231123T134500Z
DTEND:20231123T144500Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/31
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/31/">Maximum likelihood estimation of toric Fano varieties</a
 >\nby Dr Dimitra Kosta (University of Edinburgh) as part of (ED-3S) Essex 
 Data Science Seminar Series\n\nLecture held in Zoom.\n\nAbstract\nI will t
 alk about the maximum likelihood estimation problem for several classes of
  toric Fano models. I will start by exploring the maximum likelihood degre
 e for all 2-dimensional Gorenstein toric Fano varieties. I will show that 
 the ML degree is equal to the degree of the surface in every case except f
 or the quintic del Pezzo surface with two ordinary double points and provi
 de explicit expressions that allow one to compute the maximum likelihood e
 stimate in closed form whenever the ML degree is less than 5. I will explo
 re the reasons for the ML degree drop using A-discriminants and intersecti
 on theory. If there is time\, I will discuss about toric Fano varieties as
 sociated to 3-valent phylogenetic trees and their ML degree.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/31/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Richard Mann (University of Leeds)
DTSTART:20240201T140000Z
DTEND:20240201T150000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/33
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/33/">Collective decision-making by rational agents</a>\nby Dr
  Richard Mann (University of Leeds) as part of (ED-3S) Essex Data Science 
 Seminar Series\n\nLecture held in STEM 3.1.\n\nAbstract\nThe decisions mad
 e by others are a valuable source of social information about the world\, 
 because they may have knowledge that we lack. This means that when one age
 nt makes a given choice\, it can induce others to do so as well. In this t
 alk I will describe a theory of rational agents who optimally utilise the 
 social information provided by others\, and explore the dynamics this prod
 uces at the individual and group level. In particular\, I will show how th
 e implicit beliefs such agents hold about the physical and social environm
 ent shape their response to each other\, and how changes to the environmen
 t that conflict with these beliefs can dramatically alter collective behav
 iour and impact the success of groups.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/33/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Jinyu Tian (Macau University of Science and Technology)
DTSTART:20231214T140000Z
DTEND:20231214T150000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/34
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/34/">Discreteness Problem in Adversarial Machine Learning</a>
 \nby Dr Jinyu Tian (Macau University of Science and Technology) as part of
  (ED-3S) Essex Data Science Seminar Series\n\n\nAbstract\nAdversarial exam
 ples (AEs) of deep neural networks (DNNs) are receiving ever-increasing at
 tention because they help in understanding the mechanism of DNNs and provi
 de a novel perspective of the ethics of deep learning applications. In man
 y real scenarios\, AEs have to be discrete (e.g. digital images). Most exi
 sting works achieve the discreteness relying on the discretization of cont
 inuous AEs. Unfortunately\, they cannot sufficiently control the spatial d
 ifference before and after discretizing continuous AEs\, which will leads 
 to two sid-effects: degrading the attack capability of the obtained discre
 te AEs or introducing the extra distortion. \n\nIn this work\, we propose 
 an adversarial attack called Discrete Attack (DATK) to produce continuous 
 AEs tightly close to their discrete counterparts. Owning the negligible sp
 atial distance between them\, the expected discrete AEs perform with the s
 ame powerful attack capability as the continuous AEs without an extra dist
 ortion overhead. More precisely\, the proposed DATK generate AEs from a no
 vel perspective by directly modeling adversarial perturbations (APs) as di
 screte random variables. The AE generation problem thus reduces to the est
 imation of the distribution of discrete APs. Since this problem typically 
 is nondifferential\, we relax it with the proposed reparameterizing tricks
  and obtain an approximated continuous distribution of discrete APs. Our t
 heoretical proof shows that\, by virtue the continuous APs sampled from th
 e approximated distribution\, the spatial distance between the resultant c
 ontinuous AEs and their discrete counterparts are tightly bounded\, which 
 significantly overcomes the side-effects caused by the discretization. Ext
 ensive results over Imagenet\, Cifar10 and TU Berlin Sketch demonstrate th
 e superiority of our method when attacking representative DNNs including V
 gg19\, Resnet50\, DenseNet121 and MobilenetV2. It is also verified that ou
 r DATK is more robust against the state-ofthe-art adversarial detection me
 thods.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/34/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Hong Duong (University of Birmingham)
DTSTART:20231130T140000Z
DTEND:20231130T150000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/35
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/35/">Model Reduction of Complex Systems</a>\nby Dr Hong Duong
  (University of Birmingham) as part of (ED-3S) Essex Data Science Seminar 
 Series\n\nLecture held in STEM 3.1.\n\nAbstract\nComplex systems in nature
  and in applications (such as molecular systems\, crowd dynamics\, swarmin
 g\, opinion formation\, just to name a few) are often described by systems
  of stochastic differential equations (SDEs) and partial differential equa
 tions (PDEs). It is often analytically impossible or computationally prohi
 bitively expensive to deal with the full models due to their high dimensio
 nality (degrees of freedom\, number of involved parameters\, etc.). It is 
 thus of great importance to approximate such large and complex systems by 
 simpler and lower dimensional ones\, while still preserving the essential 
 information from the original model. This procedure is referred to as mode
 l reduction or coarse-graining in the literature. In this talk\, I will pr
 esent methods for qualitative and quantitative coarse-graining of several 
 SDEs and PDEs\, in the presence or absence of a scale-separation.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/35/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Yuyu Chen (University of Melbourne)
DTSTART:20231116T130000Z
DTEND:20231116T140000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/36
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/36/">Diversification of infinite-mean Pareto distributions</a
 >\nby Dr Yuyu Chen (University of Melbourne) as part of (ED-3S) Essex Data
  Science Seminar Series\n\nLecture held in Zoom.\n\nAbstract\nWe show the 
 perhaps surprising inequality that the weighted average of negatively depe
 ndent super-Pareto random variables\, possibly caused by triggering events
 \, is larger than one such random variable in the sense of first-order sto
 chastic dominance. The class of super-Pareto distributions is extremely he
 avy-tailed and it includes the class of infinite-mean Pareto distributions
 . We discuss several implications of this result via an equilibrium analys
 is in a risk exchange market. First\, diversification of super-Pareto loss
 es increases portfolio risk\, and thus a diversification penalty exists. S
 econd\, agents with super-Pareto losses will not share risks in a market e
 quilibrium. Third\, transferring losses from agents bearing super-Pareto l
 osses to external parties without any losses may arrive at an equilibrium 
 which benefits every party involved. The empirical studies show that our n
 ew inequality can be observed empirically for real datasets that fit well 
 with extremely heavy tails.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/36/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Xiaochun Meng (University of Bath)
DTSTART:20240509T130000Z
DTEND:20240509T140000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/37
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/37/">Angular Combining of Forecasts of Probability Distributi
 ons</a>\nby Dr Xiaochun Meng (University of Bath) as part of (ED-3S) Essex
  Data Science Seminar Series\n\nLecture held in STEM 3.1.\n\nAbstract\nWhe
 n multiple forecasts are available for a probability distribution\, foreca
 st combining enables a pragmatic synthesis of the information to extract t
 he wisdom of the crowd. A linear opinion pool has been widely used\, where
 by the combining is applied to the probability predictions of the distribu
 tional forecasts. However\, it has been argued that this will tend to deli
 ver overdispersed distributional forecasts\, prompting the combination to 
 be applied\, instead\, to the quantile predictions of the distributional f
 orecasts. Results from different applications are mixed\, leaving it as an
  empirical question whether to combine probabilities or quantiles. In this
  paper\, we present an alternative approach. Looking at the distributional
  forecasts\, combining the probability forecasts can be viewed as vertical
  combining\, with quantile forecast combining seen as horizontal combining
 . Our proposal is to allow combining to take place on an angle between the
  extreme cases of vertical and horizontal combining. We term this angular 
 combining. The angle is a parameter that can be optimized using a proper s
 coring rule. For implementation\, we provide a pragmatic numerical approac
 h and a simulation algorithm. Among our theoretical results\, we show that
 \, as with vertical and horizontal averaging\, angular averaging results i
 n a distribution with mean equal to the average of the means of the distri
 butions that are being combined. We also show that angular averaging produ
 ces a distribution with lower variance than vertical averaging\, and\, und
 er certain assumptions\, greater variance than horizontal averaging. We pr
 ovide empirical support for angular combining using weekly distributional 
 forecasts of Covid mortality.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/37/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Mahendra Singh Rajpoot (University of Essex)
DTSTART:20240125T140000Z
DTEND:20240125T150000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/38
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/38/">Large Language Models: A Stepping Stone for AGI!</a>\nby
  Mahendra Singh Rajpoot (University of Essex) as part of (ED-3S) Essex Dat
 a Science Seminar Series\n\nLecture held in STEM 3.1.\n\nAbstract\nIn the 
 rapidly evolving landscape of Artificial Intelligence (AI)\, Large Languag
 e Models (LLMs) have emerged as a transformative force\, showcasing remark
 able capabilities in natural language understanding and generation. This p
 resentation delves into the pivotal role that LLMs play as a stepping ston
 e towards achieving Artificial General Intelligence (AGI). We explore the 
 fundamental principles\, applications\, and underlying mechanisms that pro
 pel LLMs while contemplating their implications for the broader goal of AG
 I. The talk will navigate through recent advancements\, challenges\, and e
 thical considerations in harnessing the potential of LLMs\, ultimately env
 isioning their contribution to the evolution of comprehensive artificial i
 ntelligence\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/38/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Yi Zhang (University of Birmingham)
DTSTART:20240425T130000Z
DTEND:20240425T140000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/40
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/40/">On discounted Markov decision processes and their extens
 ions</a>\nby Dr Yi Zhang (University of Birmingham) as part of (ED-3S) Ess
 ex Data Science Seminar Series\n\nLecture held in 4SW.6.28.\n\nAbstract\nT
 he theory for discounted Markov decision processes (MDPs) has been well de
 veloped. In this talk we review some basic results concerning their occupa
 tion measures\, which are convenient for the studies of optimal control pr
 oblems with constraints. After that\, we discuss the possibility of their 
 extensions to more general models (uniformly absorbing MDPs\, absorbing MD
 Ps\, or more general MDPs with total criteria). The studies of absorbing M
 DPs have been active recently.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/40/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Kareemah Chopra (University of Essex)
DTSTART:20240208T140000Z
DTEND:20240208T150000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/41
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/41/">[Cancelled] The Bunching Behaviour of Cows</a>\nby Dr Ka
 reemah Chopra (University of Essex) as part of (ED-3S) Essex Data Science 
 Seminar Series\n\nLecture held in STEM 3.1.\n\nAbstract\nBunching behavior
  in cattle may occur for several reasons including enabling social interac
 tions\, a response to stress or danger\, or due to shared interest in reso
 urces such as feeding or watering areas. There is evidence in pasture graz
 ed cattle that bunching may occur more frequently at higher ambient temper
 atures\, possibly due to sharing of fly-load or to seek shade from the dir
 ect sun under heat stress conditions. Here we demonstrate how bunching beh
 avior is associated with higher ambient temperatures in a barn-housed UK d
 airy herd. A real-time local positioning system (RTLS) was used\, as part 
 of a precision livestock farming (PLF) approach\, to track the spatial pos
 ition and activity of a commercial dairy herd (c100 cows) in a freestall b
 arn continuously at high temporal resolution for 4 mo between August and N
 ovember 2014. Bunching was determined using 4 different spatial measures d
 etermined on an hourly basis: herd full and core range size\, mean herd in
 ter-cow distance (ICD)\, and mean herd nearest neighbor distance (NND). Fo
 r hourly mean ambient temperatures above 20°C\, the herd showed higher bu
 nching behavior with increasing ambient temperature (i.e.\, reduced full a
 nd core range size\, ICD\, and NND). Aggregated space-use intensity was fo
 und to positively correlate with localized variations in temperature acros
 s the barn (as measured by animal mounted sensors)\, but the level of corr
 elation decreased at higher ambient barn temperatures. Bunching behavior m
 ay increase localized temperatures experienced by individuals and hence ma
 y be a maladaptive behavioral response in housed dairy cattle\, which are 
 known to suffer heat stress at higher temperatures. Our study is the first
  to use high-resolution positional data to provide evidence of association
 s between bunching behavior and higher ambient temperatures for a barn-hou
 sed dairy herd in a temperate region (UK). Further studies are needed to e
 xplore the exact mechanisms for this response to inform both welfare and p
 roduction management.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/41/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Professor Richard J. Samworth (University of Cambridge)
DTSTART:20240229T140000Z
DTEND:20240229T150000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/42
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/42/">Isotonic subgroup selection</a>\nby Professor Richard J.
  Samworth (University of Cambridge) as part of (ED-3S) Essex Data Science 
 Seminar Series\n\nLecture held in STEM 3.1.\n\nAbstract\nGiven a sample of
  covariate-response pairs\, we consider the subgroup selection problem of 
 identifying a subset of the covariate domain where the regression function
  exceeds a pre-determined threshold. We introduce a computationally-feasib
 le approach for subgroup selection in the context of multivariate isotonic
  regression based on martingale tests and multiple testing procedures for 
 logically-structured hypotheses. Our proposed procedure satisfies a non-as
 ymptotic\, uniform Type I error rate guarantee with power that attains the
  minimax optimal rate up to poly-logarithmic factors. Extensions cover cla
 ssification\, isotonic\nquantile regression and heterogeneous treatment ef
 fect settings.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/42/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Professor Edward Rochead (Defence Science and Technology Laborator
 y)
DTSTART:20240307T140000Z
DTEND:20240307T150000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/43
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/43/">The Alliance for Data Science Professionals</a>\nby Prof
 essor Edward Rochead (Defence Science and Technology Laboratory) as part o
 f (ED-3S) Essex Data Science Seminar Series\n\nLecture held in STEM 3.1.\n
 \nAbstract\nThe talk will begin by introducing the Alliance\, its members 
 and how it was formed. It will then explain how individuals can become acc
 redited as Advanced Data Science Professionals and also describe the plans
  being formed to accredit degrees. It is expected that the discussion woul
 d focus on how the AfDSP can work with academic colleagues and ensure accr
 editation is attractive and meaningful to them\, and also consider how it 
 may feed into the employability of graduates in relevant disciplines.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/43/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Laurel Ariane Regibeau-Rockett (Stanford University)
DTSTART:20240321T140000Z
DTEND:20240321T150000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/44
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/44/">Hurricanes as heat engines</a>\nby Laurel Ariane Regibea
 u-Rockett (Stanford University) as part of (ED-3S) Essex Data Science Semi
 nar Series\n\nLecture held in STEM 3.1.\n\nAbstract\nHurricanes are danger
 ous and destructive atmospheric phenomena\, frequently causing loss of liv
 es worldwide. Improving our understanding of hurricanes can help improve h
 urricane forecasts and projections of their response to climate change. On
 e conceptual model of the hurricane\, which has supported major advancemen
 ts in hurricane science\, is the conceptualization of the hurricane as a h
 eat engine.  This theoretical framework supports research at the intersect
 ion of physics\, mathematics\, and atmospheric science. In this seminar\, 
 we will review this important theoretical model and some of its applicatio
 ns\, together with possible directions of future research in this interdis
 ciplinary domain.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/44/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Professor Mariachiara Di Cesare (University of Essex)
DTSTART:20240314T140000Z
DTEND:20240314T150000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/45
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/45/">Institute of Public Health and Wellbeing opportunities t
 o enhance research for all</a>\nby Professor Mariachiara Di Cesare (Univer
 sity of Essex) as part of (ED-3S) Essex Data Science Seminar Series\n\nLec
 ture held in STEM 3.1.\n\nAbstract\nThe IPHW\, established in 2022\, repre
 sents a major strategic innovation for the University of Essex\, bringing 
 together our community of experts to provide pioneering leadership in the 
 production of world-class research\, knowledge exchange and impact. Workin
 g with regional\, national\, and international partners\, the IPHW is driv
 en by a collective goal of creating a healthier and fairer society. During
  this seminar we will discuss the IPHW mission\, vision\, and strategy and
  look at opportunities to enhance interdisciplinary research in the field 
 of health and wellbeing with a special focus on data science.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/45/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Maria Brigida Ferraro (Sapienza University of Rome)
DTSTART:20240530T130000Z
DTEND:20240530T140000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/46
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/46/">Two-mode clustering in a fuzzy setting: methods and clus
 ter validity indices</a>\nby Dr Maria Brigida Ferraro (Sapienza University
  of Rome) as part of (ED-3S) Essex Data Science Seminar Series\n\nLecture 
 held in STEM 3.1.\n\nAbstract\nThe aim of clustering is to find a partitio
 n of the rows (e.g. objects) of a data matrix based on the values assumed 
 on a set of variables (columns). Two objects belong to the same cluster if
  the corresponding rows are close to each other according to a certain met
 ric based on all the variables. However\, it can be reasonable to seek clu
 sters such that objects assigned to the same cluster are close to each oth
 er with respect to a subset of variables. The research\ninterest can also 
 be reversed\, i.e.\, the goal is to find clusters of variables close to ea
 ch other in terms of a subset of objects. Standard clustering algorithms a
 re not adequate to accomplish these tasks. For this purpose\, two-mode clu
 stering methods have been introduced. Two-mode clustering consists in simu
 ltaneously partitioning modes (e.g.\, objects and variables) of an observe
 d two-mode data matrix.\n\nIn the literature\, two-mode clustering methods
  have been extensively studied and extended\nalong various directions. Mos
 t of them are based on the classical approach to clustering\, i.e.\, the o
 bjects (or the variables) are either assigned or not to the clusters. A mo
 re powerful and flexible exploratory approach is represented by introducin
 g fuzziness in the clustering process. In this case\, the objects (or the 
 variables) are no longer either assigned or not to the clusters\, but belo
 ng to the clusters with the so-called (fuzzy) membership degrees taking va
 lues in the interval [0\,1]. A high membership degree\, close to 1\, recog
 nizes an object (or variable) strongly assigned to a cluster\, i.e.\, an o
 bject (or variable) very close to the corresponding cluster prototype.\n\n
 Starting from the Double k-Means\, we propose a class of two-mode clusteri
 ng algorithms in a\nfuzzy framework\, including some robust proposals\, ta
 king into account that\, in this case\,\ndifferent kinds of outliers exist
  and should be considered.\nIn addition\, in order to evaluate the two fuz
 zy partitions and to choose the optimal numbers of clusters\, new cluster 
 validity indices are introduced. The proposed measures are defined in\nter
 ms of the compactness within each cluster and separation between clusters.
  Starting from\nsome well-known indices in standard fuzzy clustering\, som
 e generalizations to the two-mode\ncase are addressed. The adequacy of the
  proposals is checked by means of simulation and real-case studies.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/46/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Mohamed Bader (University of Portsmouth)
DTSTART:20240627T130000Z
DTEND:20240627T140000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/47
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/47/">Beyond Theory: Machine Learning and AI in Action for Ear
 ly Behavior and Outcome Prediction</a>\nby Dr Mohamed Bader (University of
  Portsmouth) as part of (ED-3S) Essex Data Science Seminar Series\n\nLectu
 re held in 1N1.4.1.\n\nAbstract\nThis talk explores the practical applicat
 ion of machine learning and AI in predicting early behaviors and outcomes 
 across healthcare\, digital marketing\, and industrial sectors. In healthc
 are\, particularly ICUs\, we discuss how AI models forecast patient outcom
 es and improve resource management. In digital marketing\, the focus is on
  how AI anticipates consumer behaviors to optimize marketing strategies. L
 astly\, in industrial applications\, we examine AI's role in predicting ma
 intenance needs and enhancing operational reliability.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/47/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Professor Guy Nason (Imperial College London)
DTSTART:20240516T130000Z
DTEND:20240516T140000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/48
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/48/">Network Time Series</a>\nby Professor Guy Nason (Imperia
 l College London) as part of (ED-3S) Essex Data Science Seminar Series\n\n
 Lecture held in STEM 3.1.\n\nAbstract\nA network time series is a multivar
 iate time series where the individual series are known to be linked by som
 e underlying network structure. Sometimes this network is known a priori\,
  and sometimes the network has to be created\, often inferred from the mul
 tivariate series itself. Network time series are becoming increasingly com
 mon\, long\, and collected over a large number of variables. We are partic
 ularly interested in network time series whose network structure changes o
 ver time.\n\nWe describe some recent developments in the modeling of netwo
 rk time series via generalized network autoregressive (GNAR) process model
 s. These models use regular autoregressive links between a variable and it
 s past and between a variable and the past of its neighbours. GNAR models 
 are highly parsimonious and\, hence\, work well for short series or those 
 afflicted by worrying amounts of missing data. For the same reason\, they 
 tend not to overfit and often exhibit excellent forecasting performance\, 
 especially when compared to alternatives such as vector autoregressive mod
 els.\n\nThis talk explains the GNAR model and some interesting variants. W
 e introduce some new tools for model selection and exhibit their use on ep
 idemic and economic data.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/48/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Kareemah Chopra (University of Essex)
DTSTART:20240502T130000Z
DTEND:20240502T140000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/50
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/50/">The Bunching Behaviour of Cows</a>\nby Dr Kareemah Chopr
 a (University of Essex) as part of (ED-3S) Essex Data Science Seminar Seri
 es\n\n\nAbstract\nBunching behavior in cattle may occur for several reason
 s including enabling social interactions\, a response to stress or danger\
 , or due to shared interest in resources such as feeding or watering areas
 . There is evidence in pasture grazed cattle that bunching may occur more 
 frequently at higher ambient temperatures\, possibly due to sharing of fly
 -load or to seek shade from the direct sun under heat stress conditions. H
 ere we demonstrate how bunching behavior is associated with higher ambient
  temperatures in a barn-housed UK dairy herd. A real-time local positionin
 g system (RTLS) was used\, as part of a precision livestock farming (PLF) 
 approach\, to track the spatial position and activity of a commercial dair
 y herd (c100 cows) in a freestall barn continuously at high temporal resol
 ution for 4 mo between August and November 2014. Bunching was determined u
 sing 4 different spatial measures determined on an hourly basis: herd full
  and core range size\, mean herd inter-cow distance (ICD)\, and mean herd 
 nearest neighbor distance (NND). For hourly mean ambient temperatures abov
 e 20°C\, the herd showed higher bunching behavior with increasing ambient
  temperature (i.e.\, reduced full and core range size\, ICD\, and NND). Ag
 gregated space-use intensity was found to positively correlate with locali
 zed variations in temperature across the barn (as measured by animal mount
 ed sensors)\, but the level of correlation decreased at higher ambient bar
 n temperatures. Bunching behavior may increase localized temperatures expe
 rienced by individuals and hence may be a maladaptive behavioral response 
 in housed dairy cattle\, which are known to suffer heat stress at higher t
 emperatures. Our study is the first to use high-resolution positional data
  to provide evidence of associations between bunching behavior and higher 
 ambient temperatures for a barn-housed dairy herd in a temperate region (U
 K). Further studies are needed to explore the exact mechanisms for this re
 sponse to inform both welfare and production management.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/50/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Anusa Suwanwong (University of Essex)
DTSTART:20240620T130000Z
DTEND:20240620T140000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/51
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/51/">A Gene Selection Method for Classification with Three Cl
 asses Using Proportional Overlapping Scores</a>\nby Anusa Suwanwong (Unive
 rsity of Essex) as part of (ED-3S) Essex Data Science Seminar Series\n\nLe
 cture held in STEM 3.1.\nAbstract: TBA\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/51/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Yu Wu (Southwest Jiaotong University\, China)
DTSTART:20241010T110000Z
DTEND:20241010T120000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/52
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/52/">Efforts to overcome the curse of dimensionality in seque
 ntial decision-making problems</a>\nby Dr Yu Wu (Southwest Jiaotong Univer
 sity\, China) as part of (ED-3S) Essex Data Science Seminar Series\n\n\nAb
 stract\nSequential decision-making problems are widespread\, and solving t
 hem is essential for enhancing efficiency\, reducing costs\, and optimizin
 g resource allocation. However\, these problems are notoriously difficult 
 due to the “curse of dimensionality.” Given the diversity of sequentia
 l decision-making problems and the broad applicability of solution methods
 \, this\ntalk will primarily focus on the complex Dynamic Vehicle Routing 
 Problem (DVRP). It will start by elucidating the specific challenges posed
  by the curse of dimensionality\, including the exponential growth of stat
 e space\, action space\, and transition probabilities. Then\, the talk wil
 l examine and discuss existing techniques to address these challenges\, su
 ch as state aggregation\, initial policy generation\, offline-online polic
 y improvement\, state (or state-action) value function representation\, an
 d methods for updating and leveraging probabilistic laws. Finally\, by mod
 ularly deconstructing\, updating\, and recombining these techniques\, this
  talk will propose new approaches to potentially overcome the curse of dim
 ensionality in the future.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/52/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Prof Tengyao Wang (LSE)
DTSTART:20241128T120000Z
DTEND:20241128T130000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/53
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/53/">High-dimensional changepoint estimation with heterogeneo
 us missingness</a>\nby Prof Tengyao Wang (LSE) as part of (ED-3S) Essex Da
 ta Science Seminar Series\n\nLecture held in CTC 3.02.\n\nAbstract\nWe pro
 pose a new method for changepoint estimation in partially-observed\, high-
 dimensional time series that undergo a simultaneous change in mean in a sp
 arse subset of coordinates.  Our first methodological contribution is to i
 ntroduce a 'MissCUSUM' transformation\, that captures the interaction betw
 een the signal strength and the level of missingness in each coordinate.  
 In order to borrow strength across the coordinates\, we project these Miss
 CUSUM statistics along a direction found as the solution to an optimisatio
 n problem.  The changepoint can then be estimated as the location of the p
 eak of the projected series.  In a model that allows different missingness
  probabilities in different component series\, we identify that the key in
 teraction between the missingness and the signal is an observation-probabi
 lity-weighted sum of squares of the signal change in each coordinate.   Mo
 re specifically\, we prove that the angle between the estimated and oracle
  projection directions\, as well as the changepoint location error\, are c
 ontrolled with high probability by the sum of two terms\, both involving t
 his weighted sum of squares\, and representing the error incurred due to n
 oise and due to missingness respectively.  The striking effectiveness of o
 ur methodology is further demonstrated both on simulated data\, and on an 
 oceanographic data set.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/53/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Eleni-Rosalina Andrinopoulou (Erasmus Medical Center Rotterdam)
DTSTART:20241205T120000Z
DTEND:20241205T130000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/54
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/54/">Assessing Risk Indicators in Clinical Practice with Join
 t Models</a>\nby Dr Eleni-Rosalina Andrinopoulou (Erasmus Medical Center R
 otterdam) as part of (ED-3S) Essex Data Science Seminar Series\n\nLecture 
 held in https://essex-university.zoom.us/j/98548135065.\n\nAbstract\nThe i
 ncreasing availability of clinical measures (e.g.\, electronic medical rec
 ords) leads to collecting many different types of information. This inform
 ation mainly includes multiple longitudinal measurements collected during 
 follow-up visits of the patient to the clinic. Big data is the key element
  to new developments and precision medicine. Nowadays\, individualized\, d
 ynamic predictions are popular in different medical fields because they im
 prove patient care. In particular\, it is of high clinical interest to pre
 dict future measurements to assess recovery on patients who experienced a 
 stroke or predict life expectancy for patients with Cystic Fibrosis.\n \nP
 hysicians collect a variety of measurements over time to assess the severi
 ty and progression of a disease. Even though all outcomes will be consider
 ed together intuitively\, they will usually be analyzed separately. It is 
 biologically relevant to study them together\; therefore\, it is more appr
 opriate to analyze them assuming a single statistical model. This\, howeve
 r\, poses many challenges. In particular\, different characteristics of th
 e patients' longitudinal profiles could influence the outcome(s) of intere
 st. For example\, the rate of change could be a better predictor than the 
 actual value. Therefore\, it is essential to assume the correct associatio
 n structure when obtaining dynamic predictions.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/54/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Victoria Volodina (University of Exeter)
DTSTART:20241114T120000Z
DTEND:20241114T130000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/55
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/55/">Polynomial structural equation models (SEMs) for decisio
 n support systems</a>\nby Dr Victoria Volodina (University of Exeter) as p
 art of (ED-3S) Essex Data Science Seminar Series\n\nLecture held in CTC 3.
 02.\n\nAbstract\nProbabilistic graphical models are widely used to model c
 omplex systems with uncertainty. In a decision-support context\, Bayesian 
 analysis may focus on obtaining beliefs from a decision maker and multiple
  expert panels about individual variables or propagating the effects of ne
 w evidence through the graph G\, coherently updating beliefs in vertices t
 hat are not yet established. The associated computations can become cumber
 some. To facilitate the decision-making process\, we propose adopting the 
 polynomial structural equation model (SEM) to depict complex relationships
  between individual variables in graph G and with a utility function in po
 lynomial form. Since the marginal posterior distributions of individual va
 riables can become analytically intractable\, we develop a nonparametric m
 essage-passing algorithm that propagates information throughout the graph 
 using only moments\, enabling exact calculation of expected utility scores
 . We illustrate the proposed methodology with examples and an application 
 to decision problems in energy planning and healthcare.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/55/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Jonas Latz (Manchester University)
DTSTART:20250227T120000Z
DTEND:20250227T130000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/56
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/56/">Cancelled</a>\nby Dr Jonas Latz (Manchester University) 
 as part of (ED-3S) Essex Data Science Seminar Series\n\nLecture held in CT
 C 3.02.\nAbstract: TBA\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/56/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Jianya Lu (University of Essex)
DTSTART:20241017T110000Z
DTEND:20241017T120000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/57
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/57/">Distribution estimation for time series via DNN-based GA
 Ns with an application to change-point estimation</a>\nby Dr Jianya Lu (Un
 iversity of Essex) as part of (ED-3S) Essex Data Science Seminar Series\n\
 nLecture held in CTC 3.02.\n\nAbstract\nThe generative adversarial network
 s (GANs) have recently been applied to estimating the distribution of inde
 pendent and identically distributed data\, and have attracted a lot of res
 earch attention. In this talk\,  I'll demonstrate the effectiveness of GAN
 s in estimating the distribution of stationary time series. Theoretically\
 , we derive a non-asymptotic error bound for the Deep Neural Network (DNN)
 -based GANs estimator for the stationary distribution of the time series. 
 Our approach is based on the blocking technique and the $m$-dependence app
 roximation technique that divides the time series into interlacing blocks 
 of equal size and then constructs independent blocks. Based on the theoret
 ical analysis\, we propose an algorithm for estimating the change-point in
  time series distribution. Numerical results of Monte Carlo experiments an
 d real data application are given to validate our theory and algorithm.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/57/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Porf. Michael J. Daniels (University of Florida\, USA)
DTSTART:20250213T120000Z
DTEND:20250213T130000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/58
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/58/">A Bayesian nonparametric approach for evaluating the cau
 sal effect of treatment in observational cohort studies with semi-competin
 g risks</a>\nby Porf. Michael J. Daniels (University of Florida\, USA) as 
 part of (ED-3S) Essex Data Science Seminar Series\n\nLecture held in CTC 3
 .02.\n\nAbstract\nWe develop a Bayesian nonparametric (BNP) approach to ev
 aluate the causal effect of treatment where a nonterminal event may be cen
 sored by one (or more) terminal event(s)\, but not vice versa (i.e.\, semi
 -competing risks). Based on the idea of principal stratification\, we defi
 ne a novel estimand for the causal effect of treatment on the nonterminal 
 event. We introduce identification assumptions (using factorizations based
  on vine copulas) indexed by  sensitivity parameters and show how to draw 
 inference using our BNP approach. We illustrate our methodology using data
  from a cardiovascular cohort study.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/58/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Soudeep Deb (Indian Institute of Management Bangalore)
DTSTART:20241212T120000Z
DTEND:20241212T130000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/59
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/59/">Structural breaks in the spatial network of real estate 
 dynamics: A study of UK property transactions</a>\nby Dr Soudeep Deb (Indi
 an Institute of Management Bangalore) as part of (ED-3S) Essex Data Scienc
 e Seminar Series\n\nLecture held in CTC 3.02.\n\nAbstract\nThe real estate
  market is a dynamic system shaped by various economic\, social\, and envi
 ronmental factors\, making the detection of structural changes crucial for
  understanding market trends\, managing risks\, and guiding investment and
  policy decisions. This study examines temporal changes in Greater London
 ’s real estate market using weekly data at the MSOA (Middle Layer Super 
 Output Areas) level through a two-stage methodology. First\, Local Indicat
 ors of Spatial Association (LISA) are applied to identify significant clus
 ters of high and low property prices and spatial outliers\, which are then
  integrated into a network framework incorporating geographical distance. 
 Second\, structural breaks in market dynamics are detected using network L
 aplacians\, capturing both gradual and abrupt shifts over time. These find
 ings are further leveraged to develop a localized house price index and a 
 data-driven zoning approach\, offering enhanced tools for evaluating prope
 rty prices with greater accuracy\, benefiting both investors and policymak
 ers.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/59/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Wasiur R. KhudaBukhsh (University of Nottingham)
DTSTART:20250515T110000Z
DTEND:20250515T120000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/60
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/60/">Enzyme kinetic reactions as interacting particle systems
 : Stochastic averaging and parameter inference</a>\nby Dr Wasiur R. KhudaB
 ukhsh (University of Nottingham) as part of (ED-3S) Essex Data Science Sem
 inar Series\n\nLecture held in CTC 3.02.\n\nAbstract\nWe consider a stocha
 stic model of multistage Michaelis--Menten (MM) type enzyme kinetic reacti
 ons describing the conversion of substrate molecules to a product through 
 several intermediate species. The high-dimensional\, multiscale nature of 
 these reaction networks presents significant computational challenges\, es
 pecially in statistical estimation of reaction rates. This difficulty is a
 mplified when direct data on system states are unavailable\, and one only 
 has access to a random sample of product formation times. To address this\
 , we proceed in two stages. First\, under certain technical assumptions ak
 in to those made in the Quasi-steady-state approximation (QSSA) literature
 \, we prove two asymptotic results: a stochastic averaging principle that 
 yields a lower-dimensional model\, and a functional central limit theorem 
 that quantifies the associated fluctuations. Next\, for statistical infere
 nce of the parameters of the original MM reaction network\, we develop a m
 athematical framework involving an interacting particle system (IPS) and p
 rove a propagation of chaos result that allows us to write a product-form 
 likelihood function. The novelty of the IPS-based inference method is that
  it does not require information about the state of the system and works w
 ith only a random sample of product formation times. We provide numerical 
 examples to illustrate the efficacy of the theoretical results. Preprint: 
 https://arxiv.org/abs/2409.06565\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/60/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Kirstii Badcock (University of Essex)
DTSTART:20241121T120000Z
DTEND:20241121T130000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/61
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/61/">SMSAS and Impact: Ensuring your research makes a differe
 nce</a>\nby Kirstii Badcock (University of Essex) as part of (ED-3S) Essex
  Data Science Seminar Series\n\nLecture held in CTC 3.02.\n\nAbstract\nI w
 ill give a short overview of national research assessments\, why impact is
  important and how it is scored in the REF drawing on examples of high sco
 ring impact case studies taken from REF21 for Unit of Assessment 10.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/61/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Yi Xia (University of Essex)
DTSTART:20250116T120000Z
DTEND:20250116T130000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/62
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/62/">A Numerical Investigation of Non-Convex Reinsurance Prob
 lems Using a Method of Homotopy Optimization with Perturbations and Ensemb
 les</a>\nby Yi Xia (University of Essex) as part of (ED-3S) Essex Data Sci
 ence Seminar Series\n\nLecture held in CTC 3.02.\nAbstract: TBA\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/62/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Stuart McDonald (Longevity and Demographic Insights)
DTSTART:20250130T120000Z
DTEND:20250130T130000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/63
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/63/">Mortality in the wake of the pandemic</a>\nby Stuart McD
 onald (Longevity and Demographic Insights) as part of (ED-3S) Essex Data S
 cience Seminar Series\n\nLecture held in CTC 3.02.\n\nAbstract\nStuart McD
 onald will discuss the role of actuaries in proving modelling and insights
  and addressing misinformation during the Covid-19 pandemic. He will show 
 the impact of the pandemic and resulting NHS pressures on UK mortality rat
 es\, and the challenge this presents for projecting future mortality.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/63/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Rishideep Roy (University of Essex)
DTSTART:20250313T120000Z
DTEND:20250313T130000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/64
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/64/">Real-time within game forecasting in football</a>\nby Dr
  Rishideep Roy (University of Essex) as part of (ED-3S) Essex Data Science
  Seminar Series\n\nLecture held in CTC 3.02.\n\nAbstract\nWe employ a Baye
 sian methodology to predict the results of soccer matches in real-time. Us
 ing sequential data of various events throughout the match\, we utilise a 
 multinomial probit regression in a novel framework to estimate the time-va
 rying impact of covariates and to forecast the outcome. English Premier Le
 ague data from eight seasons are used to evaluate the efficacy of our meth
 od. Different evaluation metrics establish that the proposed model outperf
 orms potential competitors inspired by existing statistical or machine lea
 rning algorithms. Additionally\, we apply robustness checks to demonstrate
  the model’s accuracy across various scenarios.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/64/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Mengchu Li (University of Birmingham)
DTSTART:20250522T110000Z
DTEND:20250522T120000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/65
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/65/">Differential privacy analysis of Langevin algorithms</a>
 \nby Dr Mengchu Li (University of Birmingham) as part of (ED-3S) Essex Dat
 a Science Seminar Series\n\nLecture held in CTC 3.02.\n\nAbstract\nIn this
  talk\, I will introduce the concept of differential privacy\, the prevail
 ing framework for developing statistical procedures while quantifying the 
 amount of privacy offered to each individual in the data set. Differential
  privacy guarantees are often achieved by injecting noise into determinist
 ic algorithms\, and this fact makes a large class of sampling algorithms n
 aturally private without any modifications. I will focus on the simplest u
 nadjusted Langevin algorithm and discuss several attempts to characterise 
 its privacy guarantees under the differential privacy framework.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/65/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Romina Hashami (University of Essex)
DTSTART:20251016T120000Z
DTEND:20251016T130000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/66
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/66/">Can News Predict the Direction of Oil Price Volatility? 
 A Language Model Approach with SHAP Explanations</a>\nby Romina Hashami (U
 niversity of Essex) as part of (ED-3S) Essex Data Science Seminar Series\n
 \nLecture held in 4.311.\n\nAbstract\nFinancial markets can be highly sens
 itive to news\, investor sentiment and economic indicators\, leading to im
 portant asset price fluctuations.  In this study we focus on crude oil\, d
 ue to its crucial role in commodity markets and global economy. Specifical
 ly we are interested on understanding the directional changes of oil price
  volatility\, and for this purpose\, we investigate whether news alone -- 
 without incorporating traditional market data -- can effectively predict t
 he direction of oil price movements. Using a decade-long dataset from Eiko
 n (2014–2024)\, we develop an ensemble learning framework to extract pre
 dictive signals from financial news. Our approach leverages diverse sentim
 ent analysis techniques and cutting-edge language models\, including FastT
 ext\, FinBERT\, Gemini\, and LLaMA\, to capture market sentiment and textu
 al patterns. We benchmark our model against the Heterogeneous Autoregressi
 ve (HAR) model and assess statistical significance using the McNemar test.
  Notably\, while most sentiment-based indicators do not consistently outpe
 rform HAR\, the raw news count emerges as a robust predictor. Among embedd
 ing techniques\, FastText proves most effective for forecasting directiona
 l movements. Furthermore\, SHAP-based interpretation at the word level rev
 eals evolving predictive drivers across market regimes: pre-pandemic empha
 sis on supply-demand and economic terms\; early pandemic focus on uncertai
 nty and macroeconomic instability\; post-shock attention to long-term reco
 very indicators\; and war-period sensitivity to geopolitical and regional 
 oil market disruptions. These findings highlight the predictive power of n
 ews-driven features and the value of explainable NLP in financial forecast
 ing.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/66/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Peter Young (King's College London)
DTSTART:20251106T130000Z
DTEND:20251106T140000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/67
DESCRIPTION:Title: <a href="https://stable.researchseminars.org/talk/Essex
 -DataScience/67/">Argumentation Theory and its Applications in Data Scienc
 e</a>\nby Dr Peter Young (King's College London) as part of (ED-3S) Essex 
 Data Science Seminar Series\n\nLecture held in 4.311.\n\nAbstract\nArgumen
 tation theory is a branch of artificial intelligence (AI) that models how 
 claims and arguments interact and how conflicting viewpoints can be resolv
 ed. While rooted in formal logic\, argumentation theory has found new rele
 vance in data science\, offering tools for modelling and analysing debates
 \, explainable AI\, and decision-making with uncertain and contradictory i
 nformation.\n\nIn this talk\, I will introduce the foundational concepts o
 f argumentation theory\, tracing its development from formal logic to comp
 utational models. I will then explore its integration with data-driven app
 roaches\, highlighting my recent work in the analysis of social media disc
 ourse and in information synthesis for portfolio optimisation. These examp
 les illustrate how ideas from argumentation can complement approaches in d
 ecision-making and statistical learning.\n\nThis talk is aimed at research
 ers interested in the intersection of AI\, data science\, and computationa
 l social science\, and does not assume any prior background in logic or ar
 gumentation.\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/67/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Yi Xia (University of Essex)
DTSTART:20251120T130000Z
DTEND:20251120T140000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/68
DESCRIPTION:by Yi Xia (University of Essex) as part of (ED-3S) Essex Data 
 Science Seminar Series\n\nLecture held in 4.311.\nAbstract: TBA\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/68/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Dane Grundy (Aviva)
DTSTART:20251127T130000Z
DTEND:20251127T140000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/69
DESCRIPTION:by Dr Dane Grundy (Aviva) as part of (ED-3S) Essex Data Scienc
 e Seminar Series\n\nLecture held in 4.311.\nAbstract: TBA\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/69/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dr Yang Lu (University of Concordia)
DTSTART:20251204T140000Z
DTEND:20251204T150000Z
DTSTAMP:20260404T111410Z
UID:Essex-DataScience/70
DESCRIPTION:by Dr Yang Lu (University of Concordia) as part of (ED-3S) Ess
 ex Data Science Seminar Series\n\nLecture held in 4.311.\nAbstract: TBA\n
LOCATION:https://stable.researchseminars.org/talk/Essex-DataScience/70/
END:VEVENT
END:VCALENDAR
