Metastatic prostate cancer is one of the most common cancers in men. In the advanced stages of prostate cancer, tumours can metastasise to other tissues in the body, which is fatal. In this thesis, we performed a genetic analysis of prostate cancer tumours at different metastatic sites using data science, machine learning and topological network analysis methods. We presented a general procedure for pre-processing gene expression datasets and pre-filtering significant genes by analytical methods. We then used machine learning models for further key gene filtering and secondary site tumour classification. Finally, we performed gene co-expression network analysis and community detection on samples from different prostate cancer secondary site types. In this work, 13 of the 14,379 genes were selected as the most metastatic prostate cancer related genes, achieving approximately 92% accuracy under cross-validation. In addition, we provide preliminary insights into the co-expression patterns of genes in gene co-expression networks. Project code is available at https://github.com/zcablii/Master_cancer_project.
In this paper we propose a novel method to forecast the result of elections using only official results of previous ones. It is based on the voter model with stubborn nodes and uses theoretical results developed in a previous work of ours. We look at popular vote shares for the Conservative and Labour parties in the UK and the Republican and Democrat parties in the US. We are able to perform time-evolving estimates of the model parameters and use these to forecast the vote shares for each party in any election. We obtain a mean absolute error of 4.74\%. As a side product, our parameters estimates provide meaningful insight on the political landscape, informing us on the quantity of voters that are strongly pro and against the considered parties.
We address the phenomenon of sedimentation of opinions in networks. We investigate how agents who never change their minds ("stubborn") can influence the opinion of a social group and foster the formation of polarised communities. We study the voter model in which users are divided in two camps and repeatedly update their opinions based on others they connect with. Assuming a proportion of the agents are stubborn, the distribution of opinions reaches an equilibrium. We give novel formulas based on Markov Chain analysis to compute the distribution of opinions at any time and speed of convergence to stationary equilibrium. Theoretical results are supported by numerical experiments on synthetic data, and we discuss a strategy to mitigate the polarisation phenomenon.