Get our free extension to see links to code for papers anywhere online!

 Add to Chrome

 Add to Firefox

CatalyzeX Code Finder - Browser extension linking code for ML papers across the web! | Product Hunt Embed
A new evaluation framework for topic modeling algorithms based on synthetic corpora

Jan 28, 2019
Hanyu Shi, Martin Gerlach, Isabel Diersen, Doug Downey, Luis A. N. Amaral

* accepted for AISTATS 2019; code available at https://github.com/amarallab/synthetic_benchmark_topic_model; Main text (11 pages, 5 figures) and Supplementary Material (14 pages, 11 figures) 

  Access Paper or Ask Questions

A standardized Project Gutenberg corpus for statistical analysis of natural language and quantitative linguistics

Dec 19, 2018
Martin Gerlach, Francesc Font-Clos


  Access Paper or Ask Questions

A network approach to topic models

Jul 19, 2018
Martin Gerlach, Tiago P. Peixoto, Eduardo G. Altmann

* Science Advances 4, eaaq1360 (2018) 
* 22 pages, 10 figures, code available at https://topsbm.github.io/ 

  Access Paper or Ask Questions

Generalized Entropies and the Similarity of Texts

Nov 11, 2016
Eduardo G. Altmann, Laercio Dias, Martin Gerlach

* J. Stat. Mech. 014002 (2017) 
* 13 pages, 6 figures; Results presented at the StatPhys-2016 meeting in Lyon 

  Access Paper or Ask Questions

Similarity of symbol frequency distributions with heavy tails

Apr 15, 2016
Martin Gerlach, Francesc Font-Clos, Eduardo G. Altmann

* Phys. Rev. X 6, 021009 (2016) 
* 13 pages, 7 figures 

  Access Paper or Ask Questions

Statistical laws in linguistics

Feb 11, 2015
Eduardo G. Altmann, Martin Gerlach

* Proceedings of the Flow Machines Workshop: Creativity and Universality in Language, Paris, June 18 to 20, 2014 

  Access Paper or Ask Questions

Scaling laws and fluctuations in the statistics of word frequencies

Nov 04, 2014
Martin Gerlach, Eduardo G. Altmann

* New Journal of Physics 16 (2014), 113010 
* 19 pages, 4 figures 

  Access Paper or Ask Questions

Extracting information from S-curves of language change

Oct 30, 2014
Fakhteh Ghanbarnejad, Martin Gerlach, Jose M. Miotto, Eduardo G. Altmann

* J. R. Soc. Interface 6 December 2014 vol. 11 no. 101 20141044 
* 9 pages, 5 figures, Supplementary Material is available at http://dx.doi.org/10.6084/m9.figshare.1221782 

  Access Paper or Ask Questions

Stochastic model for the vocabulary growth in natural languages

Apr 04, 2013
Martin Gerlach, Eduardo G. Altmann

* Phys. Rev. X 3, 021006 (2013) 
* corrected typos and errors in reference list; 10 pages text, 15 pages supplemental material; to appear in Physical Review X 

  Access Paper or Ask Questions