American communities of color have shouldered a disproportionate share of the COVID-19 pandemic’s burden. How can we visualize this impact?
Brief example about data interpolation
Today, I want to talk about some data interpolation I had to do recently. As part of a project of mine, I had to deal with US census data. As you probably know, the US census collects data on many aspects of US society (population, education, income, race, and many ...
Getting longitude-latitude coordinates for a (long) list of cities using Python and a free API
[This post was originally published on my blog] Today I’ve decided to expand the number of cities included on my murder rate map to everywhere with 100,000+ people. In order to do that using the FBI data (which only includes the names of the cities), I need to find the ...
Step-by-step: How to plot a map with slider to represent time evolution of murder rate in the US using Plotly
[This post was originally published on my blog ] This post in based on this other one I posted a few days ago, where I’m exploring a new data set about murder rates in the US. I decided to write a plot detailing how to plot a map of said ...
Interactive visualizations with Plotly
For the last couple of years, I’ve been using Plotly to create visually appealing interactive plots. You can create an account at plot.ly and then create and edit your plots from their online platform (they also have a premium option with extra features), but I prefer using Plotly offline, just ...
New paper out in PLOS Biology on Why potentially important genes are ignored
We just published a new paper performing a large-scale investigation of the reasons why potentially important genes are ignored.
New paper in Nature Human Behaviour on Personality Types
We just published a new paper investigating personality types in four large datasets (>1.5M respondents) finding robust support for at least four personality types.
New paper in Science Advances combining topic models and complex networks
Topic models are a popular way to extract information from text data, but its most popular flavours (based on Dirichlet priors, such as LDA) make unreasonable assumptions about the data which severely limit its applicability. Martin Gerlach, member of the Amaral-lab, and co-authors explore an alternative way of doing topic modelling, based on stochastic block models (SBM), thus exploiting a mathematical connection with finding community structure in networks. A network approach to topic models Science Advances 4, eaaq1360 (2018)
Alumni consulting company acquired by Ideo
Ideo is a San Francisco Bay Area consulting firm that helps companies design new products and services. It has 52 employees in Chicago and will add 16 more, including 15 data scientists, with the acquisition of Datascope Analytics. Datascope, founded by two Northwestern University and Amaral Lab alumni, has partnered ...
New paper linking fractal dynamics of worms to aging and stress
Luiz G. A. Alves, Peter B. Winter, Leonardo N. Ferreira, Renée M. Brielmann, Richard I. Morimoto, and Luís A. N. Amaral: Long-range correlations and fractal dynamics in C. elegans: Changes with aging and stress Physical Review E 96, 022417 (2017)
Shootings in U.S. schools are linked to increased unemployment
By Megan Fellman EVANSTON – A rigorous Northwestern University study of a quarter-century of data has found that economic insecurity is related to the rate of gun violence at K-12 and postsecondary schools in the United States. When it becomes more difficult for people coming out of school to find ...
New paper out in G3:Genes|Genomes|Genetics!
The recent work of Chuyue Yang, a talent undergraduate (now recently graduated!), and graduate student Adam Hockenberry is now online. This work is a part of a multi-year collaboration with Professor Michael Jewett investigating the mechanisms by which the sequence of messenger-RNA can influence its translation. In this particular work, ...
Unintended effects of data privacy in healthcare
I would like to draw attention to an instance where I believe the pendulum has swung too far: the Healthcare Insurance Portability and Accountability Act (HIPAA)
The ultimate book about folktales
I’d like to use my first blog post to advertise the ultimate story book – which also happens to be one of the most successful works of information-based science: the Aarne-Thompson index. Briefly, this index tries nothing less than to identify and categorize every folklore tale. Although the first version of the index is more than a hundred years old, it has stayed as a useful tool for folklore research ever since. Parts of the success of the Aarne-Thompson index seem to stem from time-less design decisions for organizing data.
Toward a Social Contract in Data Journalism
In the ongoing saga of American politics, we the voters have seen some pretty improbable things this election cycle. But to many, the starkest instance of the improbable has been Donald Trump’s rise to presumptive nominee for the GOP. But I won’t be talking about the situation in question – instead I want to discuss the state of data journalism in the wake of this campaign season.
Computational research, no longer a red-headed stepchild!
This week I had the, almost obscene, pleasure of participating in Northwestern’s Computational Research Day as a chairperson and poster judge.I typically cringe at the thought of attending conferences and symposia, since I am mainly a homebody (I love my desk, computer, research, and daily schedule), but at the symposium ...