Editor's Picks
-
The Gender Shades project
This evaluation focuses on gender classification as a motivating example to show the need for increased transparency in the performance of any AI products and services that focused on human subjects. Bias in this context is defined as having practical differences in gender classification error rates between groups...
A Message From This Week's Sponsor
Data Science Articles & Videos
-
The Machine Learning Reproducibility Crisis
It’s hard to explain to people who haven’t worked with machine learning, but we’re still back in the dark ages when it comes to tracking changes and rebuilding models from scratch. It’s so bad it sometimes feels like stepping back in time to when we coded without source control...
-
Marketing for Data Science:
A 7 Step ‘Go-to-Market’ Plan for Your Next Data Product
As a product scientist at Indeed (product science is a team in data science — learn more here!), I think about launching both business products and internal data products. This has helped me see that marketing techniques for launching goods and services can also be applied to launching data products internally. With this perspective, I’ve helped the tools I developed become among the top 10% most used at Indeed...
-
Evolution is the New Deep Learning
Like Deep Learning (DL), EC was introduced decades ago, and it is currently experiencing a similar boost from the available big compute and big data. However, it addresses a distinctly different need: Whereas DL focuses on modeling what we already know, EC focuses on creating new knowledge. In that sense, it is the next step up from DL...
-
You need 16 times the sample size to estimate an interaction than to estimate a main effect
The most important point here, though, has nothing to do with statistical significance. It’s just this: Based on some reasonable assumptions regarding main effects and interactions, you need 16 times the sample size to estimate an interaction than to estimate a main effect. And this implies a major, major problem with the usual plan of designing a study with a focus on the main effect, maybe even preregistering, and then looking to see what shows up in the interactions...
-
Network structure from rich but noisy data
Here we present a general formalism for the optimal inference of network structure from rich but noisy data, and show how it can be applied to a range of data types...
-
Adversarial Logit Pairing
In this paper, we develop improved techniques for defending against adversarial examples at scale. First, we implement the state of the art version of adversarial training at unprecedented scale on ImageNet and investigate whether it remains effective in this setting - an important open scientific question...
-
There’s No Such Thing as a Data Scientist
The discipline has dramatically risen in popularity over the past few years. And while the number of data science jobs has increased, clarity around the role has declined. This post takes advantage of Indeed’s tremendous amounts of behavioral data to describe trends in the field and more specific definitions for data science roles...
Jobs
-
Data Scientist, Growth Insights - Spotify - NYC
We are looking for a Data Scientist to join the band and help drive a data-first culture with focus on growth. As a Data Scientist, our mission is to turn our 200 petabytes of data into insights and gain a deep understanding of music and listeners to impact the strategy and direction of Spotify. You will study user behavior, strategic initiatives, markets, content, and new features and bring data and insights into every decision we make. Above all, your work will impact how we think about user growth and how we can make Spotify available and accessible for more people in the world...
Training & Resources
Books