I applied online. I interviewed at Natera in Nov 2020
Interview
There was a phone screening with a recruiter, then a 45-minute technical interview with a senior statistician where I was asked about statistics and modeling concepts; there were no behavioral questions. After that, there would have been a coding assessment, and then probably several more stages, but I didn't make it past the technical interview.
Given this data, if you were trying to assess the effectiveness of this drug in the presence of these other confounding variables, what would you do/what sort of model would you fit?
At least a four-step interview process:
1. Over the phone interview with written (googledoc) test
2. In person interview with team lead covering statistical concepts
3. Coding interview (don't be fooled, it's not a biostatistics coding interview -- but a data science + algorithms coding interview - there is no modeling, data management, or statistical testing involved)
4. A day of intensive interviews with 5 (?) separate team members - did not make it to this stage
The overall most disappointing part of this process was how much energy and time was required only to get this far - knowing there was an additional even more intense step coming next. From initial contact to the third interview was over a month of time. After the third interview I did not hear back from anyone. I had to reach out a week and a half later to find out I was rejected. I can only imagine how much worse it would have been if I went through the "intensive interview" day of at least 3+ hours of additional interviewing and who knows how many additional weeks of waiting, only to be rejected.
Interview questions [1]
Question 1
In the coding interview:
First question was to develop a function which takes a numeric list, and generates the results of the CDF of the empirical distribution for each number in that list.
The second was to run a logistic regression (one line of code, very simple)
The third was FizzBuzz
The fourth was to generate an algorithm to add all the numbers in any given number. (ie, 132 => 1+3+2 = 6)
I applied through a recruiter. I interviewed at Natera in Aug 2020
Interview
I was approached by a recruiter on LinkedIn. The entire interview process lasted about 5 weeks. It included an initial HR/recruiter screening, an introductory and slightly technical phone call with the hiring manager, a technical interview with a Biostatistician, and finally a 4+-hour panel interview with various managers and members of Natera.
Overall, the interview was fairly easy to average. However, the Associate director of Data Science was very rude. He would cut you off, ask questions that were not pertinent to his own problem. It didn't seem like he knew what he was asking for and instead tried to trap you into a corner.
Interview questions [1]
Question 1
Explain confounding variables.
How would you calculate P-value by hand?
What is p-value, and how would you characterize its pros and cons?
How can we increase power in a study?
Calculate sensitivity for a test/small sample.
Find confidence interval of a classifier that has 99.5% accuracy.
How would you test if a single die was loaded on one side? Subsequently, how would you test this with 90% power?
How do you write an SAP?
You are posed with a question: "What are the chances a retired person who makes >100k has a kid who makes >100k?". How would you go about solving this question?
Given a list of numbers, write a function that returns a list in which each element is a tuple of the data point and its empirical cdf value.
Given the name of the predictor variables, outcome variable, and name of the dataset, provide the code to perform a logistic regression.
Write a function that accepts any integer and returns the sum of all the digits in the number.