It started with a recruiter ping on LinkedIn in early January. Role was something like "Machine Learning Engineer, Integrity" on the AI/ranking side. I had applied cold maybe three weeks earlier and had genuinely forgotten about it. The recruiter was efficient, no fluff. She walked me through the loop in maybe twelve minutes: one coding screen, then if that cleared, a full virtual onsite with five rounds back to back across two days. ML Design, two Coding rounds, Product Sense/Metrics, and Behavioral. She mentioned the loop would be graded holistically. That word, holistically, I would think about a lot over the next three weeks.
Interview questions [1]
Question 1
The coding screen was a variant of the longest subarray problem with a constraint on the sum, and the follow-up asked me to handle an infinite stream where I could only keep a fixed-size window in memory, which is where the deque approach came in. The graph problem on day one was something like find the minimum number of steps to convert one word to another where each step changes exactly one character and every intermediate word must exist in a given dictionary, classic word ladder but with a follow-up asking me to return all shortest paths not just the count, which is where BFS alone breaks down and you need to layer in some backtracking. The ML Design prompt was the feed freshness one I mentioned, but the specific follow-up that pushed me hardest was when they asked how I would handle the cold start problem for new creators who have no engagement history at all, and I went with a content-based fallback using embedding similarity to established creators in the same topic cluster, which they probed by asking how I would evaluate whether that fallback was actually working without contaminating my main ranking metrics. The Product Sense question was exactly: Reels engagement is down 8% week over week, you are the ML lead in the diagnosis meeting, what do you do first, and the follow-up that exposed me was when I said I would look at content quality signals and they asked me to name three specific features I would pull and in what order, and I named watch-through rate and re-watch rate cleanly but stumbled on the third and said something vague about "interaction diversity" that did not land. The behavioral question about pushing back on a committed technical decision was the one that surprised me most structurally because they did not just want the story, they wanted to know what I would do differently now, which forces you to critique your past self in real time in front of a stranger.
Coding screen was 45 min, 2 problems, leetcode easy and mediums, no ability to run code (text editor only), interviewer helpfully pasted some examples to run through. Had to walk through the code with the examples to prove it was accurate. Interviewer also probed edge cases so make sure to prep edge cases even if leetcode doesn't require it.
Made it to onsite. Very pushy. Always say I am wrong before I even finish the sentence. Worst interview ever. The other interviews went decent but guess I was too bad at System design.
The interview was an OA about building a database from scratch. It wasn’t too bad and I think it could benefit from adding some more edge cases to test problem solving.