I applied through an employee referral. I interviewed at NVIDIA (Bengaluru) in July 2018
Interview
Felt good attempting the interview. The quality of questions was very good. They touch on the fundamentals only, but tricky and sometimes difficult. I was interviewed for an automotive team. They tested thoroughly on C, computer architecture, OS, hardware.
Interview questions [2]
Question 1
For Automotive team:
Telephone:
Form a function to find max/min of 2 integers without using comparison operator.
What is a static function in C?
What is the significance of volatile keyword in C?
What are the different types of scheduling? What type is used by a linux OS.
What are the different stages of a process? How a scheduler is linked to these stages?
What is priority inversion?
Explain JPEG format?
Let’s say there is a system with only 1 timer. And you want to write a multi-threaded program. Each thread wants to use a function called “sleep(input)“ to sleep for the given input time. Now every thread may call this function. How do you make sure that every thread sleeps the same amount of time it is intended to with a single available timer in the system. [Hint: can you imagine a data structure and logic to hold this requirement.]
Round 2:
Let’s say I give you any number of queues. Also I give you two functions enqueue and dequeue to do the respective task of enqueuing and dequeuing. Can you design a stack with queues. Come up with 2 designs. Make push operation costly in one design and pop operation costly in another design.
Can you define the node structure of a singly linked list?
Can you make the node a heterogenous one, i.e. the data in the node can be of any type char, int, float, double, structure, structure of structures etc.,?
What is a semaphore, binary semaphore?
What is a mutex? Difference between mutex and binary semaphore?
Let’s say I have a linked list. There is one writer who can delete any node in the list. And there are n readers who read the content from the list. How do you design a synchronization scheme to achieve this?
Can you describe round robin (RR) scheduling in detail. Let’s say a process P1 is in the ready queue at time t=0ms. There is another process P2 which enters the ready queue at time t=2ms. Let’s say process P1 takes 2ms to complete. Explain the scenarios when the time slice is 1ms, 2ms, 3ms for the RR scheduler.
What are the different segments of memory in the controller?
Where is dynamic memory allocated? If I am interested to allocate a 32KB of memory dynamically, is it continuous in memory? If yes, why? If no, why?
What is virtual memory? Explain with a detailed diagram.
Round 3:
What is NVMe BAR address?
Explain the process of bootloading?
Explain the architecture of a generic flash controller. Consider NVMe front end and a Flash back end. Discuss various design considerations.
Why are NVMe submission and receival queues needed?
Explain inter process communication (IPC) and the need for it.
Round 4:
Declare:
Pointer to an integer
Array of pointers
Pointer to an array
Function pointers
Array of function pointers
Puzzle: Let’s say there is a 6x6 matrix shelves. Can you place 14 bottles in these shelves in such a way that every row and every column will have even number of bottles?
There are two 32 bit numbers A & B. Write a macro to mask n bits (towards the right) of position p bit of A to B.
Let’s say you are the person in charge for booting the system and getting up and running. Your source code in eMMC drive has to come to RAM and then the reset vector trigger should start the main code. But unfortunately booting failed. How will you debug, what are the steps you are going to take? Assume you have a eMMC coming from the customer company. You have the privilege to ask questions to the customer company, what do you ask them if you have to?
Round 5:
What is the role of ADC?
Let’s say I have an SPI bus and I connected ADC to it which reads analog and gives digital in the uP scope. How will you design your sampling system.
Explain different signals you have analyzed using digital signal analyzers. When did you go to analog oscilloscopes in your career?
Can you design the concept of a C++ functionality of a fraction class in C. The design should be able to maintain numerator and denominator private but exposing the function to print the fraction, modify the fraction. This should cover public & private variables, functions. [Hint: Static]
Round 6:
There are 3 bags with 2 balls in each bag. The balls in 3 bags are WW, WB, BB respectively. If I pick a ball from a bag and find that the ball is W, what is the probability that the next ball I pick from the same bag is also W?
I have a structure,
Struct data{
U32 data;
Struct data next;
};
I have a memory where my data is being written. How will you detect if some other process has come and corrupted your data. [Hint: What is that you add to the structure to achieve this goal?]
Round 7:
I have a single linked list. I want to delete a node. I am giving you the node address. How will you do it in O(1) time complexity?
Write a C code to find the number of 1s in a given number? Can you optimize it?
Let’s say you are capturing images at 30 frames per second on a host CPU. You are given a discrete GPU to process these images. You have a DMA to transfer the images from the CPU RAM to GPU. How are you going to design this system?
Can you explain the cache algorithms?
Priority inversion with an example of 3 threads?
I applied online. The process took 4 weeks. I interviewed at NVIDIA (Santa Clara, CA) in Aug 2025
Interview
30min teams interview
Starting with my introduction about my most related experience, which is followed up by very extensive and detailed questions about the technical detail and division of labor
Then a free form 10 min basic coding question for both python and c++
Ending with me asking questions about the position
Interview questions [1]
Question 1
What's the division of labor and what's your contribution?
I applied online. The process took 2 months. I interviewed at NVIDIA (Taipei) in Jan 2024
Interview
First interview with hiring manager, followed by four engineers for technical interviews.
Programming problems included data structures, bit-wise operations, API designing, and CPU optimization techniques. There were LeetCode-style problems but really not too many of them.
Interview questions [1]
Question 1
They showed me a relatively simple function, with some resource allocation logic inside, and asked how I would design the function's signature so that a caller would be able to access the allocated resource but leave the release of it to the same callee.
I applied online. I interviewed at NVIDIA (Bengaluru)
Interview
MS teams first round, asked few questions on memory management, spin locks, mutex, semaphores, I2C multi master, clock stretching,ARM architecture, privileges, registers, memory protection unit Then taken to the C programming
Interview questions [1]
Question 1
Insert node in linked list swap odd and even bits declare pointer to a dynamically allocated 2D array