data engineer interview questions shared by candidates
Describe different JOINs in SQL. Having a categorical variable with thousands distinct values, how would you encode it? How to find linear trends in time series. How to find the flow in a matrix showing temperature. ...
I replied all the questions with confidence. The two phone interviewers were very nice and did not hesitate to explain again/more each question.
Having a categorical variable with thousands distinct values, how would you encode it ? Can you please answer this ?
Please use the awswrangler module to create a list of the files in the input folder that are not in the output folder. There is an AWS S3 bucket with two folders. Here is the initial code: import awswrangler as wr input_folder = 's3://mf-pythontest/in' output_folder = 's3://mf-pythontest/out' Using the AWS wrangler module, please create a list of the files in the input folder that are not in the output folders. The Required output is: ['doc_003.parquet']You must use the awswrangler package: https://github.com/awslabs/aws-data-wranglerYou will need to have some AWS credentials to access this public bucket. ***TIP*** The solution should have no more than three lines of code
See Interview Questions for Similar Jobs
- Software Engineer
- Data Analyst
- Business Analyst
- Senior Software Engineer
- Senior Consultant
- Software Developer
- Senior Analyst
- Account Manager
- Senior Business analyst
- Product Manager
- Actuarial Analyst