Data engineer Interview Questions | Glassdoor.com.au

Data engineer Interview Questions

15

data engineer interview questions shared by candidates

Top Interview Questions

Sort: RelevancePopular Date

Please use the awswrangler module to create a list of the files in the input folder that are not in the output folder. There is an AWS S3 bucket with two folders. Here is the initial code: import awswrangler as wr input_folder = 's3://mf-pythontest/in' output_folder = 's3://mf-pythontest/out' Using the AWS wrangler module, please create a list of the files in the input folder that are not in the output folders. The Required output is: ['doc_003.parquet']You must use the awswrangler package: https://github.com/awslabs/aws-data-wranglerYou will need to have some AWS credentials to access this public bucket. ***TIP*** The solution should have no more than three lines of code

1 Answer

import awswrangler as wr input_folder = 's3://mf-pythontest/in' output_folder = 's3://mf-pythontest/out from os import path get_filenames = lambda folder_path: [path.basename(file_path) for file_path in wr.s3.list_objects(folder_path)] [filename for filename in get_filenames(input_folder) if file not in get_filenames(output_folder)]

Biggest mistakes, career path thinking, dream job

1 Answer

Generator, Decorator in python, Basics of Spark, difference between DataFrame and dataset in spark, what is RDD

1 Answer

Data modelling,Star Schema and Snowflake Schema, Index, SCD Types

What is the difference between bagging and boosting?

Process of training Neural networks.

- tell me about yourself? - Why data engineering? - What experience do you have in python?

What projects have you worked on in the past.

There were no questions of the traditional format that would be of interest here. Simply things about past technologies used and the challenges described above.

- About experienced projects (architecture, used skills, roles ...) - About specific AWS technologies

110 of 15 Interview Questions