AWS, Google— Data Engineer Interview Questions


Role of Data Engineers at Amazon

  • Collect, store and manage huge quantities of data.
  • Convert raw data into information that can be used to make decisions.
  • Be at the forefront of data-driven decision-making by working closely with data scientists, product managers, and software engineers.
  • Build and maintain database architectures.
  • Coordinate with product managers, software engineers, and data scientists to work on common projects that involve leveraging datasets.
  • Leverage SQL and programming to build algorithms.
  • Perform data modeling and carry out ETL design, keeping with best practices.

Skills and Qualifications Required to Be a Data Engineer at Amazon

  • 4+ years of experience in Python, SQL and ETL design
  • Proven experience in data modeling and building data pipeline architectures
  • 3+ years of experience in big data analytics, with workflow management engines(Airflow, AWS Step Functions, Google Cloud Composer…)
  • Proven experience in working with cloud analytics platforms or MPP analytics platforms such as AWS Redshift, Google Big Query, Teradata, or Netezza
  • Proven experience in SQL Performance Tuning
  • Proven experience in designing database pipeline architectures
  • Experience in using Big data analytics tools such as Spark, Impala, Hive, Presto
  • Experience in E2E process optimization
  • Experience with anomaly/outlier detection

Interview Questions

  • Algorithms and data structures
  • Metric and visualization solution designs
  • Spark, EMR
  • Reporting tools like Tableau and Excel
  • SQL
  • Data pipeline design
  • DB performance tuning
  • Statistics and modeling

Leadership Principle

  • Customer Obsession: meet customer expectations
  • Ownership: beyond your job responsibilities and work on a challenging project
  • Insist on the Highest Standards: improve quality of a project and motivate others
  • Think Big: significant professional achievement, make a bold and challenging decision, great impact
  • Bias for Action: take a calculated risk, and take the initiative to correct a problem
  • Earn Trust: speak up in a difficult or uncomfortable environment, gain the trust of your team
  • Dive Deep: complicated problem you’ve had to deal with, utilize in-depth data
  • Have Backbone; Disagree and Comment: something you believe in that nobody else does
  • Deliver Results: push something to deliver results even though team give up on something

STAR: answer clearly in the form below

  • Situation
  • Task
  • Action
  • Result


  • Minimum Qualifications: 2+ year of software development, data engineering, business intelligence, data science, or related field with experience in manipulating, processing, and extracting value from datasets. 4+ years of experience in designing, building, and deploying cloud-based solution architectures.
  • Collaborative Experience: skillfully communicating, organizing and analyzing
  • Analytical Experience: Experiencing Designing Data Models and Data Warehouses and using SQL and NoSQL database management systems.
  • Preferred Qualifications: Master’s degree

Behavioral Questions

  • What is your 5-year professional plan?
  • Describe a time you failed to reach a goal.
  • Describe how you work effectively with others and achieve the desired result.
  • Tell me about a project you’re proud of.

Product sense & Business Cases

  • What kind of spam will you have on YouTube, and how to deal with them?
  • How would you explain cloud computing to a 6-year-old?
  • How many cans of blue paint were sold in the United states last year?

Data Analysis & Coding

  • Find the number of emails received by each user under each built-in email label. The email labels are ~
  • Find the total AdWords earnings for each business type. Output the business types along with the total earnings.
  • Write a code to generate random normal distribution and plot it.

Modeling Techniques

  • Why use feature selection? If two predictors are highly correlated, what is the effect on the coefficients in the logistic regression? What are the confidence intervals of coefficients?
  • For sample size n, the margin of error in 3. How many more samples do we need to make the margin of error to 0.3?




Data Engineer interested in constructing Data-Driven Architecture with Cloud Service

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

OpenLaw’s Latest Language Release Improves Aliases and a Sneak Peek into Contract Collections and…

We need docker , Let us configure docker and install it in our system.

On R&D, feet, and a global pandemic

Setting up KafkaSource to send data and displayed with Knative event-display

How to get a Twitter Developer the easy way

Here’s How Consistency Improves Your Code’s Readability

landscape scene with lake, mountains, and cloudy sky

Sukhavati Network’s Community Campaign 3.0

Best Web Hosting Services in 2022

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store


Data Engineer interested in constructing Data-Driven Architecture with Cloud Service

More from Medium

Data Engineering: Data Lake vs Data Warehouse vs Data Mart

Data Modeling For Interviews: Test II

Understanding AWS Glue for ETL

Spark of Data Engineers